View Single Post
Old 02-10-2018, 06:17 AM   #12
nelson1379
Enthusiast
nelson1379 began at the beginning.
 
Posts: 31
Karma: 32
Join Date: Jan 2012
Device: Kindle Paperwhite
Sorry to keep posting, but the non web_edition scraping mechanism isn't reading the today's edition webpage correctly -- it correctly puts the first four articles in the "Front Page" section, but then it seems to skip over the rest of the "Front Page" section and puts all of the rest of the articles into the "International" section.

I'm not sure what it is in the html that is confusing the script in between the top four articles and the rest -- they're obviously formatted different visually but there's no h1 section between Front Page and International that the script is reading. I don't know Python but I've been staring at it for a little while trying to figure it out... Perhaps it's something about that "rank-template featured-rank-template template-2 issue-template" div that contains only the first four "Front Page" articles that's messing it up. Sorry I can't be more helpful.
nelson1379 is offline   Reply With Quote