Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-07-2018, 08:37 AM   #1
nelson1379
Enthusiast
nelson1379 began at the beginning.
 
Posts: 26
Karma: 32
Join Date: Jan 2012
Device: Kindle Paperwhite
New York Times recipe broken

Looks like the "Today's Paper" webpage moved from

http://www.nytimes.com/pages/todayspaper/index.html

to

https://www.nytimes.com/section/todayspaper

And the webpage's layout is different as well.

I don't think I can fix it, but if anyone here knows how, the Github URL for the recipe is https://github.com/kovidgoyal/calibr...nytimes.recipe .
nelson1379 is offline   Reply With Quote
Old 02-07-2018, 11:02 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,477
Karma: 10205098
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The existing recipes were such a mess, that I just ended up re-writing them. Note that, I am not a NYT reader, so let me know if there are problems in the new recipes. https://github.com/kovidgoyal/calibr...451c71cd94b3b8

Decreased the size of the recipes from 1500 lines to 150 lines.
kovidgoyal is offline   Reply With Quote
Advert
Old 02-09-2018, 07:45 AM   #3
nelson1379
Enthusiast
nelson1379 began at the beginning.
 
Posts: 26
Karma: 32
Join Date: Jan 2012
Device: Kindle Paperwhite
Thank you, this recipe works very well! It's fantastic that it could be rewritten in a fraction of the code.

There are a couple differences from before, but these are cosmetic and non-essential:

-There's some cruft at the end of each article. It starts with "A version of this article appears in print on..." The old script did not have this. However, this is pretty easy to ignore.

-The article text contains the same hyperlinks from the website -- when tapped, accidentally or not, on a Kindle, they open up the slow-as-molasses Kindle browser. The old script seemed to erase the hyperlinks, which I never found useful (can't speak for others though). Again, non-essential and easy to ignore.

-The resulting files seem larger than before (10 MB vs 3-4 MB for a weekday paper, 75 minutes vs 15 minutes to process on a Raspberry Pi 2, both with the setting compress_news_images_auto_size = 16). I will tool around with compress_news_images_max_size to see if I can get this back down to the same file size / processing time as before.

Thank you again!
nelson1379 is offline   Reply With Quote
Old 02-09-2018, 09:02 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,477
Karma: 10205098
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
1) Easily fixed: https://github.com/kovidgoyal/calibr...8bff364d981ac3

2) Personally, I like to keep the links, as people that read them on devices with useful browsers might like to click them once in a while, plus it is good for the NYT to get visits.

3) No idea why that might be -- I never used the old recipe, so it's hard to say what's different.
kovidgoyal is offline   Reply With Quote
Old 02-09-2018, 01:18 PM   #5
NSILMike
Evangelist
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 492
Karma: 35936
Join Date: Apr 2011
Location: Suburb of Boston, MA
Device: Kindle KeyBoard & Samsung Tab 4
Just downloaded Calibre 3.17 (64-bit.) I saw in the change log that the New York Times recipe is improved. However, it is no longer an option in the 'schedule news download' option in 'Fetch news.' Where did it go?
Thanks.
NSILMike is offline   Reply With Quote
Advert
Old 02-09-2018, 01:47 PM   #6
NSILMike
Evangelist
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 492
Karma: 35936
Join Date: Apr 2011
Location: Suburb of Boston, MA
Device: Kindle KeyBoard & Samsung Tab 4
Quote:
Originally Posted by NSILMike View Post
Just downloaded Calibre 3.17 (64-bit.) I saw in the change log that the New York Times recipe is improved. However, it is no longer an option in the 'schedule news download' option in 'Fetch news.' Where did it go?
Thanks.
Found it by searching but not sure what section of the 'schedule news download' operation in Calibre it's now in...
NSILMike is offline   Reply With Quote
Old 02-09-2018, 03:29 PM   #7
nelson1379
Enthusiast
nelson1379 began at the beginning.
 
Posts: 26
Karma: 32
Join Date: Jan 2012
Device: Kindle Paperwhite
1. Thanks for the fix!
2. Makes sense.
3. This was embarrassingly user error, I forgot to change the webedition from true to false. I'm used to using false and forgotten I'd changed it awhile ago.
nelson1379 is offline   Reply With Quote
Old 02-09-2018, 08:27 PM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,477
Karma: 10205098
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@NSILMike: it is in the English section, where I think it always was.
kovidgoyal is offline   Reply With Quote
Old 02-09-2018, 08:34 PM   #9
NSILMike
Evangelist
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 492
Karma: 35936
Join Date: Apr 2011
Location: Suburb of Boston, MA
Device: Kindle KeyBoard & Samsung Tab 4
Quote:
Originally Posted by kovidgoyal View Post
@NSILMike: it is in the English section, where I think it always was.
Nope. And still isn't... something weird is going on?
NSILMike is offline   Reply With Quote
Old 02-09-2018, 08:44 PM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,477
Karma: 10205098
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It is for me:
Attached Thumbnails
Click image for larger version

Name:	Screenshot_20180210_071433.png
Views:	111
Size:	132.6 KB
ID:	162161  
kovidgoyal is offline   Reply With Quote
Old 02-09-2018, 08:58 PM   #11
NSILMike
Evangelist
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 492
Karma: 35936
Join Date: Apr 2011
Location: Suburb of Boston, MA
Device: Kindle KeyBoard & Samsung Tab 4
Quote:
Originally Posted by kovidgoyal View Post
It is for me:
Not for me...
NSILMike is offline   Reply With Quote
Old 02-10-2018, 06:17 AM   #12
nelson1379
Enthusiast
nelson1379 began at the beginning.
 
Posts: 26
Karma: 32
Join Date: Jan 2012
Device: Kindle Paperwhite
Sorry to keep posting, but the non web_edition scraping mechanism isn't reading the today's edition webpage correctly -- it correctly puts the first four articles in the "Front Page" section, but then it seems to skip over the rest of the "Front Page" section and puts all of the rest of the articles into the "International" section.

I'm not sure what it is in the html that is confusing the script in between the top four articles and the rest -- they're obviously formatted different visually but there's no h1 section between Front Page and International that the script is reading. I don't know Python but I've been staring at it for a little while trying to figure it out... Perhaps it's something about that "rank-template featured-rank-template template-2 issue-template" div that contains only the first four "Front Page" articles that's messing it up. Sorry I can't be more helpful.
nelson1379 is offline   Reply With Quote
Old 02-10-2018, 12:02 PM   #13
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,477
Karma: 10205098
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
https://github.com/kovidgoyal/calibr...52eec88bcf77cf
kovidgoyal is offline   Reply With Quote
Old 02-11-2018, 07:04 AM   #14
nelson1379
Enthusiast
nelson1379 began at the beginning.
 
Posts: 26
Karma: 32
Join Date: Jan 2012
Device: Kindle Paperwhite
Thank you -- this fix captures all of the Front Page articles correctly, but everything that comes after the Front Page section (National, Obituaries, New York, etc.) is still gathered into the International section, which seems to cut off after 90-100 articles (maybe a device or file limitation).

More properly, looking at the log file, it seems that the script picks up every article after the Front Page as belonging in every individual section, and then since those sections are ostensibly the same, retains only the first section (International).

https://pastebin.com/Vr6SYq8K

Someday I'd like to learn some Python so that I'm submitting pull requests and not just requests...
nelson1379 is offline   Reply With Quote
Old 02-11-2018, 09:46 AM   #15
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 33,477
Karma: 10205098
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by nelson1379 View Post
Someday I'd like to learn some Python so that I'm submitting pull requests and not just requests...
Dont worry about it, making useful bug reports is appreciated as well.
https://github.com/kovidgoyal/calibr...067559c5615e48
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
New York Times Technology Beat is broken NSILMike Recipes 1 04-16-2017 12:55 AM
New York Times Book Review broken again. wingmongyee Recipes 9 03-24-2016 07:20 PM
New York Times Book Review broken wingmongyee Recipes 3 01-02-2016 12:32 AM
New York Times Recipe dieterpops Recipes 1 01-20-2013 12:26 PM
New York Times recipe broken? gianfri Calibre 1 03-20-2010 09:52 AM


All times are GMT -4. The time now is 03:09 PM.


MobileRead.com is a privately owned, operated and funded community.