Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 03-15-2013, 09:06 PM   #1
Waldo3
Junior Member
Waldo3 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kindle
NY Times - Multiple articles problem

Calibre V.0.9.23. I am enormously impressed with the quality and general behavior of the Calibre periodical recipes. Many of these, particularly for the NYT, Economist, etc. are outstanding. It is clearly an ongoing challenge to keep these working correctly in the light of frequent changes to the related websites, particularly when some of the websites such as Bloomberg / Business Week do not even maintain conventional RSS feeds.

In the past few weeks I've noticed what is probably a website-change related problem with the NYT name / password recipe. Many if not most article links now actually contain several articles merged under the same title. For example, an article on the Metropolitan Opera in the Friday 3/15/13 edition contained a total of 8 articles concatenated together.

It appears that the original article of the 8 contains a link section at the end called "Related Articles" The recipe appears to be appending the text of all of these articles to the original, and possibly even chaining multiple articles each of which contains more links at the end.

Although this could be a pleasing option, it is causing my Kindle Keyboard to either take a very long time to open a new article or to eventually crash / reset in some cases.

The total file size as a result is also bulking up considerably. On Friday 3/15 the NYT Mobi file was 13MB. On Sunday a week or two ago it was over 20MB, and would not even open on the Kindle Keyboard and subsequently crashed.

As a temporary workaround, I discovered that it was possible to convert the problematic Mobi to Epub format using onlineconvert.com. The resulting epub also reduced in size from 13MB to 7MB and appeared to correctly maintain the TOC. This reads fine in the Windows Firefox ePub reader. However, the multiple concatenated articles still remain the same in the epub version.

As far as the KKeyboard crashes are concerned, I am beginning to wonder if there are possibly two problems - one related to the multiple articles and another related to the large file size or complex TOC / article linkages.
Waldo3 is offline   Reply With Quote
Old 03-15-2013, 11:36 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
As far as I recall the inclusion of related articles is deliberate by the recipe author. You can disable it by customizing the recipe. Right click the fetch news button and choose customize recipe. CUstomize the builtin NTY recipe and change the line which says

recursions = 1
to
recursions = 0

Then use your customized recipe instead of the builtin one.
kovidgoyal is offline   Reply With Quote
Advert
Old 03-16-2013, 08:36 AM   #3
NSILMike
Guru
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
Am on version 0.9.23
I have similar issues with the Times. (And a different one as well.)
If I right click my fetch news button, the only options I see are 'schedule news download', 'add a custom news source', and 'download all scheduled...'

As for the different issue: on the second NYT recipe (has Headlines in the recipe title) it fails immediately for me now. Is there a bug, or is their paywall blocking it, or...?
NSILMike is offline   Reply With Quote
Old 03-16-2013, 11:38 AM   #4
SilentSeven
Enthusiast
SilentSeven began at the beginning.
 
Posts: 27
Karma: 10
Join Date: Sep 2010
Device: Nexus7
Mike - I started a specific thread on the NY Times headline news issues. Should help to keep the topic focused...
SilentSeven is offline   Reply With Quote
Old 03-16-2013, 12:50 PM   #5
NSILMike
Guru
NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.NSILMike turned on, tuned in, and dropped out.
 
Posts: 735
Karma: 35936
Join Date: Apr 2011
Location: Shrewsury, MA
Device: Lenovo Android Tablet
Quote:
Originally Posted by SilentSeven View Post
Mike - I started a specific thread on the NY Times headline news issues. Should help to keep the topic focused...
OK, I'll look for it now. (Good idea.)
NSILMike is offline   Reply With Quote
Advert
Old 03-17-2013, 07:37 AM   #6
Waldo3
Junior Member
Waldo3 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kindle
Quote:
Originally Posted by kovidgoyal View Post
As far as I recall the inclusion of related articles is deliberate by the recipe author. You can disable it by customizing the recipe. Right click the fetch news button and choose customize recipe. CUstomize the builtin NTY recipe and change the line which says

recursions = 1
to
recursions = 0

Then use your customized recipe instead of the builtin one.
Thanks for the followup.
Waldo3 is offline   Reply With Quote
Old 03-21-2013, 12:35 PM   #7
Waldo3
Junior Member
Waldo3 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kindle
As followup to the file size / crash issue with NYT subscription version, I changed to recursions=0 and tested for a few days. File size is much reduced to < 10GB weekday and Sunday. However page turning and menu navigation are still very slow. The Kindle Keyboard eventually crashed this Tues and Wed even though file size was in the 6-7 MB range. The problem is not quite as bad on the Paperwhite - no crashes occur, but navigation does not work correctly on the large format TOC menu.

It seems that concatenated articles were not the problem. The behavior is similar to issues I've seen in the past with the Sunday NYT subscription edition which gets very big. What has changed recently is the updated recipe which adds daily most viewed, tech sections, etc. For example, the Wed 3/20 edition contained 17 sections and 227 total articles. Of these about half are in the recently added sections. The common issue with the earlier Sunday problems seems to be the total number of sections / articles.

Has anyone else has seen these issues with the Kindle / MOBI NYT subscription version? Or possibly other MOBI periodical-format publications? It seems that maybe the Kindle firmware itself chokes when sections / articles go over 100-200+ I regularly download over a dozen Kindle pubs in MOBI periodical format and never have problems otherwise.
Waldo3 is offline   Reply With Quote
Old 03-21-2013, 02:05 PM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Older kindles tended to have issues with alrger periodical downloads. I suggest you edit the recipe to remove the sections you don't want, just as you reduced recursions to zero
kovidgoyal is offline   Reply With Quote
Old 03-21-2013, 03:59 PM   #9
BobbyVan
Enthusiast
BobbyVan began at the beginning.
 
Posts: 42
Karma: 20
Join Date: Jan 2012
Device: Kindle Paperwhite
Recipe causing crashes...

Quote:
Originally Posted by Waldo3 View Post
As followup to the file size / crash issue with NYT subscription version, I changed to recursions=0 and tested for a few days. File size is much reduced to < 10GB weekday and Sunday. However page turning and menu navigation are still very slow. The Kindle Keyboard eventually crashed this Tues and Wed even though file size was in the 6-7 MB range. The problem is not quite as bad on the Paperwhite - no crashes occur, but navigation does not work correctly on the large format TOC menu.

It seems that concatenated articles were not the problem. The behavior is similar to issues I've seen in the past with the Sunday NYT subscription edition which gets very big. What has changed recently is the updated recipe which adds daily most viewed, tech sections, etc. For example, the Wed 3/20 edition contained 17 sections and 227 total articles. Of these about half are in the recently added sections. The common issue with the earlier Sunday problems seems to be the total number of sections / articles.

Has anyone else has seen these issues with the Kindle / MOBI NYT subscription version? Or possibly other MOBI periodical-format publications? It seems that maybe the Kindle firmware itself chokes when sections / articles go over 100-200+ I regularly download over a dozen Kindle pubs in MOBI periodical format and never have problems otherwise.
This is a frequent problem for me. I'm a PW user and have also modified the recipe to compress images and set recursions to zero. Would a different output format (azw or az3) possibly avoid this crashing? Can other Kindle-compatible formats retain periodical formatting?

Thanks.
BobbyVan is offline   Reply With Quote
Old 03-21-2013, 06:05 PM   #10
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by Waldo3 View Post
What has changed recently is the updated recipe which adds daily most viewed, tech sections, etc.
You can turn those sections off as well, by customizing the recipe:
Code:
getTechBlogs = False
getPopularArticles = False
That will omit the most viewed/emailed and tech blog articles.

As of the upcoming release (I believe Kovid has integrated the code) you'll be able to also set
Code:
compress_news_images = True
which will rescale images in the articles to your device screen dimensions (set by your selected output profile) and compress the images images by a factor of 16. This should have a fairly big impact on your resulting file size.

The low-end Kindle e-ink readers are pretty sensitive to file size. You'll have to make some decisions as to what sections you want to see if you are having issues. The thing about the NYTimes is it has a lot of content, and some people want it all and some don't. That's why the recipe will often require customization, and I've tried to make that easy by putting the controls at the top of the recipe with detailed comments explaining what they do.
nickredding is offline   Reply With Quote
Old 03-24-2013, 07:59 AM   #11
Waldo3
Junior Member
Waldo3 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kindle
@nickredding Thanks for the editing tips. Overall this is a terrific recipe. Out of curiosity do any of the Kindle phone / tablet apps handle the size better? The Amazon PC app does not handle the periodical format well - not officially supported. I'd be tempted to go epub and get a Nook, but I like the TOC features of Kindle periodicals.
Waldo3 is offline   Reply With Quote
Old 03-24-2013, 10:02 AM   #12
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by Waldo3 View Post
@nickredding Thanks for the editing tips. Overall this is a terrific recipe. Out of curiosity do any of the Kindle phone / tablet apps handle the size better? The Amazon PC app does not handle the periodical format well - not officially supported. I'd be tempted to go epub and get a Nook, but I like the TOC features of Kindle periodicals.
The original Kindle Fire I have will handle periodical files over 100MB without any issues, so I assume the newer Fire models will also.
nickredding is offline   Reply With Quote
Old 03-29-2013, 07:28 AM   #13
Waldo3
Junior Member
Waldo3 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2013
Device: Kindle
OK I have been testing this week with the option

compress_news_images = True

This has dramatically reduced the average size of the NYT subscription recipe issues from 5-13 MB in the previous 2 weeks to < 5 MB this week. For some reason the size of last Sunday's issue dropped the most to 2.7MB. I was suspicious - I did not do an A/B test - but the content seems to be intact. Navigation and page turning on the Kindle Keyboard and Paperwhite is also noticeably faster. Better yet, there have been no Kindle crashes.

On my PC's 24-in monitor, using a Mobi to epub conversion with Firefox, the photos are somewhat blocky but reasonably detailed. The text in some charts was readable but marginal. I gather that using another Calibre device profile in this case, rather than a conversion, may have improved image quality, but even this approach was acceptable to me.

I also tried the compression option with a custom Ars Technica recipe and had similar size reductions.

For the NYT, at least, it probably makes sense to set "compress_news_images = True" as the default option.
Waldo3 is offline   Reply With Quote
Old 03-29-2013, 01:38 PM   #14
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by Waldo3 View Post
OK I have been testing this week with the option

compress_news_images = True

This has dramatically reduced the average size of the NYT subscription recipe issues from 5-13 MB in the previous 2 weeks to < 5 MB this week. For some reason the size of last Sunday's issue dropped the most to 2.7MB. I was suspicious - I did not do an A/B test - but the content seems to be intact. Navigation and page turning on the Kindle Keyboard and Paperwhite is also noticeably faster. Better yet, there have been no Kindle crashes.

On my PC's 24-in monitor, using a Mobi to epub conversion with Firefox, the photos are somewhat blocky but reasonably detailed. The text in some charts was readable but marginal. I gather that using another Calibre device profile in this case, rather than a conversion, may have improved image quality, but even this approach was acceptable to me.

I also tried the compression option with a custom Ars Technica recipe and had similar size reductions.

For the NYT, at least, it probably makes sense to set "compress_news_images = True" as the default option.
There are a couple of other image compression options you might want to try to see if you get a better combination of image quality and size.

The setting compress_news_images_auto_size defaults to 16, which means the image compression target is set to w*h/16 bytes, where w*h are the (rescaled) image dimensions in pixels. You can set this to a lower number to set a lower compression target.

Also, instead of scaling images to your device output profile, you can set your own image scaling parameters using the scale_news_images parameter.

I use the following settings in my custom recipes because I often view the mobi files on my iPad as well as my Kindle Fire.
Code:
    compress_news_images = True
    compress_news_images_auto_size = 8
    scale_news_images_to_device = False
    scale_news_images = (768, 1024)
Today's NYTimes weighed in at 6.7MB which for my purposes is a good result. There are no visible compression artifacts on my iPad and the images are scaled to that screen size. The Fire simply rescales the images down to its screen dimensions.
nickredding is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Financial Times (UK) recipe no longer fetching all articles piet8stevens Recipes 1 02-23-2013 04:15 AM
New York Times recipe skipping some articles? gianfri Recipes 20 02-18-2012 03:29 AM
Touch Ejecting multiple times? vaeditor Kobo Reader 19 07-01-2011 10:24 AM
New York Times missing Front Page articles mkgtu Recipes 0 02-21-2011 10:37 AM
(another) FIX: New York Times Missing Articles bcollier Recipes 11 02-11-2011 03:16 PM


All times are GMT -4. The time now is 03:24 AM.


MobileRead.com is a privately owned, operated and funded community.