MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Recipes (https://www.mobileread.com/forums/forumdisplay.php?f=228)
-   -   Repeated contents in The Economist (https://www.mobileread.com/forums/showthread.php?t=182311)

Steven630 06-21-2012 07:49 AM

Repeated contents in The Economist
 
Sometimes, especially in the special report of The Economist, sentences would be quoted out of the main text. But Calibre doesn't seem to recognize them and treat those as ordinary paragraphs.

An example is this link: http://www.economist.com/node/21554747

You can see "Having been trapped in a bubble during the fascist dictatorship, once they were freed Spanish banks were able to leapfrog rivals in more developed markets" is in bigger size and is a quote from the article. Calibre, however, put this quote even before the sentence appears in the article.

Any way to fix the problem?

NotTaken 06-21-2012 08:50 AM

You could try changing:

Code:

dict(attrs={'class':['dblClkTrk', 'ec-article-info',
                'share_inline_header', 'related-items']}),

,

in remove tags, to:

Code:

dict(attrs={'class':['dblClkTrk', 'ec-article-info',
                'share_inline_header', 'related-items',
                'pullquote']}),

I think the 'problem' was a feature to more accurately represent the published content :D

Steven630 06-21-2012 09:24 AM

Thanks. Or perhaps Calibre can turn the quote into italics? (But having the quote come first would still be a problem)

Steven630 06-21-2012 11:21 AM

Just tried replacing the code. It works well. But I noticed that there are more images than was the case with the issue downloaded using the built-in recipe. (They all come from "blog" articles of TE like Charlemagne).

For example:
http://www.economist.com/node/21556949 (Charlemagne)
http://www.economist.com/node/21556983 (Myanmar)

With the built-in recipe, images in Charlemagnes of 2012 at the bottom were not downloaded (that's great, since they are just images from past column). But a series of news images in the article on Myanmar were downloaded. (which is also good)

However, with the modified recipe, both images of Charlemagne and Myanmar were downloaded. I can't figure out the reason since the modified version just took out the quote.

Is there a way to exclude images like " Charlemagnes of 2012" while including "Myanmar" at the same time?

NotTaken 06-21-2012 05:12 PM

Anything is possible, just check the documentation.

Edit: if its only images at the bottom you could remove all tags after class=footnotes but I've no idea if that would wipe out other stuff that appears in that position that is useful.

Steven630 06-21-2012 10:35 PM

Quote:

Originally Posted by NotTaken (Post 2123240)
Anything is possible, just check the documentation.

Edit: if its only images at the bottom you could remove all tags after class=footnotes but I've no idea if that would wipe out other stuff that appears in that position that is useful.

Thank you so much!


All times are GMT -4. The time now is 10:49 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.