Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 05-03-2018, 02:17 PM   #1
jma1
Connoisseur
jma1 began at the beginning.
 
Posts: 85
Karma: 10
Join Date: Dec 2015
Device: Kindle
Request for new recipie The Federalist

Attached are two versions of a new recipie for The Federalist. Would appreciate help per below.

The base version with auto cleanup turned off returns all the articles with clean text, including inline images but no author or date. It is difficult to test for images as the feed has a short list of articles and some days there are no inline images in any article, just the picture at top of heading.

In the second version called Test I tried keep tags and remove tags to get author and date. Date appears for all articles but output format is odd [dd yyyy ^p mmm]. The author appears alongside date in the mobi for articles where the author byline appears inline with the article title and text on web page. Some articles have author in a left-sided sidebar and for that type I could not figure out how to specify the tags.

Is it possible from either of these recipies to get author, date and possibly images? Thanks in advance.
jma1 is offline   Reply With Quote
Old 05-03-2018, 10:22 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,596
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You seem to have forgotten to attach the recipes.
kovidgoyal is offline   Reply With Quote
Old 05-04-2018, 08:52 AM   #3
jma1
Connoisseur
jma1 began at the beginning.
 
Posts: 85
Karma: 10
Join Date: Dec 2015
Device: Kindle
Recipies now attached.
Attached Files
File Type: recipe Federalist mode_1046.recipe (453 Bytes, 216 views)
File Type: recipe Test Federalist_1038.recipe (1.5 KB, 222 views)
jma1 is offline   Reply With Quote
Old 05-04-2018, 11:16 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,596
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Your first recipe is getting the content from the RSS feed, so it will only contain whatever content is in the RSS feed. The second recipe gets it from the actual web pages. I'm not sure what you are trying to do with huge number fo keep/remove tag specifications. You shouldn't need more than 3-4 in keep_only_tags. DOes this website have different formatting for different article pages? If so it would help if you posted links to a few of these pages.
kovidgoyal is offline   Reply With Quote
Old 05-04-2018, 11:24 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,596
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I took a quick look, and this is what I came up with:

Code:
def classes(classes):
    q = frozenset(classes.split(' '))
    return dict(
        attrs={'class': lambda x: x and frozenset(x.split()).intersection(q)}
    )


class AdvancedUserRecipe1502348373(BasicNewsRecipe):
    title = 'The Federalist'
    oldest_article = 7
    max_articles_per_feed = 100
    no_stylesheets = True
    encoding = 'utf-8'
    use_embedded_content = False
    remove_attributes = ['xmlns', 'lang', 'style', 'width', 'height']

    keep_only_tags = [
        classes('entry-header'),
        classes('wp-post-image post-categories entry-content shortbio'),
    ]

    feeds = [
        ('All', 'http://thefederalist.com/feed/'),
    ]
kovidgoyal is offline   Reply With Quote
Old 05-06-2018, 03:17 PM   #6
jma1
Connoisseur
jma1 began at the beginning.
 
Posts: 85
Karma: 10
Join Date: Dec 2015
Device: Kindle
I tweaked your version and now the byline (author and date) elements seem to appear for all articles (there are two different styles of the articles when I view them by clicking on link in the rss feed page and the byline is coded and appears differently in the two formats).

Revised recipe and mobi output attached (two files) Please note for some articles the article picture appears in the middle of the byline in the mobi. Not a big issue as the info is all there.

One of the two article formats has a right-hand sidebar with 'Most Popular' and 'Related Posts' sections. Your version eliminates the text but the photos in those boxes appear in the mobi output and are not for the article itself. I tried to get them out with remove tags but could not. Could those images be eliminated? example of that article format linked below -

http://thefederalist.com/2018/05/04/...f-destruction/

Thanks!
Attached Files
File Type: recipe cameupwith_1057.recipe (951 Bytes, 217 views)
File Type: mobi cameupwith - calibre.mobi (6.32 MB, 219 views)
jma1 is offline   Reply With Quote
Old 05-06-2018, 09:52 PM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,596
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
https://github.com/kovidgoyal/calibr...77515df996838d
kovidgoyal is offline   Reply With Quote
Old 05-09-2018, 10:00 AM   #8
jma1
Connoisseur
jma1 began at the beginning.
 
Posts: 85
Karma: 10
Join Date: Dec 2015
Device: Kindle
Kovid, That revision worked.

Now just one more item please. Some articles include images inline with the content text that are not being captured in the mobi output. Some days the feed does not have any such articles. Here is an example article in today's feed -

http://thefederalist.com/2018/05/09/...ng-oil-prices/

And here is what the html looks like for one of the images -

<img class="aligncenter wp-image-182063" src="http://thefederalist.com/wp-content/uploads/2018/05/Lima5.8.c.jpg" alt="" data-portal-copyright="The Federalist" srcset="http://thefederalist.com/wp-content/uploads/2018/05/Lima5.8.c.jpg 955w, http://thefederalist.com/wp-content/....c-300x218.jpg 300w, http://thefederalist.com/wp-content/....c-768x558.jpg 768w, http://thefederalist.com/wp-content/....c-372x270.jpg 372w, http://thefederalist.com/wp-content/....c-200x145.jpg 200w, http://thefederalist.com/wp-content/....c-294x214.jpg 294w" sizes="(max-width: 600px) 100vw, 600px" style="display: block;" data-lazy-loaded="true" width="600" height="436">

Thanks.
jma1 is offline   Reply With Quote
Old 05-09-2018, 10:24 AM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,596
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
https://github.com/kovidgoyal/calibr...0db2ac32c8c04a
kovidgoyal is offline   Reply With Quote
Old 05-11-2018, 12:37 PM   #10
jma1
Connoisseur
jma1 began at the beginning.
 
Posts: 85
Karma: 10
Join Date: Dec 2015
Device: Kindle
Perfect. Thanks.
jma1 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
NY Post Updated Recipie Request for Photos jma1 Recipes 2 02-02-2018 06:00 AM
Request for New Recipie of MumbaiMirror.com rajshah Recipes 0 01-21-2012 07:38 AM
History U.S. Founders: The Federalist Papers (PDF) Last_of_the_PEs Other Books 0 05-25-2011 02:30 AM
Government Publius: The Federalist Papers. eReader. 30 Jan 2008 6charlong Other Books 1 01-30-2008 04:52 PM
Philosophy Hamilton, Jay, & Madison: The Federalist Papers. 07 Oct 07 RWood Kindle Books 0 10-07-2007 10:19 PM


All times are GMT -4. The time now is 11:51 PM.


MobileRead.com is a privately owned, operated and funded community.