Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 06-17-2011, 07:11 AM   #1
lgwapnitsky
Junior Member
lgwapnitsky began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jun 2011
Device: Nook Simple Touch
Philadelphia Inquirer Recipe

I'm new to Calibre and have started exploring the recipes. I'd like to modify the Philadelphia Inquirer recipe, as it is currently only downloading the basic RSS feed information and no link to (or info from) the full article. I've not programmed in Python before and would appreciate some guidance on how I can tackle this issue.

Thanks,
Larry
lgwapnitsky is offline   Reply With Quote
Old 06-17-2011, 09:35 AM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by lgwapnitsky View Post
I'm new to Calibre and have started exploring the recipes. I'd like to modify the Philadelphia Inquirer recipe, as it is currently only downloading the basic RSS feed information and no link to (or info from) the full article. I've not programmed in Python before and would appreciate some guidance on how I can tackle this issue.

Thanks,
Larry
http://manual.calibre-ebook.com/news.html#news
Starson17 is offline   Reply With Quote
Old 06-17-2011, 10:51 AM   #3
sexymax15
Enthusiast
sexymax15 began at the beginning.
 
sexymax15's Avatar
 
Posts: 30
Karma: 12
Join Date: Jun 2011
Location: India
Device: Kindle 3g
Here's the recipe for Philadelphia Inquirrer(Philly.com).It works without any problems.

Spoiler:
Code:
import re

from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.chardet import xml_to_unicode
class AdvancedUserRecipe1308312288(BasicNewsRecipe):
    title          = u'Philadelphia Inquirer(Philly.com)'
    oldest_article = 15
    max_articles_per_feed = 20
    use_embedded_content = False
    remove_empty_feeds = True
    no_stylesheets = True
    remove_javascript = True

   # remove_tags_before = {'class':'article_timestamp'}
    #remove_tags_after = {'class':'graylabel'}
    keep_only_tags= [dict(name=['h1','p'])]
    remove_tags = [dict(name=['hr','dl','dt','img','meta','iframe','link','script','form','input','label']),
dict(id=['toggleConfirmEmailDiv','toggleTOS','toggleUsernameMsgDiv','toggleConfirmYear','navT1_philly','secondaryNav','navPlacement','globalPrimaryNav'
,'ugc-footer-philly','bv_footer_include','footer','header',
'container_rag_bottom','section_rectangle','contentrightside'])
,{'class':['megamenu3 megamenu','container misc','container_inner misc_inner'
,'misccontainer_left_32','headlineonly','misccontainer_middle_32'
,'misccontainer_right_32','headline formBegin',
'post_balloon','relatedlist','linkssubhead','b_sq','dotted-rule-above'
,'container','headlines-digest','graylabel','container_inner'
,'rlinks_colorbar1','rlinks_colorbar2','supercontainer','container_5col_left','container_image_left',
'digest-headline2','digest-lead','container_5col_leftmiddle',
'container_5col_middlemiddle','container_5col_rightmiddle'
,'container_5col_right','divclear','supercontainer_outer force-width',
'supercontainer','containertitle  kicker-title',
'pollquestion','pollchoice','photomore','pollbutton','container rssbox','containertitle video ',
'containertitle_image ','container_tabtwo','selected'
,'shadetabs','selected','tabcontentstyle','tabcontent','inner_container'
,'arrow','container_ad','containertitlespacer','adUnit','tracking','sitemsg_911 clearfix']}]

    extra_css             = """ 
                               h1{font-family: Georgia,serif; font-size: xx-large} 
                               
                            """


    feeds          = [(u'News', u'http://www.philly.com/philly_news.rss')]
Screenshot:


Moderator Notice
Edited to add tags.
Attached Files
File Type: zip Philadelphia Inquirer(Philly.com)_1129.zip (1.2 KB, 141 views)

Last edited by Starson17; 06-17-2011 at 11:52 AM.
sexymax15 is offline   Reply With Quote
Old 06-17-2011, 11:05 AM   #4
lgwapnitsky
Junior Member
lgwapnitsky began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jun 2011
Device: Nook Simple Touch
I saw that, but some of what's in the documentation isn't clearly detailed, e.g. What is "soup"? I'm not too familiar with Python and its inner workings (I'm currently programming heavily in C#).

Thanks,
Larry


Sexymax - didn't see your response.

Thanks!
lgwapnitsky is offline   Reply With Quote
Old 06-17-2011, 11:06 AM   #5
lgwapnitsky
Junior Member
lgwapnitsky began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jun 2011
Device: Nook Simple Touch
Quote:
Originally Posted by sexymax15 View Post
Here's the recipe for Philadelphia Inquirrer(Philly.com).It works without any problems.



Screenshot:

Beautiful! Perfect! I'll try that when I get home later!
lgwapnitsky is offline   Reply With Quote
Old 06-17-2011, 11:50 AM   #6
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by lgwapnitsky View Post
I saw that, but some of what's in the documentation isn't clearly detailed, e.g. What is "soup"? I'm not too familiar with Python and its inner workings (I'm currently programming heavily in C#).
FYI, "soup" is html of the article page currently being processed which has been dropped into a database by BeautifulSoup so that you can find, remove, replace and manipulate all the html tags on that page by where they are relative to other tags (Next, Previous) were they are relative to other tags at the same level (Next Sibling, parent), etc. It's at the heart of how Calibre's recipe system scrapes web pages.
Starson17 is offline   Reply With Quote
Old 06-17-2011, 09:01 PM   #7
lgwapnitsky
Junior Member
lgwapnitsky began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jun 2011
Device: Nook Simple Touch
[QUOTE=sexymax15;1613187]Here's the recipe for Philadelphia Inquirrer(Philly.com).It works without any problems.


question - can I add more of the Philly.com under the feeds list? I think this is how it was set up in the one provided by Calibre.

Thanks,
Larry
lgwapnitsky is offline   Reply With Quote
Old 06-18-2011, 03:48 AM   #8
sexymax15
Enthusiast
sexymax15 began at the beginning.
 
sexymax15's Avatar
 
Posts: 30
Karma: 12
Join Date: Jun 2011
Location: India
Device: Kindle 3g
[QUOTE=lgwapnitsky;1614260]
Quote:
Originally Posted by sexymax15 View Post
Here's the recipe for Philadelphia Inquirrer(Philly.com).It works without any problems.


question - can I add more of the Philly.com under the feeds list? I think this is how it was set up in the one provided by Calibre.

Thanks,
Larry
Just replace the feeds with your feeds.
Quote:
feeds = [(u'News', u'http://www.philly.com/philly_news.rss')]
sexymax15 is offline   Reply With Quote
Old 06-28-2011, 10:51 PM   #9
Jetkey
Junior Member
Jetkey began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jun 2011
Location: Maryland, USA
Device: Kindle 3 (Wi Fi)
Question

Hi folks, just joined the forum. I can't seem to get anything from the Inquirer (Philly.com) recipe. I have a bunch of other recipes that work just fine, but all I get from this one is a "News" link but no articles. I'm converting everything to mobi for my Kindle 3. Any ideas why it seems to work for some but not in my case?
Jetkey is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Request: Inquirer.net Recipe update zoilom Recipes 0 12-21-2010 01:06 AM
Hello from Philadelphia bthrowsnaill Introduce Yourself 13 10-03-2010 09:39 AM
Philadelphia Library stephanie31802 Sony Reader 34 07-02-2010 03:24 PM
Hello from Philadelphia jbsanno Introduce Yourself 2 03-06-2009 05:33 PM
Bad Inquirer piece about e-ink grimo1re News 15 09-28-2007 11:45 AM


All times are GMT -4. The time now is 07:01 PM.


MobileRead.com is a privately owned, operated and funded community.