Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 05-30-2009, 02:48 PM   #1
mobilereader72
Junior Member
mobilereader72 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: May 2009
Device: multiple
Using PubDate in print_version of custom news source

Hi,

I'm rather new to this, but is there an easy way to use the pubdate information for the print_version of a custom news source?

For example, the article link is at:
Code:
http://www.somewebsite.com/.../article_idnumber
But print version is at:
Code:
http://www.somewebsite.com/.../print/20090529/idnumber
The 20090529 is the pubdate identified in the article link above. Whereas the pubdate is listed in the xml source as

Code:
<pubDate>Fri, 29 May 2009 23:31 -0400</pubDate>
Calibre seems to be able to parse the pubdate fine using the BasicNewsRecipe. So is there some sort of global variable I can use to include the pubdate in my print_version url?

For example:

Code:
def print_version(self, url): 
   return 'http://www.somewebsite.com/../print/' + pubdate + '/' + url.rsplit('/article_')[1]
From what I've read in the documentation, I will need to parse the feeds again using the parse_feeds() function in order to extract the pubdate data. Is this correct? Does anyone have any examples on how do do this? I can't seem to find any recipes that use the parse_feeds() function.

Any help would be appreciated. Thanks!
mobilereader72 is offline   Reply With Quote
Old 05-30-2009, 02:50 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Rather than using pubdate (which may not always work) you can simply fetch the non print url in the print_version method, using the index_to_soup method, and extract the URL of the print version from that.
kovidgoyal is offline   Reply With Quote
Advert
Old 05-30-2009, 03:07 PM   #3
mobilereader72
Junior Member
mobilereader72 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: May 2009
Device: multiple
Thanks, but the print version link that I'm trying to use is not included on the non print web page. I know it sounds odd, but the non-print web page has changed in that they use Java now to generate the printed version. I discovered an alternative method of creating a clean, text only version, but this alternative method uses the pubdate as part of the URL.

Rather than rewrite my entire recipe, I was wondering if there was a quick and easy way to just change my print version URL so that it includes the pubdate.
mobilereader72 is offline   Reply With Quote
Old 05-30-2009, 03:28 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You'd basically have to re-implement parse_feeds in your recipe, you can just copy paste it from BasicNewsRecipe and change it a little to extract the pubdate from the feed
kovidgoyal is offline   Reply With Quote
Old 05-30-2009, 05:52 PM   #5
kiklop74
Guru
kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.kiklop74 can program the VCR without an owner's manual.
 
kiklop74's Avatar
 
Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
I would recommend you to use in this case method get_article_url instead of print_version. In get_article_url you have access to the xml and you can override all article entries with print version:

Code:
    def get_article_url(self, article):
        raw_url = article.get('link',  None)
        date_url = article.get('pubDate',  None)
        #Extract values from date_url
        #we assume you have the final version of date string in datestr variable
        datestr = "<processed valid value>"
        art_id = raw_url.rsplit('/article_')[1]
        nurl = raw_url.replace('http://www.somewebsite.com/','http://www.somewebsite.com/print/' + datestr + '/' + art_id)
        return nurl
With such code you do not need print_version at all

Examples of this you can see at various recipes like La prensa, NIN etc.
kiklop74 is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Best English News Source? Gideon Reading Recommendations 24 11-16-2010 05:14 PM
Rename output Title of (custom) news source ischeriad Calibre 4 02-16-2010 06:14 AM
Custom news source - for forums RichD Calibre 0 01-12-2010 11:05 AM
Custom news source JayCeeEll Calibre 2 11-14-2009 04:01 AM
libprs500 and custom news feeds scottsan Calibre 1 04-03-2008 02:49 PM


All times are GMT -4. The time now is 11:31 PM.


MobileRead.com is a privately owned, operated and funded community.