Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-17-2011, 02:43 AM   #1
Finbar127
Member
Finbar127 began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Feb 2011
Device: Kindle 3
Trying to strip the date from an article URL

Hi Everyone,

I'm trying to create a recipe for a local newspaper. The article URL is formated like this:

"http://www.mahopacnews.com/Articles-c-2011-02-15-207354.112113-Former-official-to-receive-750-daily-for-interim-position-plus-pension-.html"

The print version url is formated like this:

"http://www.mahopacnews.com/LPprintwindow.LASSO?-token.editorialcall=207354.112113"

The problem I have is that this section of the article URL "-c-2011-02-15-" contains a date which changes so using url.replace does not seem to work. Is there a work around for this? I saw a couple of examples using spilt.url however I am new to this and I can't seem to get it to work.

I would also like to strip out all the characters after the article number ie: "-Former-official-to-receive-750-daily-for-interim-position-plus-pension-.html"

I would appreciate any help that you folks could give me.

Thanks

John
Finbar127 is offline   Reply With Quote
Old 02-17-2011, 03:02 PM   #2
Finbar127
Member
Finbar127 began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Feb 2011
Device: Kindle 3
Figured it out

Code:
from calibre.web.feeds.recipes import BasicNewsRecipe

class AdvancedUserRecipe1297969350(BasicNewsRecipe):
    title = u'Mahopac News'
    description = 'Mahopac News Features'
    oldest_article = 2
    max_articles_per_feed = 100

    feeds = [(u' ', u'http://www.mahopacnews.com/rssheadlines.xml')]

    def print_version(self,url):

          baseURL='http://www.mahopacnews.com/LPprintwindow.LASSO?-token.editorialcall='
          segments = url.split('-')
          printURL = baseURL + segments[5]
        
          return printURL

Last edited by Finbar127; 03-02-2011 at 10:21 PM.
Finbar127 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with Article Date in parse_index spedinfargo Recipes 5 02-19-2011 07:12 PM
Date Added vs. Date Modified aglaia761 Calibre 5 11-28-2010 05:34 AM
Bulk Changing Published Date To Date hmf Calibre 4 10-19-2010 10:19 PM
Up-to-date candy teacher (date being 1921) kacir Deals and Resources (No Self-Promotion or Affiliate Links) 0 06-16-2010 04:18 PM
new official shipping date / US invitation date R2D2 iRex 18 07-06-2006 02:32 PM


All times are GMT -4. The time now is 05:47 PM.


MobileRead.com is a privately owned, operated and funded community.