View Single Post
Old 09-24-2010, 05:39 PM   #2
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by jenden View Post
Could you please create a recipe for the french version of the Jerusalem post.
http://fr.jpost.com/

Thanks in advance.
I do not speak French but test this and see if it works like you expect. If it does then I will submit it as complete.
Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, re
class JerusalemPost(BasicNewsRecipe):
    title = 'Jerusalem post'
    language = 'fr'
    __author__ = 'TonytheBookworm'
    description = 'news'
    publisher = 'jpost'
    category = 'news'
    oldest_article = 30
    max_articles_per_feed = 100
    linearize_tables = True
    no_stylesheets = True
    remove_javascript   = True
    
    masthead_url = 'http://static.jpost.com/JPSITES/images/JFrench/2008/site/jplogo.JFrench.gif'
   
    remove_tags = [
                   dict(name='a', attrs={'href':['javascript:window.print()']}),
                   dict(name='div', attrs={'class':['bot']}),
       
                   ]
    
    feeds          = [
                      ('NEWS', 'http://fr.jpost.com/servlet/Satellite?collId=1216805762036&pagename=JFrench%2FPage%2FRSS')
                      
                    ]
    def print_version(self, url):
        split1 = url.split("cid=")
        #for testing only -------
        #print 'SPLIT IS: ', split1
        #print 'ORG URL IS: ', url
        #---------------------------
        idnum = split1[1] # get the actual value of the id article
        #for testing only --------------------
        #print 'the idnum is: ', idnum
        #-------------------------------------- 
        print_url = 'http://fr.jpost.com/servlet/Satellite?cid=' + idnum + '&pagename=JFrench%2FJPArticle%2FPrinter'
        #for testing only -------------------------
        #print 'PRINT URL IS: ', print_url
        #------------------------------------------
        return print_url
      
    #example of how links should be formated
    #--------------------------------------------------------------------------------------------------------------              
    #org   version =  http://fr.jpost.com/servlet/Satellite?pagename=JFrench/JPArticle/ShowFull&cid=1282804806075
    #print version =  http://fr.jpost.com/servlet/Satellite?cid=1282804806075&pagename=JFrench%2FJPArticle%2FPrinter
    #------------------------------------------------------------------------------------------------------------------
TonytheBookworm is offline   Reply With Quote