Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 09-24-2010, 12:30 PM   #1
jenden
Junior Member
jenden began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Sep 2010
Device: kindle dx
French JPost recipe please

Could you please create a recipe for the french version of the Jerusalem post.
http://fr.jpost.com/

Thanks in advance.
jenden is offline   Reply With Quote
Old 09-24-2010, 05:39 PM   #2
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by jenden View Post
Could you please create a recipe for the french version of the Jerusalem post.
http://fr.jpost.com/

Thanks in advance.
I do not speak French but test this and see if it works like you expect. If it does then I will submit it as complete.
Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, re
class JerusalemPost(BasicNewsRecipe):
    title = 'Jerusalem post'
    language = 'fr'
    __author__ = 'TonytheBookworm'
    description = 'news'
    publisher = 'jpost'
    category = 'news'
    oldest_article = 30
    max_articles_per_feed = 100
    linearize_tables = True
    no_stylesheets = True
    remove_javascript   = True
    
    masthead_url = 'http://static.jpost.com/JPSITES/images/JFrench/2008/site/jplogo.JFrench.gif'
   
    remove_tags = [
                   dict(name='a', attrs={'href':['javascript:window.print()']}),
                   dict(name='div', attrs={'class':['bot']}),
       
                   ]
    
    feeds          = [
                      ('NEWS', 'http://fr.jpost.com/servlet/Satellite?collId=1216805762036&pagename=JFrench%2FPage%2FRSS')
                      
                    ]
    def print_version(self, url):
        split1 = url.split("cid=")
        #for testing only -------
        #print 'SPLIT IS: ', split1
        #print 'ORG URL IS: ', url
        #---------------------------
        idnum = split1[1] # get the actual value of the id article
        #for testing only --------------------
        #print 'the idnum is: ', idnum
        #-------------------------------------- 
        print_url = 'http://fr.jpost.com/servlet/Satellite?cid=' + idnum + '&pagename=JFrench%2FJPArticle%2FPrinter'
        #for testing only -------------------------
        #print 'PRINT URL IS: ', print_url
        #------------------------------------------
        return print_url
      
    #example of how links should be formated
    #--------------------------------------------------------------------------------------------------------------              
    #org   version =  http://fr.jpost.com/servlet/Satellite?pagename=JFrench/JPArticle/ShowFull&cid=1282804806075
    #print version =  http://fr.jpost.com/servlet/Satellite?cid=1282804806075&pagename=JFrench%2FJPArticle%2FPrinter
    #------------------------------------------------------------------------------------------------------------------
TonytheBookworm is offline   Reply With Quote
Old 09-25-2010, 11:00 AM   #3
jenden
Junior Member
jenden began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Sep 2010
Device: kindle dx
Good job TonytheBookworm!
But it works only for the condensed latest news.
Would it be possible to get all the RSS on the page?
We would have the full length articles.
Thanks in advance.
jenden is offline   Reply With Quote
Old 09-25-2010, 11:07 AM   #4
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by jenden View Post
Good job TonytheBookworm!
But it works only for the condensed latest news.
Would it be possible to get all the RSS on the page?
We would have the full length articles.
Thanks in advance.
Yeah, I wasn't sure if that was the condensed version or not that is why i wanted you to look at it. I will see what I can come up with and post back.
TonytheBookworm is offline   Reply With Quote
Old 09-25-2010, 11:45 AM   #5
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire

Alright a little confused about what you are calling "all rss"
I went to the original site.
http://fr.jpost.com/servlet/Satellit...h%2FPage%2FRSS
and i clicked on the first link that brought me to this site
http://fr.jpost.com/servlet/Satellit...=1282804806075

I then ran the recipe in calibre and looked at that article.
The content looks exactly the same. (are you wanting the side bars and all that other stuff too ? )

I also checked to see if all the rss links that are listed are in the fetched feed that i have calibre generating. they are all there. So i'm confused as to what is missing. sorry.


I see what your talking about now. At first glance of the main page I only seen one rss icon. I'll look into it.

Last edited by TonytheBookworm; 09-25-2010 at 12:16 PM. Reason: nevermind i see what your talking about now..... yeah i will take care of it...
TonytheBookworm is offline   Reply With Quote
Old 09-25-2010, 12:17 PM   #6
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Try this code. and let me know if this is what you wanted.
Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup, re
class JerusalemPost(BasicNewsRecipe):
    title = 'Jerusalem post'
    language = 'fr'
    __author__ = 'TonytheBookworm'
    description = 'news'
    publisher = 'jpost'
    category = 'news'
    oldest_article = 30
    max_articles_per_feed = 100
    linearize_tables = True
    no_stylesheets = True
    remove_javascript   = True
    
    masthead_url = 'http://static.jpost.com/JPSITES/images/JFrench/2008/site/jplogo.JFrench.gif'
   
    remove_tags = [
                   dict(name='a', attrs={'href':['javascript:window.print()']}),
                   dict(name='div', attrs={'class':['bot']}),
       
                   ]
    
    feeds          = [
                      ('NEWS', 'http://fr.jpost.com/servlet/Satellite?collId=1216805762036&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench En route vers la paix', 'http://fr.jpost.com/servlet/Satellite?collId=1216805762201&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Politique', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737334&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Securite', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737338&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Moyen Orient', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737342&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Diplomatie / Monde', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737346&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Economie / Sciences', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737358&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Societe', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737354&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Opinions', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737350&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Monde juif', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737366&pagename=JFrench%2FPage%2FRSS'),
                      ('JFrench Culture / Sport', 'http://fr.jpost.com/servlet/Satellite?collId=1215356737362&pagename=JFrench%2FPage%2FRSS')
                    ]
    def print_version(self, url):
        split1 = url.split("cid=")
        #for testing only -------
        #print 'SPLIT IS: ', split1
        #print 'ORG URL IS: ', url
        #---------------------------
        idnum = split1[1] # get the actual value of the id article
        #for testing only --------------------
        #print 'the idnum is: ', idnum
        #-------------------------------------- 
        print_url = 'http://fr.jpost.com/servlet/Satellite?cid=' + idnum + '&pagename=JFrench%2FJPArticle%2FPrinter'
        #for testing only -------------------------
        #print 'PRINT URL IS: ', print_url
        #------------------------------------------
        return print_url
      
    #example of how links should be formatted
    #--------------------------------------------------------------------------------------------------------------              
    #org   version =  http://fr.jpost.com/servlet/Satellite?pagename=JFrench/JPArticle/ShowFull&cid=1282804806075
    #print version =  http://fr.jpost.com/servlet/Satellite?cid=1282804806075&pagename=JFrench%2FJPArticle%2FPrinter
    #------------------------------------------------------------------------------------------------------------------
TonytheBookworm is offline   Reply With Quote
Old 09-25-2010, 01:56 PM   #7
jenden
Junior Member
jenden began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Sep 2010
Device: kindle dx
It works perfectly.
Thanks a lot.
jenden is offline   Reply With Quote
Old 09-25-2010, 02:02 PM   #8
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by jenden View Post
It works perfectly.
Thanks a lot.
No problem glad I could help.
TonytheBookworm is offline   Reply With Quote
Old 09-25-2010, 02:08 PM   #9
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Completed Code with Favicon

bundled code and favicon for submission
Attached Files
File Type: rar french jpost.rar (1.2 KB, 605 views)
TonytheBookworm is offline   Reply With Quote
Old 09-25-2010, 02:53 PM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,850
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Added
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
I need some help with a recipe jefferson_frantz Recipes 14 11-22-2010 02:06 PM
New recipe kiklop74 Recipes 0 10-05-2010 04:41 PM
New recipe kiklop74 Recipes 0 10-01-2010 02:42 PM
french literature in french peterbeecher General Discussions 3 08-25-2010 08:15 AM
Recipe Help lrain5 Calibre 3 05-09-2010 10:42 PM


All times are GMT -4. The time now is 12:47 PM.


MobileRead.com is a privately owned, operated and funded community.