Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-18-2010, 03:09 AM   #1
marbs
Zealot
marbs began at the beginning.
 
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
cant seem to keep my articles!

i whiped up this recipe of globes.co.il.

i have everything down, but the articles are wider than the pdf output. i have no idea what is going on. any ideas?
the code:
Spoiler:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import re

class AdvancedUserRecipe1283848012(BasicNewsRecipe):
    description   = 'This is a recipe of Globs.co.il.'
    cover_url      = 'http://www.the7eye.org.il/SiteCollectionImages/BAKTANA/arye_avnery_010709_377.jpg'
    title          = u'Globes1'
    language              = 'he'
    __author__ = 'marbs'
    extra_css='img {max-width:100%;} body{direction: rtl;max-width:100%;}title{direction: rtl; } article_description{direction: rtl; }, a.article{direction: rtl;max-width:100%;} calibre_feed_description{direction: rtl; }'
    simultaneous_downloads = 5
    remove_javascript     = True
    timefmt        = '[%a, %d %b, %Y]'
    oldest_article = 1
    max_articles_per_feed = 3


    feeds          = [(u'שוק ההון', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=585'),
                           (u'נדל"ן', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=607'),
                           (u'וול סטריט ושווקי העולם', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=1225'),
                           (u'ניתוח טכני', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=1294'),
                           (u'היי טק', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=594'),
                           (u'נתח שוק וצרכנות', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=821'),
                           (u'דין וחשבון', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=829'),
                           (u'רכב', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=3220'),
                           (u'דעות', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=845'),
                           (u'קניון המניות - טור שבועי', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=3175'),
                           (u'סביבה', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=3221')]

    def print_version(self, url):
        split1 = url.split("=")
        print_url = 'http://www.globes.co.il/serve/globes/printwindow.asp?did=' + split1[1]
        return print_url


    def preprocess_html(self, soup):
        soup.find('tr',attrs={'bgcolor':'black'}).findPrevious('tr').extract()
        soup.find('tr',attrs={'bgcolor':'black'}).extract()
        return soup


example of print page here.

thank you
marbs is offline   Reply With Quote
Old 11-18-2010, 11:54 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,935
Karma: 5036099
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
you need to remove non reflowable marjup like tables, pre tags, or other tags with an explicitly specified width.
kovidgoyal is online now   Reply With Quote
Old 11-18-2010, 12:56 PM   #3
marbs
Zealot
marbs began at the beginning.
 
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
got it

thanks kovid
ready to be built in:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import re

class AdvancedUserRecipe1283848012(BasicNewsRecipe):
    description   = 'This is a recipe of Globs.co.il.'
    cover_url      = 'http://www.the7eye.org.il/SiteCollectionImages/BAKTANA/arye_avnery_010709_377.jpg'
    title          = u'Globes'
    language              = 'he'
    __author__ = 'marbs'
    extra_css='img {max-width:100%;} body{direction: rtl;max-width:100%;}title{direction: rtl; } article_description{direction: rtl; }, a.article{direction: rtl;max-width:100%;} calibre_feed_description{direction: rtl; }'
    simultaneous_downloads = 5
    remove_javascript     = True
    timefmt        = '[%a, %d %b, %Y]'
    oldest_article = 1
    max_articles_per_feed = 100
    remove_attributes = ['width','style']


    feeds          = [(u'שוק ההון', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=585'),
                           (u'נדל"ן', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=607'),
                           (u'וול סטריט ושווקי העולם', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=1225'),
                           (u'ניתוח טכני', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=1294'),
                           (u'היי טק', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=594'),
                           (u'נתח שוק וצרכנות', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=821'),
                           (u'דין וחשבון', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=829'),
                           (u'רכב', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=3220'),
                           (u'דעות', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=845'),
                           (u'קניון המניות - טור שבועי', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=3175'),
                           (u'סביבה', u'http://www.globes.co.il/webservice/rss/rssfeeder.asmx/FeederNode?iID=3221')]

    def print_version(self, url):
        split1 = url.split("=")
        print_url = 'http://www.globes.co.il/serve/globes/printwindow.asp?did=' + split1[1]
        return print_url


    def preprocess_html(self, soup):
        soup.find('tr',attrs={'bgcolor':'black'}).findPrevious('tr').extract()
        soup.find('tr',attrs={'bgcolor':'black'}).extract()
        print 'soup is',soup,'end of soup'
        return soup

    def fixChars(self,string):
        # Replace lsquo (\x91)
        fixed = re.sub("■","■",string)
        return fixed

Last edited by kovidgoyal; 11-18-2010 at 01:08 PM.
marbs is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Where to submit articles ? spoudaios Writers' Corner 6 05-26-2010 08:43 PM
PRS-600 Articles like this scottjl Sony Reader 31 12-30-2009 05:41 AM
Wikipedia articles Sordelka Calibre 1 04-20-2009 09:02 AM
Submit my articles Shannon Lounge 3 01-08-2009 12:56 PM
A crop of articles from the UK Argel News 3 09-08-2008 10:13 AM


All times are GMT -4. The time now is 01:03 AM.


MobileRead.com is a privately owned, operated and funded community.