Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 04-29-2012, 04:07 PM   #1
atordo
Connoisseur
atordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to behold
 
Posts: 89
Karma: 19669
Join Date: Apr 2012
Device: Kindle Touch
Help with Wordpress feed (El Mundo Today)

I'm trying to create a recipe for:
http://www.elmundotoday.com/feed/

I've tweaked with several recipes for Wordpress found in this very forum to no avail, the index is always empty (no articles) although manually downloading the feed shows articles there.

The feed is compressed with gzip, but I guess this should not be a problem for Calibre?

Below is my last attempt:
Code:
lass AdvancedUserRecipe1335711936(BasicNewsRecipe):
    title          = u'El Mundo Today'
    description = 'La actualidad del mañana'
    cover_url = 'http://www.elmundotoday.com/wp-content/themes/EarthlyTouch/images/logo.png'
    oldest_article = 365
    max_articles_per_feed = 100
    auto_cleanup = False
    no_stylesheets = True
    language = 'es_ES'
    use_embedded_content  = True

    feeds  = [(u'El Mundo Today', u'http://www.elmundotoday.com/feed/')]
TIA.
atordo is offline   Reply With Quote
Old 04-29-2012, 07:42 PM   #2
atordo
Connoisseur
atordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to behold
 
Posts: 89
Karma: 19669
Join Date: Apr 2012
Device: Kindle Touch
I should have tried this before posting:

Uncompressed the RSS file, copied it to the data directory of a very simple web server that runs in my computer, then pointed the feed in the recipe to localhost. Articles now show up.

So it seems gzip compression was indeed the problem.
atordo is offline   Reply With Quote
 
Advertisement
Old 04-29-2012, 11:55 PM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,287
Karma: 5381913
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you want calibre to handle gzip transparently, use

Code:
def get_broser(self):
    br = BasicNewsRecipe.get_browser(self)
    br.set_handle_gzip(True)
    return br
That should do the trick, though I haven't tested it.
kovidgoyal is offline   Reply With Quote
Old 04-30-2012, 07:25 AM   #4
atordo
Connoisseur
atordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to behold
 
Posts: 89
Karma: 19669
Join Date: Apr 2012
Device: Kindle Touch
Thanks Kovid, that did it. Below is a working version of the recipe in case someone else is interested in the site:

Code:
from calibre.web.feeds.news import BasicNewsRecipe

class ElMundoTodayRecipe(BasicNewsRecipe):
    title = 'El Mundo Today'
    description = u'La actualidad del mañana'
    category = 'Noticias, humor'
    cover_url = 'http://www.elmundotoday.com/wp-content/themes/EarthlyTouch/images/logo.png'
    oldest_article = 30
    max_articles_per_feed = 30
    auto_cleanup = True
    no_stylesheets = True
    language = 'es_ES'
    use_embedded_content  = True

    feeds = [('El Mundo Today', 'http://www.elmundotoday.com/feed/')]

    def get_broser(self):
        br = BasicNewsRecipe.get_browser(self)
        br.set_handle_gzip(True)
        return br
atordo is offline   Reply With Quote
Old 06-06-2012, 12:18 AM   #5
atordo
Connoisseur
atordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to behold
 
Posts: 89
Karma: 19669
Join Date: Apr 2012
Device: Kindle Touch
Updated version with better page parsing and some CSS for eye candy.

Spoiler:
Code:
import re
from calibre.web.feeds.news import BasicNewsRecipe

class ElMundoTodayRecipe(BasicNewsRecipe):
    title = 'El Mundo Today'
    description = u'La actualidad del mañana'
    category = 'Noticias, humor'
    cover_url = 'http://www.elmundotoday.com/wp-content/themes/EarthlyTouch/images/logo.png'
    oldest_article = 30
    max_articles_per_feed = 60
    auto_cleanup = False
    no_stylesheets = True
    remove_javascript = True
    language = 'es_ES'
    use_embedded_content  = False

    preprocess_regexps = [
        (re.compile(r'</title>.*<!--Begin Article Single-->', re.DOTALL),
        lambda match: '</title><body>'),
        #(re.compile(r'^\t{5}<a href.*Permanent Link to ">$'), lambda match: ''),
        #(re.compile(r'\t{5}</a>$'), lambda match: ''),
        (re.compile(r'<div class="social4i".*</body>', re.DOTALL),
        lambda match: '</body>'),
    ]

    keep_only_tags = [
        dict(name='div', attrs={'class':'post-wrapper'})
    ]

    remove_attributes = [ 'href', 'title', 'alt' ]

    extra_css = '''
        .antetitulo{font-variant:small-caps; font-weight:bold} .articleinfo{font-size:small}
        img{margin-bottom:0.4em; display:block; margin-left:auto; margin-right:auto}
    '''

    feeds = [('El Mundo Today', 'http://www.elmundotoday.com/feed/')]

    def get_broser(self):
        br = BasicNewsRecipe.get_browser(self)
        br.set_handle_gzip(True)
        return br
atordo is offline   Reply With Quote
Old 06-06-2012, 03:23 AM   #6
Terisa de morgan
Wizard
Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.
 
Terisa de morgan's Avatar
 
Posts: 2,346
Karma: 1996742
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Aura, Kobo Mini, iPhone, iPad, Galaxy Tab3 8
Thank you, I'm interested (and always surprised by their news )
Terisa de morgan is online now   Reply With Quote
Old 06-06-2012, 02:49 PM   #7
atordo
Connoisseur
atordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to beholdatordo is a splendid one to behold
 
Posts: 89
Karma: 19669
Join Date: Apr 2012
Device: Kindle Touch
Glad to know it's of use to someone else
atordo is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Psychology today news feed failing to download Shuichiro Recipes 1 05-14-2011 06:11 AM
BBC Mundo (v1.0) - Spanish tolyluis Recipes 0 01-29-2011 08:12 PM
Hola mundo johansolo Introduce Yourself 6 08-22-2009 10:53 PM
Wordpress Vs Textpattern Moejoe Lounge 4 03-06-2009 12:46 PM
iLiad review in El Mundo (Spanish newspaper) ElaHuguet iRex 1 08-17-2007 11:15 AM


All times are GMT -4. The time now is 05:06 PM.


MobileRead.com is a privately owned, operated and funded community.