Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes


Thread Tools Search this Thread
Old 03-12-2015, 03:09 PM   #1
Junior Member
nanodreams began at the beginning.
Posts: 1
Karma: 10
Join Date: Mar 2015
Device: Kindle
Post New Recipe -

Hi Guys!

I´d like to ask for help as I expended hours trying different approaches to get the needed content and I can´t.

At the end .. the best approach was using the auto_cleanup option that detects perfectly what I want except that is removing the photo of the news.

The RSS I´d like to parse is:

I´m using the following code:
        import time
        from calibre.ptempfile import PersistentTemporaryFile
        from import BasicNewsRecipe

        class DiarioDeBurgos(BasicNewsRecipe):
            title          = u'Diario de Burgos'
            oldest_article = 1
            max_articles_per_feed = 10
            ignore_duplicate_articles = {'url'}
            use_embedded_content = False
            no_stylesheets = True
            auto_cleanup = True

            feeds          = [
                                (u'Portada', u''),
            def get_cover_url(self):
               return  ''
I tried to use the command 'auto_cleanup_keep', but it seems that it´s not working for me. I´d like to keep the div called `divImgNoticia0` and the tag looks like

<div id="divImgNoticia0" class="GaleriaNoticiaFoto" ...

I tried the following code but no luck:

auto_cleanup_keep = '//div[@id="divImgNoticia0"]'

I´d really appreciate if someone could help me to identify what I´m doing wrong. It seems that the command auto_cleanup_keep is easy to use ... but not working somehow.

The idea is to keep only the tags

<div class="Titular">
<span id="ctl00_cph2Columnas_lblTextoNoticia">
<div id="divImgNoticia0" class="GaleriaNoticiaFoto" style="cursorointer;cursor:hand">

I tried also to use the command 'keep_only_tags' but not luck neither .. in this case the element 'ctl00_cph2Columnas_lblTextoNoticia' is not being added.

Many thanks in advanced for your help and time.


Last edited by PeterT; 03-12-2015 at 05:09 PM. Reason: Editted to include [code] . [/code] to make the script easier to read
nanodreams is offline   Reply With Quote

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
recipe for newspaper subscription NicoDeMus! Recipes 0 11-24-2011 05:02 PM recipe request jonathan22 Recipes 0 09-10-2011 02:50 AM
How to create recipe for xXxXxXxXxXx Recipes 3 05-17-2011 09:57 AM
Recipe for ready quini Recipes 0 04-29-2011 02:09 PM
recipe request: jshzh Recipes 0 02-07-2011 01:00 AM

All times are GMT -4. The time now is 08:59 AM. is a privately owned, operated and funded community.