Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-06-2012, 06:20 AM   #1
penguin
Junior Member
penguin began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2012
Device: Kindle Touch
Der Spiegel, Recipe Error Code:1

Hey,

I'm using the "Der Spiegel" recipe the last weeks without problems. But now I got the following error:

Spoiler:
Code:
...
InputFormatPlugin: Recipe Input running
Using custom recipe
Found section  Hausmitteilung
Found article  
Python function terminated unexpectedly
  'href' (Error Code: 1)
Traceback (most recent call last):
  File "site.py", line 132, in main
  File "site.py", line 109, in run_entry_point
  File "site-packages\calibre\utils\ipc\worker.py", line 186, in main
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert
  File "site-packages\calibre\ebooks\conversion\plumber.py", line 989, in run
  File "site-packages\calibre\customize\conversion.py", line 239, in __call__
  File "site-packages\calibre\ebooks\conversion\plugins\recipe_input.py", line 109, in convert
  File "site-packages\calibre\web\feeds\news.py", line 881, in download
  File "site-packages\calibre\web\feeds\news.py", line 1026, in build_index
  File "<string>", line 81, in parse_index
  File "site-packages\calibre\ebooks\BeautifulSoup.py", line 518, in __getitem__
KeyError: 'href'


I only customize the recipe to add the date to the title (works fine the last weeks), but now also the original recipe stops with the same error.
I already do some google research and try to change the recipe, but without success. I'm not a pro so its maybe just a simple change.
The problem seems to be the highlighted part:

Spoiler:
Code:
...
feeds = []
         for section in index.findAll('dt'):
            section_title = self.tag_to_string(section).strip()
            self.log('Found section ', section_title)

            articles = []
            for article in section.findNextSiblings(['dd','dt']):
                if article.name == 'dt':
                    break
                link = article.find('a')
                title = self.tag_to_string(link).strip()
                if title in self.empty_articles:
                    continue
                self.log('Found article ', title)
                url = self.PREFIX + link['href']
                articles.append({'title' : title, 'date' : strftime(self.timefmt), 'url' : url})
            feeds.append((section_title,articles))
        return feeds;


If i set the URL manually to a single article the recipe download / convert this article without any errors.

I hope anyone have an idea to get the recipe work again! (Btw I update to calibre 0.9.1 but no changes)

Thx Matthias

Last edited by penguin; 10-07-2012 at 07:06 PM.
penguin is offline   Reply With Quote
Old 10-07-2012, 05:39 AM   #2
Keksfabrik
Member
Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.
 
Posts: 14
Karma: 192992
Join Date: Oct 2012
Device: Kindle Paperwhite
Same here, same outpout, recipe is untouched. Seems to be broken.
Keksfabrik is offline   Reply With Quote
Old 10-07-2012, 06:14 PM   #3
penguin
Junior Member
penguin began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2012
Device: Kindle Touch
it seems to work again. Add the following highlighted parts. Only the last one is needed to make the recipe work again. The others are changes in style and title.


@Keksfabrik
maybe you can report if it fix your problem


Spoiler:
Code:
#!/usr/bin/env  python

__license__   = 'GPL v3'
__copyright__ = '2011, Nikolas Mangold <nmangold at gmail.com>'
'''
spiegel.de
'''
from calibre.web.feeds.news import BasicNewsRecipe
from calibre import strftime
from calibre import re

class DerSpiegel(BasicNewsRecipe):
    title                  = 'Der Spiegel '
    __author__             = 'Nikolas Mangold'
    description            = 'Der Spiegel, Printed Edition. Access to paid content.'
    publisher              = 'SPIEGEL-VERLAG RUDOLF AUGSTEIN GMBH & CO. KG'
    category               = 'news, politics, Germany'
    no_stylesheets         = False
    encoding               = 'cp1252'
    needs_subscription     = True
    remove_empty_feeds     = True
    delay                  = 1
    PREFIX                 = 'http://m.spiegel.de'
    INDEX                  = PREFIX + '/spiegel/print/epaper/index-heftaktuell.html'
    use_embedded_content   = False
    masthead_url = 'http://upload.wikimedia.org/wikipedia/en/thumb/1/17/Der_Spiegel_logo.svg/200px-Der_Spiegel_logo.svg.png'
    language               = 'de'
    publication_type       = 'magazine'
    extra_css              = ' body{font-family: Arial,Helvetica,sans-serif} '
    timefmt = '[%U/%Y]'
    title = title + strftime(timefmt)
    empty_articles = ['Titelbild']
    preprocess_regexps = [
        (re.compile(r'<p>◆</p>', re.DOTALL|re.IGNORECASE), lambda match: '<hr>'),
        ]

    def get_browser(self):
        def has_login_name(form):
            try:
                form.find_control(name="f.loginName")
            except:
                return False
            else:
                return True

        br = BasicNewsRecipe.get_browser()
        if self.username is not None and self.password is not None:
            br.open(self.PREFIX + '/meinspiegel/login.html')
            br.select_form(predicate=has_login_name)
            br['f.loginName'    ] = self.username
            br['f.password'] = self.password
            br.submit()
        return br

    remove_tags_before =  dict(attrs={'class':'spArticleContent'})
    remove_tags_after  =  dict(attrs={'class':'spArticleCredit'})

    def parse_index(self):
        soup = self.index_to_soup(self.INDEX)

        cover = soup.find('img', width=248)
        if cover is not None:
            self.cover_url = cover['src']

        index = soup.find('dl')

        feeds = []
        for section in index.findAll('dt'):
            section_title = self.tag_to_string(section).strip()
            self.log('Found section ', section_title)

            articles = []
            for article in section.findNextSiblings(['dd','dt']):
                if article.name == 'dt':
                    break
                link = article.find('a',href=True)
                title = self.tag_to_string(link).strip()
                if title in self.empty_articles:
                    continue
                self.log('Found article ', title)
                url = self.PREFIX + link['href']
                articles.append({'title' : title, 'date' : strftime(self.timefmt), 'url' : url})
            feeds.append((section_title,articles))
        return feeds;

Last edited by penguin; 10-07-2012 at 07:05 PM.
penguin is offline   Reply With Quote
Old 10-07-2012, 06:41 PM   #4
Keksfabrik
Member
Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.
 
Posts: 14
Karma: 192992
Join Date: Oct 2012
Device: Kindle Paperwhite
Initial tryout failed, but obviously I'm just to tired to copypaste properly. I'll give it a new try tomorrow.
Keksfabrik is offline   Reply With Quote
Old 10-07-2012, 06:51 PM   #5
PeterT
Taking a break; Fed up
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 6,679
Karma: 43902502
Join Date: Nov 2007
Location: Toronto
Device: Wife: Touch, Arc, Vox Me: Nexus 7, Glo
Quote:
Originally Posted by penguin View Post
it seems to work again. Add the following highlighted parts. Only the last one is needed to make the recipe work again. The others are changes in style and title.


@Keksfabrik
maybe you can report if it fix your problem

Spoiler:

Code:
#!/usr/bin/env  python

__license__   = 'GPL v3'
__copyright__ = '2011, Nikolas Mangold <nmangold at gmail.com>'
'''
spiegel.de
'''
from calibre.web.feeds.news import BasicNewsRecipe
from calibre import strftime
from calibre import re

class DerSpiegel(BasicNewsRecipe):
    title                  = 'Der Spiegel '
    __author__             = 'Nikolas Mangold'
    description            = 'Der Spiegel, Printed Edition. Access to paid content.'
    publisher              = 'SPIEGEL-VERLAG RUDOLF AUGSTEIN GMBH & CO. KG'
    category               = 'news, politics, Germany'
    no_stylesheets         = False
    encoding               = 'cp1252'
    needs_subscription     = True
    remove_empty_feeds     = True
    delay                  = 1
    PREFIX                 = 'http://m.spiegel.de'
    INDEX                  = PREFIX + '/spiegel/print/epaper/index-heftaktuell.html'
    use_embedded_content   = False
    masthead_url = 'http://upload.wikimedia.org/wikipedia/en/thumb/1/17/Der_Spiegel_logo.svg/200px-Der_Spiegel_logo.svg.png'
    language               = 'de'
    publication_type       = 'magazine'
    extra_css              = ' body{font-family: Arial,Helvetica,sans-serif} '
    timefmt = '[%U/%Y]'
    title = title + strftime(timefmt)
    empty_articles = ['Titelbild']
    preprocess_regexps = [
        (re.compile(r'<p>◆</p>', re.DOTALL|re.IGNORECASE), lambda match: '<hr>'),
        ]

    def get_browser(self):
        def has_login_name(form):
            try:
                form.find_control(name="f.loginName")
            except:
                return False
            else:
                return True

        br = BasicNewsRecipe.get_browser()
        if self.username is not None and self.password is not None:
            br.open(self.PREFIX + '/meinspiegel/login.html')
            br.select_form(predicate=has_login_name)
            br['f.loginName'    ] = self.username
            br['f.password'] = self.password
            br.submit()
        return br

    remove_tags_before =  dict(attrs={'class':'spArticleContent'})
    remove_tags_after  =  dict(attrs={'class':'spArticleCredit'})

    def parse_index(self):
        soup = self.index_to_soup(self.INDEX)

        cover = soup.find('img', width=248)
        if cover is not None:
            self.cover_url = cover['src']

        index = soup.find('dl')

        feeds = []
        for section in index.findAll('dt'):
            section_title = self.tag_to_string(section).strip()
            self.log('Found section ', section_title)

            articles = []
            for article in section.findNextSiblings(['dd','dt']):
                if article.name == 'dt':
                    break
                link = article.find('a',href=True)
                title = self.tag_to_string(link).strip()
                if title in self.empty_articles:
                    continue
                self.log('Found article ', title)
                url = self.PREFIX + link['href']
                articles.append({'title' : title, 'date' : strftime(self.timefmt), 'url' : url})
            feeds.append((section_title,articles))
        return feeds;
Remember that indentation is key...
PeterT is online now   Reply With Quote
Old 10-08-2012, 07:13 PM   #6
Keksfabrik
Member
Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.
 
Posts: 14
Karma: 192992
Join Date: Oct 2012
Device: Kindle Paperwhite
Secondary tryout failed, too. Further investigation planned for next week.
Keksfabrik is offline   Reply With Quote
Old 10-09-2012, 02:44 AM   #7
penguin
Junior Member
penguin began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2012
Device: Kindle Touch
Have you tried to add the parts manually instead of copy&paste?

Maybe try this part first and if it works add the other ones:

Code:
link = article.find('a',href=True)
You can also try to reinstall the latest version of calibre.

It takes a bit longer to download, because now the recipe add images to the articles! But I tried again and it works for me.

Last edited by penguin; 10-09-2012 at 02:49 AM.
penguin is offline   Reply With Quote
Old 10-09-2012, 04:44 PM   #8
Keksfabrik
Member
Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.Keksfabrik can program the VCR without an owner's manual.
 
Posts: 14
Karma: 192992
Join Date: Oct 2012
Device: Kindle Paperwhite
It's working again. Thanks a lot! I just didn't manage to save the custom recipe properly...
Keksfabrik is offline   Reply With Quote
Reply

Tags
calibre, der spiegel, keyerror href

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Running recipe error: Python function terminated unexpectedly (Error Code: 1) jpassarella Recipes 0 10-05-2012 12:54 PM
Recipe Request SPIEGEL/SPON Archive calyo98 Recipes 0 09-09-2012 12:11 PM
Recipe: DER SPIEGEL? ganymede Recipes 14 06-04-2012 11:49 AM
Spiegel.de: Kleine Nabelschau zu eBooks auf der Buchmesse K-Thom Deutsches Forum 14 10-16-2009 01:25 PM
Mystery and Crime Storm, Theodor W.: Der Spiegel des Cyprianus, german, v1, 14 Mar 2009 ravenne BBeB/LRF Books 0 03-14-2009 06:21 PM


All times are GMT -4. The time now is 10:37 AM.


MobileRead.com is a privately owned, operated and funded community.