I can't get it.
I mean, in the debug.log (for the previous version of the scrapped site):
https://gist.github.com/749781
Here is one section with its articles, as shown in the log:
And later in the log:
Quote:
Could not fetch link http://www.es.hu/2010-12-15_kovetseg...ok-titokparade
Traceback (most recent call last):
File "/usr/lib/calibre/calibre/web/fetch/simple.py", line 428, in process_links
soup = self.get_soup(dsrc)
File "/usr/lib/calibre/calibre/web/fetch/simple.py", line 189, in get_soup
return self.preprocess_html_ext(soup)
File "/tmp/calibre_0.7.34_tmp_WzGqsn/calibre_0.7.34_8gsQ4J_recipes/recipe0.py", line 122, in preprocess_html
url = links['href']
File "/usr/lib/calibre/calibre/ebooks/BeautifulSoup.py", line 518, in __getitem__
return self._getAttrMap()[key]
KeyError: 'href'
http://www.es.hu/2010-12-15_kovetseg...ok-titokparade saved to
Downloading
Fetching http://www.es.hu/2010-12-15_leminosites-
Failed to download article: TIMOTHY GARTON ASH: Követségi táviratok: titokparádé from http://www.es.hu/2010-12-15_kovetseg...ok-titokparade
Traceback (most recent call last):
File "/usr/lib/calibre/calibre/utils/threadpool.py", line 95, in run
(request, request.callable(*request.args, **request.kwds))
File "/usr/lib/calibre/calibre/web/feeds/news.py", line 838, in fetch_article
return self._fetch_article(url, dir, f, a, num_of_feeds)
File "/usr/lib/calibre/calibre/web/feeds/news.py", line 834, in _fetch_article
raise Exception(_('Could not fetch article. Run with -vv to see the reason'))
Exception: Nem lehet a cikket letölteni. Futtassa a -vv paraméterrel a hibaüzenetek megjelenítéséhez
|
Which means I get the article href in
parse_index part, but can't download it in
preprocess_html (as this function contains:
url = links['href'])?