|
|
#16 |
|
Enthusiast
![]() Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
Thanks for your Answer Kovid!
But what if I want to get the article? Why can't my recipe download it? |
|
|
|
|
|
#17 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,622
Karma: 28549046
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
you have look and see why the link element has no href on the website and figure out an alternative
|
|
|
|
|
|
#18 | ||
|
Enthusiast
![]() Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
I can't get it.
![]() I mean, in the debug.log (for the previous version of the scrapped site): https://gist.github.com/749781 Here is one section with its articles, as shown in the log: Quote:
Quote:
|
||
|
|
|
|
|
#19 |
|
Enthusiast
![]() Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
Hi All,
![]() With my updated recipe (which still needs refactoring) at https://gist.github.com/749788 I still can't get some of the articles which were recognized by parse_index as valid feed items (and can access them via my browser). Could someone tell me why? Here is the debug.log: https://gist.github.com/749781 Important part is: Code:
Could not fetch link http://www.es.hu/2011-01-16_van-e-sajtoszabadsag-magyarorszagon
Traceback (most recent call last):
File "/usr/lib/calibre/calibre/web/fetch/simple.py", line 428, in process_links
soup = self.get_soup(dsrc)
File "/usr/lib/calibre/calibre/web/fetch/simple.py", line 189, in get_soup
return self.preprocess_html_ext(soup)
File "/tmp/calibre_0.7.40_tmp_fNd0OI/calibre_0.7.40_CGdmix_recipes/recipe0.py", line 144, in preprocess_html
url = links['href']
File "/usr/lib/calibre/calibre/ebooks/BeautifulSoup.py", line 518, in __getitem__
return self._getAttrMap()[key]
KeyError: 'href'
http://www.es.hu/2011-01-16_van-e-sajtoszabadsag-magyarorszagon saved to
Downloading
Fetching http://www.es.hu/2011-01-16_esse-delendam
Failed to download article: KOLTAY ANDRÁS Van-e sajtószabadság Magyarországon? from http://www.es.hu/2011-01-16_van-e-sajtoszabadsag-magyarorszagon
Traceback (most recent call last):
File "/usr/lib/calibre/calibre/utils/threadpool.py", line 95, in run
(request, request.callable(*request.args, **request.kwds))
File "/usr/lib/calibre/calibre/web/feeds/news.py", line 846, in fetch_article
return self._fetch_article(url, dir, f, a, num_of_feeds)
File "/usr/lib/calibre/calibre/web/feeds/news.py", line 842, in _fetch_article
raise Exception(_('Could not fetch article. Run with -vv to see the reason'))
Exception: Nem lehet a cikket letölteni. Futtassa a -vv paraméterrel a hibaüzenetek megjelenítéséhez
|
|
|
|
|
|
#20 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I can't, but I can steer you to some debugging. When Calibre asks for an article, the way it asks differs from the request made by a browser. The trick is to make the browser look like Calibre or vice-versa. It can be a cookie issue, a header issue (referer, etc.). Use LiveHTTP Headers or TamperData in FireFox to control the browser. Use the browser and header commands in the recipe to see and modify headers/cookies/referer in your recipe. When they are the same, you will get the same results.
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| ADD Books & extract tags from title? | johnb0647 | Calibre | 3 | 01-08-2011 06:36 PM |
| Article tweak for title sort not working | Manichean | Calibre | 2 | 10-04-2010 12:56 PM |
| Initial parse failed: | mburgoa | Calibre | 4 | 08-07-2010 09:50 AM |
| Metadata extract from Title | 507Tuli | Calibre | 14 | 05-29-2009 04:13 AM |