11-18-2010, 07:06 AM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Nov 2010
Device: Kindle
|
Article object in preprocess_html
The method preprocess_html() gets a "soup" argument, but I have a situation where the article being fetched is a request for authentication. After doing so, one is expected to re-fetch the URL. Is the URL in the soup object or, better yet, is the article object (with title, URL, description, and date) available in the BasicNewsRecipe object (i.e., self)? I would love to add more attributes (e.g., byline) to the article object and have that available to preprocess_html() so that I can add more stuff to the fetched article. Thanks!
|
11-18-2010, 11:53 AM | #2 |
creator of calibre
Posts: 43,744
Karma: 22446736
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
the soup is just downloaded html. Article objects are stored in BasicNewsRecipe (IIRC under self._fetched_articles or something like that). You have access to both the soup and the article object in populate_article_metadata, however popluate_article metadata is called after postprocess_html
|
Advert | |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
AttributeError: 'NoneType' object has no attribute 'lower | manada | Calibre | 2 | 08-10-2010 10:54 PM |
Line' object has no attribute 'children | mazzeltjes | Calibre | 0 | 02-12-2010 09:30 AM |
TypeError: 'dict' object is not callable | sauravishal | Calibre | 3 | 01-23-2009 06:21 PM |
'list' object has no attribute 'add_book' etc. | mazzeltjes | Calibre | 1 | 12-26-2008 01:12 PM |
'list' object has no attribute 'add_book' | drmathprog | Calibre | 2 | 11-13-2008 04:52 PM |