![]() |
#1366 |
Bookworm
![]() Posts: 5
Karma: 10
Join Date: Dec 2009
Location: Quito, Ecuador
Device: BeBook
|
Hi,
I would like to be able to download the articles from http://www.elcomercio.com/ which is an Ecuadorian newspaper. Thank you very much. Best regards, Felipe |
![]() |
![]() |
#1367 |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
|
ReadItLater Recipe Not Working for Me
kiklop74,
Thanks for letting us know about that instapaper.com site. Very interesting. I visited the site, created an account, and saved some articles to read later. However, when I crank up your recipe in Calibre, I'm getting an error. Here it is: ERROR: Conversion Error: <b>Failed</b>: Fetch news from Read It Later Fetch news from Read It Later Resolved conversion options {'asciiize': False, 'author_sort': None, 'authors': None, 'base_font_size': 0, 'book_producer': None, 'chapter': None, 'chapter_mark': 'pagebreak', 'comments': None, 'cover': None, 'debug_pipeline': None, 'disable_font_rescaling': False, 'dont_download_recipe': False, 'dont_justify': True, 'enable_autorotation': False, 'extra_css': None, 'font_size_mapping': None, 'footer_regex': '(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' , 'header': False, 'header_format': '%t by %a', 'header_regex': '(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s* <a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)' , 'header_separation': 0, 'input_encoding': None, 'input_profile': <calibre.customize.profiles.InputProfile object at 0x02BCF970>, 'insert_blank_line': False, 'insert_metadata': False, 'isbn': None, 'language': None, 'level1_toc': None, 'level2_toc': None, 'level3_toc': None, 'line_height': 0, 'linearize_tables': False, 'lrf': False, 'margin_bottom': 5.0, 'margin_left': 5.0, 'margin_right': 5.0, 'margin_top': 5.0, 'max_toc_links': 50, 'minimum_indent': 0, 'mono_family': None, 'no_chapters_in_toc': False, 'no_inline_navbars': False, 'output_profile': <calibre.customize.profiles.SonyReaderOutput object at 0x02BCFB50>, 'page_breaks_before': None, 'password': '', 'prefer_metadata_cover': False, 'preprocess_html': False, 'pretty_print': False, 'publisher': None, 'rating': None, 'read_metadata_from_opf': None, 'remove_first_image': False, 'remove_footer': False, 'remove_header': False, 'remove_paragraph_spacing': False, 'remove_paragraph_spacing_indent_size': 1.5, 'render_tables_as_images': False, 'sans_family': None, 'series': None, 'series_index': None, 'serif_family': None, 'tags': None, 'test': False, 'text_size_multiplier_for_rendered_tables': 1.0, 'title': None, 'title_sort': None, 'toc_filter': None, 'toc_threshold': 6, 'use_auto_toc': False, 'username': '', 'verbose': 2, 'wordspace': 2.5} InputFormatPlugin: Recipe Input running Python function terminated unexpectedly 'NoneType' object has no attribute 'findAll' (Error Code: 1) Traceback (most recent call last): File "site.py", line 103, in main File "site.py", line 85, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 99, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 745, in run File "site-packages\calibre\customize\conversion.py", line 211, in __call__ File "site-packages\calibre\web\feeds\input.py", line 92, in convert File "site-packages\calibre\web\feeds\news.py", line 634, in download File "site-packages\calibre\web\feeds\news.py", line 751, in build_index File "c:\docume~1\hp_adm~1\locals~1\temp\calibre_0.6.37 _3vz3rn_recipes\recipe0.py", line 52, in parse_index for item in ritem.findAll('li'): AttributeError: 'NoneType' object has no attribute 'findAll' I took out my username and password but am postive both were correct. Hope you can help. XG |
![]() |
Advert | |
|
![]() |
#1368 |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
|
kiklop74,
Oops!! I thought you were accessing the instapaper.com site. I see now in your recipe it's another site, readitlater.com. My apologies. Where can I find the recipe that accesses the instapaper.com site? XG |
![]() |
![]() |
#1369 |
Member
![]() Posts: 12
Karma: 42
Join Date: Jan 2010
Device: Kindle
|
XG,
In the list of recipes by language it is under Unknown near the bottom. Denny |
![]() |
![]() |
#1370 |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Dec 2008
Location: Germany
Device: SONY PRS-500
|
Will Try The Instapaper.com Recipe
|
![]() |
Advert | |
|
![]() |
#1371 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
New recipe for El Comercio:
|
![]() |
![]() |
#1372 |
Bookworm
![]() Posts: 5
Karma: 10
Join Date: Dec 2009
Location: Quito, Ecuador
Device: BeBook
|
|
![]() |
![]() |
#1373 |
Junior Member
![]() Posts: 9
Karma: 10
Join Date: Jan 2010
Device: Sony PRS-505
|
Please help!
I'm trying to figure out a recipe for http://szmobil.sueddeutsche.de/. I'm working on it pretty long now and after a short success with parsing one section I can't get the login with calibres browser-instance going ![]() from calibre.web.feeds.recipes import BasicNewsRecipe class SzMobilRecipe(BasicNewsRecipe): title = u'S\xfcddeutsche Zeitung' oldest_article = 7 max_articles_per_feed = 100 description = 'Sueddeutsche Zeitung Mobile Ausgabe' language = 'de' needs_subscription = True def get_browser(self): br = BasicNewsRecipe.get_browser() if self.username is not None and self.password is not None: br.open('http://szmobil.sueddeutsche.de/login.php') br.select_form(nr=0) br['username'] = self.username br['password'] = self.password br.submit() return br # feeds = [(u'Streiflicht', u'http://szmobil.sueddeutsche.de/show.php?id=streif')] def parse_index(self): feeds = [] for title, url in [('Politik', 'http://szmobil.sueddeutsche.de/show.php?section=Politik') # ('Seite Drei', 'http://szmobil.sueddeutsche.de/show.php?section=Seite+drei'), # ('Meinungsseite', 'http://szmobil.sueddeutsche.de/show.php?section=Meinungsseite'), # ('Panorama', 'http://szmobil.sueddeutsche.de/show.php?section=Panorama'), # ('Feuilleton', 'http://szmobil.sueddeutsche.de/show.php?section=Feuilleton'), # ('Medien', 'http://szmobil.sueddeutsche.de/show.php?section=Medien'), # ('Wissen', 'http://szmobil.sueddeutsche.de/show.php?section=Wissen'), # ('Wirtschaft', u'http://szmobil.sueddeutsche.de/show.php?section=Wirtschaft'), # ('Sport', u'http://szmobil.sueddeutsche.de/show.php?section=Sport'), # ('Muenchen-Bayern', u'http://szmobil.sueddeutsche.de/show.php?section=M%FCnchen%2FBayern') ]: articles = self.nz_parse_section(url) if articles: feeds.append((title, articles)) return feeds def nz_parse_section(self, url): soup = self.index_to_soup(url) current_articles = [] for li in soup.findAll('li'): a = li.find('a', href = True) if a is None: continue title = self.tag_to_string(a) url = a.get('href', False) if not url or not title: continue current_articles.append({'title': title, 'url': url, 'description':'', 'date':''}) return current_articles |
![]() |
![]() |
#1374 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
There is also one hidden field in that form. Try this:
Code:
def get_browser(self): br = BasicNewsRecipe.get_browser() if self.username is not None and self.password is not None: br.open('http://szmobil.sueddeutsche.de/login.php') br.select_form(nr=0) br['username'] = self.username br['password'] = self.password br['id'] = 'streif' br.submit() return br |
![]() |
![]() |
#1375 |
Junior Member
![]() Posts: 9
Karma: 10
Join Date: Jan 2010
Device: Sony PRS-505
|
Hi kiklop74,
many thanks for your reply. I tried it, but it didn't work - got an ValueError: control 'id' is readonly. So I tried this then: def get_browser(self): br = BasicNewsRecipe.get_browser() if self.username is not None and self.password is not None: br.open('http://szmobil.sueddeutsche.de/login.php') br.select_form(nr=0) ctl_1 = br.find_control(type = 'hidden', name = 'id') ctl_1.readonly = False [1.try] ctl_1.value = 'streif' br['username'] = self.username br['password'] = self.password [2.try]. br['id'] = 'streif' br.submit() return br Both try outs brought the same. The ValueError disappeared but the downladed article pages have been the login-page again. I'm getting more and more at loss with it. ![]() Regards, Gero |
![]() |
![]() |
#1376 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
My wife reads the Discover Magazine feed and tells me that the Main menu, Section menu and Next links at the top of each article page (deepest pages) are all linking to external locations. Looking at the epub, I see that those links are really relative links, but the html code for each article page includes a base tag of the form:
<base href="http://discovermagazine.com ...> Removing the base tag seems to fix the problem. Do recipe bugs belong here or in the bug tracker? Thanks. Last edited by Starson17; 03-02-2010 at 07:11 AM. |
![]() |
![]() |
#1377 |
Little Fuzzy Soldier
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 580
Karma: 5711
Join Date: Sep 2008
Location: Nowhere in particular.
Device: cybook gen3, htc hero, ipaq 214
|
Question: When I download a recipe calibre adds the "next", "previous" and "section menu" links of itself, right? My problem is that the "section menu" link doesn't point to the table of contents but to some nonexistent label, e.g. index.html#article_0. Is there some way how I can make it point to the table of contents of the given feed? Please.
|
![]() |
![]() |
#1378 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,377
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Ah yes, <base> tags will screw things up. I'll add some code to strip them automatically in the next release.
@Abelturd: Section Menu links only work in recipes that have multiple sections. |
![]() |
![]() |
#1379 |
Little Fuzzy Soldier
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 580
Karma: 5711
Join Date: Sep 2008
Location: Nowhere in particular.
Device: cybook gen3, htc hero, ipaq 214
|
Custom recipe for the ŽIVÉ.sk (zive.sk) - slovak IT news website.
|
![]() |
![]() |
#1380 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |