Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 06-17-2010, 09:20 PM   #2131
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by lordvetinari2 View Post
I downloaded the news to LRF instead and noticed that the "Next" text did not even had link formatting in Calibre, while it did have link formatting in ePUB, but didn't work. It's like there is no link at all, rather than a non-active link.
I've had this happen to me in the past. Unfortunately I can't remember what in the downloaded html hindered this. I have very little html experience but looking at the test html I was able to quickly guess what was getting in the way from the downloaded page.

Just know that it is caused by something from the page you're grabbing and not a bug in calibre.
DoctorOhh is offline  
Old 06-18-2010, 04:40 AM   #2132
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Can anyone please help how to get the print version of this multipage article of Psychology Today?

The article link is
The print version of this link is
The problem here is that the print version has number (21819) which doesn't appear anywhere in the original link.
rty is offline  
Advert
Old 06-18-2010, 06:03 AM   #2133
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Can anybody help why the Multipage part doesn't work on following recipe:

Spoiler:
Code:
 
class AdvancedUserRecipe1275708473(BasicNewsRecipe):
    title          = u'My Psychology Today'
    # oldest_article = 7
    max_articles_per_feed = 100
    remove_javascript = True
    use_embedded_content   = False
    no_stylesheets = True
    language = 'en'

    keep_only_tags = [dict(name='div', attrs={'id':['contentColumn','content-content']})]
    remove_tags = [
                    dict(name='div', attrs={'id':'advertisement advertisement-zone-51'}),
                    dict(name='div', attrs={'id':'block-td_search_160'}),
                    dict(name='div', attrs={'id':'block-cam_search_160'}),
                    dict(name='div', attrs={'class':'article-sub-meta'}),
   	dict(name='div', attrs={'class':'article-terms meta'}),
                         ] 
    # remove_tags_after  = dict(id=['rightColumn'])
    feeds          = [(u'Contents', u'http://www.psychologytoday.com/articles/index.rss')]

   
    def append_page(self, soup, appendtag, position):
        pager = soup.find('div',attrs={'class':'pager-next'})
        if pager:
           nexturl = self.INDEX + pager.a['href']
           soup2 = self.index_to_soup(nexturl)
           texttag = soup2.find('div', attrs={'id':'contentColumn'})
           for it in texttag.findAll(style=True):
               del it['style']
           newpos = len(texttag.contents)          
           self.append_page(soup2,texttag,newpos)
           texttag.extract()
           appendtag.insert(position,texttag)

    def postprocess_html(self, soup, first):
               for tag in soup.findAll(name=['ul', 'li']):
                    tag.name = 'div'
               return soup


Thank you in advance.

Last edited by rty; 06-19-2010 at 08:48 AM.
rty is offline  
Old 06-18-2010, 09:11 AM   #2134
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by rty View Post
The problem here is that the print version has number (21819) which doesn't appear anywhere in the original link.

Read this.
Starson17 is offline  
Old 06-18-2010, 09:13 AM   #2135
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by rty View Post
Can anybody help why the Multipage part doesn't work on following recipe:
What are you trying to do? What part of your plan doesn't work? What do you see when you run it?
Starson17 is offline  
Advert
Old 06-18-2010, 12:54 PM   #2136
gambarini
Connoisseur
gambarini began at the beginning.
 
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
Quote:
Originally Posted by rty View Post
Can anyone please help how to get the print version of this multipage article of Psychology Today?

The article link is


The print version of this link is


The problem here is that the print version has number (21819) which doesn't appear anywhere in the original link.

It seems very simple;the print link is in the html...
gambarini is offline  
Old 06-18-2010, 02:27 PM   #2137
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by gambarini View Post
It seems very simple;the print link is in the html...
I don't think I'd call it "very simple." Yes, the print link is on the article page, but there's no easy way with print_version to read the article page and substitute the print version link for the article link. You have to go to the article page before you can get the link, but if you've already gone there, it's too late to substitute the print link for the article link.

There are several solutions, but the standard one is to treat it as an obfuscated link.

It's not hard, but it's different enough that most recipes don't bother. It's often easier to just clean up the non-print version page than it is to get the obfuscated link to the print version.
Starson17 is offline  
Old 06-19-2010, 12:59 AM   #2138
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by Starson17 View Post
What are you trying to do? What part of your plan doesn't work? What do you see when you run it?
Hi Starson,

Thanks for your quick response. The recipe works fine fetching all articles from http://www.psychologytoday.com/articles/index.rss except articles which continue multipages, for example, the article titled "Nation of Wimps" http://www.psychologytoday.com/artic...1/nation-wimps.

The recipe was copied from the built-in recipe for AdventureGamer but unfortunately it doesn't work here.

Thanks.
rty is offline  
Old 06-19-2010, 02:13 AM   #2139
gambarini
Connoisseur
gambarini began at the beginning.
 
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
Quote:
Originally Posted by Starson17 View Post
I don't think I'd call it "very simple." Yes, the print link is on the article page, but there's no easy way with print_version to read the article page and substitute the print version link for the article link. You have to go to the article page before you can get the link, but if you've already gone there, it's too late to substitute the print link for the article link.

There are several solutions, but the standard one is to treat it as an obfuscated link.

It's not hard, but it's different enough that most recipes don't bother. It's often easier to just clean up the non-print version page than it is to get the obfuscated link to the print version.
Because It is an obfuscated feed, ok...
otherwise the print link is under a findable tag.
gambarini is offline  
Old 06-19-2010, 02:37 AM   #2140
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Again thanks to Starson17 for pointing me to a right direction on obfuscated feed.

Attached is my contribution for "Psychology Today". The current built-in recipe in Calibre doesn't handle multipage page articles and it doesn't fetch the magazine cover. This one does.
Attached Files
File Type: zip PsychologyToday.zip (860 Bytes, 308 views)
rty is offline  
Old 06-19-2010, 03:06 AM   #2141
gambarini
Connoisseur
gambarini began at the beginning.
 
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
i have solved my problem with parse index and with multiple pages...

This is the new recipe (AutoProve2.zip)

It is updated every month (more or less) and it contains complete test of car.
Every test is a feed, and all the "tabs" of every test are the news in the feed.

p.s.

I "re"-post 3 recipes that i have posted but i don't find them in the new version of calibre.
Quote:
Originally Posted by gambarini View Post
libero-news.it

Better viewing

Quote:
Originally Posted by gambarini View Post
Recipe for
corrieredellosport (sport daily newspaper)
auto & autosprint (formula 1 and car news)
Attached Files
File Type: zip corriereauto.zip (1.9 KB, 231 views)
File Type: zip libero.zip (938 Bytes, 238 views)
File Type: zip AutoProve2.zip (1.3 KB, 229 views)

Last edited by gambarini; 06-19-2010 at 03:17 AM.
gambarini is offline  
Old 06-19-2010, 03:52 AM   #2142
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Recipe for "Maximum PC" (updated to handle Multipage issues by using Print Version)
Attached Files
File Type: zip Maximum PC.zip (822 Bytes, 240 views)
rty is offline  
Old 06-19-2010, 04:12 AM   #2143
--abc--
Member
--abc-- began at the beginning.
 
--abc--'s Avatar
 
Posts: 14
Karma: 10
Join Date: Nov 2009
Device: Kindle 2 (intl.)
Quote:
Originally Posted by kiklop74 View Post
New recipe for Haaretz - Israel newspaper in English:
I get an error with this recipe:

Code:
ERROR: Conversion Error: <b>Failed</b>: Fetch news from Haaretz

Fetch news from Haaretz
Resolved conversion options
calibre version: 0.7.3
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0,
 'book_producer': None,
 'change_justification': 'original',
 'chapter': None,
 'chapter_mark': 'pagebreak',
 'comments': None,
 'cover': None,
 'debug_pipeline': None,
 'disable_font_rescaling': False,
 'dont_compress': False,
 'dont_download_recipe': False,
 'extra_css': None,
 'font_size_mapping': None,
 'footer_regex': '(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'header_regex': '(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x05130B90>,
 'insert_blank_line': False,
 'insert_metadata': False,
 'isbn': None,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0,
 'linearize_tables': False,
 'lrf': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'max_toc_links': 50,
 'no_chapters_in_toc': False,
 'no_inline_navbars': True,
 'no_inline_toc': False,
 'output_profile': <calibre.customize.profiles.KindleOutput object at 0x05130E70>,
 'page_breaks_before': None,
 'password': None,
 'personal_doc': '[PDOC]',
 'prefer_author_sort': False,
 'prefer_metadata_cover': False,
 'preprocess_html': False,
 'pretty_print': False,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': None,
 'remove_first_image': False,
 'remove_footer': False,
 'remove_header': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'rescale_images': False,
 'series': None,
 'series_index': None,
 'tags': None,
 'test': False,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'use_auto_toc': False,
 'username': None,
 'verbose': 2}
InputFormatPlugin: Recipe Input running
Python function terminated unexpectedly
   (Error Code: 1)
Traceback (most recent call last):
  File "site.py", line 103, in main
  File "site.py", line 85, in run_entry_point
  File "site-packages\calibre\utils\ipc\worker.py", line 99, in main
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert
  File "site-packages\calibre\ebooks\conversion\plumber.py", line 815, in run
  File "site-packages\calibre\customize\conversion.py", line 211, in __call__
  File "site-packages\calibre\web\feeds\input.py", line 104, in convert
  File "site-packages\calibre\web\feeds\news.py", line 702, in download
  File "site-packages\calibre\web\feeds\news.py", line 856, in build_index
  File "site-packages\calibre\web\feeds\news.py", line 1301, in parse_feeds
  File "site-packages\calibre\web\feeds\news.py", line 347, in get_feeds
NotImplementedError
--abc-- is offline  
Old 06-19-2010, 05:13 AM   #2144
rty
Zealot
rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.rty got an A in P-Chem.
 
Posts: 108
Karma: 6066
Join Date: Apr 2010
Location: Singapore
Device: iPad Air, Kindle DXG, Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
Use

conversion_options = {'linearize_tables':True}
How do we remove width property in tables? This conversion_options doesn't seem to remove it.

I am working on recipe for Forbes India which was requested by someone here last week. If I can remove this table width property, I can complete the recipe immediately.

Thanks.
rty is offline  
Old 06-19-2010, 07:40 AM   #2145
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by rty View Post
Again thanks to Starson17 for pointing me to a right direction on obfuscated feed.

Attached is my contribution for "Psychology Today". The current built-in recipe in Calibre doesn't handle multipage page articles and it doesn't fetch the magazine cover. This one does.
It looks like you solved your problem by using the obfuscated feed that links to the print version instead of trying to fix the multipage problem? So you don't need comments on why multipage didn't work when you copied from AdventureGamer?
Starson17 is offline  
Closed Thread


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column read ? pchrist7 Calibre 2 10-04-2010 02:52 AM
Archive for custom screensavers sleeplessdave Amazon Kindle 1 07-07-2010 12:33 PM
How to back up preferences and custom recipes? greenapple Calibre 3 03-29-2010 05:08 AM
Donations for Custom Recipes ddavtian Calibre 5 01-23-2010 04:54 PM
Help understanding custom recipes andersent Calibre 0 12-17-2009 02:37 PM


All times are GMT -4. The time now is 02:56 PM.


MobileRead.com is a privately owned, operated and funded community.