![]() |
#1 |
Enthusiast
![]() Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
![]()
Hi All!
![]() In my recipe I was able to find the needed content, and extract it via keeponly_tags, and remove_tags. Spoiler:
But the article(s) are in an inner table/(thead|tr/td). Which - if I convert the recipe to mobi for my Kindle - doesn't look good. Actually Only the first screen is filled with the text, and the second page is empty. So I tried to get rid of the unnecessary tags, but without luck. I tried postprocess_html: Spoiler:
But it gave me a TypeError: Spoiler:
Then I had tried preprocess_regexps, but it gave me empty article pages Spoiler:
The recipe in its actual state (which works fine if you are creating e.g. PDF output) can be reached here: https://github.com/zsoltika/.hu-reci...0_1_nap.recipe So my question is: after cleaning up the articles html via keeponly_tags, and remove_tags, how does one replace some tags - in my case: table, thead, tfoot, tr, td; BUT only the tag names, not their contents! - with another tag name (e.g. </?span>)? And one more thing popped into my mind: wouldn't it be nicer, if the various api callables/overrides etc. at http://calibre-ebook.com/user_manual/news_recipe.html will be numbered? I mean I don't get which applies earlier in the process from ['preprocess_html', 'preprocess_regexps', 'keeponly_tags', 'remove_tags']. Thanks for any help! |
![]() |
![]() |
![]() |
#2 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Code:
'linearize_tables' : True Code:
def postprocess_html(self, soup, first_fetch): for t in soup.findAll(['table', 'tr', 'td']): t.name = 'div' |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Enthusiast
![]() Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
|
Worked like a charme, thank You!
|
![]() |
![]() |
![]() |
Tags |
recipes, replacewith, tables |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Replacing my Sony with K3? | cognym | Amazon Kindle | 61 | 02-02-2011 04:02 PM |
Replacing my new Kobo - again! | objectman | Kobo Reader | 7 | 09-20-2010 08:00 PM |
Replacing the battery | AprilHare | Sony Reader | 12 | 04-29-2009 01:08 PM |
Replacing ¬ | PieOPah | Workshop | 5 | 12-17-2008 04:25 PM |
iLiad Replacing the contentlister | tribble | iRex Developer's Corner | 21 | 06-22-2007 03:58 PM |