![]() |
#1 |
Connoisseur
![]() Posts: 65
Karma: 10
Join Date: Dec 2010
Device: kindle voyage
|
Multiple Page Sites
The reusable code to load multiple-page articles is IMHO wrong. It uses preprocess_html which is applied "after the cleanup as specified by remove_tags etc.", so no cleanup is done on the following pages, at least this is what I experience on FAZ.NET. This site in particular offers a link to 'Article on one page', so this could be used before cleanup instead of appending pages, but I'm not sure what would be the correct way, skip_ad_pages (but this accepts soup but returns the HTML, so in case this page is ok, one cannot use it) or get_article_url(then the article might have to be loaded twice). Couldn't we have a function that gets and returns the same object, soup or text and is applied right after loading the article content?
|
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,342
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
preprocess_raw_html()
or if the URL scheme is fixed, then print_version() |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Connoisseur
![]() Posts: 65
Karma: 10
Join Date: Dec 2010
Device: kindle voyage
|
Yes, thanks. I was actually hoping for the soup version which would be easier to parse and avoid duplicate parsing
![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Moon+ and slow/multiple page turns | hbtaylor | Android Devices | 2 | 02-08-2014 06:20 PM |
Same page, multiple fonts? | larryt | Kobo Reader | 17 | 08-10-2012 08:59 AM |
Syncing last page read between multiple Kindle devices? | johneveryman | Calibre | 14 | 08-04-2011 11:29 PM |
does any ereader have multiple page turners? | parafluie | Which one should I buy? | 7 | 09-14-2010 02:29 AM |
multiple page turns when pressing flipbar | bazmi | iRex | 27 | 06-14-2009 01:19 PM |