|
|
#16 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,617
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Code:
def preprocess_raw_html(self, raw_html, url):
open('/path/to/tempfile.html', 'wb').write(raw_html.encode('utf-8'))
return raw_html
|
|
|
|
|
|
#17 |
|
Big Poppa
![]() Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
Yep, the raw HTML looks same as it does in my browser. There's a multiline comment in the head tag but the other six are just plain comments with one space inside generally. So it seems to be a bug somewhere in beautifulsoup for not parsing comments properly? (or are multiline comments in head not to spec?)
Either way this regex doesn't do the job Where is the final HTML generated? Just in the epub you mean?
|
|
|
|
| Advert | |
|
|
|
|
#18 |
|
Big Poppa
![]() Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
Through trial and error removing the head tag manually seems to fix it. Not sure if bug or just bad HTML on NYT part, but the multiline ascii art is what kills it.
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Copy custom tag to author tag | Lzyslckr | Library Management | 3 | 11-25-2017 03:48 PM |
| Wondering if there is a way to remove end tag with beginning tag | LadyKate | Editor | 5 | 06-29-2016 05:32 PM |
| suggestion: tag groups should use Calibre tag hierarchy | comox | Calibre Companion | 53 | 05-25-2015 08:22 PM |
| Send tag to device only if tag has more than 1 book? | eosrose | Calibre | 0 | 01-29-2013 08:46 PM |
| Adding an Owner tag to tag list? | Fangles | Library Management | 1 | 02-25-2011 03:32 AM |