View Single Post
Old 06-29-2011, 11:54 AM   #49
burbleburble
Connoisseur
burbleburble began at the beginning.
 
Posts: 52
Karma: 38
Join Date: Jun 2011
Device: Kindle 3
@Kovid
That solved one issue. But then it found another. So I just did ''.join(list) first, then parsed from a string instead of a list. For some strange reason it no longer has a problem, even without replacing null bytes.

But it is rather time consuming to perform this operation first. Oh well. Still, is calibre's version of lxml not up to date? Because mine works fine parsing from a list!

Another question: I'm having trouble saving a page from webkit. I tried both mainFrame().toHtml() and documentElement.toOuterXml() and either way it wont save valid xhtml. It always leaves out the '/' on single tag elements (like 'img', 'br', 'meta'). (Is it even valid in an epub?) This generates serious problems when trying to parse it again with lxml. So, do you know of a way around this issue?

Thanks for all the help
burbleburble is offline   Reply With Quote