MobileRead Forums - View Single Post

burbleburble · 06-29-2011, 11:54 AM

@Kovid
That solved one issue. But then it found another. So I just did ''.join(list) first, then parsed from a string instead of a list. For some strange reason it no longer has a problem, even without replacing null bytes.

But it is rather time consuming to perform this operation first. Oh well. Still, is calibre's version of lxml not up to date? Because mine works fine parsing from a list!

Another question: I'm having trouble saving a page from webkit. I tried both mainFrame().toHtml() and documentElement.toOuterXml() and either way it wont save valid xhtml. It always leaves out the '/' on single tag elements (like 'img', 'br', 'meta'). (Is it even valid in an epub?) This generates serious problems when trying to parse it again with lxml. So, do you know of a way around this issue?

Thanks for all the help

06-29-2011, 11:54 AM	#49
burbleburble Connoisseur Posts: 52 Karma: 38 Join Date: Jun 2011 Device: Kindle 3	@Kovid That solved one issue. But then it found another. So I just did ''.join(list) first, then parsed from a string instead of a list. For some strange reason it no longer has a problem, even without replacing null bytes. But it is rather time consuming to perform this operation first. Oh well. Still, is calibre's version of lxml not up to date? Because mine works fine parsing from a list! Another question: I'm having trouble saving a page from webkit. I tried both mainFrame().toHtml() and documentElement.toOuterXml() and either way it wont save valid xhtml. It always leaves out the '/' on single tag elements (like 'img', 'br', 'meta'). (Is it even valid in an epub?) This generates serious problems when trying to parse it again with lxml. So, do you know of a way around this issue? Thanks for all the help