View Single Post
Old 08-11-2008, 01:11 PM   #18
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by bkilian View Post
Examples of failure:
Ticket #938 (Which makes my entirely automated book coversion process tricky at best)
Ticket #939 (In which a book that I spent ages scanning maps and cleaning them up suddenly has no maps)
I've pushed fixes for these bugs up to my bug-fix branch -- Kovid should pick them up for the next release unless he's unhappy with them for some reason.

Quote:
Originally Posted by bkilian View Post
And in general, it becomes _impossible_ to edit the resulting OPF and HTML files, which I tend to do a lot, when they're all in one line. ConvertLit appears to have no problems making an easy to read and edit file. (I had a hell of a time trying to add a <dc:Language> tag to a bunch of books because of this)
IceHand has asked for this too, and it's certainly on the todo list. The problem is that the HTML contained in the LIT files is stored as lit2oeb produces them. ConvertLIT pretty-prints the HTML as it extracts it, but gets it wrong quite frequently, inserting whitespace where it doesn't belong and producing output with e.g. "S mall C aps" messed up. I've looked at pretty-printing with the HTML & XML parser/generators already in calibre, but they either have the same flaws as the ConvertLIT pretty-printer (BeautifulSoup) or don't always succeed in actualy pretty-printing "document-style" XML (lxml).

Quote:
Originally Posted by bkilian View Post
If I find anything else, I'll create corresponding bugs, but for the moment, I'm essentially stuck since I did a directory clean up and deleted all my LRF files to do a clean reconvert, and now I'll have to wait until these bugs are fixed.
I'm sorry you've run into some bugs, but please do submit any more issues you find and I'll get them fixed as quickly as I can.
llasram is offline   Reply With Quote