View Single Post
Old 02-12-2011, 10:04 AM   #9
BobC
Addict
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 329
Karma: 245756
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, Various Android Apps
Quote:
Originally Posted by Begemot View Post
OP here, I resorted to using export Text in WinDJView.

This gets you a text dump with no formatting whatsoever. For my Libre it works well enough, but in general, this procedure is suboptimal.

Most DJVU files do seem to have a text layer (unless there is some on the fly OCR happening when you select an area on the page, which seems unlikely).

Thus, there must be a way(at least theoretically until someone writes a converter) to preserve the formatting in the text layer.
I can assure you that the text layer is just that - text; it's purpose is simply to provide the search capability. There is no formatting and in many books there are OCR "mis-reads".

If you want to understand DJVUs then you need to get the spec and study it. I've done quite a bit of work with adding TOCs to existing DJVUs and have converted a couple of books to FB2 - this involves manually proof-reading and correcting the dumped text then formatting it to match the original (italics, bold etc).

Don't expect too much out of what is a by-product of the search function.

BobC
BobC is offline   Reply With Quote