MobileRead Forums - View Single Post

adi · 11-24-2008, 05:50 PM

Quote:

Originally Posted by thomega

That's true for most books, but the examples I refer to are theoretical physics texts: particle physics, quantum field theory, string theory - all full of equations and diagrams. There's no OCR program in the world that will be able to handle that. You can see an example at this link (the exerpt is from the introduction which is very light on math, it gets heavier later

). A copy of the whole book that can be downloaded from file sharing sites has the same quality. It's a pity that this stuff is not yet legally available for the DR 1000S.

I am in the similar situation, trying to read books in djvu with lots of math. So far, the easiest (for the DR1000, not for me) solution is to first convert from djvu -> pdf (using print to pdf from djvu-viewer), then convert pdf to individual pages using pdftk, and then convert each page to a png using imagemagick's convert. Then store all the pngs in a single directory, on the reader, this shows as a multipage document. All the steps can be automated, but it takes a lot of time for each book.

There are a couple of drawbacks. The size increases by a lot. Typically a 400 page (4-5 MB) djvu document gets converted to 400 pngs approx 300kB each. Also, in principle one does not need to split the pdf into individual pages, convert is supposed to handle multi-page pdf. But, for pdfs created from djvu files, convert gives a ghostscript error on a multipage pdf.

Another drawback is that the DR cannot hide margins correctly. I normally end up doing a selection zoom once for each book.