View Single Post
Old 11-24-2008, 05:50 PM   #10
adi
Enthusiast
adi began at the beginning.
 
Posts: 34
Karma: 36
Join Date: Oct 2008
Device: irex digital reader
Quote:
Originally Posted by thomega View Post
That's true for most books, but the examples I refer to are theoretical physics texts: particle physics, quantum field theory, string theory - all full of equations and diagrams. There's no OCR program in the world that will be able to handle that. You can see an example at this link (the exerpt is from the introduction which is very light on math, it gets heavier later ). A copy of the whole book that can be downloaded from file sharing sites has the same quality. It's a pity that this stuff is not yet legally available for the DR 1000S.
I am in the similar situation, trying to read books in djvu with lots of math. So far, the easiest (for the DR1000, not for me) solution is to first convert from djvu -> pdf (using print to pdf from djvu-viewer), then convert pdf to individual pages using pdftk, and then convert each page to a png using imagemagick's convert. Then store all the pngs in a single directory, on the reader, this shows as a multipage document. All the steps can be automated, but it takes a lot of time for each book.

There are a couple of drawbacks. The size increases by a lot. Typically a 400 page (4-5 MB) djvu document gets converted to 400 pngs approx 300kB each. Also, in principle one does not need to split the pdf into individual pages, convert is supposed to handle multi-page pdf. But, for pdfs created from djvu files, convert gives a ghostscript error on a multipage pdf.

Another drawback is that the DR cannot hide margins correctly. I normally end up doing a selection zoom once for each book.
adi is offline   Reply With Quote