The original file is djvu, for which native output isn't supported. I actually tried to convert it to a pdf while preserving the hidden text, so that I could then use k2pdfopt in native mode to pretty it up for my kindle. I used djvu2hocr (from the ocrodjvu package
) to extract the text layer. Then I should be able to use either Hocr2PDF, from ExactImage
, or PDFBeads
to merge it back with the images. On my MacBook, hocr2pdf produces a 1.4 mb PDF which freezes Adobe Reader and looks like gibberish in Preview.
I finally did succeed with pdfbeads, but it wasn't entirely straightforward.
On the other hand, if I reflow, k2pdfopt mangles a lot of the math formulas. I realize this might not be a high priority, but I thought I'd report it in case it would be easy or interesting to fix.
For instance, this formula:
ends up like this:
Here's a formula with superscripts and subscripts that get unaligned:
ends up as
There are also some issues with inline text, when math stuff overlaps a line.
For instance, the bottom of the fraction 7/2 here:
gets cut off and ends up floating beneath the word "unit" here:
This particular issue (usually with a "2") happens a lot in this book, I noticed.
A second issue to report: while processing a PDF file produced by k2pdfopt with Ghostscript, I get hundreds of these errors:
**** Unknown operator: 'inf'
**** Error reading a content stream. The page may be incomplete.
**** File did not complete the page properly and may be damaged.
It ends up by saying:
**** This file had errors that were repaired or ignored.
**** The file was produced by:
**** >>>> K2pdfopt v1.65 <<<<
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
So, as instructed, I'm notifying you! A little searching turned up this bug
, where the Ghostscript developers say that some floating point value is being written to the PDF, but "inf" is not valid in the PDF format, even if the floating point is INF.
Sorry if I'm giving you trouble—I like the software a lot! It's attractive for mathematical use since it uses the original images, so that strange symbols and letters from lots of alphabets are always preserved. Thanks for making it.