View Single Post
Old 05-23-2013, 11:08 PM   #429
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by kundor View Post
The original file is djvu, for which native output isn't supported. I actually tried to convert it to a pdf while preserving the hidden text, so that I could then use k2pdfopt in native mode to pretty it up for my kindle. I used djvu2hocr (from the ocrodjvu package) to extract the text layer. Then I should be able to use either Hocr2PDF, from ExactImage, or PDFBeads to merge it back with the images. On my MacBook, hocr2pdf produces a 1.4 mb PDF which freezes Adobe Reader and looks like gibberish in Preview.
I finally did succeed with pdfbeads, but it wasn't entirely straightforward.
Thank you for the links. I wasn't aware of these applications.

Quote:
Originally Posted by kundor View Post


On the other hand, if I reflow, k2pdfopt mangles a lot of the math formulas. I realize this might not be a high priority, but I thought I'd report it in case it would be easy or interesting to fix.
For instance, this formula:

ends up like this:

Here's a formula with superscripts and subscripts that get unaligned:

ends up as

There are also some issues with inline text, when math stuff overlaps a line.
For instance, the bottom of the fraction 7/2 here:

gets cut off and ends up floating beneath the word "unit" here:

This particular issue (usually with a "2") happens a lot in this book, I noticed.
Please attach a couple example Djvu pages if you can. There are probably some settings adjustments that can be made.

Quote:
Originally Posted by kundor View Post


A second issue to report: while processing a PDF file produced by k2pdfopt with Ghostscript, I get hundreds of these errors:
Code:
   **** Unknown operator: 'inf'
   **** Error reading a content stream. The page may be incomplete.
   **** File did not complete the page properly and may be damaged.
It ends up by saying:
Code:
   **** This file had errors that were repaired or ignored.
   **** The file was produced by: 
   **** >>>> K2pdfopt v1.65 <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.
So, as instructed, I'm notifying you! A little searching turned up this bug, where the Ghostscript developers say that some floating point value is being written to the PDF, but "inf" is not valid in the PDF format, even if the floating point is INF.
Again, please attach an example of the source file and command options that cause the generation of the bad PDF file if you can.
willus is offline   Reply With Quote