View Single Post
Old 05-11-2012, 07:52 AM   #11
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,228
Karma: 1334002
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
The problem is that you can't just sidestep the PDF issue. It doesn't matter how many times what you guys are asking for gets rephrased...

PDFs have no "structure" such as indentation - many don't even have text being just images. As I understand it the various PDF converters attempt to resurrect such indentation and line breaks and apply heuristics to attempt to guess where paragraphs might end and indentation exists. But as has been repeated over and over there are certain issues (some particularly in calibre's current PDF converter) that result in text that is corrupted, such as the oft quoted double-L issue (ligatures) etc.

Adobe themselves who invented this awful format can't come up with a tool that can convert to something more useful. Now if the originator of the format can't do it, what does that tell you? That it completely sucks for anything other than being rendered as a PDF.

So as I posted on the other thread your options are:

(1) Buy a decent sized tablet and open them in a PDF reader so you don't bother converting. That is what I and many others do, particularly for technical books which rely on layout. If you want an e-ink screen, go hunting for a Kindle DX or whatever other models might be out there...

(2) Do the conversion but live with the formatting being trashed. How trashed depends on a variety of factors such as which tool, what settings and how that PDF was authored. There are no magic settings, you might stumble on something that looks "mostly alright" for one PDF and find it doesn't work well with the next one.

(3) Do the conversion but spend many hours making it readable using an html editor.

In my opinion it is a non-starter, but then I've only dabbled around the edges with PDF conversions. Calibre's perpetually on hold "new" PDF engine contains some improvements that might be able to be built on, but until/if it ever gets released you really are pushing the proverbial uphill.

Last edited by kiwidude; 05-11-2012 at 08:06 AM.
kiwidude is offline   Reply With Quote