View Single Post
Old 02-28-2015, 09:28 PM   #4
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Sticky: Read this before Posting PDF Questions

No, "converting perfectly into DOCX" does not exist, and will not and cannot fix the problems inherent to using PDF.

The best any PDF-to-Word converter can do is make use of advanced and expensive parsing engines to make a best-guess at where the paragraphs connect, and I do not know of any such software. It probably does exactly what calibre does and uses a generic line unwrap factor. So you might want to think about checking for the odd split/joined paragraphs.

Either way, this is another common problem with PDF conversions (IIRC calibre does the same) -- it correctly unwrapped those two lines, but took the PDF at its word with the hyphen and linebreak, and assumed they were two words.
eschwartz is offline   Reply With Quote