View Single Post
Old 07-20-2016, 09:59 AM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,106
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by richardfoley View Post
I'm (trying to) convert an existing PDF into an eBook using Calibre v1.48.

Most of the process works fairly well, except for the first line of each chapter. The first line is "squashed", that is, the whitespace is all removed and the words all run together. Only on the first line of each chapter, all other paragraphs are fine.

Heuristics are switched on, and it seems not to matter which settings I use, the first line is always squashed. I've a suspicion this has to do with the first 3 lines in the PDF using a drop cap, as it's the only thing I can think of which is unique to the first line of each chapter. The 2nd and 3rd lines (which follow the drop cap in the PDF) appear just fine with their expected word spacings.

I've tried using "-d dirpath" and all of the (debug) output files have the squashed text in them already, so I suspect it's the parsing of the drop cap in the original PDF, somehow...

Thanks in advance for any ideas that might be a cause/fix for this.
I use the Editor to fix whatever ills.

In this case. You have a line-height: <some value less than 1.2 (a typical value)>
This was probably inherited from a Dropcap or other decoration in the original. Find the Paragraph class in the CSS and fix (or remove that line)
PDF is a terrible source as it is a paste-up format, where the commands can be anywhere on the page
theducks is offline   Reply With Quote