View Single Post
Old 10-29-2012, 08:38 AM   #205
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by pepijndevos View Post
The original book used hyphenation, but after con- version, it results in long words being split up...
I can't tell because you didn't post the source PDF, but if the source is low quality and the hyphens are touching the letters next to them, then the hyphen-detection algorithm in k2pdfopt will not consider it a hyphen and will not remove it. I've also tweaked the algorithm for the next release. If you want to post your source, I'll test it before the next release.

Quote:
Originally Posted by pepijndevos View Post
At the start of a chapter, the first character spans multiple lines. This cause all the characters to the right to be copied as one block.
I haven't tried to implement something to detect this yet. It's not trivial to reliably detect such a situation, but it could certainly be attempted. I'll put it on my list of possible future features. Theoretically there are adjustments you could make that would get k2pdfopt to detect the letter as a separate column (you'd have to set -cgr higher, close to 1, and -ch lower, probably to 0.5 or a little lower), but there may be adverse side effects (it may break the source contents into multiple columns where you don't want it to).

Quote:
Originally Posted by pepijndevos View Post
Similar to the above page number problem, this book lists the current chapter at the top of every page.
You can use the -mt option (top margin ignore). E.g. -mt 0.5 will ignore the top 0.5 inches of each source page.
willus is offline   Reply With Quote