Quote:
Originally Posted by pepijndevos
The original book used hyphenation, but after con- version, it results in long words being split up...
|
I can't tell because you didn't post the source PDF, but if the source is low quality and the hyphens are touching the letters next to them, then the hyphen-detection algorithm in k2pdfopt will not consider it a hyphen and will not remove it. I've also tweaked the algorithm for the next release. If you want to post your source, I'll test it before the next release.
Quote:
Originally Posted by pepijndevos
At the start of a chapter, the first character spans multiple lines. This cause all the characters to the right to be copied as one block.
|
I haven't tried to implement something to detect this yet. It's not trivial to reliably detect such a situation, but it could certainly be attempted. I'll put it on my list of possible future features. Theoretically there are adjustments you could make that would get k2pdfopt to detect the letter as a separate column (you'd have to set -cgr higher, close to 1, and -ch lower, probably to 0.5 or a little lower), but there may be adverse side effects (it may break the source contents into multiple columns where you don't want it to).
Quote:
Originally Posted by pepijndevos
Similar to the above page number problem, this book lists the current chapter at the top of every page.
|
You can use the -mt option (top margin ignore). E.g. -mt 0.5 will ignore the top 0.5 inches of each source page.