Quote:
Originally Posted by willus
1. The unnecessary line break you point out has nothing to do with the spacing between Chinese characters. It occurs because the one line, due to the punctuation mark, is not fully justified (see attachment, green circle). Because the rest of the text is fully justified, k2pdfopt thinks that line, because not fully justified, has ended a paragraph. I could consider having an adjustable threshold setting for this, but for now this determination cannot be adjusted.
2. The -ws option, as you have discovered, can be tuned to a small value to allow k2pdfopt to re-flow between the characters (really should be -ws 0, not -ws -0 -- I'll have to fix that). As you can see in the attachment, k2pdfopt is correctly separating all of the Chinese characters with this setting.
3. The additional option -bp[-] that you put doesn't do anything. The brackets are used to indicate optional suffixes, e.g. when the command-line usage says -bp[+|-|--], it means you can put one of these four options: -bp, -bp+, -bp-, or -bp--.
|
1. Thank you for the explanation. I found unnecessary line breaks throughout the book.
In Chinese, a new paragraph almost always begins with an indentation of two characters. This might be a better indicator for a new paragraph.
When a line ends with a punctuation mark, especially with a Chinese parenthesis, it is impossible to have full justification. Note the difference between English brackets () and Chinese ones (). A ")" always has blank to its right (see image, underlined blue parts).
2. So I just have to set ws to 0 when converting Chinese files, right?
3. Thank you for pointing out my mistake.
There is another odd behaviour (see image, two underlined red parts). Besides unnecessary line, the text is right aligned.
Here is the original two pages of PDF
before 1 and 2.pdf. (I have not underlined all unnecessary line breaks.)
I have found a software,
Kindle Comic Converter, that converts images to manga (comic) mobi files. I first use k2pdfopt to convert and reflow PDF to png images and then use KCC to convert them into mobi. Unlike PDF files,
manga mobi files don't refresh every page on Kindle, as if it were a text-based mobi file. Page-refreshing is the reason why I avoid reading PDFs on my Kindle even if the PDF files have been well-converted. With the help of your tool and KCC, PDF reading experience on Kindle is maximized. I recommend Kindle users to use both k2pdfopt (for image optimization and text reflow) and KCC (for Kindle-friendly mobi files).
The only problem is that since KCC is not developed for scanned text files, it uses an aggressive auto-cropping mode, which cuts all margins produced by k2pdfopt.
Thank you again for your time!
Update
Just found
Amazon's official tool—Kindle Comic Creator. I'll try it out and report the result.
Update again
Here is the result. Kindle Comic Creator, Amazon's official software is amazing. It even supports PDF as source files. Just follow the steps and fill in the title and author, select page turning mode, panel view (for scanned PDF, no need to enable panel view) etc. And the output is half the size of that of the unofficial Kindle Comic Converter.
So k2pdfopt is best used with Kindle Comic Creator.