Convert a Chinese book, markers based on punctuation caused an erranour page break
Hi,
I have converted many Chinese books from Word files to epub with Calibre successfully. I recently encountered this issue that for a book, Calibre generates an extra page break.
The error happens in the only place where I have a short English phrase, as shown in the attached jpg file ( I have turned on "show all marks" in Word).
I checked the output of debug. In the input/index.html file, the line is correct:
<p class="block_33">*</p>
<p class="block_5 text_10">Mee Soto</p>
<p class="block_7">(词/曲:疏效平、李家欣)</p>
But in the parsed/index.html, it became:
<p class="block_33">*</p>
<p class="block_5 text_10" style="page-break-before:always">Mee Soto</p>
<p class="block_7">(词/曲:疏效平、李家欣)</p>
Note the extra "page-break-before:always" was added.
In the log file, it says:
...
Median line length is 135, calculated with html format
Looking for more split points based on punctuation, currently have 2
marked 3 section markers based on punctuation. - Mee Soto</p>
...
So somehow Calibre thinks there is a punctuation in "Mee Soto</p>
But I don't see it and I have spent quite a few days try to get rid of the extra page break.
I also found that if I change the "Mee Soto" to other English text, the page break will still be there. But if I change "Mee Soto" to some Chinese characters, then Calibre will not generate the extra page break.
I'd appreciate if anyone can help or point me why Calibre see a punctuation in "Mee Soto</p>.
Thanks
|