| 
				
				Convert a Chinese book, markers based on punctuation caused an erranour page break
			 
 
			
			Hi,
 I have converted many Chinese books from Word files to epub with Calibre successfully. I recently encountered this issue that for a book, Calibre generates an extra page break.
 
 The error happens in the only place where I have a short English phrase, as shown in the attached jpg file ( I have turned on "show all marks" in Word).
 
 I checked the output of debug. In the input/index.html file, the line is correct:
 
 <p class="block_33">*</p>
 <p class="block_5 text_10">Mee Soto</p>
 <p class="block_7">(词/曲:疏效平、李家欣)</p>
 
 But in the parsed/index.html, it became:
 <p class="block_33">*</p>
 <p class="block_5 text_10" style="page-break-before:always">Mee Soto</p>
 <p class="block_7">(词/曲:疏效平、李家欣)</p>
 
 Note the extra "page-break-before:always" was added.
 
 In the log file, it says:
 ...
 Median line length is 135, calculated with html format
 Looking for more split points based on punctuation, currently have 2
 marked 3 section markers based on punctuation. - Mee Soto</p>
 ...
 
 So somehow Calibre thinks there is a punctuation in "Mee Soto</p>
 But I don't see it and I have spent quite a few days try to get rid of the extra page break.
 
 I also found that if I change the "Mee Soto" to other English text, the page break will still be there. But if I change "Mee Soto" to some Chinese characters, then Calibre will not generate the extra page break.
 
 I'd appreciate if anyone can help or point me why Calibre see a punctuation in "Mee Soto</p>.
 
 Thanks
 |