MobileRead Forums - View Single Post

marcbook · 01-19-2014, 12:52 PM

Hi!

Here is my challenge…

Im´m trying to convert this EPUB file:

Moderator Notice
Removed link to copyrighted material

onto PDF.

I´ve noticed when convert the file the chinese words are separated with spaces; that is great for chinese students, because chinese people doesn't use spaces when writing. I converted again into HTMLZ, changed extension to ZIP, uncompress it and… surprise! If you open the "index.html" in OpenOffice you will see a gray character between words (maybe it´s a non-breaking space? hypen?). If you open the file with TextWrangler you will see spaces between words, but that space is not the same as the occidental texts has. Seems to be that the original EPUB file contains some special character segmenting chinese words, but in read mode you can't see it, so all phrases hasn't spaces.

I want to convert the EPUB to PDF and use some "Search and Replace" option allow me to replace that special character onto spaces, but I don't know how to:

1) identify that special character
2) how to tell calibre doing a proper search and replace

I tried the search and replace wizard, but seems to be that the preview view would not show this special character, so I cannot select it from here.

I definitely need help!

Thank you in advance!

01-19-2014, 12:52 PM	#1
marcbook Junior Member Posts: 4 Karma: 10 Join Date: Dec 2013 Device: ipad3	Special find and replace Hi! Here is my challenge… Im´m trying to convert this EPUB file: Moderator Notice Removed link to copyrighted material onto PDF. I´ve noticed when convert the file the chinese words are separated with spaces; that is great for chinese students, because chinese people doesn't use spaces when writing. I converted again into HTMLZ, changed extension to ZIP, uncompress it and… surprise! If you open the "index.html" in OpenOffice you will see a gray character between words (maybe it´s a non-breaking space? hypen?). If you open the file with TextWrangler you will see spaces between words, but that space is not the same as the occidental texts has. Seems to be that the original EPUB file contains some special character segmenting chinese words, but in read mode you can't see it, so all phrases hasn't spaces. I want to convert the EPUB to PDF and use some "Search and Replace" option allow me to replace that special character onto spaces, but I don't know how to: 1) identify that special character 2) how to tell calibre doing a proper search and replace I tried the search and replace wizard, but seems to be that the preview view would not show this special character, so I cannot select it from here. I definitely need help! Thank you in advance! Last edited by theducks; 01-19-2014 at 12:58 PM. Reason: removed link