View Single Post
Old 09-11-2010, 08:24 AM   #13
GrzegorzN
Junior Member
GrzegorzN began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Aug 2010
Device: Kindle 3
I've tested the latest Calibre release (0.7.18) -- does a good job of converting accented characters in my PDFs, but 'ą' characters are now prefixed with a space. Maybe sulka will give it a go and report as well, it'd be good to know it's just my exotic PDFs are causing problems.

I enabled debug in conversion options (should've done it right at the beginning...) and I see that in the input\index.html document the 'ą' character (in the middle of a word) is represented as

&nbsp;˛<br>[CR][LF]a

However, I don't see any way of fixing that -- leading spaces might be valid word separators (if a word begins with an accented character), so they shouldn't be automatically removed. I guess I'll have to use an intermediate output format and apply some manual fixes to it, but at least the converter does most of the work for me now, so that's a big improvement
GrzegorzN is offline   Reply With Quote