I just want to add that I have a similar problem when I convert a PDF containg the font
Minion Pro. The original file is in german and could be easily mapped to a standard character set.
But combination like
Th or
ff or
fi cause the conversion to HTML to go wrong.
I am looking forward to the new PDF conversion.
I am using calibre 0-72.