It looks like pdf2html converted them to 0x00AD, soft-hyphen rather than 0x002D, hyphen-minus. They're then stripped out between the final output from debugging and the actual epub creation.
The paragraph breaks at some of these hyphens appear to be bad line unwrapping on PDF conversion. I could play with line unwrapping to get a better PDF conversion and then manually convert the soft hyphens to regular hyphens.
|