Actually Calibre does go through and remove hyphenated words intelligently. It uses the document itself as a dictionary to see if there is a variant of the word without a hyphen, and deletes the hyphen if there is a match.
The problem in this case is it's a crappy pdf with some other character encoded in addition to the hyphen. Unless this is a common issue across many pdfs (and I've never seen it with lots of test cases), it's probably not something that will get covered in the code.
|