Originally Posted by ldolse
Actually Calibre does go through and remove hyphenated words intelligently. It uses the document itself as a dictionary to see if there is a variant of the word without a hyphen, and deletes the hyphen if there is a match.
The problem in this case is it's a crappy pdf with some other character encoded in addition to the hyphen. Unless this is a common issue across many pdfs (and I've never seen it with lots of test cases), it's probably not something that will get covered in the code.
Globalistan - Pepe Escobar
seems to be a good quality, non-commercial PDF, unless I'm misunderstanding the creative commons licence ?
QUOTE from http://www.nimblebooks.com/wordpress...mmons-license/
GLOBALISTAN free under Creative Commons License
Inspired by the example of the science fiction novelist Peter Watts, who released the full text of his outstanding novel BLINDSIGHT under a Creative Commons License last year to deservedly rapturous acclaim from Boing Boing! and many others, Pepe Escobar and I are happy to announce the Free GLOBALISTAN Project.
The full text of Pepe’s brilliant new book, GLOBALISTAN: HOW THE GLOBALIZED WORLD IS DISSOLVING INTO LIQUID WAR, is now available under a Creative Commons license in both PDF and html format
maybe I should try grabbing & converting a html version instead ? Unfortunately the link to html version at the above site seems broken - only the pdf link is working.
PS could someone please explain - if the book is being legally distributed for free, with the author's blessing , how come Amazon still want £5.27 for a Kindle version ?