Quote:
Originally Posted by elibrarian
I only work in clean text/xhtml - not Word. That's probably the only difference. Every other piece of software I've ever used to get textlayers from pdfs have exported each and every space and linebreak for every (EVERY!) single line - except FlexiPDF.
That said, I just tried to export the textlayer from an OCR'ed pdf I've got from the Royal Library of Copenhagen, to Word, and using the standard settings, I'll admit it stinks. But if you press "Format" on the right bottom of the export dialogue, you can alter the standard settings. I removed everything, except "Text Output" and "De-hyphenate" for the Word-export, and got a nice Word-doc with none of the issues, you mention. Not perfect (because the OCR from the Royal Library is not perfect), but very, very usable.
You'll probably have to fiddle with the settings to get exactly what you want, but I think you might be a little too fast condemning FlexiPDF.
Regards,
Kim
|
P.S., Kim:
I did want to say, the Unbreaker in TransTools is simply AMAZEBALLS, and we all love it here. So, well done on that one! We are appreciative for the referral. LOVE IT.
(Still not wild about the FlexiPDF, but that may just be a matter of taste.)
Hitch