Quote:
Originally Posted by Hitch
So, sports fans:
I ran a test, using the FlexiPDF, and the short answer is, however Kim is using it, is 100% different than what we do here. Gotta be. Not only did I have to screw around to get it to export the entire document, but every time I do, the resulting 62 MB Word file (from a 400-page, super-clean PDF novel) crashes Word, causing serious errors. Maybe it works better in simpler environments, but...
On those pages that I did manage to export, without crashing, as a test, every paragraph is put inside a TABLE.
I did a "save as...Word" export, from the same PDF, and sure, I got broken paragraphs, but I also got the body text styles, etc. The Acrobat export was far, far superior to the FlexiPdf export.
In short, again--Kim must be using it very, very, VERY differently than we do, because this is not a program that I'd use, having seen the results. I"Ve tried probably 10x different "convert from PDF" programs, and honestly, this ranks amongst the worst. None of them are good, of course; that's the bottom line. But this one, putting each paragraph in a table? That's a whole new low.
I did buy the $25 TransTools suite; I figure if it's a disaster, I can afford to lose the money. I'm going to test it against the broken paras in the from-Acrobat export I did as part of the test. I'll report back how that does on broken paragraphs because that is a feature I could really use.
Hitch
|
I only work in clean text/xhtml - not Word. That's probably the only difference. Every other piece of software I've ever used to get textlayers from pdfs have exported each and every space and linebreak for every (EVERY!) single line - except FlexiPDF.
That said, I just tried to export the textlayer from an OCR'ed pdf I've got from the Royal Library of Copenhagen, to Word, and using the standard settings, I'll admit it stinks. But if you press "Format" on the right bottom of the export dialogue, you can alter the standard settings. I removed everything, except "Text Output" and "De-hyphenate" for the Word-export, and got a nice Word-doc with none of the issues, you mention. Not perfect (because the OCR from the Royal Library is not perfect), but very, very usable.
You'll probably have to fiddle with the settings to get exactly what you want, but I think you might be a little too fast condemning FlexiPDF.
Regards,
Kim