View Single Post
Old 08-03-2020, 10:59 AM   #42
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
Quote:
Originally Posted by Tex2002ans View Post
What did you export as?

You should see italics/bold showing up in the right half of Finereader:

Attachment 180911

Left should display the original document, and the Right half should show all the actual OCRed text.

Did you select Document Layout: "Formatted Text". In the dropdown, you can also select DOCX:

Attachment 180910

(Personally, I keep everything on "Exact Copy" until I'm ready to export the document. This makes the Left/Right halves match much more closely, making it easier to make corrections.)


(snippage for brevity)

I explained some of this back in 2014: Post #5 in "Problems converting K2PDF Opt files to EPUB".



As long as your headings are marked fine (<h1> <h2> <h3> ...), you regenerate that from Sigil.
One trick that you can use, to make your life a bit less horrible, is to take the exported Word file, in the original format/layout and then turn right around and export it to PDF--and then run a COMPARE, for the original PDF versus the new. Now...that only works worth a damn if you already have a text layer in the original pdf, but if you do, this can save you a crapload of braindamage. Take the compare, make the edits.

Take new Word file, cleaned-up, export to PDF, lather-rinse-repeat.

Yes, it's tedious and all that, but it's a shedload less tedious than trying to find all the OCR errors yourself manually. Does it find everything? Oh, hells, no, but it's an option that most people overlook.

Offered FWIW.

Hitch
Hitch is offline   Reply With Quote