Quote:
Originally Posted by Tex2002ans
What did you export as?
You should see italics/bold showing up in the right half of Finereader:
Attachment 180911
Left should display the original document, and the Right half should show all the actual OCRed text.
Did you select Document Layout: "Formatted Text". In the dropdown, you can also select DOCX:
Attachment 180910
(Personally, I keep everything on "Exact Copy" until I'm ready to export the document. This makes the Left/Right halves match much more closely, making it easier to make corrections.)
(snippage for brevity)
I explained some of this back in 2014: Post #5 in "Problems converting K2PDF Opt files to EPUB".
As long as your headings are marked fine (<h1> <h2> <h3> ...), you regenerate that from Sigil.
|
One trick that you can use, to make your life a bit less horrible, is to take the exported Word file, in the original format/layout and then turn right around and export it to PDF--and then run a COMPARE, for the original PDF versus the new. Now...that only works worth a damn
if you already have a text layer in the original pdf, but if you do, this can save you a crapload of braindamage. Take the compare, make the edits.
Take new Word file, cleaned-up, export to PDF, lather-rinse-repeat.
Yes, it's tedious and all that, but it's a shedload less tedious than trying to find all the OCR errors yourself manually. Does it find everything? Oh, hells, no, but it's an option that most people overlook.
Offered FWIW.
Hitch