View Single Post
Old 07-24-2020, 03:38 AM   #38
Shohreh
Addict
Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.
 
Posts: 211
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
Thanks much for the infos about layers.

Quote:
Originally Posted by Tex2002ans View Post
Didn't you already say in Post #28 that you ran this PDF through Finereader? Finereader should have carried over italics and other formatting for you.
Because FineReader did not carry formatting, I wanted to try other tools, especially since the PDF contained two layers, so it made sense to extract the "text" layer and see how it compared with running the PDF through FineReader.

Turns out it's still a bit of work to…
  • Re-add formatting (bold, italics, etc.)
  • Some hyphenated words weren't corrected by FineReader (but much better than starting from raw text from pdttotext, since FineReader uses a dictionary to fix most of those)
  • Re-add footnotes
  • Takes pictures of tables and… pictures, and insert them
  • Build a ToC
Shohreh is offline   Reply With Quote