Quote:
Originally Posted by AIDM2
So, there's a book that has sanskrit text on the bottom of almost every page (see image).
|
Also, here is a link to the Archive.org version:
https://archive.org/details/materiamedicahi00kinggoog
Quote:
Originally Posted by AIDM2
Would it be possible to program ABBYY FineReader to ignore every instance of that text?
|
You will have to manually intervene in those cases.
You can either (ranked from worst to best):
- Ignore those sections and clean it in when you output
- You can manually delete the garbage text from the Text Window (right side)

- You can resize the "Text Recognition Box" to not include the footnote
- Don't forget to "Read" the page again.

- You can resize the text box, and then mark the footnote as an Image instead.
- I would choose this method.
- Don't forget to "Read" the page again.

Making the Sanskrit footnotes images will allow you to export them all, and perhaps you can then feed those images into an OCR tool that can read Sanskrit, you can more easily manually transcribe the text, or you can as a last resort, embed the sanskrit in the book as images (although I HIGHLY recommend against embedding text as images).