View Single Post
Old 12-30-2020, 06:02 AM   #7
Frenzie
Wizard
Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.Frenzie ought to be getting tired of karma fortunes by now.
 
Posts: 1,763
Karma: 731681
Join Date: Oct 2014
Location: Antwerp
Device: Kobo Aura H2O
Quote:
Originally Posted by ichnilatis View Post
So, do I have to make this correction?

-- document languages for OCR
DKOPTREADER_CONFIG_DOC_LANGS_TEXT = {"English", "Ancient Greek"}
DKOPTREADER_CONFIG_DOC_LANGS_CODE = {"eng", "grc"} -- language code, make sure you have corresponding training data
DKOPTREADER_CONFIG_DOC_DEFAULT_LANG_CODE = "eng" -- that have filenames starting with the language codes
Something like that, yes. If you want to keep it, make sure to put it in persistent.defaults.lua.

Quote:
From the screenshot you sent I conclude that the breathings (᾿ ῾), the circumflex (῀) and the grave accent (`) are not recognized... and some letters

Can this problem be solved?
It's probably much less of a problem in non-italic text, but unless you have a slightly higher DPI original document not really. A newer version of Tesseract might also do slightly better.
Frenzie is offline   Reply With Quote