View Single Post
Old 07-07-2021, 07:08 AM   #19
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Hi

Tesseract, gimageReader, LO.

All images are in the attached zip file.

The sources are the two attached images Pasteur 01.jpg and Pasteur 02.jpg. It's a scientific (admittedly old) text, with italics, superscript, some special characters, nothing specially easy.

I took the following screenshots
- écran gimagereader is what you get. You can correct some red mistakes or follow on. I did not correct anything.
- écran gimagereader2 is what you get when you click to suppress line ends.

- Pasteur.txt is the output from gimageReader.

- Pasteur.odt is what you get on LO when you import the file Pasteur.txt in your working model.

- checking.png is how I proceed for the checking phase. I put the image on the left, the working model on the right.

I hope these images and screenshots will provide you with an honest understanding of what Tesseract 4.1.1. can do now. The text of most of the fiction books is easier than this example.
Attached Files
File Type: zip tesseract.zip (3.18 MB, 229 views)
roger64 is offline   Reply With Quote