View Single Post
Old 07-22-2023, 02:28 AM   #23
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,639
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
That is a great video @Tex2002ans, really helpful.

Today I decided to create a better workflow with this new information.

Firstly, I had to do something about my prehistoric scanner. The interface is non-existant. Yep, when Adobe Flash Player was removed from Windows, I lost access to the scanners GUI. Up until now I've been using the WIA function on photoshop. Not ideal.

1. So hunting around for new WIA compliant scanner software, I found this...
https://www.naps2.com/
Simple and easy to use. The BEST feature is that it can batch scan. You enter how many scans to make, how many seconds between scans (6 sec in my case) and press Start. All you need to worry about is turning pages in that 6 seconds. In a matter of a few minutes 15 scans have been completed (30 book pages).

2. Once those scans are created, then it's time for Scan Tailor Advanced.
It is very quick and simple with all the batch processes. In a few minutes 30 pages are turned into OCR ready tiff images.

3. Then onto gImage Reader...
https://github.com/manisandro/gImageReader/
Batch OCR the 30 pages

4. Next comes LibreOFFICE and the OCR text is copied across.
This is where it becomes quite time consuming- fixing all those little OCR errors. Then marking the chapter headings. Once done export to epub.

Of course, the ebook needs a bit of work for a good quality final product, but my main concern was the OCR side.

I have previously tried to scan pages from books, but it was a very frustrating experience, and I spent close to three hours to scan 20 pages and add them to an ebook. I realise now I attempted this without the right knowledge and tools. So thanks for all the great pointers!!

If there is anything in my workflow that could be improved, please let me know.
Karellen is online now   Reply With Quote