Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 03-05-2022, 04:30 PM   #1
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 8,656
Karma: 61234567
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Is there a GUI for OCRmyPDF?

https://wiki.mobileread.com/wiki/OCRmyPDF

I am... fairly bad with using the command-line. Does anybody know if there was any sort of GUI made for this?

ownedbycats is offline   Reply With Quote
Old 03-05-2022, 06:34 PM   #2
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
For any particular platform / operating system? Looks like OCRmyPDF is largely targeted at linux but can install on OS/X and Windows with some third party support (e.g. Python, Homebrew/Cygwin...). There are a lot of other OCR apps if you find OCRmyPDF difficult to use. If you have an iPhone, for example, there is Elucidate which also uses the Tesseract OCR engine.
willus is offline   Reply With Quote
Advert
Old 03-05-2022, 08:42 PM   #3
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 8,656
Karma: 61234567
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
I somehow managed to completely skip over the fact that there's no real Windows implementation
ownedbycats is offline   Reply With Quote
Old 03-06-2022, 10:53 AM   #4
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
It's really for tesseract anyway. Even on Linux you might use something else with Tesseract!

You can run Linux for free, either on a VM (Openbox is free on Windows 10 and recommended MS solution for XP and Win7 on Win10), or USB stick or dual boot, or ditch windows (me entirely in Jan 2017, but I have a clone of my 2002 XP laptop on OpenBox VM on Linux and Office 2003 on WINE).
Quoth is offline   Reply With Quote
Old 03-06-2022, 10:57 AM   #5
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,037
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
There's no good program for OCRing PDF. It's not possible.
JSWolf is offline   Reply With Quote
Advert
Old 03-06-2022, 01:00 PM   #6
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,285
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by JSWolf View Post
There's no good program for OCRing PDF.
That might well be true.

Quote:
Originally Posted by JSWolf View Post
It's not possible.
Prove it or retract.
j.p.s is offline   Reply With Quote
Old 03-06-2022, 03:24 PM   #7
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,037
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by j.p.s View Post
That might well be true.



Prove it or retract.
If it was possible with convert PDF to something else easily that doesn't need a lot of fixing, it would have already been done.
JSWolf is offline   Reply With Quote
Old 03-06-2022, 04:16 PM   #8
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,285
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by JSWolf View Post
If it was possible with convert PDF to something else easily that doesn't need a lot of fixing, it would have already been done.
And that is why we already have cheap fusion energy for our cities on Mars.

You have proved nothing. Prove your assertion, retract, or rephrase to be true.

(You have been told that makking an assertion does not establish a fact and it certainly does not constitute proof.)
j.p.s is offline   Reply With Quote
Old 03-07-2022, 06:07 AM   #9
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Quote:
Originally Posted by JSWolf View Post
There's no good program for OCRing PDF. It's not possible.
it depends on the PDF image quality. On a good day it can beat a copy-typist.

That's too general a statement.

You certainly need to proof & edit the output.
Quoth is offline   Reply With Quote
Old 03-07-2022, 09:58 AM   #10
salamanderjuice
Guru
salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.
 
Posts: 727
Karma: 10215666
Join Date: Jul 2017
Device: Boox Nova 2
Quote:
Originally Posted by JSWolf View Post
If it was possible with convert PDF to something else easily that doesn't need a lot of fixing, it would have already been done.
It really depends what you're using the OCR'd text to do. To make an ePub? Yeah it'll need some work. To search the text of a bunch of PDFs? It can do a pretty good job.
salamanderjuice is offline   Reply With Quote
Old 03-07-2022, 05:02 PM   #11
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 8,656
Karma: 61234567
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
I am now very confused. I just wanted to add a text layer to some scanned booklets I had for search purposes.
ownedbycats is offline   Reply With Quote
Old 03-08-2022, 10:28 AM   #12
salamanderjuice
Guru
salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.
 
Posts: 727
Karma: 10215666
Join Date: Jul 2017
Device: Boox Nova 2
Tesseract is an open source OCR engine that a lot of programs use, OCRmyPDF included. There's a number of programs to do OCR for free using it like gImageReader that works on Windows with a GUI. I don't think gImageReader will embed the text in a layer like OCRmyPDF but it will let you get a text document out at least.

JSWolf seems to think OCR on PDF is useless but he's wrong.

If you have Windows 10/11 it's also not that hard to get OCRmyPDF working under the Windows Subsystem for Linux (WSL).

There's also some websites that will do it like https://www.sandwichpdf.com/
salamanderjuice is offline   Reply With Quote
Old 03-08-2022, 10:45 PM   #13
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by ownedbycats View Post
I am now very confused. I just wanted to add a text layer to some scanned booklets I had for search purposes.
If you cannot get ocrmypdf working in Windows or are not comfortable with the command line, you can try k2pdfopt. It will unfortunately re-render each page as a new bitmap (this can be good if you want to change the resolution of the original PDF, or it can be bad if the original PDF uses better a better compression technique than k2pdfopt). If using the MS Windows GUI, select "copy" for the conversion mode and the check the "OCR (Tesseract)" box.

If you dig deeper there are options to adjust contrast, gamma correction, and output resolution.

Last edited by willus; 03-08-2022 at 10:52 PM.
willus is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Save Virtual Libraries To Column (GUI) chaley Plugins 14 04-04-2021 05:25 AM
OCRmyPDF adds OCR text layer to scanned PDF files orebmur PDF 0 01-20-2018 06:16 PM
GUI Icons Rellwood Development 1 07-09-2017 11:19 AM
GUI Changes luketheobscure Development 40 07-14-2011 04:23 PM
Frustrated with GUI yocalif Library Management 23 04-11-2011 03:09 PM


All times are GMT -4. The time now is 12:42 AM.


MobileRead.com is a privately owned, operated and funded community.