Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 11-10-2020, 03:07 PM   #1
Shohreh
Zealot
Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.
 
Posts: 148
Karma: 192898
Join Date: Jan 2016
Device: none
[SOLVED] Tool to OCR an "image" PDF → add text as extra layer?

Hello,

Is there a tool that can…
1. OCR an "image" PDF, and
2. Include the text output as an additional layer in a PDF, so that the user can search, and possibly select+copy, and paste it elsewhere, like it were a "text" PDF?

Thank you.
Attached Thumbnails
Click image for larger version

Name:	A9DC227F-30D7-497F-911E-D38A2888CD63.png
Views:	433
Size:	144.4 KB
ID:	183316  

Last edited by Shohreh; 11-10-2020 at 06:44 PM.
Shohreh is offline   Reply With Quote
Old 11-10-2020, 04:42 PM   #2
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Shohreh View Post
Is there a tool that can…
1. OCR an "image" PDF, and
2. Include the text output as an additional layer in a PDF, so that the user can search, and possibly select+copy, and paste it elsewhere, like it were a "text" PDF?
Besides Adobe Acrobat, pretty much any commercial OCR tool, e.g. ABBYY FineReader, can do this.
There are also a couple of free Linux tools that can do this, e.g. pdfsandwich, but most of them are neither easy to install nor exactly user-friendly.

Last edited by Doitsu; 11-10-2020 at 04:45 PM.
Doitsu is offline   Reply With Quote
Old 11-10-2020, 06:44 PM   #3
Shohreh
Zealot
Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.Shohreh can program the VCR without an owner's manual.
 
Posts: 148
Karma: 192898
Join Date: Jan 2016
Device: none
Thanks for the info.

I tried a couple of open-source apps (Naps2 and ocrmypdf), and the output is pretty good.
Shohreh is offline   Reply With Quote
Old 11-14-2020, 09:23 AM   #4
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Thanks for the tips on naps2 and ocrmypdf. Great looking utilities. k2pdfopt will also do this and also uses Tesseract.

k2pdfopt -mode copy -n- -ocr t file.pdf
willus is offline   Reply With Quote
Old 12-15-2020, 10:23 AM   #5
charsee
Member
charsee began at the beginning.
 
Posts: 10
Karma: 10
Join Date: May 2019
Location: Pakistan
Device: kindle4/kobo touch
Quote:
Originally Posted by willus View Post
k2pdfopt -mode copy -n- -ocr t file.pdf
These commands go in "Additional options" box?
charsee is offline   Reply With Quote
Old 12-19-2020, 12:47 PM   #6
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,272
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by charsee View Post
These commands go in "Additional options" box?
With the MS Windows GUI you can set them as shown in the attached screen shot. The OCR option will automatically turn off native mode.
Attached Thumbnails
Click image for larger version

Name:	screenshot.png
Views:	392
Size:	128.0 KB
ID:	184136  
willus is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
OCRmyPDF adds OCR text layer to scanned PDF files orebmur PDF 0 01-20-2018 06:16 PM
Tool to rewrite a PDF as new text after OCR crazybrit PDF 1 06-10-2015 02:22 AM
How to add "Extra Titles" to my database? 1gnition Library Management 20 04-03-2014 06:51 AM
Scanned text pdf with OCR but graphical layer instead vectorial whopper PDF 2 09-10-2011 06:32 PM
PDF Image -> OCR -> text frikk Workshop 9 07-08-2009 07:21 PM


All times are GMT -4. The time now is 11:23 PM.


MobileRead.com is a privately owned, operated and funded community.