04-21-2023, 09:42 PM | #1 |
Junior Member
Posts: 7
Karma: 10
Join Date: Aug 2020
Device: Kindle Paperwhite 2021
|
converter for OCR from images
I have some books that the publisher in my country does not sell in digital format and I would like to convert them to use on kindle
I have the advantage of increasing the size of the text I can read at night with its own light and the book is 98% text. but I did not find a way to do this in caliber does it support this feature? |
04-21-2023, 10:03 PM | #2 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No calibre does not have any OCR capabilities.
|
Advert | |
|
04-22-2023, 09:13 AM | #3 |
Junior Member
Posts: 7
Karma: 10
Join Date: Aug 2020
Device: Kindle Paperwhite 2021
|
and there is no possibility of having this support in the future?
|
04-22-2023, 10:01 AM | #4 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
calibre is open source anyone can contribute code to it. I will say this is not on my horizon.
|
04-28-2023, 02:01 AM | #5 |
Connoisseur
Posts: 52
Karma: 666
Join Date: May 2020
Location: Germany
Device: android smartphone + tablet
|
|
Advert | |
|
04-28-2023, 02:20 PM | #6 |
the rook, bossing Never.
Posts: 11,158
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
However I think text from PDF images via OCR is a workflow best done before the final version is added to the Library. It's not conversion in the sense mobi, azw3, epub etc to each other are. It needs human proofing and editing.
|
04-29-2023, 04:07 AM | #7 | |
Connoisseur
Posts: 52
Karma: 666
Join Date: May 2020
Location: Germany
Device: android smartphone + tablet
|
Quote:
In my use case, however, it is about adding a text layer to a PDF whose layout should not be changed, for example to enable full-text search (FTS) and text extraction. In my experience, with a good scan and the correct configuration of the OCR software (Tesseract), the recognition errors are relatively small and hardly affect the FTS. If you copy a piece of text for a quote, you can easily check it against the original layout. Also, before writing the text layer, I plan to offer the scan result in a text editor, with the possibility of proofreading. |
|
04-29-2023, 05:49 AM | #8 |
the rook, bossing Never.
Posts: 11,158
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
My experience of unproofed OCR text layer is the Internet Archive. Indeed it's only any use for text search and not great for that.
|
05-09-2023, 08:08 AM | #9 |
Junior Member
Posts: 2
Karma: 10
Join Date: May 2023
Device: kindle
|
I have recently converted a pdf non text just image into an epub. It took me a couple of hours to convert it into text.
There are some ocr softwares very accurate, but the one I had less errors was the one of google drive. Make sure your pdf do not exceed 2 mb. You can split it. Upload the files to google drive. Right button, open with google docs. And done. Then you can download it as txt, or docx, etc... and convert it into epub. Good luck! |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Anyone used iPad for OCR, proofreading or editing book images? | graycyn | Apple Devices | 6 | 09-11-2020 06:35 PM |
no text extraction for pdf with images and OCR | fxp33 | Conversion | 7 | 12-15-2015 07:22 AM |
Can you OCR the images inside of .pdf files? | klmmc13 | Workshop | 39 | 10-30-2014 08:07 PM |
Free PDF to text OCR Converter | Thasaidon | Deals and Resources (No Self-Promotion or Affiliate Links) | 1 | 04-02-2012 11:58 AM |
free PDF to EPUB converter with images | rmm1 | Apple Devices | 1 | 05-15-2010 12:43 AM |