|
|
#1 |
|
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 942
Karma: 53902736
Join Date: Jun 2015
Device: multiple
|
Easy way to check for pdfs with no text or buggy text?
Sometimes pdfs just lack text and need ocr. Sometimes they start with text, but lose it to pre-processing bugs. Is there a sort of Quality Check tool for pdfs that can find ones which lack text or have seriously screwed up text?
|
|
|
|
|
|
#2 |
|
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 15,176
Karma: 111120239
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
If you can't select any text in any Linux based PDF reader, then it's only an image.
Selecting and pasting one page into a text editor usually shows if it's rubbish OCR only really to provide search. |
|
|
|
| Advert | |
|
|
|
|
#3 |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,046
Karma: 30277960
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@MarjaE - I'm not aware of any calibre tools that will check pdf content in the way you want.
There maybe a 3rd party utility that can do the check, probably to a single file, meaning you could use it in a script that walked the directory tree. Best place to ask is in the PDF forum ==>> PDF BR |
|
|
|
![]() |
| Tags |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Make Sure that You Check Your Text Messages--Amazon May Have Nice Gift for You. | GtrsRGr8 | Deals and Resources (No Self-Promotion or Affiliate Links) | 4 | 07-10-2019 02:50 PM |
| Renaming a text file is not so easy | roger64 | Editor | 4 | 02-25-2016 09:52 AM |
| PDFs and Hidden Text Layers | aidren | enTourage Archive | 4 | 04-14-2010 02:23 PM |
| Missing text in PDFs | Pulp | Bookeen | 9 | 10-02-2008 11:58 AM |
| PRS-500 pielrf beta - Text to LRF with Easy TOC, autoflow, etc. | EatingPie | Sony Reader Dev Corner | 9 | 05-11-2007 11:51 PM |