View Single Post
Old 02-13-2015, 01:14 PM   #8
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Bulk Library Search for OCR Warning Indicators (partially solved)

It turns out that the Search ePub feature under the Quality Check add-on for Calibre includes an option to "show all occurrences". That option at least indirectly provides a bulk search capability to identify the number of times a special OCR-error character like � appears in the various ePubs.

If the "show all occurrences" option is selected (maybe best at first to limit this to a subset of the books identified as having at least one occurrence), then every occurrence within that subset of ePubs is listed on a separate line in the results log and that list can be copied into something like Excel to be analyzed to identify the most problematic books.
Rob557 is offline   Reply With Quote