Bulk Library Search for OCR Warning Indicators (partially solved)
It turns out that the Search ePub feature under the Quality Check add-on for Calibre includes an option to "show all occurrences". That option at least indirectly provides a bulk search capability to identify the number of times a special OCR-error character like � appears in the various ePubs.
If the "show all occurrences" option is selected (maybe best at first to limit this to a subset of the books identified as having at least one occurrence), then every occurrence within that subset of ePubs is listed on a separate line in the results log and that list can be copied into something like Excel to be analyzed to identify the most problematic books.
|