View Single Post
Old 08-27-2014, 08:59 PM   #633
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,557
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by lunker View Post
Hi. I frequently come up to OCR errors in the books I read.

To correct these errors, my method is, saving that flawed words and/or lines to "My Clippings" and after that, opening both "Clippings.txt" and the .epub file on PC to make corrections.

This whole process takes very long time and obviously boring. Here's the idea:

"On the Calibre's Edit Book window, a plugin that

- pulls the txt file from a specified location,
- finds the highlighted parts in the book,
- asks the user for confirmation and replacing,
- additional ideas..."

would be very effective. Anyone shares this situation? Is that possible to code a plugin like this?
I try to fix all the typographical errors before I start reading a book

Many OCR errors can be fixed with a few well crafted Search and Replace regular expressions (which can be saved for reuse), have a look in the Editor sub forum and also the Sigil forum, particularly at the Reg Ex Sticky threads.

Spell checking can also help, with split words in particu lar [sic]. A search in the full word list (in the spell checker) for '-' can help get rid of extraneous hyphens, and Sigil's Reports can help identify 'weird characters'.

I would more likely find use for a plug in as requested in the context of correcting grammatical/stylistic 'errors' rather than typographical errors.

BR

Last edited by BetterRed; 08-27-2014 at 09:02 PM.
BetterRed is online now   Reply With Quote