View Single Post
Old 04-01-2014, 03:11 AM   #4
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by BetterRed View Post
@jlocicero - I'm wondering if Sigil's Spell Check might be of some use - you could filter by spelling mistakes containing a 'hyphen'

BR
I second this suggestion. Sigil Spellcheck can point out every single instance of a hyphenated word:

Click image for larger version

Name:	SigilHyphenationSpellcheck.png
Views:	546
Size:	22.6 KB
ID:	121096

Just add in a hyphen in the Filter box, and make sure "Show All Words" is checked.

I use this all the time to remove accidental hard hyphens leftover from OCR.

I typically do a "two pass" check. Once with "Show All Words" unchecked, and one with "Show All Words" checked.

To replace hyphens with en dashes, I use this Regex:

Search: ([0-9])-([0-9])
Replace: \1–\2

This handles all of the years/page numbers that are typically in the book (although I don't recommend using "replace all", replace on a case-by-case basis even though it will take a while longer).

If you want to get even more refined.... there is no solid way to do it besides checking every single hyphen manually. Probably better to pull the information from a better source, or reOCR the thing yourself and do a code comparison.
Tex2002ans is offline   Reply With Quote