View Single Post
Old 07-01-2014, 09:09 PM   #7
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by mzmm View Post
just thought i'd throw in that you don't need to escape most metacharacters inside a character class, so you could rewrite

[...]
Thanks a lot for the info.

That Regex was just one of the things I created WAYYYY back when I first started figuring out Regex, and since it continued to work so well, I just didn't mess with it. And better to be safe with escapes than sorry!

I actually stumbled across a few cases in the past few days of left and right brackets '[' ']', might have to be added in to Regex #1 and Regex #2.

ALSO, there is the odd case I forgot to mention of the wrong punctuation being italicized (QUITE common OCR error). For example,

Quote:
<p>Stigler, George. 1961. “The Economics of Information.<span class="italics">” Journal of Political Economy</span> 69.</p>
As you can see here, the RIGHT double quote is included in the italics, but isn't in my Regex #1.

I typically tackle these on a case-by-case basis at a later date (sometimes I can spot other errors when this occurs). For example, quite often a quotation mark can be the wrong way around, OR, the "smart punctuation" algorithm went haywire, and an anomaly occurred.
Tex2002ans is offline   Reply With Quote