View Single Post
Old 05-17-2009, 09:05 AM   #28
pepak
Guru
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
Quote:
Originally Posted by rogue_ronin View Post
Only had one false positive. There was a positive there too, but some short distance preceding it was an 'em and it was lumped into the positive, ie:
Yep, that's what I mentioned above:

Quote:
Originally Posted by pepak View Post
My regexp does that, except in the case when the apostrophed word appears before the actual opening quote. (That is, it will work fine if 'em is inside quotes and will work fine if there are no quotes after 'em).
Quote:
... and beat my prior search regex with an ugly stick
Your old regex may still be useful for replacing double-quotes. It it pretty poor with single-quotes, though. That's why I suggested my regexp :-)

Quote:
But I did the modified run after initially running his regex. Anyone see a reason why it might not work straight-up? I'm having trouble imagining a sentence that would false positive because of ; and &...
The starting semicolon could give you trouble with documents containing non-english characters. E.g. if your character's name ends with ç - it would get recognized as end of sentence.

I approach your problem from the other side - I always insert a space between apostrophe/single-quote and quote/double-quote. Not only it makes my regexp work just fine, it is far nicer visually, too. Later on, when all quotes are converted, you can either remove the space or (better yet) convert it to non-breaking space.
pepak is offline   Reply With Quote