View Single Post
Old 09-06-2018, 10:19 PM   #224
szarroug3
Zealot
szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'szarroug3 knows the difference between 'who' and 'whom'
 
Posts: 104
Karma: 10000
Join Date: Apr 2016
Device: Kindle PW2
Okay, so I've figured out what's wrong but I can't figure out how to fix it. In the regex pattern I wrote, i use \b around the word I'm looking for. Turns out that this doesn't work when the first or last character in the word is non-ascii.

There are three different positions that qualify as word boundaries:
  • Before the first character in the string, if the first character is a word character.
  • After the last character in the string, if the last character is a word character.
  • Between two characters in the string, where one is a word character and the other is not a word character.

Basically, since the non-ascii character doesn't count as a "word" character, it doesn't fulfill any of these requirements.

I'm still working on it.

Last edited by szarroug3; 09-06-2018 at 10:31 PM.
szarroug3 is offline   Reply With Quote