Quote:
Originally Posted by EbokJunkie
What about creation of temporary copy of each file with soft hyphens stripped?
|
Stripping out soft hyphens would mess up the locations of the terms it finds.
But does C# support regex search? Why not search for aliases like this:
alias: Nessarose
search:N\x{00AD}*e\x{00AD}*s\x{00AD}*s\x{00AD}*a\x {00AD}*r\x{00AD}*o\x{00AD}*s\x{00AD}*e
Which will match soft hyphens 0 or more times between each letter, guaranteeing to find every instance of that term regardless of soft hyphens included in it.
If you wanted to be absolutely sure, you could do some more fancy regex:
search:
Code:
N(\x{00AD}|­|­|­|­|­)*e(\x{00AD}|­|­|­|­|­)*s(\x{00AD}|­|­|­|­|­)*s(\x{00AD}|­|­|­|­|­)*a(\x{00AD}|­|­|­|­|­)*r(\x{00AD}|­|­|­|­|­)*o(\x{00AD}|­|­|­|­|­)*s(\x{00AD}|­|­|­|­|­)*e
The key part being the insertion of:
Code:
(\x{00AD}|­|­|­|­|­)*
between every letter that finds either the literal unicode soft hyphen symbol or the strings ­, ­, ­, ­, or ­.
If C# supports inline mode changes like Perl, you could even make that string case-insensitive while preserving the case sensitivity of the alias:
Code:
((?i)\x{00AD}|­|­|­|­|­(?-i))*