Yes, if you want to optimize, since allowed xhtml whitespace chars are single byte low code point values you could check if char code < some cutoff value and if so use it as offset into fixed array/vector to check for whitespace. I am sure there are other tricks as well.
Fwiw string search for a single char is one of the most heavily optimized glibc routines for just this reason - for short strings, repeat the byte to look for n times and do a 64 bit xor against const string to simultaneously test for that value in multiple chars at once. There are many other approaches as well
Last edited by KevinH; 07-21-2016 at 07:01 PM.
|