First of all, your message, David, remember me that I forgot to include quote and curved quote in the main group of the regex.
Quote:
Originally Posted by davidfor
I just tried this and it picked up "I" by itself. Which I suppose if you are using title case for the matched words works, but, not if you are just changing them to lower case.
|
Yes, I was not worried with "I" because the OP said he wanted to capitalize each word (1st letter uppercase). So "I" wasn't a concern.
If we want to exclude I\s and I' from the capture group, it begins to be tricky for me using regex (I tried (*SKIP)(*F) and negative look-ahead, but with a relative success). I would rather do this inside the regex function, we may lower everything and then capitalize back all \bi[\s'’].
Anyway, I totally agree with you in that an automatic treatment will probably create lots of false negatives and false positives, and is not recommanded. A solution could be to list all caps words in a text file (using a regex function), and use this file, after cleaning, as an index to say to the regex-function which words are to process (or at the contrary, which are not, a choice to do in relation with the aspect of the list). But it would be lot of work for the expected result, I guess (I mean in cleaning, the 2 regex-functions are quite easy to do).
Quote:
Originally Posted by davidfor
Or use a span with a transform to lower case or capitalize.
|
Yes, this is clever, since we don't touch the text itself, but its style, so it's easy to give an aspect or another. The problem in that approch still is "I", which must resist to lowering case, so the regex-function will have a hard job to put the tags at the right places.
Another problem is that many e-readers don't respect the directive "text-transform" or "font-variant", and it's the case with mine. I don't know why, since it would be very easy to implement this, but the fact is there, those directives are not bullet proof.