If you can supply a list of words in a text file, one pair per line separated by a vertical pipe character:
Td|I'd
I would be happy to write a small program to sort and index the list and then walk the text of every xhtml file parsing the text word by word, and looking in the list to see if the word needs to be replaced and if so doing the replacement. Please make the list case sensitive.
The hardest part will actually be where to split the text of a sentence into words and dealing with all the punctuation pieces stuck to the end.
KevinH
|