FWIW: The ICU tools include a transliterator called "Any-Publishing" that converts publishing punctuation to typewriter punctuation. For example, it converts ‘“foo”’ to '"foo"'. It might make sense to use it in a variant of "primary_contains" so searching ignores the accent types. I don't understand enough about how you did the ICU python bindings to know a) if this idea is feasible or b) how to do it if it is feasible.
It also might make sense to use something like the transliterator in the content server to process the search strings.
|