View Single Post
Old 05-14-2022, 07:58 AM   #1
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,455
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Proposed change to library search

There have been several comments about the change to searching in V5.42 where naked (unprefixed) searches now ignore punctuation, spacing, and character variants (accented characters). Before it ignored character variants only.

I am considering adding a new search prefix, Kovid suggests '^', that forces the search to respect punctuation and spacing while ignoring variants. The existing search option "Unaccented characters match accented characters and punctuation is ignored" would have no effect in this case.

Examples using two titles:
  1. "Big, Bothered, and Bad"
  2. "Big Bummer"
Using a naked search with the option checked (V5.42 behavior):
  • title:"g" matches both
  • title:"g " matches both (the search is g<space>)
  • title:"g," matches both
  • title:"gb" matches both
  • title:"g b" matches both
  • title:"db" matches #1
  • title:"," matches both (it actually matches all books)
Using the proposed prefixed search:
  • title:"^g" matches both
  • title:"^g " matches #2
  • title:"^g," matches #1
  • title:"^gb" matches nothing
  • title:"^g b" matches #2
  • title:"^db" matches nothing
  • title:"^," matches #1
To summarize the proposed behavior:
  • Naked search: either "contains ignoring punctuation and accents" or "simple contains" based on the existing preference. "Simple contains" is case insensitive matching where letter variants (e.g., accents) are significant.
  • ^ search (new): contains ignoring accents
  • = search: exact match ignoring case
  • ~ search: regex based search
I am not interested in adding options to more finely control how naked and the new prefix search behave. There are too many places where searching is used so I can't guarantee behavior. I would consider adding another prefix to force "simple contains" if there is agreement on the prefix letter, and if Kovid agrees.
chaley is offline   Reply With Quote