Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 05-09-2022, 12:59 PM   #1
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 2,079
Karma: 14079267
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Sage
Somewhat surprising search results.

I searched for an author with the surname 'Tonelli' with the search 'authors:tonelli'. I'm aware that this is a substring search, so authors with the surname 'Antonelli' popping up was expected behavior.

What did surprise me, was 'Bret Easton Ellis' as a match. Yes, if you totally ignore whitespace and the difference between first names and surnames it's a match: Bret Easton Ellis.

For me, that breaks the Principle of Least Surprise
mbovenka is offline   Reply With Quote
Old 05-09-2022, 01:43 PM   #2
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,181
Karma: 168983734
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by mbovenka View Post
I searched for an author with the surname 'Tonelli' with the search 'authors:tonelli'. I'm aware that this is a substring search, so authors with the surname 'Antonelli' popping up was expected behavior.

What did surprise me, was 'Bret Easton Ellis' as a match. Yes, if you totally ignore whitespace and the difference between first names and surnames it's a match: Bret Easton Ellis.

For me, that breaks the Principle of Least Surprise
It's likely you have been bitten by the removal of punctuation from searches introduced in 5.42.
DNSB is online now   Reply With Quote
Old 05-09-2022, 07:05 PM   #3
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 2,079
Karma: 14079267
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Sage
Quote:
Originally Posted by DNSB View Post
It's likely you have been bitten by the removal of punctuation from searches introduced in 5.42.
Yes, probably, now that you mention it. Well, this is removing somewhat too much punctuation. Whitespace should stay relevant, I think...
mbovenka is offline   Reply With Quote
Old 05-10-2022, 11:08 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Sadly I know of no way to convince ICU to remove punctuation but keep spaces. The workaround would be to replace the space with some private use unicode codepoint before searching which is a big performance hit.
kovidgoyal is offline   Reply With Quote
Old 05-10-2022, 11:29 AM   #5
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,025
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
So is this part of why search on Amazon, eBay, Google, DuckDuckGo etc has been broken for a while? Part of reason they are broken is to sell stuff so Amazon especially seems to include absolutely unrelated things.
Quoth is offline   Reply With Quote
Old 05-10-2022, 12:13 PM   #6
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
JimmXinu's Avatar
 
Posts: 6,970
Karma: 4604635
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
We bumped into a similar issue for SmartEject.

A regular expression search doesn't use the remove punctuation code, so if you instead search: authors:~tonelli, you shouldn't get "Bret Easton Ellis" anymore. But then you may have to learn some regexp.

I was going to comment that the change for 1969926 should be optional, but then I saw it already is!

Checkbox: Preferences > Searching > Unaccented characters match accented characters and punctuation is ignored

Unchecking that make spaces work as expected again, but at the cost of not matching accented/punc.
JimmXinu is offline   Reply With Quote
Old 05-10-2022, 05:09 PM   #7
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 10,973
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Would it make sense for the 'unaccented characters' option be added to the search context for quick toggling?
ownedbycats is offline   Reply With Quote
Old 05-10-2022, 05:41 PM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,749
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Can spaces be left out of the removal of punctuation from searches introduced in 5.42?
JSWolf is online now   Reply With Quote
Old 05-10-2022, 06:25 PM   #9
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,722
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by JimmXinu View Post


Checkbox: Preferences > Searching > Unaccented characters match accented characters and punctuation is ignored

Unchecking that make spaces work as expected again, but at the cost of not matching accented/punc.
In earlier versions that setting was Preferences > Searching > Unaccented characters match accented characters, and it defaulted to checked.

Is it too late to have a separate setting for punctuation (including spaces), that defaults to unchecked?

BR
BetterRed is offline   Reply With Quote
Old 05-10-2022, 06:32 PM   #10
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 2,079
Karma: 14079267
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Sage
Quote:
Originally Posted by kovidgoyal View Post
Sadly I know of no way to convince ICU to remove punctuation but keep spaces. The workaround would be to replace the space with some private use unicode codepoint before searching which is a big performance hit.
I understand. Well, I'll probably leave it as is, then, as accented characters not matching unaccented ones would probably irritate me more But I'd like to add my voice to the request to have 'ignore punctuation' and 'ignore accents' be two different options, if at all possible.
mbovenka is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
New Feature? Embed search term as a tag into Search Results with one click? mikhail_fil Library Management 1 04-02-2022 10:18 PM
Search results only show first result. How can I view successive results? lovedj1 Calibre 2 05-07-2021 07:53 AM
"all results" search results no longer working? 4691mls Kobo Reader 2 11-03-2020 10:59 AM
Forma Search-in-book results sometimes ends on page17, even if there's more (hidden)results droopy Kobo Reader 9 06-30-2020 11:05 AM
Get Books search - extremely fuzzy search results? Man Eating Duck Calibre 1 05-06-2013 12:08 AM


All times are GMT -4. The time now is 06:40 PM.


MobileRead.com is a privately owned, operated and funded community.