Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 09-27-2013, 02:02 PM   #1
MelBr
Zealot
MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.
 
Posts: 105
Karma: 414068
Join Date: Feb 2013
Device: iPad Pro, Kobo Aura One
Calibre's search is pretty lacking and has issues

My biggest gripe with calibre is search. If you're used to Google, Bing or Spotlight search results, calibre's search feels like you're stuck in early 2000s.

Here's an example. Say you're searching for "author - title". See that hyphen? Well, unless your book has a hyphen in it, Calibre won't find it. It also won't find it if you put a colon and colon doesn't appear in the title. Something like "Title: Subtitle". Today I was searching for a book and couldn’t find it because I used a hyphen. I found it through Spotlight instead.

Another issue is stemming. Calibre still can't can't return search result for "network" if you type in "networks". http://stackoverflow.com/questions/9...in-python-list

All of these issues are becoming more pronounced as Calibre is moving towards being more monolithic so using other tools is becoming problematic. Before, you could just do a grep/find or Spotlight search on calibre folder and find the title/author easily but now with super-short filenames, that becomes impossible for books with longer titles. Calibre's philosophy is that you should not look into Calibre Library's folder but that means that search should be as good as or better than what the OS provides.

Unless you can easily find things through Calibre's search, using libraries with large number of books is becoming problematic (to me anyway).
MelBr is offline   Reply With Quote
Old 09-27-2013, 02:23 PM   #2
Zetmolm
Guru
Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.
 
Posts: 615
Karma: 2362786
Join Date: Jan 2010
Device: PocketBook Verse Pro Colour
Interesting. I've never had a problem with Calibre's search. And I have almost 40000 books in my library. Actually, I find the search in Calibre very efficient.

If you search for your keys in your desk but your keys are not in there, you won't find them. If you search for a colon in a title but the colon is not there, you won't find it. To me that makes perfect sense.

But then, I grew up long before Google, Bing, etc.

I'm not criticizing you, it's just that it really strikes me how different people can have different expectations.
Zetmolm is offline   Reply With Quote
Advert
Old 09-27-2013, 03:36 PM   #3
PatNY
Zennist
PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.PatNY ought to be getting tired of karma fortunes by now.
 
PatNY's Avatar
 
Posts: 1,022
Karma: 47809468
Join Date: Jul 2010
Device: iPod Touch, Sony PRS-350, Nook HD+ & HD
Quote:
Originally Posted by MelBr View Post
Here's an example. Say you're searching for "author - title". See that hyphen? Well, unless your book has a hyphen in it, Calibre won't find it. It also won't find it if you put a colon and colon doesn't appear in the title. Something like "Title: Subtitle". Today I was searching for a book and couldn’t find it because I used a hyphen. I found it through Spotlight instead.

Just curious. Why would you even use punctuation in your search? If you simply use one or two key words from the title, then Calibre will find the book easily. For example, if the book is "The Ocean at the End of the Lane" and I put in ocean end in the search field, the book will be found instantly. There is never the need to put in colons or hyphens AFAIK.


Quote:
Another issue is stemming. Calibre still can't can't return search result for "network" if you type in "networks". http://stackoverflow.com/questions/9...in-python-list
Calibre will find matches for partial words so you never need to put in plurals and are better off being briefer if you're not sure of the exact word in the title.

For example, with the book above, if I put in just ocea la in the search field, Calibre will find it easily.

One thing you might want to do, if you haven't done so already, is go into preferences and limit the metadata search to author and title. It makes the results much more accurate, and I did have some problems with Calibre searches before I selected that option.

--Pat
PatNY is offline   Reply With Quote
Old 09-27-2013, 05:51 PM   #4
jgaiser
Omnivorous
jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.
 
jgaiser's Avatar
 
Posts: 3,283
Karma: 27978909
Join Date: Feb 2008
Location: Rural NW Oregon
Device: Kindle Voyage, Kindle Fire HD, Kindle 3, KPW1
Yeah. What's with the hyphen?

Search - author:authorname and title:titlename works just fine for me
jgaiser is offline   Reply With Quote
Old 09-27-2013, 07:15 PM   #5
MelBr
Zealot
MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.
 
Posts: 105
Karma: 414068
Join Date: Feb 2013
Device: iPad Pro, Kobo Aura One
You gents ever used Google or Bing? EVERY search site strips punctuations and stems the terms. Even Amazon, GoodReader and B&N do it. Thousands of sites that use Lucene, for example, do the same. It's the basics of IR.

And to answer your question: I copy/pasted the title of a book. It differed from the one inside of Calibre by a hyphen.

Calibre's search is not user friendly if you have to type in cryptic stuff like author:"=John Smith". And even then a hyphen or a comma or a semicolon or a period will trip you up so your point is quite pointless and doesn't solve the bigger issue.
MelBr is offline   Reply With Quote
Advert
Old 09-27-2013, 07:40 PM   #6
jgaiser
Omnivorous
jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.
 
jgaiser's Avatar
 
Posts: 3,283
Karma: 27978909
Join Date: Feb 2008
Location: Rural NW Oregon
Device: Kindle Voyage, Kindle Fire HD, Kindle 3, KPW1
Quote:
Originally Posted by MelBr View Post
Calibre's search is not user friendly if you have to type in cryptic stuff like author:"=John Smith". And even then a hyphen or a comma or a semicolon or a period will trip you up so your point is quite pointless and doesn't solve the bigger issue.
Sorry your disappointed. Almost every major search engine *I've* tried has their own little cryptic ways of doing certain things.

I suspect that Kovid is limited by the Python libraries. What I did was dig through the help files area for "Searching", learned what I needed and moved on.

I also suspect that nothing much is going to change. You learn to live what you see as limitations or you go find something else that better meets your requirements. For me, calibre meets my needs.
jgaiser is offline   Reply With Quote
Old 09-27-2013, 07:44 PM   #7
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
Posts: 13,682
Karma: 79983758
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
Quote:
Originally Posted by jgaiser View Post
I also suspect that nothing much is going to change.
Remember there is another option; submit a patch to change how search works.
PeterT is offline   Reply With Quote
Old 09-27-2013, 07:50 PM   #8
jgaiser
Omnivorous
jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.jgaiser ought to be getting tired of karma fortunes by now.
 
jgaiser's Avatar
 
Posts: 3,283
Karma: 27978909
Join Date: Feb 2008
Location: Rural NW Oregon
Device: Kindle Voyage, Kindle Fire HD, Kindle 3, KPW1
Quote:
Originally Posted by PeterT View Post
Remember there is another option; submit a patch to change how search works.
Yes, of course... The source is open and Kovid seems to pretty open to some changes. I used to code searches for MySQL databases and database searches can be very convoluted. So maybe that's why I don't see the search engine in calibre to be much of a problem.
jgaiser is offline   Reply With Quote
Old 09-27-2013, 08:38 PM   #9
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by MelBr View Post
My biggest gripe with calibre is search. If you're used to Google, Bing or Spotlight search results, calibre's search feels like you're stuck in early 2000s.
@MelBr I have some empathy with you, I psych my way around the frustration by regarding calibre's 'Search' as 'Query'.

But I'm sort of puzzled as why you'd include punctuation in the search string as in "author - title" or "Title: Subtitle". Unless of course you pick up the string from the clipboard/paste buffer that was loaded by PuTTy or something similar, as happens to me via Autocopy in browsers and Click.To in Windows - fortunately I found a widget that scrapes 'noise' from clipboard entries.

Quote:
Originally Posted by MelBr View Post
All of these issues are becoming more pronounced as Calibre is moving towards being more monolithic so using other tools is becoming problematic.
Care to elaborate - I've only been using calibre for a couple of years, but I can't say I've noticed any significant deterioration (or improvement) in this regard.

Quote:
Originally Posted by MelBr View Post
Before, you could just do a grep/find or Spotlight search on calibre folder and find the title/author easily but now with super-short filenames, that becomes impossible for books with longer titles.
You seem to be suggesting there's been a change in calibre's 'restrictions' regarding folder & file naming that results in truncation and transliteration. Is that really the case or are your book titles becoming longer.

To avoid folder & file name truncation I usually leave straplines and such out of the book Titles. I put them in a Text, column shown in the tag browser' custom column called Strapline, that I include in 'Columns to search' but hide in the Tag browser

If you use a long title that results in truncated file names then Spotlight should find the full title in the opf file. The problem then is how to get those results into Calibre. Have a look at the Recoll (Spotlight for Linux) plugin, maybe you could use it as the basis to create a Spotlight plugin.

EDIT : Full text search is apparently on Kovid's list of things to do, don't know where it sits priority wise.

BR

Last edited by BetterRed; 09-27-2013 at 09:36 PM. Reason: note re Kovids 'plans'
BetterRed is offline   Reply With Quote
Old 09-27-2013, 09:29 PM   #10
At_Libitum
Addict
At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.
 
Posts: 265
Karma: 724240
Join Date: Aug 2013
Device: KyBook
Quote:
Originally Posted by MelBr View Post
Calibre's search is not user friendly if you have to type in cryptic stuff like author:"=John Smith".
If you don't wan to have to type the final query string yourself (which is perfectly natural as I wouldn't want to have to remember the correct format either) you could always use the binocular icon to the left of the search bar to help you generate that 'cryptic' stuff.
At_Libitum is offline   Reply With Quote
Old 09-28-2013, 02:25 AM   #11
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by BetterRed View Post
You seem to be suggesting there's been a change in calibre's 'restrictions' regarding folder & file naming that results in truncation and transliteration. Is that really the case or are your book titles becoming longer.
Calibre reduced the length used for filenames (?) when moving to the new database backend. Many folks commented on it at the time. I don't recall the details because I don't care, but it was done.

Quote:
Originally Posted by BetterRed View Post
To avoid folder & file name truncation I usually leave straplines and such out of the book Titles. I put them in a Text, column shown in the tag browser' custom column called Strapline, that I include in 'Columns to search' but hide in the Tag browser
Good option.
DoctorOhh is offline   Reply With Quote
Old 09-28-2013, 03:03 AM   #12
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by MelBr View Post
Calibre's search is not user friendly if you have to type in cryptic stuff like author:"=John Smith". And even then a hyphen or a comma or a semicolon or a period will trip you up so your point is quite pointless and doesn't solve the bigger issue.
You do not HAVE to type it that way. You can simply type John Smith and get the result. You also have an option to use column names, such as author:"=John Smith". That is not all, however, because you can omit that equal sign so Calibre doesn't do a "full string match", so a much better way would be to search authors:"John" and authors:"Smith". This search term would return books written by "John Smith", "Smith, John", "John T. Smith", or even "Johnathan Smither".
There are also Regular Expression support that makes the search extremely powerful.
You do not have to remember those authors:"=something" or title:"FooBar", there is nice icon to the left of the search bar where you can use "Advanced search" dialog panel to construct an advanced query without learning anything. When the query is ready, you can then refine it further by direct edit. If you like the query you can save it as a Saved search and tweak it a little bit when you do similar search next time. You can also build a Virtual Library, you can use left pane to select combination of authors, tags, formats, ratings, publishers, saved searches, whatever and have query constructed automagically. Then you can edit the query, or save it, or ...

So, query system in Calibre is very powerful and advanced and offers many features that even some databases lack. You can still use it just by typing a simple string, so you do not have to be a database guru to use it 99% of time.
Now, please tell me what exactly is the search lacking? A built-in SQL query statements for JOIN?

Please Kovid, if you do read this discussion, *please*, do not dumb down the search just because some users can not be bothered to read documentation or use icon next to search bar ;-)

Last edited by kacir; 09-28-2013 at 03:11 AM.
kacir is offline   Reply With Quote
Old 09-28-2013, 05:36 AM   #13
Zetmolm
Guru
Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.Zetmolm ought to be getting tired of karma fortunes by now.
 
Posts: 615
Karma: 2362786
Join Date: Jan 2010
Device: PocketBook Verse Pro Colour
Quote:
Originally Posted by kacir View Post
Now, please tell me what exactly is the search lacking? A built-in SQL query statements for JOIN?

Please Kovid, if you do read this discussion, *please*, do not dumb down the search just because some users can not be bothered to read documentation or use icon next to search bar ;-)
I think the OP was pretty clear about what he/she wanted: some sort of fuzzy search. Ignore punctuation. Apply stemming. And all that when you type a simple search term. Remember, Regular Expressions are not for everyone. Neither are SQL JOIN statements.

As I said, I never had any problem with the search features in Calibre, which indeed are very powerful. But I do understand that for many people it might be nice e.g. that if you type John Smith you also find books written by John Smythe.

But I agree with kacir that a search feature like that should never *replace* the current search. It might be a nice *addition* to it. And usage should be optional, e.g. in the form of a checkbox next to the search bar.
Zetmolm is offline   Reply With Quote
Old 09-28-2013, 06:10 AM   #14
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,003
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by DoctorOhh View Post
Calibre reduced the length used for filenames (?) when moving to the new database backend. Many folks commented on it at the time. I don't recall the details because I don't care, but it was done.
Yes I recall the discussions and I recall thinking --- hmmm, I've not noticed any change, maybe that's because I have the author and title related settings at the factory defaults. However my backup logs reveal that thousands of my library folders and format files haven't been synched for a year or more.

But this probably because I avoid using long names in the first instance, if there's going to be any shortening of folder or file names I prefer to apply my domain knowledge to do it, rather than leave it to a mechanistic algorithm.

When I add a book the first thing I do is to get the author names and title 'right', in doing that I look at the names of the folders and files created, if necessary I make changes so that the folder and file names are 'sensible'.

Quote:
Originally Posted by kacir View Post
You do not HAVE to type it that way. You can simply type....
The OP already covered that point, in post #5 MelBr wrote the following

Quote:
Originally Posted by MelBr View Post
And to answer your question: I copy/pasted the title of a book. It differed from the one inside of Calibre by a hyphen.
Before critiquing a critic one has a duty of care to oneself to read what the critic actually wrote** - its amazes me how often that maxim is ignored at Mobileread.

** I forget who said that, might have been Chesterton or Pauline Kale, Oliver Cromwell put it this way - I beseech you, in the bowels of Christ, think it possible that you may be mistaken.

But the gist of your post does go to the nub of MelBR's issue.

Calibre's 'search facility' is primarily based on querying its heavily indexed relational database, which contains structured data. I know there are Comments and similar columns - but inclusion of them in a calibre search can have a dramatic deleterious impact on search times - as in minutes rather than a couple of seconds.

However MelBR compares calibre to facilities found on search sites such as Google and sites that use Lucene. But I suspect MelBR knows full well that the techniques used to index structured data in a relational database (and a small single user one at that), and those used to index unstructured data on a zillion web pages and documents are not quite the same thing.

And that's why in the context of calibre I interpret the word 'search' as 'query' - because that's what it does, it interrogates a relational database using Structured Query Language.

Its also why I will continue to use my OS for my real searches, I almost always want to search beyond my calibre libraries, I want to see emails, letters, receipts, invoices, blog posts, media transcripts etc, etc. I don't expect calibre to find those things any more than I expect Quod Libet (music library) to find pictures of my cat.

BR

Last edited by BetterRed; 09-28-2013 at 06:21 AM.
BetterRed is offline   Reply With Quote
Old 09-28-2013, 07:07 AM   #15
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,525
Karma: 8065948
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by BetterRed View Post
And that's why in the context of calibre I interpret the word 'search' as 'query' - because that's what it does, it interrogates a relational database using Structured Query Language.
FWIW: calibre searches neither use SQL nor depend on the db structure. Search expressions are "compiled" into an abstract syntax tree (AST). The tree form of the expression is evaluated on a book-by-book basis, filtering results using set arithmetic to avoid evaluating a sub-expression which has no chance it can match. It further uses set arithmetic to filter the results by restriction (virtual library etc).

Unanchored non-regexp text matches are determined using ICU (International Components for Unicode) equivalency rules. The ICU package "compiles" the query text into a canonical form that takes into consideration accented character equivalences in the locale being used, then scans the text being examined for that form. That is why searches for "solzen" will find "Solženicyn, Aleksandr", "stepa" will find "Petr Štepánek", and "strasse" will find "Straße", at least in the English locale. BTW: using this process explains why searching can fail quite badly on OS X in some locales. Apple ships an old, broken ICU package.

One optimization I have considered is to reorder the AST so that expressions that have a better chance of being restrictive are evaluated first. This optimization could improve the performance of expressions like "foo and title:bar" because the naked search term "foo" would be checked only if the title contains bar. Evaluated as written, all of the fields permitted to be checked for naked search terms would be checked before the title is checked. I haven't bothered yet because I don't have strong evidence that any improvement merits the work.

Attempting to do stemming and sound equivalency in a product that runs in 100's of languages is far beyond anything I would want to try to do. And given that historically I and Kovid are the only two people who have shown interest in working on calibre's search expression analyzer, that probably means it isn't going to happen any time soon.
chaley is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Find books lacking a specific format jt421 Calibre 2 06-23-2011 07:05 AM
Classic Web browser sorely lacking skorpyo Barnes & Noble NOOK 0 12-30-2010 09:39 PM
iPad iBook store content is lacking mjhudston Apple Devices 3 07-25-2010 08:28 PM
Can't install Calibre, lacking SysAdmin priviliges, older version instead? ZenEngineer Calibre 3 12-17-2009 12:57 AM
(Resolved) I'm pretty angry right now... Calibre or Sony Library MESSED UP MY PC!!! ProDigit Calibre 26 11-29-2008 05:30 PM


All times are GMT -4. The time now is 04:57 AM.


MobileRead.com is a privately owned, operated and funded community.