Hello fellow Calibre users,
I've been using Calibre for a number of years now to manage all my e-books, including technical textbooks.
Lecture notes (pdf, docx) and personal technical notes, which aren't e-books, are not Calibre-ized.
As such, whenever a search for a technical concept, for example the term "homomorph*", is conducted, at least two searches have to be performed – one using Calibre's (impressive) search functionality to look through my e-textbooks, and another to search across documents outside Calibre.
For succinctness, let me call searching using methods outside of the Calibre app "non-Calibre search". This covers searching using the Window's File Explorer, Mac's Finder app, or the Foxtrot Search app, as per my habit.
(I briefly describe the Foxtrot Search app at the end of this post.)
A naïve way to search once instead of twice is to task the non-Calibre search query to also look within Calibre's e-books. This may be done by specifying the inclusion of Calibre's e-book directory in its search efforts.
When this is done, searching for the occurrence of the word "advertise" in a mix of Linear Algebra & Corporate Marketing documents works fine because "advertise" is rarely used in Linear Algebra exposition, resulting in the search only returning Corporate Marketing documents. Note that documents here refer to both Calibre-ized e-books and files outside of Calibre since the single search query was conducted over both Calibre's e-book directory and elsewhere.
However, searching for words that commonly occur in both the communities of Linear Algebra & Corporate Marketing, like "rank", "transformation", "optimization", "projection", "dimension", when one is only concerned with Linear Algebra concepts sees to search results containing irrelevant Corporate Marketing documents. This is why I described this method – the mere inclusion of of Calibre's e-book directory in a non-Calibre search effort – as naïve.
A better approach that reduces irrelevant documents turning up in the search results is to instruct the search effort to focus solely on Linear Algebra documents.
To achieve this, Windows File Explorer, Mac Finder app, and Foxtrot Search allow their users to specify filters. One such filter works with OS-native tags. To my knowledge (and I don't work in the space of Filesystems), an OS-native tag for a file is commonly reflected as an extended attribute in that file. For example, if a file were tagged as "psychology" via the Mac Finder app, running the terminal command $xattr -l prints "psychology" among other things. More precisely, for the programmers reading this, it prints a bunch of text that contains the string "psychology".
So, via a tag filter, one could instruct the search effort to only focus on documents tagged "Linear Algebra", reducing the appearance of irrelevant documents in the search results.
To my knowledge, Calibre 17.9 doesn't reflect its own tags as OS-native tags by default, motivating me to pen this thread with the hope that someone might know how to get Calibre to do so, or to hear from someone who might be able to offer other possible solutions.
To summarize, it'd be quite neat if Calibre could reflect its own tags as OS-native tags so that other search applications could filter on these OS-native tags to reduce users' search efforts. From a certain perspective, OS-native tags used in this manner represents a primitive inter-application communication.
(To clarify, by "reflect Calibre tags as OS-native tags", I mean that if the file, book.pdf, were tagged as "real analysis" in the Calibre app, this file would then have an OS-native tag added, possibly an extended attribute represented by the text "calibre: real analysis", or another text to this effect. And when the user removes this tag using the Calibre app, the corresponding OS-native tag is removed as well.)
****
Foxtrot Search (
https://foxtrot-search.com/) is an application that lets users search within documents (.pdf, .doc, etc. files). It is quite sophisticated in my opinion, and has a UI that is really handy for finding textual content occurring several times in documents, across several documents, that may be scattered across many paths.
A thread was recently raised in Foxtrot's forums, seeking suggestions for filtering on Calibre's e-book path conventions:
https://forum.foxtrot-search.com/ind...h=612&start=0&