View Single Post
Old 06-28-2012, 08:00 AM   #1
fufu42
Junior Member
fufu42 began at the beginning.
 
Posts: 3
Karma: 10
Join Date: May 2012
Device: Kobo Touch
Best way to search inside ebook library?

Hi, I've been searching for a while on all kinds of places and decided to ask here for the following problem.

I have a collection of ebooks in many different formats from txt over pdf to epub, mobi etc. all inside a calibre directory. Now calibre unfortunately has ebook viewing capabilitey but no full-text search function.

From my point of view the least requirements of search functionality would be:
- regular expressions in full text search
- search inside all common ebook formats
- unicode support

optional but not vital:
- pre indexed file content
- search inside archived files
- on the fly indexing

I collected the following info up to now:

- The program Beagle fur Linux seems to have met my needs but isn't maintained any longer. I didn't try to install the latest version, did anybody lately?

- Google Desktop is discontinued - but probably had no regex but only boolean operators

- Copernic Desktop search seems interesting but the developer site states nothing about regular expressions. I haven't tried it. Has anyone?

-Agent Ransack which i just tried seems interesting but probably can't search inside epub (and other formats) though is does fast regex in fulltext with pdf, archived and other plain text-like files. (If I'll decide to use that program it would somewhat oddly mean that I'd have to convert all non-pdf files to pdf...) Agent Ransack does no indexing ahead of search.

- I wouldn't hesitate to use command line tools. Basically grep can do all I need for now. The Question then would be how to extract a corpus with the necessary file information from the library including pdf and epub formats.


Any other suggestions?

Thanks in advance!

Last edited by fufu42; 06-28-2012 at 09:57 AM.
fufu42 is offline   Reply With Quote