|  07-27-2020, 12:30 PM | #16 | 
| Custom User Title            Posts: 11,351 Karma: 79528341 Join Date: Oct 2018 Location: Canada Device: Kobo Libra H2O, formerly Aura HD | 
			
			Since a good part of my library is PDFs, using pdftotext sped things up considerably. I did notice everything else lagging when I used 12 processes. I switched it to six and the lag disappeared. | 
|   |   | 
|  07-27-2020, 12:35 PM | #17 | 
| Connoisseur            Posts: 77 Karma: 90088 Join Date: Jul 2020 Device: android | |
|   |   | 
| Advert | |
|  | 
|  07-27-2020, 12:48 PM | #18 | |
| Bibliophagist            Posts: 48,088 Karma: 174315300 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos | Quote: 
 My main reason for checking this out was that I use ElasticSearch with Greylog which states that part of it's reason for existence is to work around the shortcomings of ElasticSearch. Last edited by DNSB; 07-27-2020 at 12:53 PM. | |
|   |   | 
|  07-27-2020, 01:09 PM | #19 | |
| Connoisseur            Posts: 77 Karma: 90088 Join Date: Jul 2020 Device: android | Quote: 
 That said, I hope it's more than enough for local library management.   | |
|   |   | 
|  07-27-2020, 03:06 PM | #20 | 
| Custom User Title            Posts: 11,351 Karma: 79528341 Join Date: Oct 2018 Location: Canada Device: Kobo Libra H2O, formerly Aura HD | 
			
			My library is about 4000 books and 20 GB, though the bulk of that is image-heavy PDF files (lots of video game strategy guides).
		 Last edited by ownedbycats; 07-27-2020 at 03:09 PM. | 
|   |   | 
| Advert | |
|  | 
|  07-27-2020, 03:17 PM | #21 | 
| Wizard            Posts: 1,291 Karma: 1428263 Join Date: Dec 2016 Location: Goiânia - Brazil Device: iPad, Kindle Paperwhite, Kindle Oasis | 
			
			Thanks for the plugin! Awesome idea. I did not installed pdftotext, but I only have 18 PDFs on my library. I'm testing it on a library with 1130 books (many with multiple formats: EPUB, AZW3/KFX) and about 3GB. Info about the initial indexing: It took only 16 minutes to go from 0 to 99%. But now it is stuck at 99% for about 3h45min. My system is an i7 7700HQ (16GB of RAM). My processor has 4 cores (8 threads). Plugin has chosen 8 max parallel process. Now, the strange part: according to Task Manager (Windows), my CPU is only using 20% of its total capacity. While writing this post, it finished, after 4h05min. Now it searches instantly! Nice! ------ My first impressions and questions ------ 1) Question: When you have multiple formats for one book, does it lookup all the formats or just one? 2) Question: On caps.json, it only shows EPUB, MOBI, PDF and TXTs files. According to this, and other tests I have done, it does not index AZW3/KFX files. Is this correct? 3) Question: How the index works for new additions? Are the new files automatically indexed when I run ElasticSearch? 4) Suggestion: It would be really important to have more options for search. Right now, it searches word by word. So, I can't look for phrases or compound words (Ex: coffee table. It will search for books with "coffee" OR "table"). Also, accented characters are distinguished from non-accented. 5) Info: According to ElasticSearch Reference, to have more options for search, you would need to change your query from "match" to "query_string". This would allow operators, wild cards and regular expressions. P.S.: "match" queries can use operators too, but you would have to code that. 6) Info: The ZIP file attached to first post has another ZIP inside (with the full plugin). | 
|   |   | 
|  07-27-2020, 04:26 PM | #22 | |||
| Bibliophagist            Posts: 48,088 Karma: 174315300 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos | Quote: 
 Quote: 
 Quote: 
 I used the 3rd zip file from message #9 in this thread. Note that my setup is on Windows x64. I also restarted the full search after deleting the old setup and moving my computer related ebooks out of my calibre library. I also realized that I had not pointed to pdftotext properly and corrected that. Was a heck of a lot faster with those mostly oversized pdf files removed. 2 hours down to 5 minutes. | |||
|   |   | 
|  07-27-2020, 04:51 PM | #23 | |||||
| Connoisseur            Posts: 77 Karma: 90088 Join Date: Jul 2020 Device: android | 
			
			Thanks thiago.eec for feedback! You might probably have encountered one of "bad" PDF files which is taking ages to process without pdftotext. Quote: 
 Quote: 
 Quote: 
 Quote: 
 Quote: 
   | |||||
|   |   | 
|  07-27-2020, 05:19 PM | #24 | |
| Connoisseur            Posts: 77 Karma: 90088 Join Date: Jul 2020 Device: android | Quote: 
 The new files won't be automatically indexed instantly when you add new books. But they will be indexed once you run Power Search again. | |
|   |   | 
|  07-27-2020, 06:20 PM | #25 | 
| Bibliophagist            Posts: 48,088 Karma: 174315300 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos | |
|   |   | 
|  07-27-2020, 07:44 PM | #26 | 
| Custom User Title            Posts: 11,351 Karma: 79528341 Join Date: Oct 2018 Location: Canada Device: Kobo Libra H2O, formerly Aura HD | 
			
			While testing, I also had FanFicFare update some of my fanfics and then did a search for random words that appeared only in the newest chapters. It did re-index the ePubs that had changed.
		 | 
|   |   | 
|  07-28-2020, 11:37 AM | #27 | ||
| Wizard            Posts: 1,291 Karma: 1428263 Join Date: Dec 2016 Location: Goiânia - Brazil Device: iPad, Kindle Paperwhite, Kindle Oasis | Quote: 
 Quote: 
 I did a simple test here, and changing query from "match" to "match_phrase" did the job allowing phrases and compound words. Using "query_string" isn't that easy, though. | ||
|   |   | 
|  07-28-2020, 05:32 PM | #28 | ||
| Connoisseur            Posts: 77 Karma: 90088 Join Date: Jul 2020 Device: android | Quote: 
 I just decided to be conservative in first releases and support only those book formats that I could test well enough. I will extend this list in next release once I make sure it works well. Quote: 
  This however means that I would need to find a way of sorting results according to relevance. Don't know how easy is it to do in Calibre, but generally it seems to me a right way to go. | ||
|   |   | 
|  07-31-2020, 12:05 PM | #29 | 
| Connoisseur            Posts: 77 Karma: 90088 Join Date: Jul 2020 Device: android | 
			
			Version 1.2.0 released. Contains following usability improvements: 
 Adds support for DOC, AZW3, KFX file formats. Last edited by mapozyan; 08-08-2020 at 01:00 PM. Reason: Removed attached version 1.2.0 | 
|   |   | 
|  07-31-2020, 12:18 PM | #30 | |
| Connoisseur            Posts: 77 Karma: 90088 Join Date: Jul 2020 Device: android | Quote: 
   | |
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| [GUI Plugin] Search the Internet | kiwidude | Plugins | 436 | 05-12-2025 09:02 PM | 
| [GUI Plugin] Clipboard Search | kiwidude | Plugins | 29 | 04-02-2024 10:05 PM | 
| [GUI Plugin] Recoll Full Text Search | Satas | Plugins | 16 | 08-05-2016 03:54 AM | 
| [GUI Plugin] Full Text Search (SOLR) | peterpisljar | Plugins | 2 | 08-09-2015 08:16 AM | 
| Make a simple Plugin for Full Text Search using Recoll | Satas | Development | 9 | 07-20-2013 04:15 PM |