![]() |
#16 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,872
Karma: 62040409
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Since a good part of my library is PDFs, using pdftotext sped things up considerably.
I did notice everything else lagging when I used 12 processes. I switched it to six and the lag disappeared. |
![]() |
![]() |
![]() |
#17 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 77
Karma: 90088
Join Date: Jul 2020
Device: android
|
|
![]() |
![]() |
Advert | |
|
![]() |
#18 | |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 36,979
Karma: 148318166
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Quote:
My main reason for checking this out was that I use ElasticSearch with Greylog which states that part of it's reason for existence is to work around the shortcomings of ElasticSearch. Last edited by DNSB; 07-27-2020 at 12:53 PM. |
|
![]() |
![]() |
![]() |
#19 | |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 77
Karma: 90088
Join Date: Jul 2020
Device: android
|
Quote:
That said, I hope it's more than enough for local library management. ![]() |
|
![]() |
![]() |
![]() |
#20 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,872
Karma: 62040409
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
My library is about 4000 books and 20 GB, though the bulk of that is image-heavy PDF files (lots of video game strategy guides).
Last edited by ownedbycats; 07-27-2020 at 03:09 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#21 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 946
Karma: 1183425
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
|
Thanks for the plugin! Awesome idea.
I did not installed pdftotext, but I only have 18 PDFs on my library. I'm testing it on a library with 1130 books (many with multiple formats: EPUB, AZW3/KFX) and about 3GB. Info about the initial indexing: It took only 16 minutes to go from 0 to 99%. But now it is stuck at 99% for about 3h45min. My system is an i7 7700HQ (16GB of RAM). My processor has 4 cores (8 threads). Plugin has chosen 8 max parallel process. Now, the strange part: according to Task Manager (Windows), my CPU is only using 20% of its total capacity. While writing this post, it finished, after 4h05min. Now it searches instantly! Nice! ------ My first impressions and questions ------ 1) Question: When you have multiple formats for one book, does it lookup all the formats or just one? 2) Question: On caps.json, it only shows EPUB, MOBI, PDF and TXTs files. According to this, and other tests I have done, it does not index AZW3/KFX files. Is this correct? 3) Question: How the index works for new additions? Are the new files automatically indexed when I run ElasticSearch? 4) Suggestion: It would be really important to have more options for search. Right now, it searches word by word. So, I can't look for phrases or compound words (Ex: coffee table. It will search for books with "coffee" OR "table"). Also, accented characters are distinguished from non-accented. 5) Info: According to ElasticSearch Reference, to have more options for search, you would need to change your query from "match" to "query_string". This would allow operators, wild cards and regular expressions. P.S.: "match" queries can use operators too, but you would have to code that. 6) Info: The ZIP file attached to first post has another ZIP inside (with the full plugin). |
![]() |
![]() |
![]() |
#22 | |||
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 36,979
Karma: 148318166
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Quote:
Quote:
Quote:
I used the 3rd zip file from message #9 in this thread. Note that my setup is on Windows x64. I also restarted the full search after deleting the old setup and moving my computer related ebooks out of my calibre library. I also realized that I had not pointed to pdftotext properly and corrected that. Was a heck of a lot faster with those mostly oversized pdf files removed. 2 hours down to 5 minutes. |
|||
![]() |
![]() |
![]() |
#23 | |||||
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 77
Karma: 90088
Join Date: Jul 2020
Device: android
|
Thanks thiago.eec for feedback!
You might probably have encountered one of "bad" PDF files which is taking ages to process without pdftotext. Quote:
Quote:
Quote:
Quote:
Quote:
![]() |
|||||
![]() |
![]() |
![]() |
#24 | |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 77
Karma: 90088
Join Date: Jul 2020
Device: android
|
Quote:
The new files won't be automatically indexed instantly when you add new books. But they will be indexed once you run Power Search again. |
|
![]() |
![]() |
![]() |
#25 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 36,979
Karma: 148318166
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
|
![]() |
![]() |
![]() |
#26 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,872
Karma: 62040409
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
While testing, I also had FanFicFare update some of my fanfics and then did a search for random words that appeared only in the newest chapters. It did re-index the ePubs that had changed.
|
![]() |
![]() |
![]() |
#27 | ||
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 946
Karma: 1183425
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
|
Quote:
Quote:
I did a simple test here, and changing query from "match" to "match_phrase" did the job allowing phrases and compound words. Using "query_string" isn't that easy, though. |
||
![]() |
![]() |
![]() |
#28 | ||
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 77
Karma: 90088
Join Date: Jul 2020
Device: android
|
Quote:
I just decided to be conservative in first releases and support only those book formats that I could test well enough. I will extend this list in next release once I make sure it works well. Quote:
![]() This however means that I would need to find a way of sorting results according to relevance. Don't know how easy is it to do in Calibre, but generally it seems to me a right way to go. |
||
![]() |
![]() |
![]() |
#29 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 77
Karma: 90088
Join Date: Jul 2020
Device: android
|
Version 1.2.0 released.
Contains following usability improvements:
Adds support for DOC, AZW3, KFX file formats. Last edited by mapozyan; 08-08-2020 at 01:00 PM. Reason: Removed attached version 1.2.0 |
![]() |
![]() |
![]() |
#30 | |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 77
Karma: 90088
Join Date: Jul 2020
Device: android
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Clipboard Search | kiwidude | Plugins | 29 | 04-02-2024 10:05 PM |
[GUI Plugin] Search the Internet | kiwidude | Plugins | 433 | 04-01-2024 05:48 PM |
[GUI Plugin] Recoll Full Text Search | Satas | Plugins | 16 | 08-05-2016 03:54 AM |
[GUI Plugin] Full Text Search (SOLR) | peterpisljar | Plugins | 2 | 08-09-2015 08:16 AM |
Make a simple Plugin for Full Text Search using Recoll | Satas | Development | 9 | 07-20-2013 04:15 PM |