View Single Post
Old 07-27-2020, 03:17 PM   #21
thiago.eec
Wizard
thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.
 
Posts: 1,236
Karma: 1419583
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite, Kindle Oasis
Thanks for the plugin! Awesome idea.

I did not installed pdftotext, but I only have 18 PDFs on my library.
I'm testing it on a library with 1130 books (many with multiple formats: EPUB, AZW3/KFX) and about 3GB.

Info about the initial indexing: It took only 16 minutes to go from 0 to 99%. But now it is stuck at 99% for about 3h45min. My system is an i7 7700HQ (16GB of RAM). My processor has 4 cores (8 threads). Plugin has chosen 8 max parallel process. Now, the strange part: according to Task Manager (Windows), my CPU is only using 20% of its total capacity.

While writing this post, it finished, after 4h05min. Now it searches instantly! Nice!


------ My first impressions and questions ------


1) Question: When you have multiple formats for one book, does it lookup all the formats or just one?

2) Question: On caps.json, it only shows EPUB, MOBI, PDF and TXTs files. According to this, and other tests I have done, it does not index AZW3/KFX files. Is this correct?

3) Question: How the index works for new additions? Are the new files automatically indexed when I run ElasticSearch?

4) Suggestion: It would be really important to have more options for search. Right now, it searches word by word. So, I can't look for phrases or compound words (Ex: coffee table. It will search for books with "coffee" OR "table"). Also, accented characters are distinguished from non-accented.

5) Info: According to ElasticSearch Reference, to have more options for search, you would need to change your query from "match" to "query_string". This would allow operators, wild cards and regular expressions. P.S.: "match" queries can use operators too, but you would have to code that.

6) Info: The ZIP file attached to first post has another ZIP inside (with the full plugin).
thiago.eec is offline   Reply With Quote