View Single Post
Old 07-28-2020, 11:37 AM   #27
thiago.eec
Guru
thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.thiago.eec ought to be getting tired of karma fortunes by now.
 
Posts: 927
Karma: 1177583
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite
Quote:
Originally Posted by mapozyan View Post
You might probably have encountered one of "bad" PDF files which is taking ages to process without pdftotext.
The time consuming files are image PDFs and Fixed Layout books. But now I installed pdftotext and it is way faster!

Quote:
Originally Posted by mapozyan View Post
Correct. Now plugin supports following formats: CHM, CBZ, FB2, PDB, DJVU, EPUB, MOBI, DOCX, PDF, TXT, RTF. I will try to add AZW3/KFX support in next release.
I took a deeper look at the plugin. All files are converted to TXT, then passed to ElasticSearch. So, the allowed formats can be any of those handled by calibre. I changed the supported list to include DOC, AZW3, AZW4 and KFX and they all got indexed.

Quote:
Originally Posted by mapozyan View Post
Thanks, I am thinking on it. Will try to address in a next release.
I did a simple test here, and changing query from "match" to "match_phrase" did the job allowing phrases and compound words. Using "query_string" isn't that easy, though.
thiago.eec is online now   Reply With Quote