Quote:
Originally Posted by mapozyan
You might probably have encountered one of "bad" PDF files which is taking ages to process without pdftotext.
|
The time consuming files are image PDFs and Fixed Layout books. But now I installed pdftotext and it is way faster!
Quote:
Originally Posted by mapozyan
Correct. Now plugin supports following formats: CHM, CBZ, FB2, PDB, DJVU, EPUB, MOBI, DOCX, PDF, TXT, RTF. I will try to add AZW3/KFX support in next release.
|
I took a deeper look at the plugin. All files are converted to TXT, then passed to ElasticSearch. So, the allowed formats can be any of those handled by calibre. I changed the supported list to include DOC, AZW3, AZW4 and KFX and they all got indexed.
Quote:
Originally Posted by mapozyan
Thanks, I am thinking on it. Will try to address in a next release.
|
I did a simple test here, and changing query from "match" to "match_phrase" did the job allowing phrases and compound words. Using "query_string" isn't that easy, though.