Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 07-30-2022, 10:04 AM   #1
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
JimmXinu's Avatar
 
Posts: 6,973
Karma: 4604635
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
Full Text Search Indexes All Formats?

I kicked off the full text search indexing and noticed that it reported ~twice as many books to index as I have in my library.

However, I do have both epub and azw3 for all of them. Which are textually identical.

Is there an option to only full text index one preferred format? Should there be?

In my case, I expect it would roughly halve the time and size of the search index.
JimmXinu is offline   Reply With Quote
Old 07-30-2022, 10:18 AM   #2
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 10,975
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Calibre reports 6517 index files in my library of 6250 books.

formats:#>1 reports 264 books, which matches up.

What I find weird is that most of those secondary formats are PAPERBOOK, which is a 0-byte dummy file. Somehow I expected the FTS to only do formats that Calibre recognized and could open in its reader (well, it's a renamed text file, so it probably could...).

Last edited by ownedbycats; 07-30-2022 at 10:21 AM.
ownedbycats is offline   Reply With Quote
Old 07-30-2022, 11:13 AM   #3
DrChiper
Bookish
DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.DrChiper ought to be getting tired of karma fortunes by now.
 
DrChiper's Avatar
 
Posts: 1,017
Karma: 2003162
Join Date: Jun 2011
Device: PC, t1, t2, t3, Clara BW, Clara HD, Libra 2, Libra Color, Nxtpaper 11
I have some ebooks in several languages and then some ebooks are in multiple formats too, among them PDF's. Now there are PDF's and PDF's with superimposed texts so they are searchable. No clue what this means for the indexing process but they seem to be all indexed somehow.
DrChiper is offline   Reply With Quote
Old 07-30-2022, 11:36 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yes it indexes all formats (well all formats calibre knows how to read). I prefer to be thorough rather than potentially miss something. Note that when searching cuplicate matches from multiple formats in the same book are coalesced.
kovidgoyal is online now   Reply With Quote
Old 07-30-2022, 12:49 PM   #5
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
JimmXinu's Avatar
 
Posts: 6,973
Karma: 4604635
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
I agree that some people will benefit from indexing all formats and should probably be the default setting.

But for me it's an unneeded increase in DB size and indexing time.

Perhaps a tweak setting could be added at some point to limit indexing to a list of formats?
JimmXinu is offline   Reply With Quote
Old 08-01-2022, 04:52 PM   #6
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
JimmXinu's Avatar
 
Posts: 6,973
Karma: 4604635
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
FYI, I submitted a PR to implement this tweak.

A semi-related question: Indexing speed reverts to Slow all the time; is it supposed to?
JimmXinu is offline   Reply With Quote
Old 08-01-2022, 05:39 PM   #7
Uncle Robin
Diligent dilettante
Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.Uncle Robin ought to be getting tired of karma fortunes by now.
 
Uncle Robin's Avatar
 
Posts: 3,661
Karma: 52758936
Join Date: Sep 2019
Location: in my mind
Device: Kobo Sage; Kobo Libra Colour
Quote:
Originally Posted by JimmXinu View Post
I agree that some people will benefit from indexing all formats and should probably be the default setting.

But for me it's an unneeded increase in DB size and indexing time.

Perhaps a tweak setting could be added at some point to limit indexing to a list of formats?
I support this idea. The Power Search plugin offers users the choice of which formats to index, and if adding this functionality to Calibre's internal FTS is feasible, it would be a very worthwhile enhancement
Uncle Robin is offline   Reply With Quote
Old 08-01-2022, 07:16 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,731
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by JimmXinu View Post
A semi-related question: Indexing speed reverts to Slow all the time; is it supposed to?
The other content indexers I use also default to Slow; so, you could say its a de-facto standard.

BR
BetterRed is online now   Reply With Quote
Old 08-01-2022, 07:50 PM   #9
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 10,975
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Quote:
Originally Posted by JimmXinu View Post
A semi-related question: Indexing speed reverts to Slow all the time; is it supposed to?
During my initial index (during the beta), it reverted to slow whenever I closed the indexer window.
ownedbycats is offline   Reply With Quote
Old 08-01-2022, 10:50 PM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yes, it is meant to revert to slow by default. Fast is meant only for use during initial indexing, or when you add lots of books and want to index them quickly. Therefore when you close the indexer window, it reverts to slow. Fast makes the calibre UI (and your computer depending on specs) become very sluggish.
kovidgoyal is online now   Reply With Quote
Old 08-01-2022, 11:20 PM   #11
gbm
Wizard
gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.
 
Posts: 2,188
Karma: 8888888
Join Date: Jun 2010
Device: Kobo Clara HD,Hisence Sero 7 Pro RIP, Nook STR, jetbook lite
When I ran the indexer in fast I did not notice any slow downs, left the indexer window open until ti competed and contirlt to go to web pages and watch videos. Finished my 5500+ book formats in about 15 minutes.

Linux Mint 20.3 Cinnamon

bernie
Quote:
Originally Posted by kovidgoyal View Post
Yes, it is meant to revert to slow by default. Fast is meant only for use during initial indexing, or when you add lots of books and want to index them quickly. Therefore when you close the indexer window, it reverts to slow. Fast makes the calibre UI (and your computer depending on specs) become very sluggish.
gbm is offline   Reply With Quote
Old 08-02-2022, 01:18 AM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Well yes it depends on your computer's capabilities, how you have configured calibre, the type of files being indexed, etc.
kovidgoyal is online now   Reply With Quote
Old 08-02-2022, 07:23 AM   #13
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 10,975
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Quote:
Originally Posted by gbm View Post
When I ran the indexer in fast I did not notice any slow downs, left the indexer window open until ti competed and contirlt to go to web pages and watch videos. Finished my 5500+ book formats in about 15 minutes.

Linux Mint 20.3 Cinnamon

bernie
In the beta, my initial index outright stalled out because it opened 100+ pdftohtml processes which pushed everything into swap memory. Thankfully, this was later fixed.
ownedbycats is offline   Reply With Quote
Old 12-09-2022, 04:30 PM   #14
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 10,975
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
I was having a look at the index database to see if there was some way to take account of books that have no indexed text (dummy files, or non-OCR'd pdf scans).

The books_text table in the index database shows 5554 entries to my 6,462 book records. 842 boook entries have only dummy paperback/overdrive files, so it seems to roughly match up (taking into account a) unknown number of PDF scans and b) book records with multiple formats).

I don't want to run another full re-index, but I'm curious whether 5554 matches the number Calibre would report when doing that.

Alas, not sure what I can do. I have the ids of all the books indexed, so theoretically I could look for the missing numbers. But the gaps from deleted books...

Last edited by ownedbycats; 12-09-2022 at 04:33 PM.
ownedbycats is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Full Text Search query DrChiper Calibre 2 07-26-2022 06:31 AM
Full text search? excaliber Library Management 3 08-07-2017 06:09 AM
Full Text Search? silentguy Calibre 4 02-22-2012 03:03 PM
Full Text Search Engine Fat Abe General Discussions 1 09-21-2010 05:30 PM
Google Book Search to search full-text books online Bob Russell Deals and Resources (No Self-Promotion or Affiliate Links) 1 08-19-2006 12:13 PM


All times are GMT -4. The time now is 05:10 AM.


MobileRead.com is a privately owned, operated and funded community.