Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 08-18-2023, 06:45 AM   #1
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,012
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
FTS: Book count etc.

I've 6337 "books" in Calibre. Some might have no files. Some might be image based PDF (may or may not have OCR layer).
I started the indexing for FTS, but it's indexing 14249 books!

Where there are multiple formats can it be set to only index epub, which would always exist of more than one format?

What does it do with PDFs that have either text or an OCR text layer, or at all?

Can I exclude rows from being indexed at all?
Quoth is offline   Reply With Quote
Old 08-18-2023, 06:47 AM   #2
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,012
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Now at 26% and ~2 hours 18 minutes estimate. Set to "Fast".
Quoth is offline   Reply With Quote
Advert
Old 08-18-2023, 06:52 AM   #3
Comfy.n
want to learn what I want
Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.
 
Posts: 1,611
Karma: 7891011
Join Date: Sep 2020
Device: none
Quote:
Originally Posted by Quoth View Post
Where there are multiple formats can it be set to only index epub, which would always exist of more than one format?
Indeed, this would be great to make the FTS database smaller and bypass the redundant results, not sure if it's possible... I think this question has been raised when the feature was implemented :\
Comfy.n is offline   Reply With Quote
Old 08-18-2023, 06:55 AM   #4
Comfy.n
want to learn what I want
Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.
 
Posts: 1,611
Karma: 7891011
Join Date: Sep 2020
Device: none
Quote:
Originally Posted by Quoth View Post
Now at 26% and ~2 hours 18 minutes estimate. Set to "Fast".
I set my initial indexing to run overnight
Comfy.n is offline   Reply With Quote
Old 08-18-2023, 07:14 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
No it will index all formats, and you cannot exclude books. As for PDF it extracts teh text using the pdftotext tool.
kovidgoyal is online now   Reply With Quote
Advert
Old 08-18-2023, 07:23 AM   #6
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,012
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by kovidgoyal View Post
No it will index all formats, and you cannot exclude books. As for PDF it extracts teh text using the pdftotext tool.
That's possibly handy for some of the PDFs.

Thanks.

So that explains indexing 14249 vs 6337 "books" in Calibre.
Quoth is offline   Reply With Quote
Old 08-18-2023, 10:39 AM   #7
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,012
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
I'm impressed with the FTS and the flexibility of it. Estimated (once off) Index build time was close to reality.
Quoth is offline   Reply With Quote
Old 08-18-2023, 06:15 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,718
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Assuming FTS doesn't index book 'data' folders, you could move the converted PDFs, LITs, PRCs etc into it, perhaps into a '4 Posterity' sub folder.

IIRC you can reindex individual books… but I can't remember how.

Found it: it's in the FTS search results, viz:

Click image for larger version

Name:	Screenshot 2023-08-19 082635.jpg
Views:	69
Size:	69.2 KB
ID:	203208

I don't use FTS, except on my Test library.

BR

Last edited by BetterRed; 08-18-2023 at 06:34 PM.
BetterRed is offline   Reply With Quote
Old 08-18-2023, 07:28 PM   #9
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,012
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
I don't have "converted" PDFs, just four sorts
1. Tech info, Service info (mostly not imported)
2. Manuals, instruction books etc that work OK on Sage
3. Scans of old books and magsines (mostly books, 100s of magasines not imported)
(The above three in no other format)

4. And finally, files for POD exported from LO Writer direct, for proofing layout/format only (never content) on Elipsa. Not many, but more than one size per title. These have a related epub for ebook publishing and azw3 and dual mobi purely for testing. The epub is from a docx, then deleted (original still exists in same place as odt). The azw3 & dual mobi from epub. The pdf not made by Calibre but a direct LO Writer export.

The only other "waste" in the indexing is where the epub is from a PD mobi or a converted epub or a bought azw3 converted to epub.
Hence 6337 vs 14249
There are also some titles with no formats.

Adding new books is no issue. Seems instant to index.

Free space Storage 2.9 TB
full-text-search-db = 10.6GB

The entire library as an export is about 23 GByte, so the FTS DB seems a bit large? Though Export is compressed.


Thanks for suggestions @BetterRed


I've no LITs, PDBs LRFs etc. The Palm Z22 was just a test

The FTS is very useful to me, not sure why I took so long to set it up.

Eventually a priority list of formats (like for transfer) would be an idea so that you could decide to only index one format per title.

(And rsync backed it up to my server already tonight. I never run the sync with Calibre or any editor /content creating program open).

Last edited by Quoth; 08-18-2023 at 07:33 PM.
Quoth is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Count words in whole book franc Reading and Management 16 03-31-2021 05:35 PM
Total book count. Syllius Server 1 07-17-2019 05:06 AM
Get Tag Count by book Deina97 Library Management 2 11-26-2014 11:34 AM
Book Count em92150 Calibre 5 01-06-2012 09:03 AM
Whatever happened to the book count? GJN Calibre 6 07-25-2010 12:31 PM


All times are GMT -4. The time now is 07:55 AM.


MobileRead.com is a privately owned, operated and funded community.