View Single Post
Old 09-12-2024, 05:53 PM   #10
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,476
Karma: 8025702
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by Comfy.n View Post
Now I'm trying to fix my advanced search issue, hopefully I can come up later with more detailed info. I think my notes database has grown too much lately, and when I do the search as indicated in your screenshot for Advanced Search, Calibre enters some seemingly infinite database lookup.
Your problem is almost certainly provoked by the size of your library. The template I posted above is not optimized for looking at >70,000 books, which is IIRC what you have in your library. It does a notes lookup for each book in the library. If that process takes 2 milliseconds per book then the search will run for 140 seconds. And who knows if 2ms is anywhere close to right.

This python template is optimized for the search problem. When it is first called (the first book) it gets all the note values and caches them using more optimized API calls. Then for each subsequent book it checks the cache, not the database. It would be best done as a stored template so the field and value could be passed as arguments. I didn't bother to do that because I don't know if this solves your problem.

The template:
Code:
python:
def evaluate(book, context):
	# Set these to what you want
	field_name = 'authors'
	search_value = 'aaa'

	db = context.db.new_api

	# check if we have already cached the notes
	note_items = context.globals.get('items_with_notes', None)
	if note_items is not None:
		# We have. Get the cached note values
		note_values = context.globals['note_values']
	else:
		# We haven't. Cache the note item ids and their values
		# First get all the item ids with notes and cache the result
		note_items = db.get_all_items_that_have_notes(field_name)
		context.globals['items_with_notes'] = note_items

		# Now get the note values for each item id with a note
		note_values = {}
		for note_item in note_items:
			note = db.notes_data_for(field_name, note_item)
			if note:
				# Get the plain text of the note
				note = note['searchable_text'].partition('\n')[2]
			# Put the value of the note into the cache.
			note_values[note_item] = note
		# Write the cached values to the globals
		context.globals['note_values'] = note_values

	# Check if this book is a match -- the field has a note containing the right text
	# get the item_id for the value of the desired field
	fv = book.get(field_name)
	# if the field is multi-valued, use the first value
	if isinstance(fv, list):
		fv = fv[0]
	# Now get the internal ID of the value in field_name
	item_id = db.get_item_id(field_name, fv)

	# Return the empty string if the item doesn't have a note
	if item_id not in note_items:
		return ''

	# Get the note value from the cache	
	val = note_values.get(item_id, None)
	if val is None: # This shouldn't happen, but ...
		return ''

	# use a case insensitive compare to check if the search value is in the note
	from calibre.utils.icu import primary_contains
	return 'Yes' if primary_contains(search_value, val) else ''
chaley is offline   Reply With Quote