View Single Post
Old 05-20-2024, 01:14 PM   #4
dandman
Enthusiast
dandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watchdandman is clearly one to watch
 
Posts: 29
Karma: 10545
Join Date: May 2024
Device: none
so apparently the source_relevance is not the only attribute the compare takes into consideration when sorting the book results by relevance, but also the comments length, if it has a cover, are the identifier of searched book and current results resembles etc.

i could not find that in the docs, only in the code comments,

i think it is an important part to mention, so i will quote it here for others:


function identify_results_keygen returns a function that will generate a key that the sort will use as a sorting key:
PHP Code:
identify_results_keygen(titleauthorsidentifiers
Code:
        Returns a function that is used to generate a key that can sort Metadata
        objects by their relevance given a search query (title, authors,
        identifiers).

        These keys are used to sort the results of a call to :meth:`identify`.

        For details on the default algorithm see
        :class:`InternalMetadataCompareKeyGen`. Re-implement this function in
        your plugin if the default algorithm is not suitable.
and the internal implementation of the algorithm is:

Code:
    Generate a sort key for comparison of the relevance of Metadata objects,
    given a search query. This is used only to compare results from the same
    metadata source, not across different sources.

    The sort key ensures that an ascending order sort is a sort by order of
    decreasing relevance.

    The algorithm is:

        * Prefer results that have at least one identifier the same as for the query
        * Prefer results with a cached cover URL
        * Prefer results with all available fields filled in
        * Prefer results with the same language as the current user interface language
        * Prefer results that are an exact title match to the query
        * Prefer results with longer comments (greater than 10% longer)
        * Use the relevance of the result as reported by the metadata source's search
           engine

for me it was the length of the comments that overcame the relevance,
basically the source_relevance (.extra) is taken into consideration if and only if all else is the same

PHP Code:
    def compare_to_other(selfother):
        
cmp(self.baseother.base)
        if 
!= 0:
            return 
a
        cx
cy self.comments_lenother.comments_len
        
if cx and cy:
            
= (cx cy) / 20
            delta 
cy cx
            
if abs(delta) > t:
                return -
if delta else 1
        
return cmp(self.extraother.extra
one can override this method in the plugin and re-implement the results comparison algorithm, here is a simple example to compare only by the source_relevance attribute:

PHP Code:
    def identify_results_keygen(selftitle=Noneauthors=Noneidentifiers={}):
        
# return a function that will be used while sorting the identify results based on the source_relevance field of the Metadata object
        
return lambda xx.source_relevance 
for me it was important since i wanted to introduce an option where no perfect match exist (due to misspelled title or authors) and to serve the user with close options ordered by match percentage (at what percent the result matches the searched book)
dandman is offline   Reply With Quote