View Single Post
Old 04-02-2025, 09:51 PM   #4
lomkiri
Groupie
lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.lomkiri ought to be getting tired of karma fortunes by now.
 
lomkiri's Avatar
 
Posts: 169
Karma: 1497966
Join Date: Jul 2021
Device: N/A
What about a search/replace on the whole epub, using a regex-fonction ?

find : <(\w+)
replace : the function below
Do a "Replace all", so you 'll get all the tags of the epub.
The number of replacements in the dialog box is the total of all tags, but no change is done in the epub.

Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
    
    # last passage
    if match == None:
        if not data:
            print('No tag found')
        else:
            print(f'Found a total of {number} tags, with {len(data)} different tags\n')
            for key in sorted(data):
                print(f'{key}: {data[key]}')
        return
    
    # normal passage
    tag = match[1]
    data[tag] = data.setdefault(tag, 0) +1
    return match[0]

replace.call_after_last_match = True    # Ask for last passage
The result will be :
Code:
Debug output from __count tags

Found a total of 12605 tags, with 22 different tags

a: 6
body: 78
br: 14
div: 143
em: 45
figure: 2
h1: 7
h2: 64
[etc.]

Last edited by lomkiri; 04-02-2025 at 10:05 PM.
lomkiri is offline   Reply With Quote