|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#16 | |
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
![]() Code:
from calibre.ebooks.oeb.iterator import EbookIterator from calibre_plugins.action_chains.actions.base import ChainAction with open("test_dict.txt", "r") as f: tags_dict = f.read() class TagsFromEpub(ChainAction): name = 'Tags_F_Epub' support_scopes = True def get_word_count(iterator, book_path, icu_wordcount): '''Given an iterator for the epub (if already opened/converted), estimate a word count''' from calibre.utils.localization import get_lang if iterator is None: iterator = _open_epub_file(book_path) lang = iterator.opf.language lang = get_lang() if not lang else lang DEFAULT_STORE_VALUES = {} KEY_USE_ICU_WORDCOUNT = 'useIcuWordcount' icu_wordcount = c.get(cfg.KEY_USE_ICU_WORDCOUNT, cfg.DEFAULT_STORE_VALUES[cfg.KEY_USE_ICU_WORDCOUNT]) count = _get_epub_standard_word_count(iterator, lang, icu_wordcount) print('\tWord count:', count) return iterator, count def _open_epub_file(book_path, strip_html=False): '''Given a path to an EPUB file, read the contents into a giant block of text''' iterator = EbookIterator(book_path) iterator.__enter__(only_input_plugin=True, run_char_count=True, read_anchor_map=False) return iterator def _get_epub_standard_word_count(iterator, lang='en', icu_wordcount=False): '''This algorithm counts individual words instead of pages''' book_text = _read_epub_contents(iterator, strip_html=True) wordcount = None if icu_wordcount: try: from calibre.spell.break_iterator import count_words print('\tWord count using icu_wordcount - trying to count_words') wordcount = count_words(book_text, lang) print('\tWord count - used count_words:', wordcount) except: try: # The above method is new and no-one will have it as of 08/01/2016. print('\tWord count using icu_wordcount - trying to import split_into_words_and_positions') from calibre.spell.break_iterator import split_into_words_and_positions print('\tWord count - trying split_into_words_and_positions:') wordcount = len(split_into_words_and_positions(book_text, lang)) print('\tWord count - used split_into_words_and_positions:', wordcount) except: pass if not wordcount: # If not using icu wordcount, or it failed, use the old method. from calibre.utils.wordcount import get_wordcount_obj print('\tWord count using older method - trying get_wordcount_obj') wordcount = get_wordcount_obj(book_text) wordcount = wordcount.words return wordcount def tags_from_epub(path_to_epub): temp = [] res = dict() for line in wordcount: for key,value in tags_dict.items(): if re.search(rf'{value}', line): if value not in temp: temp.append(value) res[key] = value regex = re.compile(value) match_array = regex.finditer(line) match_list = list(match_array) for m in match_list: print(key, ":",m.group()) def run(gui, settings, chain): db = gui.current_db for book_id in chain.scope().get_book_ids(): fmts = [ fmt.strip() for fmt in db.formats(book_id, index_is_id=True).split(',') ] if 'EPUB' in fmts: path_to_epub = db.format_abspath(book_id, 'EPUB', index_is_id=True) tags_from_epub(path_to_epub) |
|
![]() |
![]() |
![]() |
#17 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,196
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
Quote:
For the "Run Python Code" you should use the run() as separate function, not a method for any class, as I previously told you to do in this post (note that there is NO mention of subclassing ChainAction). The other methods should be separate functions as well. I do not understand what you are trying to do with your code, and I do not have the time to debug it. If you can get a working function that returns whatever tags you want, I can help from there. However, here is a couple of points regarding your code:
P.S. If your main problem is converting the epub to text, the easiest way is using calibre's conversion as follows: Code:
def convert_to_text(path_to_epub): import os, subprocess from calibre.ptempfile import PersistentTemporaryDirectory tdir = PersistentTemporaryDirectory('_temp_convert') output_file = os.path.join(tdir, 'temp.txt') cmd = 'ebook-convert "{}" "{}"'.format(path_to_epub, output_file) subprocess.call(cmd, shell='true') return output_file path_to_txt = convert_to_text(path_to_epub) Last edited by capink; 08-21-2022 at 07:44 AM. |
|
![]() |
![]() |
Advert | |
|
![]() |
#18 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,740
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
![]() |
![]() |
![]() |
#19 | |
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
Code:
import re import ast import os with open(r"D:\User\Calibre Portable\Python_tareas\docs_pys\test_dict.txt") as f: tags_dict = f.read() def convert_to_text(path_to_epub): import os, subprocess from calibre.ptempfile import PersistentTemporaryDirectory tdir = PersistentTemporaryDirectory('_temp_convert') output_file = os.path.join(tdir, 'temp.txt') cmd = 'ebook-convert "{}" "{}"'.format(path_to_epub, output_file) subprocess.call(cmd, shell='true') return output_file def tags_from_epub(path_to_epub): path_to_txt = convert_to_text(path_to_epub) temp = [] res = dict() for line in path_to_txt: for key,value in tags_dict.items(): if re.search(rf'{value}', line): if value not in temp: temp.append(value) res[key] = value regex = re.compile(value) match_array = regex.finditer(line) match_list = list(match_array) for m in match_list: print(key) print("processed ") def run(gui, settings, chain): db = gui.current_db for book_id in chain.scope().get_book_ids(): fmts = [ fmt.strip() for fmt in db.formats(book_id, index_is_id=True).split(',') ] if 'EPUB' in fmts: path_to_epub = db.format_abspath(book_id, 'EPUB', index_is_id=True) tags_from_epub(path_to_epub) |
|
![]() |
![]() |
![]() |
#20 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,196
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
Try the chain attached to this post. To import it: click Action Chains > Add/Modify chains > right click chain dialog > import chain.
Edit: Try it first on a test book to see whether you want to modify it further to suit you. Last edited by capink; 08-30-2022 at 04:17 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#21 | |
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
FileNotFoundError:[Errno 2] No such file or directory: 'C:\\Users\\AppData\\Local\\Temp\\calibre_ko1yesmi \\u0n9u8sm_temp_convert\\temp.txt' calibre 6.3* Portable embedded-python: True Windows-10-10.0.19041-SP0 Windows ('64bit', 'WindowsPE') ('Windows', '10', '10.0.19041') Python 3.10.1 Traceback (most recent call last): File "calibre_plugins.action_chains.action", line 449, in run_chain File "calibre_plugins.action_chains.chains", line 390, in run File "calibre_plugins.action_chains.chains", line 205, in _run_loop File "calibre_plugins.action_chains.chains", line 182, in _run_loop File "calibre_plugins.action_chains.actions.code", line 130, in run File "module", line 36, in run File "module", line 16, in tags_from_epub |
|
![]() |
![]() |
![]() |
#22 |
Library Breeder (She/Her)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,265
Karma: 1937891
Join Date: Apr 2015
Location: Fullerton, California
Device: Paperwhite 2015 (2), PW 2024 (12 GEN), PW 2023 (11 GEN), Scribe (1st)
|
If this plugin works, then I will be very happy. I have been trying to update my tags by using the ENF plugin but it comes back with the most nouns and I have to weed through those to find "Dragon" or "Vampire" or "Biker" or "Wizard". I could use "Powersearch" but that still requires me to create a tag.
|
![]() |
![]() |
![]() |
#23 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,196
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
It is working for me without producing this error. I am currently on Linux and do not have an access to Windows machine. Maybe it has something to do the OS. Try the one attached below and see whether it makes a difference. Beyond that, I'm afraid I cannot help.
Last edited by capink; 09-04-2022 at 11:37 AM. |
![]() |
![]() |
![]() |
#24 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,718
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Which user? BR Last edited by BetterRed; 09-04-2022 at 05:25 PM. |
|
![]() |
![]() |
![]() |
#25 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,196
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
I see what you are getting at. Problem is; that is not a hardcoded path, but a temporary path calculated by a calibre function. Why is it coming out this way? I don't know, and I cannot test on Windows. So, I replaced the calibre function with a standard python function hoping that it might solve the problem.
|
![]() |
![]() |
![]() |
#26 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,718
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
![]() |
![]() |
![]() |
#27 | |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,969
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Quote:
|
|
![]() |
![]() |
![]() |
#28 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,718
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
![]() |
![]() |
![]() |
#29 | |
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#30 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,718
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
I don't know what should be in the text file, I don't use the Action Chains plugin. My suggestion was aimed at getting the calibre temp folder out of the specifics of the Windows ecosystem. BR Last edited by BetterRed; 09-05-2022 at 05:44 AM. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How can I bulk-delete a number of selected tags from all books in my library? | droopy | Library Management | 1 | 08-09-2020 06:24 PM |
How are tags selected between multiple metadata sources? | Isomorpheus | Library Management | 3 | 10-19-2019 01:29 PM |
HTML Metadata add Tags? | skb | Conversion | 5 | 07-16-2019 07:24 AM |
Help Please- Add and Convert Books and Download Metadata not working??? | gorgeousbird | Calibre | 5 | 08-14-2012 12:31 AM |
ADD Books & extract tags from title? | johnb0647 | Calibre | 3 | 01-08-2011 05:36 PM |