|
|||||||
![]() |
|
|
Thread Tools | Search this Thread |
|
|
#16 | |
|
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
I added get_word_count definition but it depends on other definitions. Running the code results in TypeError: TagsFromEpub.run() takes 3 positional arguments but 4 were given.Code:
from calibre.ebooks.oeb.iterator import EbookIterator
from calibre_plugins.action_chains.actions.base import ChainAction
with open("test_dict.txt", "r") as f:
tags_dict = f.read()
class TagsFromEpub(ChainAction):
name = 'Tags_F_Epub'
support_scopes = True
def get_word_count(iterator, book_path, icu_wordcount):
'''Given an iterator for the epub (if already opened/converted), estimate a word count'''
from calibre.utils.localization import get_lang
if iterator is None:
iterator = _open_epub_file(book_path)
lang = iterator.opf.language
lang = get_lang() if not lang else lang
DEFAULT_STORE_VALUES = {}
KEY_USE_ICU_WORDCOUNT = 'useIcuWordcount'
icu_wordcount = c.get(cfg.KEY_USE_ICU_WORDCOUNT, cfg.DEFAULT_STORE_VALUES[cfg.KEY_USE_ICU_WORDCOUNT])
count = _get_epub_standard_word_count(iterator, lang, icu_wordcount)
print('\tWord count:', count)
return iterator, count
def _open_epub_file(book_path, strip_html=False):
'''Given a path to an EPUB file, read the contents into a giant block of text'''
iterator = EbookIterator(book_path)
iterator.__enter__(only_input_plugin=True, run_char_count=True, read_anchor_map=False)
return iterator
def _get_epub_standard_word_count(iterator, lang='en', icu_wordcount=False):
'''This algorithm counts individual words instead of pages'''
book_text = _read_epub_contents(iterator, strip_html=True)
wordcount = None
if icu_wordcount:
try:
from calibre.spell.break_iterator import count_words
print('\tWord count using icu_wordcount - trying to count_words')
wordcount = count_words(book_text, lang)
print('\tWord count - used count_words:', wordcount)
except:
try: # The above method is new and no-one will have it as of 08/01/2016.
print('\tWord count using icu_wordcount - trying to import split_into_words_and_positions')
from calibre.spell.break_iterator import split_into_words_and_positions
print('\tWord count - trying split_into_words_and_positions:')
wordcount = len(split_into_words_and_positions(book_text, lang))
print('\tWord count - used split_into_words_and_positions:', wordcount)
except:
pass
if not wordcount: # If not using icu wordcount, or it failed, use the old method.
from calibre.utils.wordcount import get_wordcount_obj
print('\tWord count using older method - trying get_wordcount_obj')
wordcount = get_wordcount_obj(book_text)
wordcount = wordcount.words
return wordcount
def tags_from_epub(path_to_epub):
temp = []
res = dict()
for line in wordcount:
for key,value in tags_dict.items():
if re.search(rf'{value}', line):
if value not in temp:
temp.append(value)
res[key] = value
regex = re.compile(value)
match_array = regex.finditer(line)
match_list = list(match_array)
for m in match_list:
print(key, ":",m.group())
def run(gui, settings, chain):
db = gui.current_db
for book_id in chain.scope().get_book_ids():
fmts = [ fmt.strip() for fmt in db.formats(book_id, index_is_id=True).split(',') ]
if 'EPUB' in fmts:
path_to_epub = db.format_abspath(book_id, 'EPUB', index_is_id=True)
tags_from_epub(path_to_epub)
|
|
|
|
|
|
|
#17 | |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,216
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
Quote:
For the "Run Python Code" you should use the run() as separate function, not a method for any class, as I previously told you to do in this post (note that there is NO mention of subclassing ChainAction). The other methods should be separate functions as well. I do not understand what you are trying to do with your code, and I do not have the time to debug it. If you can get a working function that returns whatever tags you want, I can help from there. However, here is a couple of points regarding your code:
P.S. If your main problem is converting the epub to text, the easiest way is using calibre's conversion as follows: Code:
def convert_to_text(path_to_epub):
import os, subprocess
from calibre.ptempfile import PersistentTemporaryDirectory
tdir = PersistentTemporaryDirectory('_temp_convert')
output_file = os.path.join(tdir, 'temp.txt')
cmd = 'ebook-convert "{}" "{}"'.format(path_to_epub, output_file)
subprocess.call(cmd, shell='true')
return output_file
path_to_txt = convert_to_text(path_to_epub)
Last edited by capink; 08-21-2022 at 08:44 AM. |
|
|
|
|
| Advert | |
|
|
|
|
#18 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,784
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
|
|
|
|
|
#19 | |
|
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
Code:
import re
import ast
import os
with open(r"D:\User\Calibre Portable\Python_tareas\docs_pys\test_dict.txt") as f:
tags_dict = f.read()
def convert_to_text(path_to_epub):
import os, subprocess
from calibre.ptempfile import PersistentTemporaryDirectory
tdir = PersistentTemporaryDirectory('_temp_convert')
output_file = os.path.join(tdir, 'temp.txt')
cmd = 'ebook-convert "{}" "{}"'.format(path_to_epub, output_file)
subprocess.call(cmd, shell='true')
return output_file
def tags_from_epub(path_to_epub):
path_to_txt = convert_to_text(path_to_epub)
temp = []
res = dict()
for line in path_to_txt:
for key,value in tags_dict.items():
if re.search(rf'{value}', line):
if value not in temp:
temp.append(value)
res[key] = value
regex = re.compile(value)
match_array = regex.finditer(line)
match_list = list(match_array)
for m in match_list:
print(key)
print("processed ")
def run(gui, settings, chain):
db = gui.current_db
for book_id in chain.scope().get_book_ids():
fmts = [ fmt.strip() for fmt in db.formats(book_id, index_is_id=True).split(',') ]
if 'EPUB' in fmts:
path_to_epub = db.format_abspath(book_id, 'EPUB', index_is_id=True)
tags_from_epub(path_to_epub)
|
|
|
|
|
|
|
#20 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,216
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
Try the chain attached to this post. To import it: click Action Chains > Add/Modify chains > right click chain dialog > import chain.
Edit: Try it first on a test book to see whether you want to modify it further to suit you. Last edited by capink; 08-30-2022 at 05:17 PM. |
|
|
|
| Advert | |
|
|
|
|
#21 | |
|
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
FileNotFoundError:[Errno 2] No such file or directory: 'C:\\Users\\AppData\\Local\\Temp\\calibre_ko1yesmi \\u0n9u8sm_temp_convert\\temp.txt' calibre 6.3* Portable embedded-python: True Windows-10-10.0.19041-SP0 Windows ('64bit', 'WindowsPE') ('Windows', '10', '10.0.19041') Python 3.10.1 Traceback (most recent call last): File "calibre_plugins.action_chains.action", line 449, in run_chain File "calibre_plugins.action_chains.chains", line 390, in run File "calibre_plugins.action_chains.chains", line 205, in _run_loop File "calibre_plugins.action_chains.chains", line 182, in _run_loop File "calibre_plugins.action_chains.actions.code", line 130, in run File "module", line 36, in run File "module", line 16, in tags_from_epub |
|
|
|
|
|
|
#22 |
|
Library Breeder (She/Her)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,301
Karma: 1937893
Join Date: Apr 2015
Location: Fullerton, California
Device: Paperwhite 2015 (2), PW 2024 (12 GEN), PW 2023 (11 GEN), Scribe (1st)
|
If this plugin works, then I will be very happy. I have been trying to update my tags by using the ENF plugin but it comes back with the most nouns and I have to weed through those to find "Dragon" or "Vampire" or "Biker" or "Wizard". I could use "Powersearch" but that still requires me to create a tag.
|
|
|
|
|
|
#23 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,216
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
It is working for me without producing this error. I am currently on Linux and do not have an access to Windows machine. Maybe it has something to do the OS. Try the one attached below and see whether it makes a difference. Beyond that, I'm afraid I cannot help.
Last edited by capink; 09-04-2022 at 12:37 PM. |
|
|
|
|
|
#24 | |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,018
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Which user? BR Last edited by BetterRed; 09-04-2022 at 06:25 PM. |
|
|
|
|
|
|
#25 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,216
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
I see what you are getting at. Problem is; that is not a hardcoded path, but a temporary path calculated by a calibre function. Why is it coming out this way? I don't know, and I cannot test on Windows. So, I replaced the calibre function with a standard python function hoping that it might solve the problem.
|
|
|
|
|
|
#26 |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,018
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
|
|
|
|
|
#27 | |
|
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,360
Karma: 79528341
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Quote:
|
|
|
|
|
|
|
#28 |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,018
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
|
|
|
|
|
#29 | |
|
Member
![]() Posts: 12
Karma: 10
Join Date: Jul 2022
Device: none
|
Quote:
One question, can I set any route for "CALIBRE_TEMP_DIR" or is there a criteria to set this route. The code creates an EMPTY temporary folder without the text file.
|
|
|
|
|
|
|
#30 | |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,018
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
I don't know what should be in the text file, I don't use the Action Chains plugin. My suggestion was aimed at getting the calibre temp folder out of the specifics of the Windows ecosystem. BR Last edited by BetterRed; 09-05-2022 at 06:44 AM. |
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How can I bulk-delete a number of selected tags from all books in my library? | droopy | Library Management | 1 | 08-09-2020 07:24 PM |
| How are tags selected between multiple metadata sources? | Isomorpheus | Library Management | 3 | 10-19-2019 02:29 PM |
| HTML Metadata add Tags? | skb | Conversion | 5 | 07-16-2019 08:24 AM |
| Help Please- Add and Convert Books and Download Metadata not working??? | gorgeousbird | Calibre | 5 | 08-14-2012 01:31 AM |
| ADD Books & extract tags from title? | johnb0647 | Calibre | 3 | 01-08-2011 06:36 PM |