cfiepub in last_read_positions is exciting metadata! I'm hoping to play around with it -- first trying to extract the node/text at the identifier/last read position. Is this reasonable/possible with code already in calibre?
I think I'm stuck on building concatenated html from an epub container. I imagine there is already a container method to generate this. But I haven't found it yet. Or maybe I'm approaching it all wrong. Any pointers? (initial attempt below)
If that's possible, I'd also like to generate a fragment identifier given a node of an epub tree. Is this something that can be done from python? That code looks like it's in the pyj files (?)
Thanks!
Code:
import init_calibre
import calibre
from calibre.ebooks.oeb.polish.container import get_container
from calibre.ebooks.epub.cfi.parse import parser as cfi_parser, decode_cfi
from calibre.ebooks.oeb.polish.parsing import parse as parse_book
# select path from book where id = 296;
fname_epub = '/path/to/my/file296.epub'
# select cfi from last_read_positions where book = 296;
cfi_str='/36/2/4[x9780525538332_EPUB-16]/2/6/1:46'
container = get_container(fname_epub, tweak_mode=False)
cfi = cfi_parser().parse_path(cfi_str)
# calibre/gui2/tweak_book/boss.py uses editor.get_raw_data()
# maybe combine container.mime_map and then calibre.ebooks.oeb.polish.parsing?
raw_data = .... #?
root = parse_book(
raw_data, decoder=lambda x: x.decode('utf-8'),
line_numbers=True, linenumber_attribute='data-lnum')
node = decode_cfi(root, cfi)