View Single Post
Old 12-25-2020, 09:31 PM   #28
twynn92
Junior Member
twynn92 can extract oil from cheesetwynn92 can extract oil from cheesetwynn92 can extract oil from cheesetwynn92 can extract oil from cheesetwynn92 can extract oil from cheesetwynn92 can extract oil from cheesetwynn92 can extract oil from cheesetwynn92 can extract oil from cheese
 
Posts: 8
Karma: 1000
Join Date: Dec 2020
Device: none
Just a quick update: ebook-convert's HTMLZ output format is not suitable at all because it renames the images, i.e., starting with 00000, and incrementing by 1 for each successive image. Pandoc is way better for this use case, but now I am running into the fact that the KF8 text refers to one more image than the KFX, so will have to take a look at the surrounding text to see why that is, and whether it will be trivial to work around or not, e.g., I can just discard the first or last image and have it matching in all other respects. Regular expressions and Notepad++ functions are really helping here, but it is definitely not easily automatable for sure.

Should I just leave this topic be, i.e., no further reports, as having the best of both worlds seems to be a specific use case that no one else really needs? In other words, this information would only be useful for someone like me who wants to have the highest resolution for images where possible but also keeping the semantic information. Everyone else is probably just satisfied with the KFX output, since most users are likely to run it through Calibre anyways, which bloats the code a bit and definitely does not leave it untouched no matter what arguments are used. Even if I were to be successful, it's not like it'll help anyone else out...
twynn92 is offline   Reply With Quote