View Single Post
Old 07-08-2014, 01:49 PM   #880
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,939
Karma: 6361444
Join Date: Nov 2009
Device: many
Hi tkeo,

After looking at what you have done some more, I would like to adopt your idea of passing more information in filenames[] to make creating the opf easier.

So I propose replacing filenames[dir,filename] with something along the lines of:

fileinfo[key, dir, filename]

The "key" will be one of the following:
- skelid (skeleton number/partno converted to string) to match with RESC skelid
- "coverpage" - used when we create a coverpage
- None

This fileinfo will be passed to the opf code along with k8resc (the much simpler version I proposed) and the spine and manifest will be built as it was originally "on the fly", with the key used to access the spine_order, spine_idrefs, spine_properties, as we build it.

I have modified my mobi_k8resc.py version to add a "x_" prefix to the given idrefs from the RESC. I have also offloaded most of the RESC header and extraction processing from kindleunpack.py into the new mobi_k8resc.py and then changed the resc returned to simply be the k8resc object (as it will have all of the other info you stored in the resc[] list.


The code in the opf then does the following:

For the metadata, we use what you have but teach it to grok the new k8resc extra metadata format instead.

For the manifest:

we use the imgnames, fileinfo, and used_map information as before but now we looks up the original idref in the k8resc.spine_idrefs dictionary as needed otherwise we use our itemXXXXX style idrefs.

For the spine:

if k8resc exists and length of k8res.spine_order >= number of parts:

- we can use k8resc.spine_order to create the proper order and get all idrefs, and page properties from the k8resc object for the spine

else

- we build the spine in the order given by the fileinfo array which should match the k8proc.partInfo order as we always did previously.

How does that sound?


To better explain what I am thinking, I have thrown together some changes with a new mobi_k8resc.py and some associated changes in the opf but this is only tested briefly for epub2! It will most likely die under epub3 but I think it illustrates the approach I was thinking about. If you agree, we would then try to reduce the redundancy using this mobi_opf.py and fix it to work with epub3 and also remove all the dependencies on mobi_taglist.py since it should no longer be needed.

So please see KindleUnpack_v072x_test.zip that is attached.

This is not a public release!!!!!

This version is just meant to demonstrate the approach and ideas so we can decide how best to move forward.

Take care,

KevinH
Attached Files
File Type: zip KindleUnpack_v072x_test.zip (96.2 KB, 230 views)

Last edited by KevinH; 07-08-2014 at 03:54 PM.
KevinH is offline   Reply With Quote