View Single Post
Old 03-11-2012, 02:17 PM   #30
KevinH
Wizard
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 1,078
Karma: 444444
Join Date: Nov 2009
Device: many
Hi,

I will take a look at this but no promises. Basically, KF8 is just a joined and reformatted epub. The problem is generating the skel, div, and other indexes and if necessary the DATP sections.

It seems all KF8's come with DATP sections which may or may not be required. AFAIK, no one has reversed the DATP (they are not required for original mobi except I think for mobi dictionaries). I do not have code that actually writes or reads DATP section contents, nor do I have code that writes indexes other than ncx indexes.

Also the right way to deal with all of the restructuring of the html is probably to use the lxml.etree code, but I am not familiar with it at all. I always resort to using regular expressions but moving and extracting tags and tag contents and adding aid= "" attributes just with regular expressions would be a pain.

So don't hold your breath ...

if (and this is a big if) I can figure out how to use the lxml.etree code, and if there is code inside calibre to write ctoc, tag maps, and INDX indexes and if the DATP sections are not required, then I can probably get something working.

If DATP sections are required, I simply do not have the free time to reverse engineer them.

KevinH
KevinH is offline   Reply With Quote