View Single Post
Old 09-15-2014, 09:13 PM   #993
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,937
Karma: 6361444
Join Date: Nov 2009
Device: many
Hi Doug,

In the mobi_opf.py in the part that builds the manifest for the opf, there is this media-map that determines things. The KF8 part unpacks to .xhtml file extensions while the older mobi part unpacks to .html so so gets that strange media-type.

Code:
media_map = {
                '.jpg'  : 'image/jpeg',
                '.jpeg' : 'image/jpeg',
                '.png'  : 'image/png',
                '.gif'  : 'image/gif',
                '.svg'  : 'image/svg+xml',
                '.xhtml': 'application/xhtml+xml',
                '.html' : 'text/x-oeb1-document', # for mobi7
                '.pdf'  : 'application/pdf', # for azw4(print replica textbook)
                '.ttf'  : 'application/x-font-ttf',
                '.otf'  : 'application/x-font-opentype', # replaced?
                #'.otf' : 'application/vnd.ms-opentype', # [OpenType] OpenType fonts
                #'.woff' : 'application/font-woff', # [WOFF] WOFF fonts
                #'.smil' : 'application/smil+xml', # [MediaOverlays301] EPUB Media Overlay documents
                #'.pls' : 'application/pls+xml', # [PLS] Text-to-Speech (TTS) Pronunciation lexicons
                '.otf'  : 'application/x-font-opentype', # replaced?
                #'.mp3'  : 'audio/mpeg',
                #'.mp4'  : 'audio/mp4',
                #'.js'   : 'text/javascript', # not supported in K8
                '.css'  : 'text/css'
                }

So it would be easy to change in KindleUnpack. That said, I passed a content.opf from an old mobi through kindlegen 2.9 and it generated a lot of warnings and built a KF8 part that would never pass any epub check.

So it looks like even Kindlegen is requiring a valid epub as input otherwise it generates junk for the KF8 part. I thought that unpacking an old mobi and then passing it back through kindlegen might be as easy way to convert from html 3 to true xhtml. No such luck.

I frankly think we should use the old mobiml2xhtml.py codebase (actually its newer cousin from your KindleImport) and try and create at least a basic, valid epub-like structure from the old mobi part. Kindlegen seems to be much more adept at taking valid epub xhtml and making old html 3 than doing the reverse.

If others agree, I would be happy to incorporate it into the next KindleUnpack release.

Take care,

Kevin
KevinH is offline   Reply With Quote