Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 09-05-2011, 05:11 AM   #136
avid-e-reader
Member
avid-e-reader began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Dec 2010
Device: Kindle
(we crossed paths in the bitstream)
So if the kindlegensrc.zip is supposedly the source files, it is not the exact source files: the directory structure is modified, and the files are tweaked to reflect the changed directory structure.

Maybe exporting any unknown binary data into files would make disassembling it a bit easier, at least. I have no clue where the .ncx file goes, or what format it gets placed in (maybe Kovid does), but without any obvious way of looking at it, it is pretty hard to figure that out. Seems reasonably likely, if there are more binary pieces than mobiunpack presently ignores, that one (or more) of them probably is the .ncx data.

And maybe others would be the .mp3 files I was asking about earlier, although sadly adding .mp3 files is something that keeps slipping further out in my project list.
avid-e-reader is offline   Reply With Quote
Old 09-05-2011, 05:12 AM   #137
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 137
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by avid-e-reader View Post
But here's something I don't understand: there is also a content.opf in the kindlegensrc.zip file, but it doesn't seem to match the one generated by mobiunpack.
The kindlegensrc.zip contains the (slightly modified) sources used by kindlegen to create the mobi file. The content of kindlegensrc.zip should be sufficient to recreate the mobi file (with the exception of some fields which are not created by kindlegen based on the source).

If you have a kindlegensrc.zip, you can just ignore the remaining output of mobiunpack.

Unfortunatly most mobi files don't contain the record which contains the kindlegensrc.zip, so using the content to improve the mobiunpack output won't help in most cases.

But at least the new mobiwriter in calibre should handle ncx files, so the calibre source should give the information how the ncx content is encoded in the mobi file.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 09-05-2011, 05:17 AM   #138
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,896
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by avid-e-reader View Post
A little more info: the <manifest> tag should probably look like:

<item href="misc/toc.ncx" id="toc" media-type="application/x-dtbncx+xml" />

but with the file name possibly different in different cases, based on the actual .ncx file name found? But here's something I don't understand: there is also a content.opf in the kindlegensrc.zip file, but it doesn't seem to match the one generated by mobiunpack.
The kindlegensrc.zip file is just extracted from the penultimate record in the Kindle ebook. It's put in there by Kindlegen, but is not actually used by any rendering software.

All the other files generated by MobiUnpack are generated by decoding the info in the Kindle ebook. In particular, the opf file is put together from bit of info in the header, EXTH records, and even from the HTML. It should have most of the info in the original opf file, but not all that info will actually be contained in the Kindle ebook.
pdurrant is online now   Reply With Quote
Old 09-05-2011, 05:21 AM   #139
avid-e-reader
Member
avid-e-reader began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Dec 2010
Device: Kindle
And regarding .mp3 files, here's a sample .mobi with .mp3 that I got from somewhere, maybe with the Kindlegen documentation?
Attached Files
File Type: mobi Jabberwocky.mobi (1.20 MB, 64 views)
avid-e-reader is offline   Reply With Quote
Old 09-05-2011, 05:22 AM   #140
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,896
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by siebert View Post
But at least the new mobiwriter in calibre should handle ncx files, so the calibre source should give the information how the ncx content is encoded in the mobi file.
Oooo... I wonder if the developer of that has documented it in the wiki? That would make life easier. Hmm.. apparently not. When I have some spare time I'll check the calibre sources.
pdurrant is online now   Reply With Quote
Old 09-05-2011, 05:27 AM   #141
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 137
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by pdurrant View Post
Oooo... I wonder if the developer of that has documented it in the wiki? That would make life easier. Hmm.. apparently not. When I have some spare time I'll check the calibre sources.
As calibre can also decode a mobi, there might even exist some python code in calibre which creates the ncx file from an existing mobi.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 09-05-2011, 05:30 AM   #142
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 137
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by avid-e-reader View Post
And regarding .mp3 files, here's a sample .mobi with .mp3 that I got from somewhere, maybe with the Kindlegen documentation?
While reverse-engineering should be possible having the mobi only, it would be much easier if you could provide the sources for an example book which contains a mp3 file, as someone could build two mobi files from the source (one with mp3 and one without) and analyse the differences to learn how mp3 support is encoded.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 09-05-2011, 05:44 AM   #143
avid-e-reader
Member
avid-e-reader began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Dec 2010
Device: Kindle
The source was just an .html file and a .mp3 file (in a subdirectory named multimedia). Attached as a .zip.
Attached Files
File Type: zip MultimediaSample.zip (1.18 MB, 92 views)
avid-e-reader is offline   Reply With Quote
Old 09-05-2011, 11:57 AM   #144
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,744
Karma: 5072196
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
Quote:
Originally Posted by avid-e-reader View Post
A little more info: the <manifest> tag should probably look like:

<item href="misc/toc.ncx" id="toc" media-type="application/x-dtbncx+xml" />

but with the file name possibly different in different cases, based on the actual .ncx file name found? But here's something I don't understand: there is also a content.opf in the kindlegensrc.zip file, but it doesn't seem to match the one generated by mobiunpack.
Of course it does not match. The kindlegensrc is likely an epub source file while mobiunpack generates a mobi source file. These are not the same thing and are not even the created with the same version of the idpf. Perhaps you do not realize that there was an earlier version of eBook standards that was originally used by eBook readers as a source file. Mobi, Lit, eBookwise IMP formats were all derive from that earlier standard. See our wiki under Open eBook for more details.
DaleDe is offline   Reply With Quote
Old 09-05-2011, 05:53 PM   #145
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 2,593
Karma: 14407089
Join Date: Apr 2010
Location: Phoenix, AZ
Device: Kindle2, iPad, KindleFire and NookColor
Quote:
Originally Posted by avid-e-reader View Post
And regarding .mp3 files, here's a sample .mobi with .mp3 that I got from somewhere, maybe with the Kindlegen documentation?
The Jabberwocky mobi rather notoriously does not work. I'd say, therefore, that it's a skosh useless as an exemplar.

Hitch
Hitch is offline   Reply With Quote
Old 09-06-2011, 07:28 AM   #146
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 137
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by avid-e-reader View Post
The source was just an .html file and a .mp3 file (in a subdirectory named multimedia). Attached as a .zip.
I've modified this sample to add also a video file and ran mobiunpack on it.

The handling of audio and video files is almost identical to image files (surprise

The only difference is that there is a 12 byte header prepended to the original audio/video file which starts with "AUDI" or "VIDE" followed by 2 integers of unknown value.

Also quite similar to the image handling the source attributes of the html tags are replaced with the record numbers:

src="file.mp3" -> mediarecindex="00002"
poster="file.jpg" -> recindex="00003"

So it should be easy to add support for audio/video to mobiunpack.

But is audio/video support really used in the wild?

My understanding is that only very few Kindle platforms are supporting them (is there a list which shows the supported platforms?)

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 09-06-2011, 10:46 AM   #147
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,896
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
NCX Decoding Puzzle

Well, I took a quick look at where the ncx file might be being stored, and it turns out that when an ncx file is added to the sources, you get three extra records added to the Mobipocket file.

Here's the source NCX file, along with the three added sections of the Mobipocket file (separated out into individual files).

I don't have time to properly decode the binary formats, but if anyone fancies a puzzle, here they are. The task is to work out how to reconstruct (as best as possible) the source ncx file from the compiled binary files.
Attached Files
File Type: zip NCX Data.zip (2.9 KB, 85 views)
pdurrant is online now   Reply With Quote
Old 09-06-2011, 11:22 AM   #148
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,424
Karma: 43260000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by pdurrant View Post
I don't have time to properly decode the binary formats, but if anyone fancies a puzzle, here they are. The task is to work out how to reconstruct (as best as possible) the source ncx file from the compiled binary files.
For those who may be looking for insight into the ncx reconstruction from calibre source-code, I'd start with calibre/ebooks/mobi/input.py. Which will lead you to calibre/ebooks/mobi/reader.py... specifically the MobiReader class and its extract_contents function.

I can't get my head around it all quite yet, but maybe someday!
DiapDealer is offline   Reply With Quote
Old 09-10-2011, 12:07 PM   #149
kaizoku
Junior Member
kaizoku began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2011
Device: Mac
Getting some unknown Metadata error with this sample file. Rename the file to .azw4.
Attached Files
File Type: avi B005KDOWQK_EBSP.azw4.avi (3.80 MB, 53 views)
kaizoku is offline   Reply With Quote
Old 09-10-2011, 02:01 PM   #150
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,896
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by kaizoku View Post
Getting some unknown Metadata error with this sample file. Rename the file to .azw4.
I think that unknown Metadata should only be showing as a warning. There is almost always some unknown metadata, as the Mobipocket/Print Replica file format is undocumented.

MobiUnpack used to ignore it, now it mentions it. You can ignore it.
pdurrant is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can i rotate text and insert images in Mobi and EPUB? JanGLi Kindle Formats 5 02-02-2013 05:16 PM
PDF to Mobi with text and images pocketsprocket Kindle Formats 7 05-21-2012 08:06 AM
Mobi files - images DWC Introduce Yourself 5 07-06-2011 02:43 AM
pdf to mobi... creating images rather than text Dumhed Calibre 5 11-06-2010 01:08 PM
Transfer of images on text files anirudh215 PDF 2 06-22-2009 10:28 AM


All times are GMT -4. The time now is 03:24 AM.


MobileRead.com is a privately owned, operated and funded community.