View Single Post
Old 07-31-2011, 03:28 PM   #44
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Quote:
Originally Posted by kovidgoyal View Post
Nevermind, looking at the TBS bytes from that document, their structure is completely different from kindlegen 1.2 TBS entries, so you'd have to decode them from scratch, the info you'll need will all be present in the decompiled_nyt/ dir.
Not true. The TBS bytes generated by Kindlegen 1.1 and 1.2 are identical. I have attached my own parsing of them using a modified version of a python script called mobiunpack (also attached). I don't undertand the output from your debug code. For example, in NYT.MOBI your code seems to say the TBS for the first record are 80 0 80 80 (from tbs_indexing.txt).
Code:
******************** TBS Indexing (27 records) ********************

Record #1: Starts at: 0 Ends at: 4095
	Contains: 3 index entries (0 ends, 0 complete, 3 starts)
TBS bytes: 80 0 80 80
	Starts:
		Index Entry: 0 (Parent index: -1, Depth: 0, Offset: 121, Size: 107660) [Periodical]
		Index Entry: 1 (Parent index: 0, Depth: 1, Offset: 568, Size: 76568) [The Front Page]
		Index Entry: 3 (Parent index: 1, Depth: 2, Offset: 2968, Size: 13248) [Amid New Talks, Some Optimism on Debt Crisis]

TBS: 0 (0000)
Outermost index: 0
Unknown extra start bytes: {}
The section at the start of this record is: 0
First article in this record of section 0 (relative to its parent section): 0 [0 absolute index]
The section 0 has at most one article in this record
My parsing shows 86 80 02 A0 85, as in
Code:
    PACKED HTML Record[  0]  Base =         0h [        0 ]  Size =   7B0h [   1968 ]
**Unpacked HTML Record[  0]  0 - 4099   TBS =  86 80 02 A0 85
       TBS HTML Record       86 80 02 A0 85
Decode TBS HTML Record       Type 6 <first section article, ncx=idx+1>
                             20h(idx=2 flags=0) NCX[3] HTML = 2968 - 16215, parent=1, flags=6, flagdata=0
Attached Files
File Type: txt unpacknyt.txt (12.7 KB, 717 views)
File Type: txt unpacknyt-1-2.txt (12.7 KB, 752 views)
File Type: mobi nyt.mobi (337.1 KB, 568 views)
File Type: mobi nyt-1-2.mobi (697.6 KB, 557 views)
File Type: txt mobiunpack.txt (29.2 KB, 641 views)
nickredding is offline   Reply With Quote