07-31-2011, 10:19 AM | #31 |
Connoisseur
Posts: 82
Karma: 10
Join Date: Oct 2010
Device: Kindle
|
Kovid, I have a problem running your inspect-mobi part of your source code on my Ubuntu 11.04 machine. Would you check the following error message?
Spoiler:
I check line 1117 of the corresponding source code as follows: Spoiler:
Thanks. |
07-31-2011, 11:40 AM | #32 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
tylau0: Update your calibre source code, that bug was fixed yesterday.
|
Advert | |
|
07-31-2011, 11:51 AM | #33 |
Connoisseur
Posts: 82
Karma: 10
Join Date: Oct 2010
Device: Kindle
|
Indeed I used the source code at http://status.calibre-ebook.com/dist/src downloaded today morning. Could you double check if the updated version has been uploaded? Or should I check an alternative site?
Thanks again. |
07-31-2011, 11:54 AM | #34 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Use
bzr branch lp:calibre The tarball is the source code corresponding to 0.8.12 |
07-31-2011, 12:29 PM | #35 | |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
Two interesting things here: the TAGX entries being generated by Kindlegen 1.1 obviously don't conform to what the debug code is expecting, and (responding to your other comments) Kindlegen 1.1 never generates secondary index data. You've got a lot further than I (or GRiker) did understanding the MOBI format, but I'm still scratching my head over the fact that Kindlegen 1.1 output works and is missing DATP and secondary index records, and also appears to use a TAGX block which is invariant (the latter is also true of Amazon-generated periodicals). My approach (failed so far) has been to try to get the MOBI output to look like it came from Kindlegen 1.1, and the last hurdle I faced was the TBS records which I couldn't replicate because of the apparently arbitrary byte sequences. If you have decoded these then maybe I can get it to work. |
|
Advert | |
|
07-31-2011, 12:34 PM | #36 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Can you post a file that causes it to crash, should be easy for me to add support for its tag structure. And note that I've been committing changes to inspect mobi up until a few hours ago. The last revision is 10040
|
07-31-2011, 12:43 PM | #37 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
I'll bzr the whole lot and try again--debug crashed on all mobi files (including calibre-generated) so yesterday's bzr must be out of date.
|
07-31-2011, 02:17 PM | #38 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Kovid - I dowlnloaded 10042 and this is what I get from the Kindlegen 1.1 generated file (using my method ebook-convert --> OEB --> Kindlegen) that is attached.
Code:
C:\Users\Nick\Calibre-Kindle\News-Files>calibre-debug --inspect-mobi nyt.mobi Python function terminated unexpectedly Dont know how to interpret flag 0b0010 while reading section transitions (Error Code: 1) Traceback (most recent call last): File "site.py", line 132, in main File "site.py", line 109, in run_entry_point File "C:\Users\Nick\calibre\src\calibre\debug.py", line 236, in main inspect_mobi(opts.inspect_mobi) File "C:\Users\Nick\calibre\src\calibre\ebooks\mobi\debug.py", line 1466, in inspect_mobi print(str(f.tbs_indexing), file=out) File "C:\Users\Nick\calibre\src\calibre\ebooks\mobi\debug.py", line 1173, in __str__ ans += self.dump_record(r, dat)[-1] File "C:\Users\Nick\calibre\src\calibre\ebooks\mobi\debug.py", line 1224, in dump_record dat['geom'][0]) File "C:\Users\Nick\calibre\src\calibre\ebooks\mobi\debug.py", line 1314, in interpret_periodical byts = read_section_transitions(byts, ssi) File "C:\Users\Nick\calibre\src\calibre\ebooks\mobi\debug.py", line 1245, in read_section_transitions raise ValueError('Dont know how to interpret flag 0b0010' ValueError: Dont know how to interpret flag 0b0010 while reading section transitions |
07-31-2011, 02:29 PM | #39 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
I added the plugin tweak to use your new MOBI writer code and trying to use debug on the resulting file gets me this
Code:
C:\Users\Nick\Calibre-Kindle\News-Files>calibre-debug --inspect-mobi nytcalibre2.mobi Python function terminated unexpectedly 'MOBIFile' object has no attribute 'secondary_index_header' (Error Code: 1) Traceback (most recent call last): File "site.py", line 132, in main File "site.py", line 109, in run_entry_point File "C:\Users\Nick\calibre\src\calibre\debug.py", line 236, in main inspect_mobi(opts.inspect_mobi) File "C:\Users\Nick\calibre\src\calibre\ebooks\mobi\debug.py", line 1456, in inspect_mobi if f.secondary_index_header is not None: AttributeError: 'MOBIFile' object has no attribute 'secondary_index_header' |
07-31-2011, 02:39 PM | #40 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Ah that's an error in reading the TBS, I'll look at it in a moment.
|
07-31-2011, 02:49 PM | #41 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
OK, those are really strange TBS bytes. I have no idea what they mean. I've never seen anything like them in amazon generated, calibre or kindlegen 1.2 output. I've committed a change to MOBI inspect to just print the error to stdout and continue, so you should be able to see the rest of the decompiled data.
|
07-31-2011, 02:58 PM | #42 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Those bytes occur only on records that are spanned (i.e. have no start/end points for periodical/section/article nodes) which means they probably contain information about the spanning node.
I'm not overly keen to decode them, since as I said, they only seem to occur in kindlegen 1.1 output. But all the data you need to decode them is in tbs_indexing.txt so knock yourself out if you feel like it. |
07-31-2011, 03:00 PM | #43 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Nevermind, looking at the TBS bytes from that document, their structure is completely different from kindlegen 1.2 TBS entries, so you'd have to decode them from scratch, the info you'll need will all be present in the decompiled_nyt/ dir.
|
07-31-2011, 03:28 PM | #44 | |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
Quote:
Code:
******************** TBS Indexing (27 records) ******************** Record #1: Starts at: 0 Ends at: 4095 Contains: 3 index entries (0 ends, 0 complete, 3 starts) TBS bytes: 80 0 80 80 Starts: Index Entry: 0 (Parent index: -1, Depth: 0, Offset: 121, Size: 107660) [Periodical] Index Entry: 1 (Parent index: 0, Depth: 1, Offset: 568, Size: 76568) [The Front Page] Index Entry: 3 (Parent index: 1, Depth: 2, Offset: 2968, Size: 13248) [Amid New Talks, Some Optimism on Debt Crisis] TBS: 0 (0000) Outermost index: 0 Unknown extra start bytes: {} The section at the start of this record is: 0 First article in this record of section 0 (relative to its parent section): 0 [0 absolute index] The section 0 has at most one article in this record Code:
PACKED HTML Record[ 0] Base = 0h [ 0 ] Size = 7B0h [ 1968 ] **Unpacked HTML Record[ 0] 0 - 4099 TBS = 86 80 02 A0 85 TBS HTML Record 86 80 02 A0 85 Decode TBS HTML Record Type 6 <first section article, ncx=idx+1> 20h(idx=2 flags=0) NCX[3] HTML = 2968 - 16215, parent=1, flags=6, flagdata=0 |
|
07-31-2011, 04:41 PM | #45 |
creator of calibre
Posts: 43,795
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That's because the extra data flags in your mobi are incorrect. They should be:
0b11 (assuming the only trailing data is multibyte overlap and indexing) Instead, they are 0b1011 This causes the reading of the trailing data to be incorrect. |
Tags |
issue fix, kindle, kindlegen, periodical |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
conversion to azw? | grapho | Conversion | 6 | 01-30-2011 10:01 AM |
AZW to EPUB conversion - overlapping letters | suecsi | Calibre | 4 | 10-16-2010 11:53 PM |
PDF to prc/azw Batch Conversion | xsolitudex | 2 | 09-04-2010 10:19 AM | |
PDF -> AZW conversion, weird character spacing | beacher | Amazon Kindle | 7 | 08-17-2010 09:54 PM |
AZW Conversion | elliskatz | Introduce Yourself | 7 | 08-14-2010 05:47 AM |