Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 09-14-2011, 07:54 AM   #181
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 155
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 9 WiFi / Google Pixel 6a (Android)
Quote:
Originally Posted by pdurrant View Post
An interesting idea. I haven't really explored the dev hub.
It seems to support only subversion (*shudder*)...

As I've said before, I pushed my git repository to github and any fellow developer should feel free to create forks for their own development which can be merged if a feature is ready: https://github.com/siebert/mobiunpack

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 09-14-2011, 09:27 AM   #182
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi fandrieu,

Great work!

I will take a shot at combining your latest version with a version that uses Siebert's readTag routine to parse the TAGX which can be found in the indx0 section to find the field bitmaps for each tag and parse them. That way we can forget about all of the if type == 0x1f lines and just use the correct bitmaps to decipher which fields are present and then read them.

Thanks!

KevinH
KevinH is online now   Reply With Quote
Advert
Old 09-14-2011, 09:38 AM   #183
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi DaleDe,

Quote:
Originally Posted by DaleDe View Post
This is great interaction and development. I wonder if the dev hub available here would be better for the purpose.
Interesting idea. Paul and I hosted a code.google.com site for mobiunpack.py but we received almost no contributions or input over the years. Siebert was the first new developer to come on and he found the source on this site (not our code.google.com dev site) and after his extensive changes he added his own git site.

Based on similar experiences from other small (couple of files only) dev projects, it appears to me that using development specific hosting with its own hurdle of concurrent versioning tool (git vs svn vs mercurial vs cvs vs rcs, etc.) and the lack of visits by users who might have an "itch to scratch" simply lowers contributions.

I think the same thing happens with users of both Sigil and Calibre. They are constantly pointed to other official sites but most of the impetus for change is done or initiated via MR.

So unless we are disrupting things with our posts, I would prefer to keep things here just to maximize our exposure to new users (and hopefully potential developers) who might want to contribute a new feature or quick fix.

My 2 cents ...

KevinH
KevinH is online now   Reply With Quote
Old 09-14-2011, 10:07 AM   #184
fandrieu
Member
fandrieu began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2011
Device: kindle 3
KevinH, sorry to flood the thread with zips, but here a new version

I tried the NCX code against on all the mobis I could lay my hands on...

The only "real" error I got was with really fat ebooks (technical books with more than a thousand entries), the INDX1 is splitted across more than one section !

I first added a few checks to prevent exceptions, but more importantly found out the the actual number of "data" INDX sections is stored in the INDX0.

So I modified the code to take this into account and parse multiple INDXx.
In the zip file you'll find a file for this test case, a dummy book with 4000 entries on 5 levels (that's a 600kb ncx...)

While I was at it, as suggested by siebert, I used his tagx code to parse the rest of INDX0, but still doesn't do anything with the data.

Please use this version instead if the previous if you plan on integrating the changes.

Thanks, fand.

PS: i also included the (simplistic) script I used to test the code on all my books, if someone interested...
Attached Files
File Type: zip mobiunpack_testncx2.zip (96.5 KB, 227 views)

Last edited by fandrieu; 09-14-2011 at 02:23 PM. Reason: reup: fixed an error in child reordering
fandrieu is offline   Reply With Quote
Old 09-14-2011, 12:39 PM   #185
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Hi, the above (mobiunpack_testncx2.zip) test script isn't recognizing the ncx in most of my mobi's. The multi-level stuff seems to be off by one. Any of my mobi's that have a strictly flat ncx (one level), the script mistakenly reports as having "No ncx." And with a mobi that has a two-level ncx, the script builds a one-level (flat ncx file)... ignoring the parent level if an entry has a parent.

I may be wrong, but I seem to remember something about calibre flattening the ncx regardless. I'm not sure the Kindle properly handles a multi-level ncx file. Something about only the parent levels (and not the children) showing on the progress bar as "jump points" (which is the only thing useful function the ncx provides on a Kindle). I could be completely mistaken about all that, though... I'll have to do some testing.

Last edited by DiapDealer; 09-14-2011 at 12:49 PM.
DiapDealer is offline   Reply With Quote
Advert
Old 09-14-2011, 01:28 PM   #186
fandrieu
Member
fandrieu began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2011
Device: kindle 3
Quote:
Originally Posted by DiapDealer View Post
Hi, the above (mobiunpack_testncx2.zip) test script isn't recognizing the ncx in most of my mobi's. The multi-level stuff seems to be off by one. Any of my mobi's that have a strictly flat ncx (one level), the script mistakenly reports as having "No ncx." And with a mobi that has a two-level ncx, the script builds a one-level (flat ncx file)... ignoring the parent level if an entry has a parent.
I wouldn't be surprised if it's off by one, quite the contrary I don't expect the code to be correct at this stage
But for now I couldn't find a book to reproduce the problem, that's pretty weird, i'll look into it further...

Quote:
Originally Posted by DiapDealer View Post
I may be wrong, but I seem to remember something about calibre flattening the ncx regardless. I'm not sure the Kindle properly handles a multi-level ncx file. Something about only the parent levels (and not the children) showing on the progress bar as "jump points" (which is the only thing useful function the ncx provides on a Kindle). I could be completely mistaken about all that, though... I'll have to do some testing.
As far as I know I completely agree and all that makes multi-level NCX pretty useless for now.
But anyway kindlegen does produce this kind of file and my goal with this code was to extract as much from the mobi as possible, so that you can re-compile the files from mobiunpack into an as-identical-as-possible new mobi...
fandrieu is offline   Reply With Quote
Old 09-14-2011, 03:22 PM   #187
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi All,

Okay, I took fandrieu's latest, and modified it to pass the tagx info to the readINDX1 routine and fixed an off by one in the code that sorts the NCX.

I think this should now be close.

PS: Actually I still think sortINDX has an off-by-one issue and my change may not be the correct one! My change fixed my problem but will probably fail for some other case. Recursion is so fun!

Either way it needs to be worked on and fixed. We should also re-factor things into classes and maybe even separate it into files that encapsulate the various functions in some smarter way.

Last edited by KevinH; 09-15-2011 at 06:56 PM. Reason: add a PS
KevinH is online now   Reply With Quote
Old 09-14-2011, 05:08 PM   #188
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I'm getting good results with these latest scripts. I'm still trying to find something in one of my books that breaks it, but I'm not having much luck.

Quote:
Originally Posted by KevinH
Either way it needs to be worked on and fixed. We should also re-factor things into classes and maybe even separate it into files that encapsulate the various functions in some smarter way.
I'm all for class-ifying, but if given a vote, I would rather that mobiunpack remain one self-contained script.

Last edited by DiapDealer; 09-14-2011 at 05:33 PM.
DiapDealer is offline   Reply With Quote
Old 09-14-2011, 05:28 PM   #189
fandrieu
Member
fandrieu began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2011
Device: kindle 3
Quote:
Originally Posted by DiapDealer View Post
I'm getting good results with these latest scripts. I'm still trying to find something in one of my books that breaks it, but I'm not having much luck.
I just found a book with the same kind of problem:
calibre fetched a scheduled feed just while i was testing some files, so i tried the resulting "periodical" mobi and that was it

It seems the problem is with the INDX parsing, i got the output:

Code:
parsed INDX header:
len 192 nul1 0 type 1 gen 0 start 1256 count 54 code 4294967295 lng 4294967295 total 0 ordt 0 ligt 0
contextual data @ xB
DF	0	-1	1	6
contextual data @ x98
2	2	E2	-1	-1
contextual data @ x127
46	2	E2	-1	-1
which shows that from the second entry everything is mangled.
There's actually an extra VWI in the first "DF" entry so the rest is shifted.

I guess the right way to fix should be to use the TAGX data to reliably know what to expect in the entries.
In this particular case our current "type-based" rules might work if we took into account the differences between book & periodical style indexes...but i'm yet to fiddle with that...

EDIT:
I missed KevinH last post...
Thanks for the tagx code i'll look into it
And yes there were some errors in the sortINDX code i actually (silently out of shame ) reuploaded the zip earlier with >= replaced by > in the first test and other fixes

EDIT2:
tagx: pretty impressive, many thanks for quickly implementing this tagx bit i had skipped altogether
sortINDX: you got the second ">0" error but missed the one i mentioned above
refactor: i was toying with the oop approach before but wouldn't do it to keep in sync with other versions, but i have a mobiunpack_ootest.py somehere...

Last edited by fandrieu; 09-14-2011 at 06:05 PM.
fandrieu is offline   Reply With Quote
Old 09-14-2011, 06:04 PM   #190
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,506
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Bear in mind that calibre-generated Mobipocket files might not be valid in all instances, since the code was written with reverse-engineered info, not with documentation of the format.
pdurrant is offline   Reply With Quote
Old 09-14-2011, 07:02 PM   #191
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi All,

Okay I merged the fixes that fandrieu made to his version (fixes to sortINDX, other changes) and added in a few other typo fixes and now I think we have a version we can use as the basis for public testing and as a basis for refactoring into classes while trying to keep to just one file.

Very nice work fandrieu!

mobiunpack_fand_updated2.zip is attached.

KevinH

Last edited by KevinH; 09-15-2011 at 08:32 PM.
KevinH is online now   Reply With Quote
Old 09-14-2011, 07:58 PM   #192
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
The above script is slightly broken for MOBI's that have no NCX (when DEBUG_NCX is set to False). In that circumstance, the outncx variable is referenced before it's assigned in the unpackBook function. The <spine> element is also incorrect in the opf for a MOBI with no ncx file.

I made two small changes to the unpackBook function that make it work for MOBI's with no NCX. A quick diff will reveal the simple changes.

I'm having quite a bit of success with unpacking various books and rebuilding them with Kindlegen.

Last edited by DiapDealer; 09-16-2011 at 01:20 PM.
DiapDealer is offline   Reply With Quote
Old 09-14-2011, 08:06 PM   #193
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi DiapDealer,

Nice catch! I never actually tested it on a book without an NCX.
If your version seems to work for everyone, then we have one to release before we attempt the refactoring/adding of classes.

Thanks,

KevinH

[QUOTE=DiapDealer;1742537]The above script is slightly broken for MOBI's that have no NCX (when DEBUG_NCX is set to False). In that circumstance, the outncx variable is referenced before it's assigned in the unpackBook function. The <spine> element is also incorrect in the opf for a MOBI with no ncx file.

I made two small changes to the unpackBook function that make it work for MOBI's with no NCX. A quick diff will reveal the simple changes.

I'm having quite a bit of success with unpacking various books and rebuilding them with Kindlegen. [/QUOTE]
KevinH is online now   Reply With Quote
Old 09-14-2011, 09:54 PM   #194
fandrieu
Member
fandrieu began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2011
Device: kindle 3
Hehe, i didn't take the time to check your latest fixes (pretty late here), but you seem to have spotted the misplaced outncx=False line

I just wanted to add another bit that troubled me:
I merged the (hopefully fixed) sortINDX & buildNCX functions, removing an "evolutionary" clutch with the added bonus of correct indenting (but didn't take much time to test it though...)
Attached Files
File Type: zip mobiunpack_testncx_onemore.zip (16.6 KB, 236 views)
fandrieu is offline   Reply With Quote
Old 09-15-2011, 10:42 AM   #195
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 155
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 9 WiFi / Google Pixel 6a (Android)
Hi,

I've looked into the latest source provided by fandrieu and the handling seems to make some shortcuts. I assume that the ncx index also contains a IDXT section, why don't you don't use it to find the start and end position of each entry, so you can verify that you've decoded all bytes?

The tag handling code will work only if all bitmasks are single bits. Is this always the case? I would then at least add an assertion which will fail for non-single bitmasks.

Ciao,
Steffen
siebert is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can i rotate text and insert images in Mobi and EPUB? JanGLi Kindle Formats 5 02-02-2013 04:16 PM
PDF to Mobi with text and images pocketsprocket Kindle Formats 7 05-21-2012 07:06 AM
Mobi files - images DWC Introduce Yourself 5 07-06-2011 01:43 AM
pdf to mobi... creating images rather than text Dumhed Calibre 5 11-06-2010 12:08 PM
Transfer of images on text files anirudh215 PDF 2 06-22-2009 09:28 AM


All times are GMT -4. The time now is 02:07 PM.


MobileRead.com is a privately owned, operated and funded community.