11-06-2013, 06:15 AM | #616 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
There is a very high execution time difference between extracting M7 and KF8 part.
Creating MOBI7 from 373MB MOBI file (KindleGen output) take 2.34200000763 sec. Creating MOBI8 from this same file take 103.877000093 sec. Entire time difference is caused by insertsectionrange function. Extractly by lines: Code:
datalst.append(secdata) datalst.append(datain[secstart:]) Anybody have idea how to optimize that? Last edited by AcidWeb; 11-06-2013 at 03:12 PM. |
11-07-2013, 08:55 AM | #617 |
Sigil Developer
Posts: 7,657
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
I am sure there are ways to speed up anything but comparing Mobi7 vs Mobi8 extraction is like comparing apples to oranges. Mobi 8 format is much much more complicated as it splits then entire document into shells for each chapter and "fragments" of html , splits out css, splits out svg, etc and these need to be re-assembled into their proper order before things like internal links can be dealt with. If the final mobi file is generated from an well designed epub, things should proceed reasonably quickly. If however you are feeding Kindlegen, a Mobi7 file as input, you will be generating a worst case situation where the entire document is one large single html file and not split into chapters and subsections as it would be inside an epub. This would slow extraction down to a great extent. Also a 373 MB Mobi file is enormous! I can't think of any text besides a dictionary of some sort (or a graphic novel?) that would require that much space. For more normal file sizes of under 10 MB, processing time to extract a mobi should not be significant on any late model computer. KevinH |
11-07-2013, 09:04 AM | #618 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
Yes. We talking about comics. Up to 500mb (in extreme cases) per file produced by KindleGen. There is no text at all inside. Only images. Input is EPUB 2.0 with separate html page for every comic page.
I would drop KindleGen with pleasure but that is only software on planet that correctly make comic type MOBIs. I'm trying to optimize last step of my application. I use modified KindleUnpack to get MOBI8 part. And I'm doing that only to change header field 501 as no software can edit both headers in hybrid MOBI created by KindleGen. Everything is working correctly but inefficiency of this solution make me sick :-) tl;dr I need faster method to cut filled with graphic MOBI8 from hybrid MOBI or something that can edit both headers in hybrid MOBI. Sadly due my poor Python skills I can't do it myself. I barely understand how KindleUnpack work :-P Last edited by AcidWeb; 11-07-2013 at 09:20 AM. |
11-07-2013, 09:49 AM | #619 |
Grand Sorcerer
Posts: 27,553
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I hope you'll pardon me for saying so; but that sounds like a heck lot of work to do just to make your personal documents look like they're not personal documents.
|
11-07-2013, 09:53 AM | #620 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
Yes. That is price for perfection :-)
|
11-07-2013, 09:54 AM | #621 | |
The Grand Mouse 高貴的老鼠
Posts: 71,514
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
Quote:
If they are about the same size, it might be possible to do some optimisation. |
|
11-07-2013, 09:59 AM | #622 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
Input: 391 131 728
MOBI8: 196 558 244 MOBI7: 196 136 667 |
11-07-2013, 10:06 AM | #623 |
The Grand Mouse 高貴的老鼠
Posts: 71,514
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
|
11-08-2013, 08:29 AM | #624 |
Sigil Developer
Posts: 7,657
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Ah,
It is the split function that is taking all the time and not the unpacking. When splitting to get a mobi 7 you are basically removing the mobi 8 part at the end but not playing with the images / graphics sections at all in the mobi container. When splitting out a mobi 8 you have to remove the text of the mobi 7 section and then copy all of the graphical image sections to put them back into the right place to create a proper mobi 8 since all image/graphics sections are only stored in the old mobi 7 sections to prevent their duplication. So yes I would expect to see much longer times for mobi 8 s in the case. KevinH |
11-08-2013, 10:00 AM | #625 |
Grand Sorcerer
Posts: 27,553
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
In other words: the stand-alone mobi7 portion is created by simply lopping off that which is NOT mobi7 (so to speak) from the hybrid; whereas the stand-alone kf8 needs to be stitched together from pieces of both formats in the original hybrid (especially with regard to images--which only occur once in the hybrid format). Is that about right? So guess it makes sense that in the case of a hybrid mobi that contains ONLY images (and lots of them), creating the stand-alone kf8 is going to take a lot more processing.
Last edited by DiapDealer; 11-08-2013 at 01:27 PM. |
11-08-2013, 12:01 PM | #626 |
Sigil Developer
Posts: 7,657
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi Diapdealer,
Yes that is exactly the issue! I am not sure how or even if we can optimize the split mobi code to deal with this better. Perhaps we can do block copies or extract the entire image section in one piece or ... but a 196 meg set of images is big! Hmm... |
11-08-2013, 12:45 PM | #627 |
Sigil Developer
Posts: 7,657
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
I think in the end writing a program to allow changing of all metadata in dual mobis might be easier. Almost all of the code needed is in the split python file. I am sure a simple command line tool where you pass in the metadata number and new value could be hacked up from the mobi_split.py code. What do you think? Would something like that be useful in the general case? I always thought mobi perl handled that. KevinH |
11-08-2013, 12:56 PM | #628 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
MOBIPerl handle only MOBI7 header. There is no software that can edit second set of headers in hybrid MOBI.
Generally generic Kindle user touch headers only to remove Personal tag from ebook. Most of users use Calibre Quality Check to do it. Obviously this tool also can't edit MOBI8 header in hybrid MOBI. Personally - my eyes shine at the thought of such tool. |
11-08-2013, 01:50 PM | #629 |
Sigil Developer
Posts: 7,657
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
If only one metadata tag needs to be changed in each header of a dual mobi and it is guaranteed to already exist and a simple replacement of values for that metadata element is all that is needed, I am sure we can hack together a python script to do just that quite easily. So exactly what metadata item number do you want to change and from what to what do you want it changed to? KevinH |
11-08-2013, 01:55 PM | #630 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
I need only to set field 501 to "EBOK". Without quotes.
If you trying to make more universal tool it should also edit fields 113 and 504. Both fields store fake ASIN number looking like this: 67168329-a721-4bc7-8a2a-351f88cb72e9 EDIT: In files created by KindleGen all three are empty. Last edited by AcidWeb; 11-08-2013 at 02:05 PM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM |
PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM |
Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM |
pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM |
Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM |