MobileRead Forums - View Single Post

KevinH · 07-18-2011, 12:04 PM

Hi Steffen,

Quote:

Originally Posted by siebert

I never tried that, but as it is just some GUI calling the actual mobiunpack.py for the unpacking, it should work if you make sure that it uses my mobiunpack.py instead of the delivered one, otherwise you won't get dictionary support nor the speed optimization for huge files.
Steffen

I just wanted to say very nice job with your new mobiunpack.py version!

I diffed your speedup changes against the original and all looks great except for one thing, why did you remove the imghdr code that detects the proper image type so that it creates a file with the proper extension? I, for one, want all of my file extensions to match the actual contents of the file because not every program ignores the extension when working with files. Are you using fake "image" files to store extra sections (non-html, non-image) from the original mobi file? Perhaps index information from the dictionaries?

Also, it would be nice to grab all of the string concats and file writes into one function that passes in the "big-file" flag, and new data and handles it, just to make the code look cleaner.

That said, I find it hard to think that even a 26 meg mobi file fills up memory in todays multi gb machines. It might simply be that the string concatenation needs to be replaced with simply adding string pieces to a list and then doing a "".join(list) at the end. This should prevent the creation of multiple copies of the 26 meg long string which is what must be filling up memory. Perhaps because the garbage collection is not aggressive enough to reclaim and reuse it in a timely fashion? ... at least older version of python used to recommend that for heavy string concats.

Thanks again for all of your work on it.

Take care,

KevinH