View Single Post
Old 10-26-2012, 04:23 PM   #428
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Some proposed changes to MobiUnpack for perusal and opinions. Very minor.

Changes to mobi_opf.py: escaping even more metadata so invalid OPFs are avoided. I honestly thought I had the bases covered when I escaped the contents of the two "handler" methods, but metadata like "Updated Title" was still falling through the cracks when containing '&' and such. I escaped "Subject" tag contents for good measure as well.

Changes to mobi_unpack.py: this one's frankly because there's a bug in my plugin that I flat-out can't fix without a change to the underlying MobiUnpack code. I feel a little sleazy requesting the change, but in all honesty, the bug could affect any programs that may want to import some of Mobi_Unpack's code as modules in the future (or at least that's what I'm telling myself).

Anyway... python's unicode function will blow up if you try to use it on a string that's already unicode. And since calibre's strings are unicode by default ... passing a unicode filepath string to MobiUnpack's unpackBook method will blow up later when an attempt to "re"-unicode the basename portion of that string is made. I've added a simple check to determine whether or not the string is already unicode before further processing. Without this change, no unicode filepath strings can be passed to the main method. I have to convert them to normal strings first -- which of course, means that no non-ascii characters can be a part of the file's pathname. Unless I'm missing something obvious, which is always quite probable.

I also have a weird test-case azw3 that brings up a fantastically convoluted encoding issue, but I may need to ponder that a bit more yet.

EDIT: to those who downloaded the very first zip, I apologize, I made a mistake and tried to upload the correct one as soon as possible, but a couple of people beat me. Download again for the latest—and what I think is the correct—version.
Attached Files
File Type: zip Mobi_Unpack_v056.zip (45.7 KB, 222 views)

Last edited by DiapDealer; 10-26-2012 at 05:30 PM.
DiapDealer is offline   Reply With Quote