Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 11-03-2012, 03:43 PM   #436
NiLuJe
BLAM!
NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.
 
NiLuJe's Avatar
 
Posts: 13,477
Karma: 26012492
Join Date: Jun 2010
Location: Paris, France
Device: Kindle 2i, 3g, 4, 5w, PW, PW2, PW5; Kobo H2O, Forma, Elipsa, Sage, C2E
Just a quick heads up: I'm using a trimmed down version of MobiUnpack in the latest K5 ScreenSavers hack . (I say trimmed down, because I only needed to extract the cover, so I chopped off everything I didn't need ).

It works surprisingly well (after a painful cross-compile of Python 2.7.3 >_<") so far, the only thing of notice I ran into was a MemoryError on the loadSection() of the last section.
I looked at how Calibre was doing it, and saw that it wrapped it in a try/except block to catch OverflowError exceptions (and, indeed, a bit of good old printf debugging seems to point out that after looks like an overflow on the last section).
I tweaked that a bit:

Code:
@@ -267,7 +67,12 @@
     def loadSection(self, section):
         before, after = self.sections[section:section+2]
         self.stream.seek(before)
-        return self.stream.read(after - before)
+        try:
+            return self.stream.read(after - before)
+        # This bombs out with a MemoryError on Kindle on the last section (where after overflows)
+        except (OverflowError, MemoryError):
+            self.stream.seek(before)
+            return self.stream.read()
And it does the job, but I was wondering if there wasn't a cleaner way to achieve that... (I feel a bit dirty catching a MemoryError exception...).

Last edited by NiLuJe; 11-03-2012 at 03:58 PM.
NiLuJe is offline   Reply With Quote
Old 11-03-2012, 04:54 PM   #437
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,406
Karma: 305065800
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by NiLuJe View Post
Just a quick heads up: I'm using a trimmed down version of MobiUnpack in the latest K5 ScreenSavers hack . (I say trimmed down, because I only needed to extract the cover, so I chopped off everything I didn't need ).

It works surprisingly well (after a painful cross-compile of Python 2.7.3 >_<") so far, the only thing of notice I ran into was a MemoryError on the loadSection() of the last section.
I looked at how Calibre was doing it, and saw that it wrapped it in a try/except block to catch OverflowError exceptions (and, indeed, a bit of good old printf debugging seems to point out that after looks like an overflow on the last section).
I tweaked that a bit:

Code:
@@ -267,7 +67,12 @@
     def loadSection(self, section):
         before, after = self.sections[section:section+2]
         self.stream.seek(before)
-        return self.stream.read(after - before)
+        try:
+            return self.stream.read(after - before)
+        # This bombs out with a MemoryError on Kindle on the last section (where after overflows)
+        except (OverflowError, MemoryError):
+            self.stream.seek(before)
+            return self.stream.read()
And it does the job, but I was wondering if there wasn't a cleaner way to achieve that... (I feel a bit dirty catching a MemoryError exception...).
It's trying to do something clever to be usable with both files and streams, and so didn't get the file length. The version I'm currently working on abandons the attempt to work with streams (since it seems completely unnecessary), and so should eliminate this infelicity.
pdurrant is offline   Reply With Quote
Advert
Old 11-03-2012, 05:06 PM   #438
NiLuJe
BLAM!
NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.
 
NiLuJe's Avatar
 
Posts: 13,477
Karma: 26012492
Join Date: Jun 2010
Location: Paris, France
Device: Kindle 2i, 3g, 4, 5w, PW, PW2, PW5; Kobo H2O, Forma, Elipsa, Sage, C2E
@pdurrant: Thanks for the explanation (and good luck )!
NiLuJe is offline   Reply With Quote
Old 11-06-2012, 12:26 PM   #439
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Quote:
Originally Posted by pdurrant View Post
It's trying to do something clever to be usable with both files and streams, and so didn't get the file length. The version I'm currently working on abandons the attempt to work with streams (since it seems completely unnecessary), and so should eliminate this infelicity.
Hi Paul,

That stream interface was only added to allow MobiUnpack to work inside Calibre before Calibre fully support KF8 style ebooks. It needed to handle both interfaces because what Calibre handed to mobiunpack might be a stream or a file depending on things.

So feel free to take it out as Calibre no longer supports internal use of Calibre but you might need to add something then to the plugin interface code so that DiapDealer's MobiUnpack plugin continues to work if you run into any problems.

Hope this helps explain why it was there.

Take care,

KevinH
KevinH is offline   Reply With Quote
Old 11-06-2012, 12:48 PM   #440
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,406
Karma: 305065800
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by KevinH View Post
That stream interface was only added to allow MobiUnpack to work inside Calibre before Calibre fully support KF8 style ebooks.
That's a relief!
pdurrant is offline   Reply With Quote
Advert
Old 11-07-2012, 01:55 PM   #441
hockpa2e
Junior Member
hockpa2e began at the beginning.
 
hockpa2e's Avatar
 
Posts: 2
Karma: 10
Join Date: Nov 2012
Location: New Jersey, USA
Device: Kindle Keyboard 3G
Hi, a DeDRM'd mobi7 that I unpacked has corrupted index entries in the HTML. The values in the idx:orth tags are byte sausage, with illegal control characters and everything:

<idx:orth value="^Ch^H\^H_">

(This is how emacs shows control characters.) The same thing happens whether I use the latest or older versions of mobiunpack.

All the other data in the file seems fine. The charset is utf-8.

Any idea what could be causing this? Thanks.
hockpa2e is offline   Reply With Quote
Old 11-07-2012, 10:38 PM   #442
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Support for dictionary type MOBIs (or anything with extensive use of idx:orth) has always been quite limited and very unreliable. While the text is fine (as you've discovered), the actual dictionary functionality is often broken when trying to rebuild the source.
DiapDealer is offline   Reply With Quote
Old 11-08-2012, 06:54 AM   #443
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,406
Karma: 305065800
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
I have uploaded MobiUnpack 0.59.

The main changes have been in the debug/dump code, which now identifies and dumps (in one form or another) every section in the file, as well as providing much more info on the Mobi headers and EXTH, hopefully replicating the the functionality of DumpMobiHeader in a reasonably nicely formatted way.

There's still a lot to be done to make it really neat and tidy, but that will have to wait for another day.
pdurrant is offline   Reply With Quote
Old 11-08-2012, 06:51 PM   #444
NiLuJe
BLAM!
NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.NiLuJe ought to be getting tired of karma fortunes by now.
 
NiLuJe's Avatar
 
Posts: 13,477
Karma: 26012492
Join Date: Jun 2010
Location: Paris, France
Device: Kindle 2i, 3g, 4, 5w, PW, PW2, PW5; Kobo H2O, Forma, Elipsa, Sage, C2E
@pdurrant: And indeed, it now works properly without an ugly hack on the Kindle, thanks!
NiLuJe is offline   Reply With Quote
Old 11-15-2012, 04:01 PM   #445
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I'm starting to get the idea that I'm chasing my own tail with regard to ensuring compliant OPF files.

I thought the escape method from the standard xml.sax library was working quite well on metadata items—and it is, in fact, converting all instances of '&' and '<' or '>' to xml compliant entities as it was intended. But I'm discovering that a lot of metadata out there (especially KF8 subjects/descriptions) seem to contain html entities. This, by itself, wouldn't pose a problem. The problem is that my xml escape method is dutifully whacking all the ampersands in those poor defenseless entities and turning them into gibberish, basically.

So in one more attempt to overthink a process... enter the criminally underutilized (not to mention unsung) "unescape" method of Python's HTMLParser module. The unescape method first converts all entities that may be present in the data to their unicode character representations (OPF files are utf-8/16 by spec, afterall). Only then does the xml escape method fixup any stray ampersands and/or left/right angle brackets.

All this rambling means that I have an updated mobi_opf.py script for you to consider, pdurrant.
Attached Files
File Type: zip mobi_opf.zip (3.2 KB, 189 views)
DiapDealer is offline   Reply With Quote
Old 11-15-2012, 05:16 PM   #446
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,406
Karma: 305065800
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Thanks! I'll take a look as soon as I can.
pdurrant is offline   Reply With Quote
Old 11-25-2012, 11:53 AM   #447
miquele
Connoisseur
miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.miquele ought to be getting tired of karma fortunes by now.
 
miquele's Avatar
 
Posts: 75
Karma: 498122
Join Date: May 2010
Location: Europe
Device: Bookeen Cybook Gen3, Kindle 3, Kindle PW, Kindle Voyage
Hello adamselene,
I try to change the incorrectly set language code of a dictionary in order to make it work on the Paperwhite. I followed your script which runs - however, I get following "Error: Dictionary contains multiple inflection index sections, which is not yet supported".
I assume this breaks the process? Do you know of another possibility to change the in/out language of a .mobi file?
Thanks a lot!
Attached Thumbnails
Click image for larger version

Name:	output.png
Views:	284
Size:	14.8 KB
ID:	96816  

Last edited by miquele; 11-25-2012 at 12:10 PM. Reason: attachment
miquele is offline   Reply With Quote
Old 11-25-2012, 05:23 PM   #448
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,406
Karma: 305065800
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by miquele View Post
Hello adamselene,
I try to change the incorrectly set language code of a dictionary in order to make it work on the Paperwhite. I followed your script which runs - however, I get following "Error: Dictionary contains multiple inflection index sections, which is not yet supported".
I assume this breaks the process? Do you know of another possibility to change the in/out language of a .mobi file?
Thanks a lot!
Hmm... I don't know a lot about dictionaries, but I'd suspect that the language codes are in the header, and so could be changed without decompiling/recompiling. But I don't know of a tool to do that.
pdurrant is offline   Reply With Quote
Old 12-10-2012, 06:08 PM   #449
nleblanc88
Junior Member
nleblanc88 began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jul 2012
Device: None
I'd like to contribute v060 if I could. What this version fixes:

--

Encoding chapter names in UTF-8. This fixes NCX and OPF files from being encoded in non UTF-8 encodings.

--

From my test, chapter names with UTF-8 characters were not being written properly to the resulting .NCX file. This causes the file charset to be "unknown-8bit", and trying to parse these files would result in errors.

This patch fixes this issue. I've attached the source.

--

I'd also like to bring up the idea of setting up a git repository for this project(bitbucket.com or github.com). I'd love to keep contributing to this project, and I think this would not only make it easier for me and others to do so, but also help the author keep track of all versions. I'd be willing to set this up if anybody would like.
Attached Files
File Type: zip Mobi_Unpack_v060.zip (81.8 KB, 213 views)
nleblanc88 is offline   Reply With Quote
Old 12-11-2012, 09:16 AM   #450
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
your changes

Hi,

Could you post a diff of your proposed changes? I have modified my tree with a number of other fixes (Amazon Page Break, fixes for div tables that have broken insert positions, fixes for ncx with broken insert positions, fixes for non-existent links to css files, fixes for hangs in debug mode, fixes for not properly describing CTOC sections in the section description output etc, addition of DiapDealer's opf output fixes for metadata that incorporate html tags, etc.

As for hosting this project, we already have a google code project but it pretty well went unused after a short bit and development seemed to only continue here.

I am planning a major clean-up of the code over the holiday break to hopefully simplify things and clean up little nits and things. Once we have all all of the patches and clean-up done, perhpas trying once again to create a shared repository might be a good idea.

Thanks,

KevinH
KevinH is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can i rotate text and insert images in Mobi and EPUB? JanGLi Kindle Formats 5 02-02-2013 04:16 PM
PDF to Mobi with text and images pocketsprocket Kindle Formats 7 05-21-2012 07:06 AM
Mobi files - images DWC Introduce Yourself 5 07-06-2011 01:43 AM
pdf to mobi... creating images rather than text Dumhed Calibre 5 11-06-2010 12:08 PM
Transfer of images on text files anirudh215 PDF 2 06-22-2009 09:28 AM


All times are GMT -4. The time now is 04:27 PM.


MobileRead.com is a privately owned, operated and funded community.