View Single Post
Old 08-06-2011, 04:51 AM   #116
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 155
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 9 WiFi / Google Pixel 6a (Android)
Quote:
Originally Posted by karunaji View Post
How hard it would be to figure this out? I tried to look at the format description but I probably won't have that much time to figure it out.
I think you'll learn the needed things better from the code than from the format description.

I don't know how fluent your Python is, but the multiple index section support shouldn't be very hard. The most important information you have to figure out is whether the additional sections use their own different tag tables or not. If they do, you have to hold multiple tag tables in addition to multiple sections, but that's also doable, just a bit more code.

You can run mobiunpack with WRITE_RAW_DATA set to True to dump the index sections as individual files and use a hex editor to analyse the data.

But I'm afraid that in your case (german-english dictionary) you will also face an issue with the inflection rules, as the current implementation doesn't handle words with special characters like german umlauts properly. That might be just a simple text encoding issue, but it could also be something which needs additional reverse engineering effort.

Quote:
I want it because I have bought PONS DE>EN dictionary and I use it to practice reading German texts. This dictionary provides word pronunciation in square brackets [] but in Kindle pop-up window it appears empty. I have to press ENTER and it's rather inconvenient. I decoded the dictionary and I can see that the pronunciation is composed of small images. But Kindle font actually contains extended IPA characters so it should be trivial to replace the images to chars and repack the dictionary.
Do I understand that correct that on the Kindle device the popup window doesn't support images so you can't read the pronuncation?

In the Kindle app the pronuncation is displayed fine, but I had to reformat my dictionary because the pronuncation and other information takes so much space that the popup window doesn't contain the actual translation for most words. So I removed pronuncation and unnecessary whitespace from the formatting to get a usable dictionary for the Kindle app.

Quote:
I just tried to decode another dictionary DE>RU and it also contains multiple inflection index sections.
The german language seem to require much more inflection rules than for example english. So I would assume that most if not all german to whatever dictionaries will contain multiple inflection index sections.

If you want to test more dictionaries, mobipocket.com provides free sample downloads for (all?) dictionaries they sell, the samples are also without DRM, so you can just run mobiunpack on them.

Ciao,
Steffen
siebert is offline   Reply With Quote