09-01-2014, 04:36 PM | #961 |
Sigil Developer
Posts: 8,099
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi Doitsu,
It is not syntax related. It is related to how many different inflection rules are needed. So languages that make heavy use of prefixes and suffixes with a large number of rules means one entire mobi section can't hold enough and multiple mobi sections are needed. So I will need access to one of the broken dictionaries to figure this out. Same thing for Japanese. PM me to figure out how we can arrange these test cases. I should be back in 3 days or so. Take care, Kevin |
09-06-2014, 05:23 AM | #962 |
Zealot
Posts: 128
Karma: 500
Join Date: Aug 2011
Device: kindle, boox
|
I've been testing the code you post with some dictionaries.
I've one strange case, before the sourcecode changes the dictionary unpacks but orth values where like encrypted. Also the struct is rare, the definition goes after the </idx:entry> With the new code dictionary fails to unpack. Before it's for "love" word: Code:
<idx:entry> <idx:orth value="owh"> </idx:entry> <h2><b>love </b> It fails to unpack with new code Code:
Parsing dictionary index data 26074 ocnt 0, oentries 0, op1 0, op2 0, otagx 0 parsed INDX header: len C0 nul1 0 type 1 gen 0 start E35C count C4C code FFFFFFFF lng FFFFFFFF total 0 ordt 0 ligt 0 nligt 0 nctoc 0 {'count': 3148, 'nctoc': 0, 'code': 4294967295L, 'nul1': 0, 'len': 192, 'ligt': 0, 'start': 58204, 'nligt': 0, 'ordt': 0, 'lng': 4294967295L, 'total': 0, 'type': 1, 'gen': 0} None None Error: unpack requires a string argument of length 0 Error: Unpacking Failed Testing with a big dictionary it fails before and after, probably to much references or something similar. With another one, it works before and after, but not inflected forms. It has multiple indx. And also strange structure like the first dictionary Code:
<idx:entry> <idx:orth value="love"> </idx:entry> <div><a id="filepos62042024" /> Also tested with wordnet3 free dictionary english-spanish. Worked fine before and after. Also inflexions and structure ok, but i suppose this one was generated with mobipocket. |
Advert | |
|
09-06-2014, 05:53 AM | #963 | |
Grand Sorcerer
Posts: 5,635
Karma: 23191067
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
(I had problems with some files because I didn't indent the code correctly.) BTW, English inflections are freely available as part of Kevin Atkinson's AGID project. |
|
09-06-2014, 04:10 PM | #964 |
Sigil Developer
Posts: 8,099
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi,
If you work with dictionaries, please unzip the attached mobi_dict.py.zip file and use it to replace its namesake in KindleUnpack_v073/lib/ and let me know of any successes and failures. I had a hunch but have too small a set of dictionaries to know if my hunch is correct or not. If so, we may have these things fully unpacking, If not, I am stuck because there are two types of ORDT tables and my hunch as to how to decide which to use will have been wrong. I have my fingers crossed. FWIW, This version seems to work with all dictionaries I have ... German, French, Sven, Collins, American English, SampleDict, ja_dict, Liddell, etc and even seems to work with multiple inflection sections (but I am not sure if correctly or not as I have no source for most of those). I hope this does the trick ... KevinH |
09-07-2014, 04:28 AM | #965 |
Zealot
Posts: 128
Karma: 500
Join Date: Aug 2011
Device: kindle, boox
|
I've tested:
Case 1: Encrypted values. Now works fine, also with inflexions before this new code it went without it, Wow! Case 2: Huge dictionary or something similar: Fails same error (I send you so you can figure what's happening) Code:
{'count': 2316, 'nctoc': 0, 'code': 4294967295L, 'nul1': 0, 'oentries': 0, 'len': 192, 'ligt': 0, 'start': 43964, 'otype': 0, 'nligt': 0, 'ordt': 0, 'lng': 4294967295L, 'total': 0, 'type': 1, 'gen': 0} None None Find link anchors Insert data into html Insert hrefs into html Remove empty anchors from html Insert image references into html Error: Error: Unpacking Failed Case 4: Works fine Final result, except the dictionary that failed now works well in all. And ine case without inflected forms. |
Advert | |
|
09-07-2014, 07:07 AM | #966 | |
Grand Sorcerer
Posts: 5,635
Karma: 23191067
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
For good measure I've also tested it with the default Kindle app dictionaries. It worked with most of them, except for the Spanish dictionary (B005F12G7O_EBOK.azw) and the two Chinese dictionaries (B00AZOHEFU_EBOK.azw & B00AZOHEGE_EBOK.azw). For some odd reason KindleUnpack apparently assumed that the Chinese dictionaries are mobi files with attached source files, because it tried to extract a build log. Code:
File contains kindlegen build log, extracting as kindlegenbuild.log Unpacking raw markup language Write ncx Info: Document contains orthographic index, handle as dictionary Error: Error: Unpacking Failed Code:
Info: Document contains orthographic index, handle as dictionary Error: Dictionary contains multiple inflection index sections, which is not yet supported inflectionTagTable: [(5, 1, 3, 0), (26, 1, 12, 0), (27, 1, 48, 0), (0, 0, 0, 1)] Error: Error: Unpacking Failed I used the wrong version for the test. Last edited by Doitsu; 09-07-2014 at 10:08 AM. |
|
09-07-2014, 09:46 AM | #967 |
Sigil Developer
Posts: 8,099
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi,
Are you sure you used the latest version of the mobi_dict.py I just posted with the Spanish dictionary? This error message should no longer be present in the new mobi_dict.py. Error: Dictionary contains multiple inflection .. Thanks, Kevin |
09-07-2014, 10:07 AM | #968 | |
Grand Sorcerer
Posts: 5,635
Karma: 23191067
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
The Spanish dictionary decompiled fine. |
|
09-07-2014, 02:14 PM | #969 | ||
Sigil Developer
Posts: 8,099
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi elchamaco:
Quote:
Quote:
Error: Dictionary uses obsolete inflection rule scheme which is not yet supported I am sorry but there is no chance I can decode obsolete inflection rules with a sample of one and no source. You will have to live with that failure. Take care, KevinH |
||
09-07-2014, 02:15 PM | #970 |
Sigil Developer
Posts: 8,099
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi Doitsu,
I am now confused. Do the two chinese dictionaries work with the current version or do that still need some work to decode? Thanks, Kevin |
09-07-2014, 06:00 PM | #971 | |
Grand Sorcerer
Posts: 5,635
Karma: 23191067
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
BTW, the Xian Dai Han Yu Ci Dian dictionary (B00AZOHEFU_EBOK.azw) source files apparently contain several syntax errors according to the kindlegen.log file (idx:entry definitions without idx:orth parameters, unresolved hyperlinks etc.). |
|
09-08-2014, 07:48 AM | #972 |
Sigil Developer
Posts: 8,099
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi Doitsu,
Sounds good. I will disable all of the debug output and add it into the material for the next release. Take care, KevinH |
09-08-2014, 02:24 PM | #973 |
Zealot
Posts: 128
Karma: 500
Join Date: Aug 2011
Device: kindle, boox
|
It's strange, i freed some space now i've 20 GB and it's a 6 GB RAM computer, and still it's unable to unpack. I've python 2.7.3, so i updated to 2.7.8, and still fails (Win7 64 bits). I'll try other day with other computer.
And the other thing, it's normal the definition it's not included in the <idx:entry> tag in some dictionaries. |
09-08-2014, 07:07 PM | #974 |
Sigil Developer
Posts: 8,099
Karma: 5450184
Join Date: Nov 2009
Device: many
|
Hi elchamaco,
Are you testing with the exact same dictionary you posted for me? Are you using KindleUnpack GUI, or Calibre or running kindleunpack.py directly from the command line? I tested by running directly from the command line. Please give that a try. KevinH Last edited by KevinH; 09-08-2014 at 10:40 PM. |
09-09-2014, 02:42 PM | #975 |
Zealot
Posts: 128
Karma: 500
Join Date: Aug 2011
Device: kindle, boox
|
It's strange, i tried command line, i was using gui version kindleunpack.pyw, but still gives error. this is the error in the command line
Code:
Traceback (most recent call last): File "E:\pru\lib\kindleunpack.py", line 936, in <module> sys.exit(main()) File "E:\pru\lib\kindleunpack.py", line 925, in main unpackBook(infile, outdir, apnxfile, epubver, use_hd) File "E:\pru\lib\kindleunpack.py", line 840, in unpackBook process_all_mobi_headers(files, apnxfile, sect, mhlst, K8Boundary, False, epubver, use_hd) File "E:\pru\lib\kindleunpack.py", line 763, in process_all_mobi_headers processMobi7(mh, metadata, sect, files, imgnames) File "E:\pru\lib\kindleunpack.py", line 583, in processMobi7 srctext, usedmap = proc.insertHREFS() File "E:\pru\lib\mobi_html.py", line 101, in insertHREFS srctext = srctext[0:12]+'<meta http-equiv="content-type" content="text/html; charset='+metadata.get('Codec')[0]+'" />'+srctext[12:] MemoryError |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM |
PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM |
Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM |
pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM |
Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM |