Thank you very much for your comments. I enjoyed much reading your article "Dictionaries for Bookeen Cybook Odyssey". Maybe I will study your script in order to start learning python.
Originally Posted by AlPe
For 2), an easy way is to store, for word W, the offset, in bytes from the beginning of the chunk, where the definition of W starts.
This seems not to be the way the Kobo engine goes. If it were the case, mnjkl, clsdclsd and me could not have successfully manipulated the content of the dictionaries. I do not know how mnjkl and clsdclsd did it, I for one, did not replace the definitions by definitions of the exact same length. Rather, I added English text at the end of the Japanese definitions, thereby increasing each time the offset of all subsequent entries. As a further information, I can say that the position is not indicated by the node position (3rd child of the html-node or so). I inserted new siblings (<w>...</w>) and the subsequent siblings were still correctly accessed.
Therefore, my guess is that the position of a dictionary entry in the .html is determined by a simple text search for name="W". In that way both cases are coverd, the main head entry (<a name="go">), and the variant (<variant name="goes"/>).
From the behaviour of the Japanese dictionary I got the impression that there things are handled a little different. I still have to think it through. Most important of course is to get the marisa tools working.