View Single Post
Old 11-19-2021, 04:48 AM   #1
MrBeef12
Junior Member
MrBeef12 began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Nov 2021
Device: Kindle Paperwhite 7th Generation
Creating a Kindle dictionary with inflections

Hello everyone!

I am currently trying to create a Russian dictionary that has proper inflections sourced from Wiktionary because all current other dictionaries are not that great.

There is some documentation about this provided by Amazon. It basically says that you should:

1) Create an XHTML file with their special markup specifying all inflections etc.
2) Turn it into an epub
3) Open it with Kindle Previewer
4) Export it with Kindle Previewer to MOBI

So I created a large XHTML file (23 MB or so) according to the Amazon specifications and opened it in Kindle Previewer, and it looked fine. However, Kindle Previewer does not let you export XHTML files to MOBI. They want you to create an intermediate epub file.

I tried using Pandoc to do the conversion, which did not work because it stripped out all the specific HTML tags and only left in paragraphs. Then I tried using calibre. The normal XHTML -> epub conversion failed because the XHTML file was too large, according to an error message. Calibre suggests to turn on the "heuristic mode" if you run into this error, which I tried, but which did not finish running after hours of runtime.

Then I attempted to create the epub file myself, using a sample file taken from this tutorial. I discovered that this is not trivial, and a check using epubcheck revealed many hard-to-understand errors in my generated file. The generation of the epub file is also a bit complicated by the fact that you probably need to split the XHTML files into many smaller files, which should maybe be 250 kb in size, because e-readers tend to struggle with parsing larger files.

So I thought there should maybe be an easier way to do this, or maybe a library that helps doing this. Maybe it would even be a good idea to output the words + inflections into some other easier dictionary format and then convert it to a MOBI using an existing library and leaving out the XHTML generation completely. Currently I am using Python, but I'd also use other languages if it is necessary. What could I try?

There is an apparently closed source script here that unfortunately doesn't support inflections, so does not work. And there are instructions here that advise converting the file to PRC using Mobipocket Creator and then opening it with Kindle Previewer. The problem with this approach is that Kindle Previewer throws the error:

> Kindle Previewer does not support this file, which has either been created using an older version of KindleGen or a third party application. We recommend using EPUB or DOCX format directly for previewing and publishing your book on Kindle.

There are also more detailed instructions for Mobipocket Creator here, which tell you to directly move the generated .prc file onto the kindle. I tried that but it is not being recognized as a dictionary.
MrBeef12 is offline   Reply With Quote