MobileRead Forums - View Single Post

Doitsu · 12-17-2013, 03:42 PM

Quote:

Originally Posted by susan_cassidy

A .prc file should be about the same as a .mobi file, usually. .mobi would be preferred.

That's not the case for dictionaries, since Amazon hasn't ported the Kindle dictionary format to KF8. I.e., in terms of functionality it doesn't make a difference whether dictionaries are generated with the older mobigen.exe or the current kindlegen.exe. (Kindle Dictionaries cannot be generated with Calibre.)

Quote:

Originally Posted by susan_cassidy

I don't know if a Kindle, for example, would allow a dictionary with a .prc suffix.

In that case you may want to refrain from answering dictionary related questions. BTW, the answer is yes, Kindles and Kindle apps accept both .prc and .mobi dictionaries.

Quote:

Originally Posted by susan_cassidy

The dictionary would not be tab-delimited, but all in HTML.

Kindles and Kindle apps do not support uncompiled HTML dictionaries; they obviously need to be compiled.

Quote:

Originally Posted by Difermo

After editing (find and fix all mistakes from ocr) i save as plain text (utf-8). That is file with dictionary.txt extension WHAT NEXT!!! WHAT NEXT!!!!

That depends on your technical skills. If you know your way around a Unicode editor that supports regular expressions, you could:

A) Use lots of search and replace operations (or a custom script) to add the required Mobipocket dictionary tags, manually create an .opf file with file references and open it with MobiGen.exe or KindleGen.exe to generate the dictionary.

Each entry in the HTML source file should look more or less like this (I added spaces to make the code more readable; they're optional):

Code:

<html>
<body>

<idx:entry>
	<b><idx:orth>book
	<idx:infl>
		<idx:iform value="books"/>
	</idx:infl>
	</idx:orth> </b> 
	<i>noun</i> <br/>
	a written or printed work
</idx:entry>
<br/><br/>
<hr/>
<idx:entry>
	<b><idx:orth>go
	<idx:infl>
		<idx:iform value="goes"/>
		<idx:iform value="going"/>
		<idx:iform value="went"/>
		<idx:iform value="gone"/>
	</idx:infl>
	</idx:orth> </b> 
	<i>verb</i> <br/>
	move from one place or point to another
</idx:entry>
<br/><br/>

</body>
</html>

B) Use lots of search and replace operations to change the text file in a way that you end up with a UTF8 text file that contains lines with the following format:

Code:

Headword<TAB>Definition<CR/LF>

<TAB> stands for the invisible tab character and <CR/LF> for the line-break that you create when you press Enter. (Unix-style line-breaks are OK, too.)
Once you've done that you can either use tab2opf.py to generate source files required for MobiGen/KindleGen. Alternatively, you could also use files provided on this website.

Quote:

Originally Posted by Difermo

3)is stardict format actualy online format of all kinds of dictionary that i need to convert to dictionary.txt?

The StarDict format is a completely different format, however, if you happen to find a Serbian-Serbian StarDict or Babylon BGL dictionary, you could use another tool, PyGlossary, to convert it to a tab-delimited file that you could use as input file for tab2opf.py.

Quote:

Originally Posted by Difermo

4)what about dictionary.prc? I see some are creating dictionary.prc (not mobi extension) what is different betvene them, and does kindle device read bout of them same

PRC dictionaries were created with the older Mobigen.exe and MOBI dictionaries with KindleGen.exe otherwise they're more or less identical.

Quote:

Originally Posted by Difermo

5) Does Inflections code must be on every word (more then 40 000), or it must be writen at begining of dictionary.txt? or at the end of dictionary.txt Is there any script to install all inflections?

Unfortunately, tab2opf.py does not support inflections, you'll need to find someone who can write a custom script for you that adds them to each entry in the required format. In my example dictionary code, inflections are coded in the <idx:infl>...</idx:infl> block.

You may also want to read the Kindle Dictionary FAQ, which will answer many of your questions.