View Single Post
Old 12-05-2013, 07:38 AM   #7
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,737
Karma: 24031401
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by mobiuser View Post
Uhmm, the conversion process doesn't seem to consider the html tags (color, font style...)
IIRC, Babylon uses a couple of custom HTML tags that you need to delete, but all other HTML code (fort example <span>'s or HTML entities) can be left in the source files since the Mobipocket format was originally based on HTML 3.2.
(The latest Kindle format also partially supports HTML5, but not for dictionaries.)

Quote:
Originally Posted by mobiuser View Post
Is there anything I can do to improve the result?
I'm afraid not, unless you're familiar with a scripting language and/or regular expressions.

Quote:
Originally Posted by mobiuser View Post
I think I can resolve the symbols and the accented characters problems with a "search and replace" in notepad++, but for the html code I don't think I can do anything.
As I've already mentioned, valid HTML 3.2 markup doesn't need to be deleted.

IMHO, the whole project probably isn't worth the effort, because Amazon has lately added some very reasonably-priced Kindle dictionaries by traditional dictionary publishers with a good coverage.
Also, even if you manage to completely reformat the input file, the dictionary won't contain inflections and will be of limited use, unless the dictionary language isn't heavily inflected or you're an advanced user, who can automatically identify inflected word forms.
Doitsu is offline   Reply With Quote