View Single Post
Old 05-23-2009, 09:42 AM   #11
keng2000
Researcher and Consultant
keng2000 has a complete set of Star Wars action figures.keng2000 has a complete set of Star Wars action figures.keng2000 has a complete set of Star Wars action figures.keng2000 has a complete set of Star Wars action figures.
 
Posts: 210
Karma: 364
Join Date: Nov 2008
Location: Bangkok, THAILAND
Device: MACBOOKPRO17" HP2400TX SONYUX27
Quote:
Originally Posted by Kardell View Post
Can you describe the whole process in details? It'd be great.
Thank you in advance
What I found in the internet is only way to convert from one to another dictionary format (dictd stardict, babylon and others)
I have not founded any between .idx back to .index.
But finally I found the original single dictionary file in plain text (TIS-620) with 5 XML tags seperated by <DOC> for each word.
Then I did some research and got the solution luckily.
1. convert them to UTF8.
2. Fix all & to &amp;
3. Use perl program to convert them to C5 format
4. Use dictfmt to build .dict and .index
5. Use dictzip to make .dict.dz

the following is my sample script
SOURCE: eltex
PERL: parse_etlex.pl

=======================
iconv -f tis620 -c etlex > etlex.utf-8

# DO IT IN EDITOR
# Change & to &amp,
# Add <etlex> at TOP
# Add </etlex> at BUTTOM
#Save as UTF-8 with SIG

./parse_etlex.pl etlex.utf-8 > etlex.c5 > /dev/null

cat lexitron.info etlex.c5 | dictfmt \
-c5 -u ftp://ftp.opentle.org/pub/lexitron/s...tron-data.zip\
-s "LEXiTRON version 2, etlex" \
--without-info --allchars \
--utf8 etlex

dictzip etlex.dict
keng2000 is offline