View Single Post
Old 01-12-2010, 09:50 AM   #9
Joghurt
Programmer
Joghurt will become famous soon enoughJoghurt will become famous soon enoughJoghurt will become famous soon enoughJoghurt will become famous soon enoughJoghurt will become famous soon enoughJoghurt will become famous soon enough
 
Posts: 40
Karma: 500
Join Date: Oct 2009
Device: PB360
Kleines exemplarisches Linux-Skript, um die EN_DE-TXT-Datei von dict.cc in XDXF zu konvertieren, das converter.exe zu vertragen scheint:
Quote:
#! /bin/bash

IN=$1
OUT=`echo $1 | sed 's/\.txt$/.xdxf/'`

echo "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>
<!DOCTYPE xdxf SYSTEM \"http://xdxf.sourceforge.net/xdxf_lousy.dtd\">
<xdxf lang_from=\"ENG\" lang_to=\"GER\" format=\"visual\">
<full_name>English-German dictionary</full_name>
<description>Copyright: http://www.dict.cc/; Version: 1.0</description>" >$OUT

grep -v "^#" $IN | grep '\t' | perl -pe 's/^(.+)\t(.+)$/<ar><k>$1<\/k>$2<\/ar>/' >>$OUT

echo "</xdxf>" >>$OUT
...zumindest meldet er dann "Total words: 444277", und das File wird 9MB groß.

Obs klappt muss ich noch zu Hause probieren, hab das jetzt grade mal schnell "hingeschlampt"...
Joghurt is offline   Reply With Quote