Yes, I see it. Garbage in = garbage out, I am afraid.
This is the entry in the chunk c_1 in the archive dict-file:
Code:
<b>A,</b> N. m. [<f>&a;</f>] ou [<f>&â;</f>] Voyelle et première lettre de l'alphabet. <i>Une panse d' <i>a </i>, </i>la première partie d'un petit <i>a </i> dans l'écriture. <f>&os;</f> <i>N'avoir pas fait une panse d' <i>a </i>, </i>c'est-à-dire n'avoir rien écrit. <f>&ns;</f> <i>Prouver par A + B, </i>avec précision et rigueur. <f>&ns;</f> <i>De A à Z, </i>du début à la fin. <f>&ns;</f> <i>A4, </i>format d'une feuille de papier de 21 X 29,7 cm. <i>A3, </i>format 29,7 X 42 cm. <f>&ns;</f> La.
And this is the entry in the Stardict file generated by Penelope:
As you can see, it is not the accented characters that are the problem, rather in the idx-file there is a doctype definition given, which defines all those entities:
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[
<!ENTITY ns "♦">
<!ENTITY os "•">
<!ENTITY oo "›">
<!ENTITY co "‹">
<!ENTITY a "a">
<!ENTITY â "ɑ">
<!ENTITY an "ɑ̃">
<!ENTITY b "b">
<!ENTITY d "ɗ">
<!ENTITY e "ə">
<!ENTITY é "e">
<!ENTITY è "ɛ">
<!ENTITY in "ɛ̃">
<!ENTITY f "f">
<!ENTITY g "ɡ">
<!ENTITY h "h">
<!ENTITY h2 "'">
<!ENTITY i "i">
<!ENTITY j "J">
<!ENTITY k "k">
<!ENTITY l "l">
<!ENTITY m "m">
<!ENTITY n "n">
<!ENTITY gn "ɲ">
<!ENTITY ing "ɳ">
<!ENTITY o "o">
<!ENTITY o2 "ɔ">
<!ENTITY oe "ɶ">
<!ENTITY on "ɔ̃">
<!ENTITY eu "ɸ">
<!ENTITY un "ɶ̃">
<!ENTITY p "p">
<!ENTITY r "ʀ">
<!ENTITY s "s">
<!ENTITY ch "ʃ">
<!ENTITY t "t">
<!ENTITY u "ɥ">
<!ENTITY ou "u">
<!ENTITY v "v">
<!ENTITY w "w">
<!ENTITY x "x">
<!ENTITY y "y">
<!ENTITY z "z">
<!ENTITY Z "ʒ">]>
<html xml:lang="fr"
xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
However, the   sequence is normal html for non-breakable-space. If you change the sametypesequence in the ifo file from
m to
h they don't disappear from linguae, but Goldendict does switch (as does Koreader):
sametypesequence=m
sametypesequence=h