Post
#13 and
#15 suggest that the only formatting for whitespace is the br-tag. All others, such as span, quote, blockquote are stripped by the converter to dic-format.
VanDale is styled with blockquotes:
Code:
<ar><head><k>leerling</k></head><def>
<blockquote><span> <span><b>l<span><u>ee</u></span>r·ling</b></span> <span> <span>de<sup><i>m</i></sup></span> </span> <span> <span> <span><i>meerv</i></span>: leerlingen</span>; <span><span><i>verkleinw</i></span>: leerlingetje</span></span> </span> <span> <span> <span> <span><i>zelfst nw</i></span> </span> <span> <span><span><b>1. </b></span> <span> <span>iem. die onderwijs krijgt</span> </span> <span>
<blockquote align="left"><span><b>• </b></span> <span><i>een <span><b>externe</b></span> leerling</i></span> </blockquote>
<blockquote align="left"><span><b>• </b></span> <span><i>een <span><b>ijverige</b></span> leerling</i></span> </blockquote>
<blockquote align="left"><span><b>• </b></span> <span><i>een <span><b>interne</b></span> leerling</i></span> </blockquote>
<blockquote align="left"><span><b>• </b></span> <span><i>een <span><b>knappe</b></span> leerling</i></span> </blockquote>
<blockquote align="left"><span><b>• </b></span> <span><i>een leerling van <span><b>school</b></span> /sturen/verwijderen/</i></span> </blockquote>
<blockquote align="left"><span><b>• </b></span> <span><i>een <span><b>zwakke</b></span> leerling</i></span> </blockquote>
<blockquote align="left"><span><b>• </b></span> <span><i>de <span><b>zwakste</b></span> leerling van de klas</i></span> | <span>de slechtst presterende leerling</span>; <span>land, bedrijf enz. dat het slechtst presteert op een bepaald terrein</span> </blockquote>
</span> </span> <span><span><b>2. </b></span> <span> <span> <span><b>volgeling</b></span> van iemands leer of stelregels</span> </span> <span>
<blockquote align="left"><span><b>• </b></span> <span><i>de leerlingen van <span><b>Jezus</b></span></i></span> </blockquote>
</span> </span> </span> </span> </span></blockquote>
</def></ar>
(Empty lines are inserted for readability.)
The new greek dictionary is styled with div-blocks:
Code:
<ar><head><k>ελληνο-</k></head><def>
<div style="margin-left:0em"><span style="color:darkslategray"></span> [<!-- T --><span style="font-family:'Helvetica'">elıno</span><!-- T -->]</div>
<div style="margin-left:0em"><span style="color:darkslategray"><b>ελληνό-</b></span> [<!-- T --><span style="font-family:'Helvetica'">elınó</span><!-- T -->], όταν κατά τη σύνθεση ο τόνος ανεβαίνει στο α' συνθετικό</div>
<div style="margin-left:0em"><span style="color:darkslategray"><b>ελλην-</b></span> [<!-- T --><span style="font-family:'Helvetica'">elın</span><!-- T -->], σπάνια όταν το β' συνθετικό αρχίζει από [<!-- T --><span style="font-family:'Helvetica'">o</span><!-- T -->]</div>
<div style="margin-left:1em">α' συνθετικό σε σύνθετες λέξεις με αναφορά κατά περίπτωση στα ονόματα<i><span style="color:slategray">Έλληνας, ελληνικός, ελληνισμός.</span></i></div>
<div style="margin-left:2em"><span style="color:green"><b>1.</b></span> </div>
<div style="margin-left:3em"><span style="color:mediumseagreen"><b>α.</b></span> (σε σύνθετα επίθετα) για σχέση (φιλία, συνθήκη <i class="p" style="color:green">κτλ.</i>) μεταξύ των Ελλήνων και του λαού που υποδηλώνει το β' συνθετικό· (<i class="p" style="color:green">πρβ.</i> <i><span style="color:slategray">-ελληνικός</span></i><sub>1</sub>): <i><span style="color:slategray">~αγγλικός, ~αμερικανικός, ~γερμανικός, ~ολλανδικός, ~τουρκικός</span></i>.</div>
<div style="margin-left:3em"><span style="color:mediumseagreen"><b>β.</b></span> σε εθνικά ουσιαστικά: <i><span style="color:slategray">Ελληνοκύπριος,</span></i> ο Έλληνας της Κύπρου. <span style="color:lime"><b>||</b></span> <i><span style="color:slategray">Ελληνοαμερικανός.</span></i></div>
<div style="margin-left:2em"><span style="color:green"><b>2.</b></span></div>
<div style="margin-left:3em"><span style="color:mediumseagreen"><b>α.</b></span> με αναφορά στη νεοελληνική γλώσσα. <i class="p" style="color:green">ΑΝΤ</i> ξενο-: <i><span style="color:slategray">ελληνόγλωσσος, ~μαθής, ελληνόφωνος.</span></i> <span style="color:lime"><b>||</b></span> (<i class="p" style="color:green">παρωχ.</i>) με αναφορά στην αρχαία ή τη λόγια μορφή της ελληνικής γλώσσας: <i><span style="color:slategray">~διδάσκαλος</span></i>.</div>
<div style="margin-left:3em"><span style="color:mediumseagreen"><b>β.</b></span> για λεξικό, λεξιλόγιο <i class="p" style="color:green">κτλ.</i> στο οποίο οι λέξεις της ελληνικής γλώσσας ερμηνεύονται (αποδίδονται) στη γλώσσα που υποδηλώνει το β' συνθετικό: <i><span style="color:slategray">~αγγλικός,</span></i> <i class="p" style="color:green">ΑΝΤ</i> αγγλοελληνικός· <i><span style="color:slategray">~γερμανικός, ~ελληνικός, ~τουρκικός.</span></i></div>
<div style="margin-left:2em"><span style="color:green"><b>3.</b></span></div>
<div style="margin-left:3em"><span style="color:mediumseagreen"><b>α.</b></span> με αναφορά <i class="p" style="color:green">συνήθ.</i> στον αρχαίο ελληνικό πολιτισμό: <i><span style="color:slategray">~λάτρης</span></i>.</div>
...
</def></ar>
(I've truncated the entry at your screenshot ending and inserted empty lines for readability.)
So new to the info in post
#13 and
#15 is that apparently, the div-tag is retained: Every div-block gets its own block. So substituting blockquotes for div's could be an approach for VanDale.
The margin-left:..em styling doesn't work as expected. Both 1.α. and 2.α. have 3em left-margin, but they are not aligned. α' and 1. both have 0em and 1em respectively, but they are aligned.
Obviously, we are not in control! Maybe the left-margins are correctly displayed, but the div-blocks are of different width and centered. Maybe we can force them with
style="width: 100%";?
@Getkey You could just get the xdxf-file, remove all entries except those your are testing and change the html-around and convert to see what works. If you could provide me with the an example of the working code and an accompanying screenshot, I am willing to implement and rerun the script on all offending dictionaries.