MobileRead Forums - View Single Post

Markismus · 11-15-2021, 05:38 AM

I've created a subroutine to remove the html-escape characters:

Code:

sub unEscapeHTMLString{
    my $String = shift;
    $String =~ s~\&lt;~<~sg;
    $String =~ s~\&gt;~>~sg;
    $String =~ s~\&apos;~'~sg;
    $String =~ s~\&amp;~&~sg;
    $String =~ s~\&quot;~"~sg;
    return $String;}

The entry for rectum is now generated as:

Code:

<head><k>rectum</k></head><def><b>rec</b>‧<b>tum</b></c> /ˈrektəm/  <i><c> noun</c></i> (<i>plural </i><b>rectums</c></b><i> or</i> <b>recta</c></b> /-tə/) [countable]</c><i> medical</c></i>
<blockquote>[Date: </c>1400-1500</c>; Language: </c>Modern Latin</c>; Origin: </c>rectum intestinum</c> </c><i>'straight intestine'</c></i>]</blockquote>
<blockquote> the lowest part of your ↑<kref>bowel</kref>s</c> ⇨ <b>rectal</b></blockquote></def>
</ar>

@GetKey Could you check whether the Longman Dictionary of Nov15th displays without html-codes?