Quote:
Originally Posted by nrapallo
I get this output when the program crashes:
Quote:
dict.xdxf, line 78723: unclosed xml tag
|
I've tried to verify what's left unclosed, but cannot locate anything using simple text searches and such. I even tried using xmllint.exe from the libxml2 2.7.6 windows port, but it's not been easy to find the culprit(s).
|
From experiments: xdxf is a correct xml document, but converter.exe fails to parse files with lines longer than 4096 bytes (not characters, bytes).
For example this xdxf file would throw the error:
<?xml version="1.0" encoding="UTF-8" ?>
<ar><k>word</k>
put_4090_simple_letters_here <fictional tag ends after the 4096 boundary>
So not all longer lines cause converter to fail, only unlucky ones. We need to break the lines to make them shorter. I use the following awk script which works for files containing <br><br>:
awk '{ gsub("<br><br>", "<br>\n<br>"); print; }' dict.xdxf >dict2.xdxf
I'll try to find an appropriate place to report this bug. Will follow rkomar's suggestion and have a look at the-ebook.org.