Bug in Kobo processing of epub files causing hang in "Processing content"
Yesterday I converted about 220 articles from the Standford Encyclopedia of Philosophy into epub and transferred them to my Kobo. I was not surprised to see that it appeared to get stuck somewhere "Processing Content". By the very tedious and laborious process described by others I was eventually able to transfer all but two of the files.
This really is unacceptable. Perhaps the Kobo team is unable to prevent their "Processing Content" stage from freezing. But this should provide better information to help people around their inadequate software. Instead of saying what percentage it thinks it has completed, the Kobo can tell us what file it is currently processing. We can then highly suspect that the file it last reports is the one it is stuck on. And it can provide a list of the files it processed that it had no problem with. It could probably even actually transfer those files instead of tranferring NO files if has trouble with one file. (If you get to "90%" for example, on 200 files, NO file at all will be transferred after all that work!)
Putting this issue aside, what was the problem with those two files? I looked into the original html trying to find something that those files had and none of the other 220 files had. And indeed there was something: the value of the "name" attribute in various <a> files has colons in them. For references, for example "Carnap:23a" was used as a reference to a paper Carnap published in 1923. This was carried over to Calibre's command line ebook-convert epub files. I don't know if this is allowed in html or epub's xhtml or not, except that every program I tried does accept it: all browsers, all epub readers I have on my pc.
The only thing that did not like it, and could not tell me why, and either hung or crashed in processing such a file was the Kobo.
I changed all these attribute values and the Kobo finally allowed me to add the files.
A couple of other comments: even if colons are not allowed (I don't feel too much like checking this out now, but I can't imagine why they are not), it is not because of that that Kobo failed. Otherwise it should just tell is there is a syntax error in the epub file. It should give much better behavior in this case, not a hang or crash.
Second: it took me a lot longer to do this because once I found the first file it could not handle (about fifty files in), I put it aside and kept going with the laborious trial and error process until I got all but one of the remaining files on. Only then did I look at the two files. Had I taken the first problematic file when I found it, and looked for an anomaly in it, I might have been able to avoid much of the rest of the long process. Hopefully others may take a smarter path in this regard.
However, Kobo must fix this. Not just the problem with colons, but also the messages it gives just before it crashes or hangs on an epub it cannot handle.