MobileRead Forums - View Single Post

jackie_w · 12-15-2012, 05:42 PM

Quote:

Originally Posted by meme

Pretty Print Tidy will not create any sgc classes. Only HTML Tidy will do that. Pretty Print Tidy will format your code to look nice AND it will try to correct common errors in the file in order to make it valid XML code.

Thanks for this info, meme. I can confirm that because of my concern about those wretched sgc-n classes I had originally set Clean Source to OFF. Now I've set it to 'Pretty Print Tidy' all seems well. Self-inflicted problem. D'oh

Quote:

Originally Posted by meme

To be more specific about the issue with nbsp in the file - the files are missing a DOCTYPE that describes what is in the document. You can add this in manually, or just let Pretty Print Tidy do it for you - which I recommend for most people.

Code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

It looks as if a calibre import/conversion (html-->zip-->epub) doesn't add an xhtml doctype statement at any stage, which surprises me a little. Though, I can't say I've ever had a problem reading a calibre-converted epub on any of my readers. calibre's headers start like this:

Code:

<?xml version='1.0' encoding='utf-8'?>
<html xmlns="http://www.w3.org/1999/xhtml">

Something else I noticed about nbsp/calibre/Sigil is that calibre always changes the   entity to the single unicode char during the html-->zip import process, but Sigil changes it back to an entity again. I did wonder whether calibre did this because some readers couldn't handle the entity, but the Sigil edited epub seems to be OK on the Sony PRS350 and the KoboGlo I tried last night.