Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 04-01-2012, 05:39 PM   #1
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,636
Karma: 5433388
Join Date: Nov 2009
Device: many
questions on self-closing tags and legal xhtml in epubs

Hi,

I have been playing around with html5lib and lxml in python and libxml2 in c to write code to process epubs and have run into difficulties parsing xhtml documents with the following self-closing tags. Are these legal in strict xhtml as used in epub 2? Are they still legal for epub 3.

<title />

<a id="blah" />

<div id="blah" />

<div id="blah" class="clearfix" />

When I parse xhtml with these self closing tags in them the parsers (and this must all tie back to libxml2 since they all are front ends to that library I believe) the get very confused and either start replacing tag < and > with their html entities, or they assume the ending tag is never found and add a new ending tag much much farther on, which can easily change the meaning especially for the float region "clearfix" class approach.

Even modern browsers seem to have trouble dealing with these particular self-closing tags.

I know in pure xml almost any tag can be a self-closing tag, but I thought under strict XHTML for epubs only specific tags like <meta /> and <hr /> were allowed to be self-closing and that all others must be explicitly and separately closed to guarantee proper ebook viewing.

Does anyone know the exact spec. Having to work around these bugs is quite painful and looking for and fixing all of these before parsing the xhtml makes things quite slow at times.

Ideas anyone?

Thanks,

Kevin
KevinH is offline   Reply With Quote
Old 04-01-2012, 09:11 PM   #2
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Self closing tags should be those things that don't have data. There is no reason to have a <div> that is self closing. It makes no sense at all as div is meant to enclose something. You can just assign the id to a different tag. There is generally no reason to use the a tag by itself any longer for the same reason. I wouldn't bother using it for title either.
DaleDe is offline   Reply With Quote
Advert
Old 04-01-2012, 09:54 PM   #3
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,636
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,

Thanks for your response. I don't want to use them. I am finding them in the wild inside epubs and they are not being viewed properly by some ebook readers that I have access to (and not all browsers either) and are not handled properly by lxml, html5lib and libxml2 which are often used to parse xhtml and is typically used inside ebook reading / handling software like kindlegen, calibre, sigil, etc.

I think the "clearfix" example is often used to fix bugs when using css to float and image right or left. This float behaviour often needs to be cleared. The div can contain a class that actually clears the float but not contain anything else as the following text needs to wrap around the floated image. The others are simply strange to me but they do exists even inside commercial epubs.

I was hoping that someone would have some idea if they were actually legal xhtml or an artifact of xml processing software used to improperly handle xhtml code.

KevinH


Quote:
Originally Posted by DaleDe View Post
Self closing tags should be those things that don't have data. There is no reason to have a <div> that is self closing. It makes no sense at all as div is meant to enclose something. You can just assign the id to a different tag. There is generally no reason to use the a tag by itself any longer for the same reason. I wouldn't bother using it for title either.
KevinH is offline   Reply With Quote
Old 04-02-2012, 01:07 AM   #4
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Ahh, You can read about ePub in our wiki and I am pretty sure these are not legal in xhtml. The wiki has links to the specs. While xhtml is designed after xml it is really designed to make html conform to the standards of xml, not to turn it into some arbitrary xml. Hope this helps. Certainly you are right, many ebook readers will not interpret these like xml. They are all basically designed for html that conforms to xml and this is what the spec says.

Dale
DaleDe is offline   Reply With Quote
Old 04-02-2012, 01:53 PM   #5
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I've used self-closing divs some times. For scene breaks, where I want an empty div with a fixed height, writing <div class="break" /> seemed cleaner than <div class="break></div>. It works fine in my reader (ADE-based) and didn't cause flightcrew to complain. Last time I checked I arrived to the conclusion it was valid.
Jellby is offline   Reply With Quote
Advert
Old 04-23-2012, 10:12 PM   #6
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
To fully answer this question. Yes, self closing tags (a and div elements in this example) is perfectly valid according to the EPUB spec. The following example conforms to the EPUB 2 spec.

Code:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title/>
</head>

<body>
<p>
  <a id="blah" />
</p>

<div id="blah1" />

<div id="blah2" class="clearfix" />
</body>
</html>
The relevant sections of the EPUB 2 Spec are 1.4.1.2 and Appendix A. XHTML 1.1 does allow self closing tags as used above. Specifically Appendix a requires that the document validate against the XHTML 1.1 DTD.
user_none is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Consistent styling of LI tags in ePubs Agama Conversion 7 10-01-2011 11:45 AM
Help with EPUb validation and closing tags book24 ePub 1 05-24-2011 05:16 PM
Release 0.7.54 - Mobi self-closing tags Moonraker Calibre 2 04-09-2011 07:35 AM
Supported XHTML tags in EPUB spaze ePub 5 02-27-2011 09:35 PM
Legal Questions About Sports Writing MV64 Writers' Corner 0 05-16-2010 01:48 PM


All times are GMT -4. The time now is 04:32 PM.


MobileRead.com is a privately owned, operated and funded community.