MobileRead Forums - View Single Post

rpspringuel · 02-14-2014, 01:01 PM

Quote:

Originally Posted by kovidgoyal

IIRC epub:type="pagebreak" is an epub3 specific extension. Currently, almost nothing supports it.

Well, my reason for using it isn't because it's currently supported but rather to try and future proof my work and make it easier on the next guy who wants to expand on this. Since I'm making an admitted hack on how azw3 works and thus could do any number of things, I might as well use something that will be nicer on the next guy.

Quote:

Originally Posted by kovidgoyal

The colon refers to a XML namespace. If you want to use it, you have to declare the epub namespace and make sure the document you are modifying is valid XML. The IDPF just likes to make everyone's life harder by using XHTML instead of plain HTML 5.

Declaring the namespace isn't that hard. I just need to add xmlns:epub="http://www.idpf.org/2007/ops" in the right place.

Unfortunately the editor currently doesn't know what to do with this declaration though. If added to the metadata tag in the metadata file (where several other namespaces are declared) then the declaration is lost in a save/close/reopen cycle. Uses of the namespace in the text files are unaffected (: gets converted to u0003a). If I try to use one of the other namespaces that are declared in the same place (dc, opf, calibre) the character swap still happens. If I declare the epub namespace within the html tag of a text document, then the declaration is removed and the name space is stripped from the tags where it is used (i.e. epub:type="pagebreak" becomes type="pagebreak"). This behavior is all specific to editing azw3 files, editing ePub's exhibit none of these behaviors (ePub's even retain the : when the namespace hasn't been declared).

As for the file being valid XML, isn't that a given? I understood azw3 to be an amazon specific compilation of ePub. Since ePub files have to be valid XML (or more specifically XHTML) shouldn't an azw3 file be valid XML? Am I missing something?

Quote:

Originally Posted by kovidgoyal

Other than that, it's fine, although note that inserting an empty span tag into a document can have side effects, since the document can use CSS selectors based on tag counts.

As I said before, the only sure way of modifying the document with no side effects is to use data- attributes. But that hhas the limitation of restricting page markers to existing tag locations.

Unfortunately I think this is a chance that I'll have to live with. If I force the pagebreak markers to use existing tags, I'll have to move them from where they actually occur, which kind of defeats the purpose of what I want to do in the first place. Since this is based on the ePub standard (which also uses span tags) that would imply that using CSS selectors based on tag counts would not be recommended in this instance anyway.