element "u" not allowed anywhere - Page 3

Toxaris · 04-18-2014, 05:21 PM

Quote:

Originally Posted by skreutzer

Well, but you guys are missing one important point here: the standard specifications are for machine readability, a check for formal correctness. They too define how a device manufacturer should render the defined constructs, but of course they can't force them to react on constructs according to the standard, they can't provide implementations for every combination of constructs, and they even leave some decisions up to the implementor (such as footnote rendering in EPUB3), which is perfectly fine. There is other reading software than the renderer software in e-reading devices, such as webservers and processing tools, which might have a completely different rendering or no rendering at all, and as devices change over time, the standard definition serves as common protocol about how information should be encoded, so that software and devices of the future might access the encoded information in the best possible way.

Have you read the specs at all? There is a lot that can be interpreted in different manners. Some are minor (auto margins), but some are major and handles structure. Good specifications and standards should not leave room for interpretation. It should be clear what is and what isn't correct. Then, and only then, a good validation can be build and compliant readers would interpret everything the same. Unfortunatly, that is an utopia.

skreutzer · 04-18-2014, 05:44 PM

No, of course I've haven't read the specification in its entirety, but you will well know that there is much less room for interpretation regarding structure than it is for rendering, and that a lot of statements are made to specify what's actually valid structure, and that there's even a validator implementation available from the IDPF, and after all this - the room left for interpretation regarding structure is neither common in real-world cases nor trivial, or needs to be fixed in the future. On the other hand, without doubt, invalid constructs can be detected, too. And on the readers side, as already mentioned, developers have to expect worst case scenarios for the input files, on the EPUB creation tool side, developers have to expect the worst case scenarios for readers of their output file instead of actually really causing those worst case scenarios by themselves and leave the mess up to the user/reader who can't help himself.

Jellby · 04-19-2014, 02:56 AM

Quote:

Originally Posted by Toxaris

Good specifications and standards should not leave room for interpretation. It should be clear what is and what isn't correct. Then, and only then, a good validation can be build and compliant readers would interpret everything the same. Unfortunatly, that is an utopia.

Even in that utopic world we would still have non-compliant readers which would call themselves epub-readers, though

Rev. Bob · 04-23-2014, 03:22 AM

Quote:

Originally Posted by Toxaris

The problem is, that even if the validation is green, it is saying nothing. It can still be a broken. It checks several things, but not if the ePUB will work. The real issue is that all readers (and applications) have their own interpretation of the specifications and that the specifications give room for interpretation.

Sticking to the specifications and standards is always a good idea, but the validation will not help you with that.

Back in the long long-ago, in my first Computer Science class, I was taught that there are three classes of errors when writing software: syntax, run-time, and logic. Take it as a given that when writing something, you're going to make mistakes; you want to pray that they're as close to the beginning of that list as possible, because that's the easiest stuff to fix.

A validator is strictly a syntax checker. Formats like EPUB have DTDs that define the permitted syntax in precise ways, and validation uses that data to catch all that easy stuff. If your book doesn't validate, nobody's going to care about fixing any rendering problems you have with it; all they have to do is point out that you're feeding the renderer invalid output, and that makes it your problem.

That certainly doesn't mean that every syntactically valid document will display properly; that's where run-time (in this case, rendering engine) and logic (ie. bad CSS rules that interact in an unforeseen way) problems come into play. However, you want to start on the firmest foundation you possibly can...and that means using validation.

Fun fact: Calibre deviates from the NCX specification in a way that the EpubChecker tool won't catch. Specifically, it assigns a dtb:depth value that is one level too high, because Kovid apparently misread that part of the spec. I know it's been called to his attention at least once, but he took the same sort of "meh, works okay, doesn't matter, won't fix it" approach described earlier in this thread. So, if you care about getting it right, don't forget to tweak that value...

Rev. Bob · 04-23-2014, 03:46 AM

Quote:

Originally Posted by Toxaris

There is a lot that can be interpreted in different manners. Some are minor (auto margins), but some are major and handles structure. Good specifications and standards should not leave room for interpretation. It should be clear what is and what isn't correct.

You're equivocating between two meanings of the term interpret.

EPUB has specs, expressed as DTDs, that exactly define the permitted structure. That's a statement of fact; that's what a DTD exists to do, and both EPUB 2.0.1 and EPUB 3.0 have one. In a conflict - a validation error - between a given EPUB and the spec it claims to follow, the DTD is right, literally by definition. In that sense, there's no fuzziness, no room for "interpretation" in the first sense - there's just Correct and Not Correct.

The "interpretation" you're talking about is the second sense, on the rendering side, which is about how a given engine processes a document, and there are several reasons to allow fuzziness on that side. For instance, an audio rendering engine (remember, DTB stands for Digital Talking Book!) has no use for visual instructions, and is therefore free to disregard them.

The rendering engine exists to convert a standard-format document into something usable on whatever hardware is running that engine, and since there's all sorts of different hardware out there, that means we need all kinds of different trade-offs. That has implications that go both ways, though; just as an author can't rely on colors for emphasis because audio and grayscale devices exist, those devices should also have some way of indicating that those cues exist. A valid document should never break a rendering engine; if it does, the engine is faulty. However, an invalid document is by definition broken, and while it's good for engines to be tolerant of minor errors, an author should always assume that any invalid content will break in some engine, somewhere, that didn't build in that level of fault tolerance.

Both parties play a role, and the author's is to make sure that they provide clean content that is well-formed and as easy to process as possible. Doing so means the document works for the widest audience, and that's never a bad thing.

JLius · 04-23-2014, 04:00 AM

Hi all

I guess the answers to my initial question are a lot more complicated than the question itself

I don't understand half of what you guys are talking about, but I find it interesting none the less.

Quote:

Fun fact: Calibre deviates from the NCX specification in a way that the EpubChecker tool won't catch. Specifically, it assigns a dtb:depth value that is one level too high, because Kovid apparently misread that part of the spec. I know it's been called to his attention at least once, but he took the same sort of "meh, works okay, doesn't matter, won't fix it" approach described earlier in this thread. So, if you care about getting it right, don't forget to tweak that value...

Mind to elaborate on that? Where is this dtb:depth value set, and how do I tweak it so that epub check catches the deviations?

By the way, I use calibre to make an attempt at creating an epub novel out of a msword document. I use the calibre conversion, cause it already does some of the work for me (You can also directly import in the editor, but for some reason I always get error messages when I try to do it that way).
But I don't keep any of Calibre's CSS, I erase it and replace it with my own. I clean up the html, trying to keep the style as simple as possible. I also us Calibre to generate a TOC.
Using it like this, I believe Calibre can be a valid tool, same as Sigil, no?

Toxaris · 04-23-2014, 04:20 AM

Quote:

Originally Posted by JLius

Mind to elaborate on that? Where is this dtb:depth value set, and how do I tweak it so that epub check catches the deviations?

You can't. Well, you can change the value in the NCX, but it will just not get caught by the validation.
If the value created by Calibre is wrong, than it should be corrected by Kovid whether it works or not. You should always keep as close to the specifications as possible.

Quote:

Originally Posted by JLius

By the way, I use calibre to make an attempt at creating an epub novel out of a msword document. I use the calibre conversion, cause it already does some of the work for me (You can also directly import in the editor, but for some reason I always get error messages when I try to do it that way).
But I don't keep any of Calibre's CSS, I erase it and replace it with my own. I clean up the html, trying to keep the style as simple as possible. I also us Calibre to generate a TOC.
Using it like this, I believe Calibre can be a valid tool, same as Sigil, no?

I haven't tested Calibre's Word conversion at the fullest, but I just don't like what he does with the styles. I fully understands why he does it like that and it is his choice. No problem with that.
However, I like the methods I use myself (no surprise...) with my Add-in. Downside for some people that is needs Word (2007 and up) and must be on Windows (blame MS for that). A lot of your manual steps are automated that way.

Rev. Bob · 04-23-2014, 04:39 AM

Quote:

Originally Posted by JLius

I guess the answers to my initial question are a lot more complicated than the question itself

I don't understand half of what you guys are talking about, but I find it interesting none the less.

Actually, the answer to your question is really simple. <u> is not a valid element in an EPUB's XHTML code, which is why you got an error message that translates to "that thing's not allowed here!" The solution is, as stated earlier, to change all your U elements to SPAN elements that are associated with a CSS rule that says to underline the contents. Other posts have gone into more detail on that front; there's no need for me to repeat it.

Quote:

Originally Posted by JLius

Mind to elaborate on that? Where is this dtb:depth value set, and how do I tweak it so that epub check catches the deviations?

It's in your NCX file, which is probably named toc.ncx, and you'll see it near the beginning of the file. If you don't have any nesting in your table of contents, that value should be 1 - because you have a "flat TOC" which is one level deep. However, Calibre will give it a value of 2.

"Nesting" in that sense is, for instance, something like this:

Book One: The Larch
- Part I
-- Chap 1
-- Chap 2
- Part II
-- Chapman
-- Cleese
-- Palin
Book Two: Canada

... and so on. That structure shows a TOC that is three levels deep. The "Book X" is the first level, "Part Y" is the second, and "Chapter Z" is the third. Thus, for that TOC, the META tag in the NCX header that has the name "dtb:depth" should have a value of 3.

However, Calibre will assign a value of 4, because it's silly that way. I can't explain better without going pretty far into the weeds, and we're not exactly on paved ground as it is.

Quote:

Originally Posted by JLius

But I don't keep any of Calibre's CSS, I erase it and replace it with my own. I clean up the html, trying to keep the style as simple as possible. I also us Calibre to generate a TOC.

I'm actually in the process of writing a book, too. I'm doing it all from scratch, though; I use LibreOffice* for the actual writing, and I maintain a CSS file that correlates to the style names I've defined. Despite how the usual guides say to do things,** I'm keeping each book element (part dividers, chapters, dedication, etc.) in a separate file, but I'm using the same template for all of it. This way, as I complete or edit a chapter, I can add it to the EPUB-in-progress and check it out.

If I were using Calibre, I'd be afraid that it would merrily rename all of my classes and thereby break that link between my hand-tooled CSS file and the LibreOffice-generated HTML.

* Weirdly, the Portable version works better for me than the "real" Windows version. I don't know why, but the "real" version exports uppercase HTML tags while the Portable version exports lowercase tags. Pretty much all I have to do is replace the header, unwrap the text, convert <br> to <br/>, and I'm good to go. If anyone reading knows how to automate those steps, I'd really like to know. I can do it in a few seconds, but that's every time on every file...

** Smashwords says to use one big document, but I'm hoping that by the time I'm done, their "direct EPUB upload" feature will be out of beta and will be set up to cross-generate to their other formats. I don't trust meat grinders.

Rev. Bob · 04-23-2014, 04:51 AM

Quote:

Originally Posted by Toxaris

If the value created by Calibre is wrong, than it should be corrected by Kovid whether it works or not. You should always keep as close to the specifications as possible.

That was my argument, too. Kovid disagreed, rather strongly, for reasons that still don't make sense to me.

For those who want to join me out here in the weeds...

Spoiler:

JLius · 04-23-2014, 05:19 AM

Allright, thanks for the elaboration.
I haven't encountered any problems with the TOC generated by Calibre though, I've tested it on the sony prs t-1, Ibooks and readium.
So you're saying that, with a flat TOC, I should change the value from "2" to "1" (no nesting in my toc) and my toc is best friends again with epub specs?

Rev. Bob · 04-23-2014, 05:23 AM

Quote:

Originally Posted by JLius

So you're saying that, with a flat TOC, I should change the value from "2" to "1" (no nesting in my toc) and my toc is best friends again with epub specs?

As Pennywise would say, "Kee-RECT!"

Jellby · 04-23-2014, 05:32 AM

Quote:

Originally Posted by Rev. Bob

Even when shown the certified correction, Kovid fell back on the "it doesn't break anything, so why bother" defense for keeping the status quo. I disagree, but it's his program and his choice.

Have you tried submitting a patch? Maybe he won't be bothered to introduce the change himself, but wouldn't object in some else doing it.

Arios · 04-29-2014, 10:59 AM

Sorry to be late, and off topic but as the initial question of the thread seem to be solved, it's less a problem.

@ Rev .Bob. Very good replies (# 34, 35)! Globaly your answer is very instructive because it allows to distinguish several things that I tend to confuse between what is related to the DTD, and what tradeoff devices must do to be able to deal with different specs (or, their own "limits"). Food for my mind. The distinction betweeen "interpret" in a straight and fuzzy way is also well though out.

I can not help you with the problem you have reported about LO but have you tried Amanuensis (http://amanuensis.pagesperso-orange.fr/) as it is specifically designed to use odt files?

Or AWP (http://www.atlantiswordprocessor.com/en/)

Not a solution, I know, just a detour.

Rev. Bob · 04-29-2014, 02:46 PM

Quote:

Originally Posted by Arios

I can not help you with the problem you have reported about LO but have you tried Amanuensis (http://amanuensis.pagesperso-orange.fr/) as it is specifically designed to use odt files?

I have not, because I've got LO-Portable tamed (with some help from Notepad++) to meet my needs well enough. The work I have to do to fix LOP-output HTML is annoying, but minimal, especially if I keep on top of it by building the book chapter by chapter, as I have been.

Arios · 04-29-2014, 08:12 PM

Quote:

I have not, because...

Understood!

Have a nice evening/day.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
The element type "p" must be terminated by the matching end-tag "</p>".	uieluck	ePub	10	02-12-2013 07:04 PM
element "img" not allowed in this context	graniton	Calibre	5	05-14-2012 11:39 PM
ePub validation error: 'element "span" not allowed here'	nannygoats	ePub	5	11-30-2011 08:47 AM
element "span" not allowed in this context	jihwan	Calibre	4	07-17-2010 09:25 PM
Crash on nonexistent "title" metadata element	Valloric	EPUBReader	3	12-10-2009 01:46 PM

04-18-2014, 05:44 PM	#32
skreutzer Software Developer Posts: 190 Karma: 89000 Join Date: Jan 2014 Location: Germany Device: PocketBook Touch Lux 3	No, of course I've haven't read the specification in its entirety, but you will well know that there is much less room for interpretation regarding structure than it is for rendering, and that a lot of statements are made to specify what's actually valid structure, and that there's even a validator implementation available from the IDPF, and after all this - the room left for interpretation regarding structure is neither common in real-world cases nor trivial, or needs to be fixed in the future. On the other hand, without doubt, invalid constructs can be detected, too. And on the readers side, as already mentioned, developers have to expect worst case scenarios for the input files, on the EPUB creation tool side, developers have to expect the worst case scenarios for readers of their output file instead of actually really causing those worst case scenarios by themselves and leave the mess up to the user/reader who can't help himself.

04-23-2014, 05:19 AM	#40
JLius Village idiot Posts: 157 Karma: 519566 Join Date: Mar 2014 Location: Belgium Device: sony PRS T-1	Allright, thanks for the elaboration. I haven't encountered any problems with the TOC generated by Calibre though, I've tested it on the sony prs t-1, Ibooks and readium. So you're saying that, with a flat TOC, I should change the value from "2" to "1" (no nesting in my toc) and my toc is best friends again with epub specs?

04-29-2014, 10:59 AM	#43
Arios A curiosus lector! Posts: 463 Karma: 2015140 Join Date: Jun 2012 Device: Sony PRS-T1, Kobo Touch	Sorry to be late, and off topic but as the initial question of the thread seem to be solved, it's less a problem. @ Rev .Bob. Very good replies (# 34, 35)! Globaly your answer is very instructive because it allows to distinguish several things that I tend to confuse between what is related to the DTD, and what tradeoff devices must do to be able to deal with different specs (or, their own "limits"). Food for my mind. The distinction betweeen "interpret" in a straight and fuzzy way is also well though out. I can not help you with the problem you have reported about LO but have you tried Amanuensis (http://amanuensis.pagesperso-orange.fr/) as it is specifically designed to use odt files? Or AWP (http://www.atlantiswordprocessor.com/en/) Not a solution, I know, just a detour.