View Single Post
Old 03-10-2011, 01:52 PM   #4
haven
Member
haven began at the beginning.
 
Posts: 10
Karma: 34
Join Date: Nov 2008
Device: Kindle 3, iPad, Iliad Irex
Quote:
Originally Posted by kovidgoyal View Post
nowrap makes no sense on an inline element, only on a block level element. Some html renderers will automatically convert an inline element with nowrap to a block element, others will ignore the nowrap. The former is what you see happening.
That makes sense to me intuitively. I'm inexperienced enough in this territory that I don't even know what standards apply. So http://www.w3.org/TR/CSS2/text.html#white-space-prop may be the wrong lookup, but if it does apply, w3.org defines white-space as applying to "all elements", no?

My example code fragment was a trimmed sanitized version of an online book from a well-regarded author and publisher. (Their upper case looked quaint to me too, but I left it verbatim in case it was relevant to debugging.) Now that I've found the issue, which pervades the book, I can fix it for my own purposes by an awk script or, worst case, writing my own parser. I thought it might be helpful to write it up since similar code blocks could affect others. I'm attaching a picture of the differential output before and after Calibre parses the html fragment I posted ... the purple highlighting shows where the Calibre parser gets confused by the assumption that white-space implies block level.

Thanks for the incredibly quick reply (and for your invaluable labor-of-love in creating Calibre. I'm humbled by your contribution and very appreciative.)
Attached Thumbnails
Click image for larger version

Name:	raw-vs-parsed-html-highlighted.png
Views:	385
Size:	77.6 KB
ID:	68087  
haven is offline   Reply With Quote