View Single Post
Old 02-08-2018, 03:48 PM   #95
sherman
Guru
sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.sherman ought to be getting tired of karma fortunes by now.
 
Posts: 876
Karma: 2676800
Join Date: Aug 2008
Location: Taranaki - NZ
Device: Kobo Aura H2O, Kobo Forma
Quote:
Originally Posted by sjfan View Post
Knuth-Plass is not suitable for use with HTML in general; floats can cause the line width to vary based on the line height of preceding lines, which can in turn vary if you use dynamic line breaks the way K-P does.

Basically, you don't know the allowed width of a given line without having already fixed the layout of the lines above it.

Jonathan Kew discusses here (in the item explaining why Firefox doesn't use K-P):


The good news is that the simple case (fixed- or at least known-width lines) should be sufficient for many ebooks, and K-P is ideal there, but you're going to have to think about how to determine ahead of time whether you can precalculate the line widths, and how to fall back to a different line-breaking algorithm if you cannot. “I don't support floats” is one possible answer, but many ebooks do use inline images so it's probably not acceptable.

You may also want to detect long paragraphs, since K-P is quadratic on the length of lines. If you can find it, P.I Cooper's article in the July 1966 Advances in Computer Typesetting discusses a way to break paragraphs into overlapping chunks to avoid degenerate behavior in long paragraphs.
Note that TeX and it's variants use K&P, and they also support floats, so it's not an insurmountable problem. Whether you can do it quick enough on the other hand...

Chances are especially in an ebook, a float is going to occur at the beginning of a paragraph. If the float is not moved up, then it's a simple matter of placing it before setting the rest of the paragraphs. The float will be of a fixed dimensions by this point.

If the float is raised, then it becomes more complicated, and you probably have to re-break the preceding paragraph(s) one or more times until it fits. An expensive operation to be sure, but if one only has to re-break the occasional paragraph in a book, it's probably not too much of an issue.


(Note, I'm not an expert on these matters, but it is of interest for me. I've long wanted to see K&P or a similar algorithm implemented in an HTML renderer for ebooks.)
sherman is offline   Reply With Quote