Thread: Soft Hyphens
View Single Post
Old 08-23-2008, 01:01 PM   #1
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,977
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3, Kobo Glo HD
Soft Hyphens

The thread Problems reading epub on prs-505 indicates that soft hyphens are a problem in ePub ebooks. From Robin’s HTML 4.0 Conformance Test:

Quote:
A soft hyphen indicates where an optional word break may occur. When a soft hyphen breaks a word between one line and the next, a hyphen character is displayed at the end of the first line. When a soft hyphen does not break a word between lines, the hyphen must not be displayed.

Soft hyphens are vital for text that must be displayed on a tiny screen or in a narrow frame. Web browsers have no excuse for rendering them incorrectly, when they can be minimally compliant by ignoring them completely.
However, the ebook readers I tested don't handle soft hyphens well.

The attached ebooks are based on http://www.cs.tut.fi/~jkorpela/shytest.html, which is from Soft hyphen (SHY) – a hard problem?. I enclose a single-file HTML (ZIP), MOBI (via MobiPocket Creator) and ePub (via BookGlutton) versions. The screenshots are from a Windows PC using Adobe Digital Editions, Sony Ebook Library (PRS-505 like), MobiPocket Reader, FBReader and uBook.

The uBook version (last screenshot) appears to do the best job, but it does not display the "-" when a soft hypen is positioned at the end of a line in the actual document and it might in fact be ignoring all the soft hyphens and using its own hyphenation (it can give discre-tionary, which isn't from the soft hyphens). Adobe Digital Editions (ePub) breaks on a soft hyphen, but does not add a "-" when it does so. Sony is based on ADE, it breaks on a soft hyphen but it also shows "?" at every soft hyphen. MobiPocket shows all soft hyphens as "-" and does not break words. FBReader does break words, but shows all soft hyphens as "-".

Soft hyphens could provide a viable alternative (or augmenation) to on the fly hyphenation, but only if ebook readers either use them for hyphenation or ignore them completely.
Attached Thumbnails
Click image for larger version

Name:	shytest_ADE.gif
Views:	1569
Size:	165.5 KB
ID:	15508   Click image for larger version

Name:	shytest_PRS.gif
Views:	1263
Size:	155.3 KB
ID:	15509   Click image for larger version

Name:	shytest_WMR.gif
Views:	1154
Size:	196.2 KB
ID:	15510   Click image for larger version

Name:	shytest_FBR.gif
Views:	1201
Size:	175.0 KB
ID:	15511   Click image for larger version

Name:	shytest_uBK.gif
Views:	1128
Size:	84.5 KB
ID:	15512  
Attached Files
File Type: epub shytest.epub (2.4 KB, 694 views)
File Type: prc shytest.prc (3.5 KB, 579 views)
File Type: zip shytest_tidy.html.zip (691 Bytes, 654 views)
wallcraft is offline   Reply With Quote