Thread: JBPatch
View Single Post
Old 05-28-2012, 05:18 PM   #152
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
[WiP] Hyphenation

Just a short "Work in Progress" announcement: I'm currently looking into .mobi/.azw hyphenation.

Yes, besides tampering around with the internals of the KT, I'm also actually using it to read books

... and there is one thing that has always seriously bugged me: the page layout. In principle, text is justified... well, except if there are long words. And my native language features a lot of long words, so the result is rather ugly to read on the Kindle.

Interestingly enough, the mobi reader of the Kindle actually does include hyphenation support, but it is deactivated by default, most probably because it's simply buggy.

The screenshot below shows a very, very early development version of an attempt to fix this.

Click image for larger version

Name:	hyphenation-sample.png
Views:	470
Size:	66.8 KB
ID:	87026

Notes:
  • The strange page layout is intended. It is used because it results in a higher amount of hyphenations.
  • The weird hyphenation character ("@" instead of "-") is also intentional, because it eases visual verification and debugging.
  • Hyphenation is completely dumb, at this time. It simply assumes that a word can be "hyphenated" at every character, and does not account for syllables or language. This is why it produces wrong results (obvious from the example, at least for german readers)
  • The green arrows show where hyphenation has kicked in, "correctly" as per the definition above.
  • The red boxes show where line spacing is still wrong. I currently don't know where that comes from, but it's annoying. It might be that the next word is not long enough for hyphenation (improbable), or that the previous word is too short to trigger adjustment/hyphenation (probable). In any case, it's most probably a bug in the reader. I'll look into it later.

As said, this is still under development, and I have no idea when it will be ready to be published, especially because correct hyphenation is language dependent. If you think you can help with this, whether "technically" or "linguistically", all suggestions are really appreciated.

PS: +300 Karma for the first person to comment about the book that the above excerpt is from. And please, don't cheat - either you have actually read the book, and your answer is interesting, or you just googled it up, and your answer is worthless (... but if you already went through the pain of looking it up, then just read it - It's free, it's a classic of literature, and it's really worth reading!) .

Last edited by ixtab; 05-28-2012 at 06:54 PM.
ixtab is offline   Reply With Quote