View Single Post
Old 05-18-2011, 03:02 PM   #677
FethryDuck
Addict
FethryDuck will become famous soon enoughFethryDuck will become famous soon enoughFethryDuck will become famous soon enoughFethryDuck will become famous soon enoughFethryDuck will become famous soon enoughFethryDuck will become famous soon enough
 
FethryDuck's Avatar
 
Posts: 281
Karma: 520
Join Date: Nov 2010
Location: sometimes Norway, planet Earth
Device: Kindle3, DXG
well, I was very surprised about the results I wrote about in my last posting. Could they really have a different coding based on what folder the text is stored in ?

@reprep: - I did not know about Turkish, and include a link here:
http://en.wikipedia.org/wiki/Wikiped...ish_characters

My view and hope has been that books/dictionaries would both handle utf-8, which does not seem possible with Turkish language. We are somewhat at the heart of some of the Duokan mess, as they attempt a larger code-base. 16bits seem to be needed for Turkish, which is also the issue with Duokan. ( uncertain what they attempt doing at the moment )

Turkish will need a separate character encoder, for ISO 8859-9, or include some exceptions as described in the link. Maybe GB18030 will handle it in the future ?
FethryDuck is offline   Reply With Quote