View Single Post
Old 01-24-2006, 05:43 AM   #7
Laurens
Jah Blessed
Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.
 
Laurens's Avatar
 
Posts: 1,295
Karma: 1373
Join Date: Apr 2003
Location: The Netherlands
Device: iPod Touch
Lightbulb More about encodings

Quote:
Originally Posted by wyfwyf
Sunrise (java) has a neat feature that allows different language encodings for the source page and the output page. I can set source page as Big5 (Traditional chinese) and output page as GBK (simplified chinese). Very useful for reading on my palm.
Actually, thinking about this some more, you should be able to convert to Simplified Chinese just fine. Simply set the target language to "Chinese (Simplified)" in the Document Properties. Any traditional Chinese pages (encoded in Big5, for instance) are then automatically converted to GB2312.

Internally, Sunrise XP handles character conversion as follows:
Big5 -> Utf8 -> GB2312

The character encoding MUST be provided in the Content-Type header or HTML <meta> tag, as Sunrise XP has no encoding detection mechanism.

From my limited knowledge, converting from Traditional to Simplified Chinese is much easier than the other way around. I can't read the language and I don't have a handheld that can display Chinese symbols, though, so there's no way for me to verify that it works. I only checked the encoding support with Cyrillic languages (Russian) and that seems to work well.
Laurens is offline