View Single Post
Old 02-12-2013, 08:58 PM   #2
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,469
Karma: 1053245
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
There is no good way to reliably detect a file's character encoding. Your best bet is to decode, and check manually or if you know the encoding specify it explicitly.

In your case you could check for characters like Ă and in the text and use that as a trigger that the encoding was wrong. However, these are valid characters for that encoding. So this technique will only work in cases where you know those characters will not be present in the text. If this is a novel in a specific language this would work the majority of the time. But it is not a fool proof system and it is not a good general purpose method.
user_none is offline   Reply With Quote