View Single Post
Old 02-26-2008, 05:51 PM   #345
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by kovidgoyal View Post
MOBI files specify their encoding in the header. Not sure if mobi2html uses that information. Try mobi2oeb
How have you handled utf-8 characters? Do you know if the filepos is on the byte stream or is it on the character stream where 2 or 3 byte sequences can be counted as one character? I have an example file but either method give a strange position that filepos is pointing to...

Do anybody know were I can find correctly coded MobiPocket files which use utf-8 and have a table of content and uses utf-8 character sequences like "0xe2 ox80 0x99" (') or "0xc2 0xa0" (nbsp).

I wonder if mobigen will give me such a fille. I will test...
tompe is offline   Reply With Quote