Thread: EPUB output
View Single Post
Old 11-14-2008, 12:06 PM   #277
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,423
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
OK I've added the encoding declaration

Quote:
Originally Posted by thawk View Post
Yes, I know that. But, I met the charset problem sereval times.
Most book I had converted are good, but there're 2 or 3 books which book title is incorrect displayed by 505 and the epub-meta. The titles have utf-8 encoding, but it seems that epub-meta treat the title as a sequence of ascii bytes, and re-encode each byte into utf-8.

e.g.
I have a book title '中文版', which utf-8 encoding is e4 b8 ad e6 96 87 e7 89 88,
but the result is epub-meta is 'ä¸*ćç', which is c3 a4 c2 b8 c2 ad c4 87 c3 a7.
I have check the metadata.opf, it's correctly encoded with utf-8. After adding the UTF-8 declaration line to metadata.opf, both 505 and epub-meta display the title correctly.

btw, without the UTF-8 line, 'epub-meta test.epub' display the wrong title to the console. 'epub-meta test.epub > file' reports a error:

Traceback (most recent call last):
File "/usr/bin/epub-meta", line 8, in <module>
load_entry_point('calibre==0.4.104', 'console_scripts', 'epub-meta')()
File "/usr/lib/python2.6/site-packages/calibre/ebooks/metadata/epub.py", line 238, in main
print unicode(get_metadata(stream, extract_cover=False))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-19: ordinal not in range(128)
kovidgoyal is offline   Reply With Quote