View Single Post
Old 12-02-2009, 07:07 PM   #58
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by macr0t0r View Post
... First off, the \U tag has proven to be unreliable with some fonts, and it's a train-wreck on Symbian devices.
Good to know. That code is from calibre's PML output and I only test against the desktop software with the standard font. Looking at the docs it seem that \\U only supports certain fonts and versions.

Quote:
Originally Posted by macr0t0r View Post
Second, I don't believe there are extended codes for \x80 and \x81.
Looks like there isn't.

Quote:
Originally Posted by macr0t0r View Post
However, this is a fascinating little trick. Perhaps this could work?
Code:
text = re.sub('[\x82-\xff]', lambda x: '\\a%03d' % ord(x.group()), text)
Then, perhaps I could fall back to unicode for whatever is left:
Code:
text = re.sub('[^\x00-\xff]', lambda x: '\\U%04x' % ord(x.group()), text)
This will work very well inside of the eReader script because you should never encounter characters that are not defined by either the \\a or \\U tags.
user_none is offline   Reply With Quote