View Single Post
Old 10-22-2015, 10:17 AM   #138
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,252
Karma: 16544692
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by Jellby View Post
I don't see how that would work for Greek or Cyrillic, given that there's no Greek or Cyrillic in LOWERS. Unless you mean extending your definition to include other alphabets.
What I meant was that each Greek char in a book where lower!=upper would be scrambled. It won't be scrambled to a Greek char but it will be scrambled to undecipherable ascii. I don't know much about Greek. Are you saying that there are many Greek chars which don't have different upper/lower variations. If so, that would indeed be a problem as far as uploading to MR is concerned.

Quote:
Originally Posted by Jellby View Post
If you use python you could start here. That basically tells you the same as your LOWERS, UPPERS and DIGITS. I haven't really used that stuff, but it looks pretty straightforward. Some additional though might be needed to scramble non-ascii characters to other non-ascii characters in their same "group", I think it's easier to just scramble anything into ascii.
Which is where we are now. I'm inclined to leave it like this unless more issues become apparent.

Quote:
Originally Posted by Jellby View Post
EDIT: Scrambling to non-ascii characters will probably cause problems with fonts: a font may a character for "é", but not for "þ" (even though they are in the same group). And any scrambling will cause problems with subset fonts.
I did observe this situation during testing. I'm not sure it's necessarily a showstopper for debugging common problems, though.

When I release the v0.2 version, perhaps some multilingual people, who are following this thread, will beta test examples of non-English books and report back on perceived issues?
jackie_w is offline   Reply With Quote