View Single Post
Old 01-11-2012, 04:23 PM   #95
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736094
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
Update:
just checked in the newest version.
  • UTF-8 output: should be fixed, both for extract and iso2utf tasks.

About the individual properties, I'll go from easiest to most complicated:
  • CliticUtilResources: blacklisted. I have no idea what they're good for anyway (there doesn't seem to exist a "base property" anyway, so they might just be artifacts). In any case, not including them into the translation leaves everything at the default, which should be safe.
  • SimpleDictLookupFormatResources: blacklisted, same reason as above.
  • DefaultDictionaries: blacklisted. We cannot reliably know what default dictionaries should be for a given language anyway, and users can always add their preferred dictionaries. So staying with the default seems the safest bet.
  • SimpleTextListFormatResources: This is a tricky case. The "_en" overrides some of the definitions in the base, but not all of them. (There is list.format.type_default.patterns in the base, which is not overridden in any of the existing locales. OTOH, all of the other existing locales "follow" the _en version in terms of content, plus the "default" one should be ok for all languages. So, in this particular case, the _en one should be the base for the translation, and the "base" one excluded from translation.

So the final question is about the naming of source files. I'm still not terribly happy with renaming "base" properties to "_en", because it will change the "semantics", and is not required for transifex. As said earlier, the source filename can be anything (it can even be completely different from the "translated file pattern"), the only thing required is to specify which locale the source file is in. My proposal is to use "blank" properties as sources, with exactly one exception as outlined above (SimpleTextListFormatResources). This one can be manually added to .tx/config.

What do you think?

About the conversion, it certainly makes sense to announce it a day or two before the change. The conversion itself should be pretty simple:
  1. make full update (tx --pull --all)
  2. convert existing localizations to utf-8 (kt-l10n.jar iso2utf)
  3. remove all en_US files (to clean up previous source file state)
  4. extract all resources as UTF-8 (kt-l10n.jar extract)
  5. update .tx/config
  6. push sources
  7. push translations

Of course, it's best to do a test-drive this in temporary directories first, but that is roughly the way it should go. The entire process, when going live, would take at most half an hour (most of the time spent uploading to transifex).

Does that make sense?
ixtab is offline   Reply With Quote