MobileRead Forums - View Single Post

eureka · 01-08-2012, 09:34 AM

Quote:

Originally Posted by ixtab

... don't import anything yet, until we have cleared the UTF-8 issue

I did some further testing, and these are the results:

- to change a resource on transifex from PROPERTIES to MOZILLAPROPERTIES, (it seems like ) it has to be removed and re-created. Removing a resource loses all associated translations, so we MUST make a backup of everything first.
- as expected, Java does not support UTF-8 properties out-of-the-box, but it should be possible to integrate this into the tool.

So, if we want to go for this, the workflow would be:
1. update the tool to assume everything is UTF-8. (ixtab) [this would not affect the extract part, but only the compile part -- or am I wrong?]
2. once the tool is ready, make a backup of current translation state, wipe all resources, upload result of extraction as new MOZILLAPROPERTIES resources, convert existing translations, re-upload existing translations. (eureka)

Is this correct, and should we go for it? I have "assigned" 2. to you, but I'm fine to help with the conversion part (i.e., to write some kind of tool to convert a .properties file from PROPERTIES to MOZILLAPROPERTIES format, aka from ISO-8859-1 to UTF-8).

Let me know...

OK, good plan. I'm fine with assigned task and converting from Uncode escaped sequences to UTF-8 looks not so hard with Python, so I'll do it.

It would be better if extract part will also produce UTF-8 output. While practically it is superfluous (as only en_US resources will be taken from extract result and these resources are the same in ISO-8859-1 and UTF-8 variants), it will be more consistent and more error-prone in case of my (or someone else's) error, if localized resource will leak from extract result to Git repo and further to Transifex.