Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Amazon Kindle > Kindle Developer's Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 01-02-2012, 09:08 AM   #61
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
I have come up with pretty much the same approach, see the attached file...

If have also tested mine with the browser part, which uses a slightly different format (multiple vars, arrays etc.) There is also the audio player ui which uses a MessageFormat *inside* the "variable". I fell asleep after doing and shortly verifying this, but take a look at the attached files and let me know if it makes sense...
It contains the source code (is it readable enough?... it's still lots of regexes) as well as a proposed output (and folder structure).
Attached Files
File Type: zip js-loc.zip (22.0 KB, 247 views)
ixtab is offline   Reply With Quote
Old 01-03-2012, 04:18 AM   #62
eureka
but forgot what it's like
eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.
 
Posts: 741
Karma: 2345678
Join Date: Dec 2011
Location: north (by northwest)
Device: Kindle Touch
Quote:
Originally Posted by ixtab View Post
I have come up with pretty much the same approach, see the attached file...

If have also tested mine with the browser part, which uses a slightly different format (multiple vars, arrays etc.) There is also the audio player ui which uses a MessageFormat *inside* the "variable". I fell asleep after doing and shortly verifying this, but take a look at the attached files and let me know if it makes sense...
It contains the source code (is it readable enough?... it's still lots of regexes) as well as a proposed output (and folder structure).
Ohhh, duplicating efforts is no good. I think, this is a lack of communication from my side, as I didn't shared my plans before starting work. Maybe, a wiki page on Bitbucket with a list of needed tools and other plans will be useful. Then each of us could pick a field of operation and add his name after each tool/actions he is currently working on.

So, I've took a look on your code (btw, you should commit it without doubt) and it's results. All is OK for me (comments were useful, thanks). I think, I'd prefer if you'll continue your work and finish it, so we'll can use your tool. How do you feel about it? I'll not remove my POC (yet), as I want to see if I could parse those WAF string you've mentioned. But I'll not make it in complete usable tool, if you'll agree to finish your tool.

I've seen that you've changed wifi_wizard_dialog_strings.js by removing WifiWizardDialogStringTable_{pre,post}_unable variable and adding it's contents into appropriate strings below. While it's a understandable and suitable approach, I'd like to see these strings as separate resources. (BTW, my POC handles this situation very well ). Also, I'd suggest you taking a look at available JS parsers for Perl, it could be easier to work with parse tree, than with regexp matching results. (I don't know if there are any JS parsers for Perl, just suggesting another one way).

On repo directory structure for Pillow and WAF apps localizing: I'd like to see it as following:

Code:
src/5.0.0/pillow
  original/
    strings/
      wifi_wizard_dialog_strings.js
      wifi_wizard_dialog_strings.tjs
      ...
  locales/
    en_US/
      strings/
        wifi_wizard_dialog_strings.properties
        ...
    de_DE/
      strings/
        wifi_wizard_dialog_strings.properties
        ...
Code:
src/5.0.0/waf
  original/
    browser/
      js/
        strings.tjs
        strings.js
    store/
      strings/
        strings.tjs
        strings.js
  locales/
    browser/
      locales/
        en_US/
          js/
            strings.properties
        de_DE/
          js/
            strings.properties
    store/
      locales/
        en_US/
          strings/
            strings.properties
        de_DE/
          strings/
            strings.properties
And, on KT it would go into '/usr/share/webkit-1.0/pillow/locales', '/var/local/waf/browser/locales', '/var/local/waf/store/locales', ...

This is somewhat obscure and multilevel hierarchy, but the idea under it is simple: keep original files in separate directory, try to mirror original hierarchy as much as possible and place 'locales' directory into the 'root' directory of app (pillow, browser, store etc)

One new note. There are also translatable resources that are missed in our plans: Kindle User Guide and low-level screens.

On locale change langpicker.so somehow founds appropriate KUG (Kindle User Guide) in /opt/amazon/kug and copies it at /mnt/us/documents. Also it's chooses localized low-level screens from /opt/amazon/low_level_screens/locale_CODE.

KUG is in AZW format (i.e. in MOBI), it should be possible transform it to XHTML and transifex supports translating of XHTML.

But with low-level screens (.png images) there is no direct solution. I've thought about converting it to base64, then placing resulting string in one of file that is supported by transifex and creating instruction for translators how to decode it and encode back after translation. The biggest image is only about 20K in .png and 32K encoded in base64, so size isn't problem, I assume.
eureka is offline   Reply With Quote
Old 01-03-2012, 05:18 AM   #63
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
Quote:
Originally Posted by eureka View Post
Ohhh, duplicating efforts is no good. I think, this is a lack of communication from my side, as I didn't shared my plans before starting work. Maybe, a wiki page on Bitbucket with a list of needed tools and other plans will be useful. Then each of us could pick a field of operation and add his name after each tool/actions he is currently working on.
Not a bad idea. Would you set up a skeleton for it?

Quote:
Originally Posted by eureka View Post
So, I've took a look on your code (btw, you should commit it without doubt) and it's results. All is OK for me (comments were useful, thanks). I think, I'd prefer if you'll continue your work and finish it, so we'll can use your tool. How do you feel about it? I'll not remove my POC (yet), as I want to see if I could parse those WAF string you've mentioned. But I'll not make it in complete usable tool, if you'll agree to finish your tool.

Also, I'd suggest you taking a look at available JS parsers for Perl, it could be easier to work with parse tree, than with regexp matching results. (I don't know if there are any JS parsers for Perl, just suggesting another one way).
I'm ok with continuing my stuff. In any case, don't remove yours -- it only takes a few bytes of space and can still be useful later on. Maybe as the tools directory grows, we can simply put a README with explanations in it.
I also totally agree that actually parsing the JS is a much better idea, because it's much more robust. The current script heavily depends on a "uniform" layout of the files -- which all currently abide to -- but may well break in the future. I'll look into it, but don't hold your breath...

Quote:
Originally Posted by eureka View Post
I've seen that you've changed wifi_wizard_dialog_strings.js by removing WifiWizardDialogStringTable_{pre,post}_unable variable and adding it's contents into appropriate strings below. While it's a understandable and suitable approach, I'd like to see these strings as separate resources. (BTW, my POC handles this situation very well ).
This is a misunderstanding
The pre_ and post_ things are NOT there in the original files, but rather were already factored out by me when I thought we'd have to translate stuff manually. The file in the last zip actually IS the original ;-) -- should've made this clearer though.

Quote:
Originally Posted by eureka View Post
On repo directory structure for Pillow and WAF apps localizing: I'd like to see it as following:

Code:
src/5.0.0/pillow
  original/
    strings/
      wifi_wizard_dialog_strings.js
      wifi_wizard_dialog_strings.tjs
      ...
  locales/
    en_US/
      strings/
        wifi_wizard_dialog_strings.properties
        ...
    de_DE/
      strings/
        wifi_wizard_dialog_strings.properties
        ...
Code:
src/5.0.0/waf
  original/
    browser/
      js/
        strings.tjs
        strings.js
    store/
      strings/
        strings.tjs
        strings.js
  locales/
    browser/
      locales/
        en_US/
          js/
            strings.properties
        de_DE/
          js/
            strings.properties
    store/
      locales/
        en_US/
          strings/
            strings.properties
        de_DE/
          strings/
            strings.properties
And, on KT it would go into '/usr/share/webkit-1.0/pillow/locales', '/var/local/waf/browser/locales', '/var/local/waf/store/locales', ...

This is somewhat obscure and multilevel hierarchy, but the idea under it is simple: keep original files in separate directory, try to mirror original hierarchy as much as possible and place 'locales' directory into the 'root' directory of app (pillow, browser, store etc)
Agreed, with one minor suggestion. Remove the intermittent "locales", IMO it is superfluous. I guess that a directory containing "original","en_US","de_DE" etc. is easier (and 'original' cannot be confused with a locale anyway).

Quote:
Originally Posted by eureka View Post
One new note. There are also translatable resources that are missed in our plans: Kindle User Guide and low-level screens.

On locale change langpicker.so somehow founds appropriate KUG (Kindle User Guide) in /opt/amazon/kug and copies it at /mnt/us/documents. Also it's chooses localized low-level screens from /opt/amazon/low_level_screens/locale_CODE.

KUG is in AZW format (i.e. in MOBI), it should be possible transform it to XHTML and transifex supports translating of XHTML.

But with low-level screens (.png images) there is no direct solution. I've thought about converting it to base64, then placing resulting string in one of file that is supported by transifex and creating instruction for translators how to decode it and encode back after translation. The biggest image is only about 20K in .png and 32K encoded in base64, so size isn't problem, I assume.
Yeah, that certainly is yet another problem, but I suggest to leave it for later. We have anough other things to do at the moment... BTW, which png files do you mean? Can you give an example?

If figured out a much more serious problem yesterday. You may have realized that I created two new resources on transifex. For some obscure reason, these files had not been found by my extraction tools, and I manually created the properties files. Now, there is *at least* one further resource: com/amazon/ebook/booklet/reader/resources/ReaderResources.class in opt/amazon/ebook/lib/ReaderSDK-impl.jar
This one contains many resources as well. The real problem is that the resource (which is a ListResourceBundle) also contains ARRAYS of Strings which need to be translated. PropertyResourceBundle doesn't support arrays, though. So in the end, in order to properly translate these, we MUST create a ListResourceBundle class instead of a flat properties file.

Any ideas?!
ixtab is offline   Reply With Quote
Old 01-03-2012, 08:14 AM   #64
eureka
but forgot what it's like
eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.
 
Posts: 741
Karma: 2345678
Join Date: Dec 2011
Location: north (by northwest)
Device: Kindle Touch
Quote:
Originally Posted by ixtab View Post
Not a bad idea. Would you set up a skeleton for it?
Here it is. (I've also enabled "Issues" tab for repository. Maybe it will be useful for workflow in some future.)

Quote:
Originally Posted by ixtab View Post
This is a misunderstanding
The pre_ and post_ things are NOT there in the original files, but rather were already factored out by me when I thought we'd have to translate stuff manually. The file in the last zip actually IS the original ;-) -- should've made this clearer though.
Ha-ha Nice. OK, have no time to talk, now I need to solve another nonexistent problem.


Quote:
Originally Posted by ixtab View Post
Agreed, with one minor suggestion. Remove the intermittent "locales", IMO it is superfluous. I guess that a directory containing "original","en_US","de_DE" etc. is easier (and 'original' cannot be confused with a locale anyway).
You are right. How about it:

Code:
src/5.0.0/pillow
  original/
    strings/
      wifi_wizard_dialog_strings.js
      wifi_wizard_dialog_strings.tjs
      ...
  en_US/
    strings/
      wifi_wizard_dialog_strings.properties
      ...
  de_DE/
    strings/
      wifi_wizard_dialog_strings.properties
      ...
Code:
src/5.0.0/waf
  original/
    browser/
      js/
        strings.tjs
        strings.js
    store/
      strings/
        strings.tjs
        strings.js
  en_US/
    browser/
      js/
        strings.properties
    store/
      strings/
        strings.properties
Quote:
Originally Posted by ixtab View Post
Yeah, that certainly is yet another problem, but I suggest to leave it for later. We have anough other things to do at the moment... BTW, which png files do you mean? Can you give an example?
Look for them in /opt/amazon/low_level_screens.

Quote:
Originally Posted by ixtab View Post
If figured out a much more serious problem yesterday. You may have realized that I created two new resources on transifex. For some obscure reason, these files had not been found by my extraction tools, and I manually created the properties files. Now, there is *at least* one further resource: com/amazon/ebook/booklet/reader/resources/ReaderResources.class in opt/amazon/ebook/lib/ReaderSDK-impl.jar
This one contains many resources as well. The real problem is that the resource (which is a ListResourceBundle) also contains ARRAYS of Strings which need to be translated. PropertyResourceBundle doesn't support arrays, though. So in the end, in order to properly translate these, we MUST create a ListResourceBundle class instead of a flat properties file.

Any ideas?!
Oh, s! I have only one idea: take the same approach as for JavaScript files. JAD results looks not tough to parse.

(Though even better would be to deal with and change original bytecode.)

Last edited by eureka; 01-03-2012 at 08:15 AM. Reason: fix a link to image
eureka is offline   Reply With Quote
Old 01-03-2012, 08:34 AM   #65
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
Quote:
Originally Posted by eureka View Post
Here it is. (I've also enabled "Issues" tab for repository. Maybe it will be useful for workflow in some future.)
Looks good. I'll update it when it makes sense... The folder structure also looks good.

Quote:
Originally Posted by eureka View Post
Look for them in /opt/amazon/low_level_screens.
Are you sure that these are ever displayed? On my Kindle, this is displayed in german after translating blanket.mo and setting the locale. Maybe they are just fallback screens. I don't have the device around so can't look at other screens, but I would assume them to never be used. (Will check when I get home if there is any screen containing strings I haven't encountered)

Quote:
Originally Posted by eureka View Post
Oh, s! I have only one idea: take the same approach as for JavaScript files. JAD results looks not tough to parse.

(Though even better would be to deal with and change original bytecode.)
You're right. Actually the jad "parsing" my previous tool did obviously missed quite a few things, so I was thinking about rewriting it in Java, running through the class definitions directly. This may also be useful for such a project (the parsing part is exactly the same; there's only the "modification" part to add). I have stumbled upon some Java libs for manipulating class code before (don't remember them right now), but that may be the way to go.
ixtab is offline   Reply With Quote
Old 01-03-2012, 03:57 PM   #66
MatzeMatz
Enthusiast
MatzeMatz began at the beginning.
 
Posts: 27
Karma: 10
Join Date: Dec 2011
Device: Kindle Touch
Wow! You both (ixtab and eureka) making progress over progress for the translation stuff!
Every time I've time and look at the forum the thread has grown with new progress messages.
I'm really impressed...

I'd still like to help - but it seems that my time is much more limited than yours - most of the time I spend my free time with reading of the thread messages rather than doing some "real" work...
It seems, my time is too limited to do much more currently...

Nevertheless I still want to help as a "limited resource"

I read (most of) this thread's messages - but I'm really lost what I could do...

Is it correct that all parts which are checked in at transifex are already translated?
So I assume a lot of strings are still missing...?

@eureka: what firmware version has your device? v5.0.0 or v5.0.1?
I also assume there are differences between the texts which have to be translated.
Is the current focus more on v5.0.0 or more on v5.0.1?
What are the plans for both versions?
Support only one of them? Or both together?
If yes, how do we handle the differences?
I wonder if it's possible to make some branches for that on transifex?
(I also try to read the documentation at transifex to understand that - but it's boring ...)

regards,
MatzeMatz
MatzeMatz is offline   Reply With Quote
Old 01-03-2012, 04:11 PM   #67
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
@eureka: I checked the low_level_screens, and except for the shipping mode message, all seem to have been localized. Besides, I *think* that I had to reset my device to shipping mode lately (because I thoroughly demolished everything) while the german locale was installed, and I *think* that there was no text at all. (like only 1, 2, 3 images so you know what you should do, but no textual description). But I didn't pay close attention, so I may be wrong. Don't wanna wreck my Kindle right now just to verify this...

@MatzeMatz: I actually also only had a time over the last week. Now I'm back to work, so things will necessarily go slower from my side. If you want to contribute, take a look at the de_DE strings which are untranslated ;-)

I think everybody has 5.0.1 on their devices, so this is more of a glitch in the current naming. It is actually from 5.0.1. I suggest not to worry about that yet (except that it may be confusing because of the naming currently).

If you want to use transifex, make sure to use the .tx/config from the git repository, otherwise you end up with a completely messed up directory structure. Then, just "tx pull -l de" or "tx push -l de" accordingly.
ixtab is offline   Reply With Quote
Old 01-03-2012, 04:22 PM   #68
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
BTW, did you check the /opt/amazon/low_level_screens/sq-AL/600x800/ pictures? WTF?
ixtab is offline   Reply With Quote
Old 01-03-2012, 09:54 PM   #69
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
ok, final update for today.

I rewrote the parser to consider all compiled ResourceBundles (this does *NOT* include the very few .properties files). The unfiltered result is attached.

3 notes:
  1. This file is purely informational. Don't even bother to download it unless you're deep into the topic.
  2. There is exactly one file which throws an Exception with this method, because it requires a special JRE. That's not a big issue though.
  3. The file contains *all* retrieved information, and is thus a superset of what would have to be (and can be) translated. For us, most probably only String and String[] resources are relevant. There (very) few String[][] as well.
Attached Files
File Type: txt out.txt (488.8 KB, 5451 views)
ixtab is offline   Reply With Quote
Old 01-04-2012, 02:26 AM   #70
eureka
but forgot what it's like
eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.
 
Posts: 741
Karma: 2345678
Join Date: Dec 2011
Location: north (by northwest)
Device: Kindle Touch
Quote:
Originally Posted by MatzeMatz View Post
@eureka: what firmware version has your device? v5.0.0 or v5.0.1?
I also assume there are differences between the texts which have to be translated.
Is the current focus more on v5.0.0 or more on v5.0.1?
What are the plans for both versions?
Support only one of them? Or both together?
If yes, how do we handle the differences?
I wonder if it's possible to make some branches for that on transifex
I have v5.0.0 on my device. When I've named resources, I assumed that ixtab had this version too.

I don't know if there are any differences between 5.0.0 and 5.0.1 (concerning resources). I think, I'll create list of MD5 sums for any file, where resources was extracted from, and provide it. If someone then will make such a list for 5.0.1, we can find differing files and will know what was changed (if it was).

My plans are to support any future version of firmware. Any new (or differed) resource will be introduced on transifex with "tag" [5.x.x] . Other (not changed) resources will be taken as-is from translation for previous firmware. In Git repo these new/differed resources will go into src/5.x.x folder, where also some metadata will be stored to provide list of resources needed to be taken from another folder (of previous firmware version).

Quote:
Originally Posted by ixtab View Post
@eureka: I checked the low_level_screens, and except for the shipping mode message, all seem to have been localized. Besides, I *think* that I had to reset my device to shipping mode lately (because I thoroughly demolished everything) while the german locale was installed, and I *think* that there was no text at all. (like only 1, 2, 3 images so you know what you should do, but no textual description). But I didn't pay close attention, so I may be wrong. Don't wanna wreck my Kindle right now just to verify this...
OK, it makes sense. In .po files there are same messages as on low-level screens and also numbers which are looked like a coordinates for messages positioning on low-level screen. So I think Amazon makes lifes of translators a little bit easier with it.

Quote:
Originally Posted by ixtab View Post
BTW, did you check the /opt/amazon/low_level_screens/sq-AL/600x800/ pictures? WTF?
I don't know. sq-AL is Albanian, right? And the text on pictures is surely not Albanian. It's in English but with characters from foreign alphabets.

(Though naming of Albanian for a distorted language has a subtle implication for a user from Russian internets: http://en.wikipedia.org/wiki/Olbanian_language)

Quote:
Originally Posted by ixtab View Post
ok, final update for today.

I rewrote the parser to consider all compiled ResourceBundles (this does *NOT* include the very few .properties files). The unfiltered result is attached.
Looks good.

OK, so there are two tasks for now: dealing with resources in JavaScript files and in Java .class files. Which one would you like to take? I'll take another one then. (Anyway we'll end with one tool written in Python.)

Looking at output of your tool I've received an insight: there could be no need in original .js and .class files for constructing of translation.

JS strings are always in one format: there is one variable contained object with a strings and with other objects, which are contiaining strings or other objects... As in .properties we have name of object and it's keys ("PasswordDialogStringTable.passwordEntryTitle" or "bapp.strings.error.na"), we could construct JavaScript file from the ground-up. It will looks not the same as original file, but will return the same result.

Expression with function in waf/adviewer/scripts/resources.js evaluates into "var AdResources = {strings: {...} }" at the end, so it's pretty much the same format. And information about array as value of object's property in waf/browser/js/strings.js could be encoded in property name. These rare exceptions from common rule could be hard-coded in tool.

Though, I can't say definitely could .class files be reconstructed from .properties in output format of your new parser, I hope it is so.

The advantage is absence of Amazon code in repository.
eureka is offline   Reply With Quote
Old 01-05-2012, 04:13 AM   #71
eureka
but forgot what it's like
eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.
 
Posts: 741
Karma: 2345678
Join Date: Dec 2011
Location: north (by northwest)
Device: Kindle Touch
Quote:
Originally Posted by eureka View Post
I don't know if there are any differences between 5.0.0 and 5.0.1 (concerning resources). I think, I'll create list of MD5 sums for any file, where resources was extracted from, and provide it. If someone then will make such a list for 5.0.1, we can find differing files and will know what was changed (if it was).
I've generated the list with md5sum of resources from KT 5.0.0. It is attached (with list generation script, meant to be run on KT).
Attached Files
File Type: txt resources.md5sum.txt (6.3 KB, 337 views)
File Type: txt list_of_resources_md5sum.sh.txt (412 Bytes, 227 views)
eureka is offline   Reply With Quote
Old 01-05-2012, 05:34 AM   #72
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
Quote:
Originally Posted by eureka View Post
OK, so there are two tasks for now: dealing with resources in JavaScript files and in Java .class files. Which one would you like to take? I'll take another one then. (Anyway we'll end with one tool written in Python.)
If you're ok with it, I'd focus on the Java part and let you look into the JS part. That way, we should end up with better quality tools for both (the prior resource extraction script was just a hack, the new one does it in a much better way; and it's better to use a proper JS parser than just regexes).

Quote:
Originally Posted by eureka View Post
Looking at output of your tool I've received an insight: there could be no need in original .js and .class files for constructing of translation.

JS strings are always in one format: there is one variable contained object with a strings and with other objects, which are contiaining strings or other objects... As in .properties we have name of object and it's keys ("PasswordDialogStringTable.passwordEntryTitle" or "bapp.strings.error.na"), we could construct JavaScript file from the ground-up. It will looks not the same as original file, but will return the same result.
Good idea, and would probably work for the Java part. for JS I'm not entirely sure. As I mentioned in one of the posts above, there is at least one instance where there is actually JS code instead of a simple string, using a MessageFormat (or something the like) as the value. Maybe we can simply keep the *entire* value as the property (so e.g. for strings, including the simple, or double quotes around it. For potential "real" JS code, it is then up to the translator to make sure that what they commit is still valid JS)?

Quote:
Originally Posted by eureka View Post
Though, I can't say definitely could .class files be reconstructed from .properties in output format of your new parser, I hope it is so.

The advantage is absence of Amazon code in repository.
As to not having Amazon code in the repo: definitely a good idea.

For the (new) Java props parser, the "finding" part is actually real easy. I simply find all classes which extend ResourceBundle, and then use normal Java to get the actual properties (getKeys/getObject). No parsing needed.

For writing back changes, I took a look around, and SERP (http://serp.sourceforge.net) looks good. (Other alternatives: http://java-source.net/open-source/bytecode-libraries). I'll see what can be done when I get to it (as announced earlier, I'm shorter on free time now... maybe over the weekend).

Concerning the md5sums, I'll check 5.0.1 ones tonight.
ixtab is offline   Reply With Quote
Old 01-05-2012, 06:35 AM   #73
eureka
but forgot what it's like
eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.
 
Posts: 741
Karma: 2345678
Join Date: Dec 2011
Location: north (by northwest)
Device: Kindle Touch
Quote:
Originally Posted by ixtab View Post
If you're ok with it, I'd focus on the Java part and let you look into the JS part. That way, we should end up with better quality tools for both (the prior resource extraction script was just a hack, the new one does it in a much better way; and it's better to use a proper JS parser than just regexes).
I'm OK with JS part.

Quote:
Originally Posted by ixtab View Post
Good idea, and would probably work for the Java part. for JS I'm not entirely sure. As I mentioned in one of the posts above, there is at least one instance where there is actually JS code instead of a simple string, using a MessageFormat (or something the like) as the value. Maybe we can simply keep the *entire* value as the property (so e.g. for strings, including the simple, or double quotes around it. For potential "real" JS code, it is then up to the translator to make sure that what they commit is still valid JS)?
I think I've found this JS code. It's in media_player_bar_strings.js in value of trackCountMessageFormat property and consists of
Code:
new MessageFormat("Track {index} of {count}")
I think with JS parser I could recognize such instantiating of object, "split" it into syntax parts (upto string as parameter of function) and extract contained string into .properties as 'MediaPlayerBarStrings.trackCountMessageFormat (MessageFormat)=Track {index} of {count}' or something like it. Then on compiling .properties back just wrap translated string into 'new MessageFormat(...)' construction. Will see...

Certainly it will be not a general soultion but rather a "hardcoded" specific rule, but it's better than keep original JS source.

Quote:
Originally Posted by ixtab View Post
For the (new) Java props parser, the "finding" part is actually real easy. I simply find all classes which extend ResourceBundle, and then use normal Java to get the actual properties (getKeys/getObject). No parsing needed.

For writing back changes, I took a look around, and SERP (http://serp.sourceforge.net) looks good. (Other alternatives: http://java-source.net/open-source/bytecode-libraries). I'll see what can be done when I get to it (as announced earlier, I'm shorter on free time now... maybe over the weekend).
OK. Good luck!
eureka is offline   Reply With Quote
Old 01-07-2012, 03:42 PM   #74
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
So..... there is some progress on the Java side, and I just checked in the sources and binaries. This is a big improvement over the previous "hacky" approach, in multiple ways.
  • Until now, only a subset of all applicable (localization) strings was known, and so much, but not all, of the translation could have been done there.
  • We can now reliably determine everything that can potentially be localized.
  • As far as localization is concerned, and as long as it concerns displayable text, we should be able to provide everything that the Java part of the Kindle encompasses. Plain Strings, as well as 1-, 2-, and 3-dimensional String arrays can now be extracted and translated. For the technically inclined folks, this is where the real challenge is, and where much of the labor has gone.
  • The new tool allows to create jar files ready for being deployed on the Kindle -- i.e., a locale .jar for a specific language can be built with one program invocation only.

For the people waiting for "when will there be a binary package for KT in my language", please wait a little more. We're working on it, but we're not there yet.

For more adventurous people, you can try to create a bundle containing your localized values and deploying it to your Kindle.

Last edited by ixtab; 01-08-2012 at 01:28 AM.
ixtab is offline   Reply With Quote
Old 01-08-2012, 01:46 AM   #75
ixtab
(offline)
ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.ixtab ought to be getting tired of karma fortunes by now.
 
ixtab's Avatar
 
Posts: 2,907
Karma: 6736092
Join Date: Dec 2011
Device: K3, K4, K5, KPW, KPW2
Just realized that the last post wasn't really complete. So here is how to actually use it:

- extract: "java -jar tool/kt-l10n.jar extract -s ~/kindle-touch/fs/opt/amazon -t tmp/" -- will recursively scan ~/kindle-touch/fs/opt/amazon for jar files, and write all relevant properties to tmp/

- compile locale:
java -jar tool/kt-l10n.jar compile -s ./src/5.0.0/framework/ -t /tmp/locale-de.jar de

I have not taken the final step of updating the source .properties on transifex. eureka, Could you please verify that the output of "extract" makes sense before we go anywhere else? (BTW, this would also be a possibility to check for differences between 5.0.0 and 5.0.1. I tried the md5sum thing, but it's pretty useless because md5sums of almost all files have changed. I'm attaching the output of "extract" for 5.0.1.) Then the only thing left to do is to actually update the source files for transifex... Could you please do that? (I don't know how you managed to automatically update the .tx/config and have the files appear on transifex with the correct name etc.)
Attached Files
File Type: zip extract_501.zip (153.4 KB, 223 views)
ixtab is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Kindle 3 localization JirkaS Kindle Developer's Corner 287 05-20-2018 10:08 AM
[K3] Physical keyboard localization Sir Alex Kindle Developer's Corner 112 05-19-2018 11:23 PM
Kindle 4 (no touch) GUI Localization Sir Alex Kindle Developer's Corner 43 09-13-2013 07:19 AM
Keyboard localization (hack) Sir Alex Kindle Developer's Corner 72 04-16-2013 03:05 PM
Kindle 3, Nook Simple Touch, Kobo Touch and Libra Pro Touch jbcohen Which one should I buy? 4 06-18-2011 07:58 PM


All times are GMT -4. The time now is 12:51 AM.


MobileRead.com is a privately owned, operated and funded community.