Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 09-19-2014, 05:51 PM   #1
arspr
Dead account. Bye
arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.
 
Posts: 587
Karma: 668244
Join Date: Mar 2011
Device: none
Question Reflections about how "translatable" Calibre is - Custom columns.

Hi, guys, sorry because this post is going to be quite long... But I really think it's necessary. Even if you just read the very first lines I really thank you nevertheless.

I want to start this thread just to arise future thinking about a possible weakness Calibre has. Calibre in theory is an English software but in theory is "easily" translated to any other language. And this is in general quite true, but recent experiences have shown me that when you work with custom columns, the aforementioned statement starts to fail... And specially it really, really fails if you intend to switch the GUI language while working on the very same library...

Sure, that is not a common situation at all and maybe fixing it is not worth the effort. Maybe it would suppose a complete overhaul of the code or whatever, (I'm in no way a code expert), and some of the "bugs/feature requests" I've opened in launchpad have been dismissed or have received alternate solutions (which are somehow "hackish" sometimes).

I don't intend to demand for better solutions, (no user of this GREAT software has that right when the software is just donation-ware) but I think that you, Kovid, and your team of coders should spend a little of time giving a second thought on this "problem" from a general point of view rather than thinking on specific issues. Maybe at the end you just decide that Calibre is perfect as it is, or that you will continue putting partial patches in the critical issues that might arise, or that even if there could be much better solutions, you just don't have enough time/energies to spend on it. Whatever you decide is just great but please just be fully conscious that in this specific area of custom columns, when they are complex, Calibre is just an English software in a multi-language "disguise".

I insist, no personal problem at all, if you decide that this is the way Calibre is going to be because of whatever reasons, it's just perfect.

After this long introduction I'm going to post some of the examples I've found. Possibly some of them are just plain misuse of Calibre's features because of my own ignorance. But it's also really feasible that these are not the only issues Calibre might have... So I think that reading them as a whole, (rather than as individual issues), they can be a good description of the general situation.



Custom columns names cannot be automatically translated I mean when you change from English to Spanish, "Author(s)" and "Language(s)" automatically change into "Autor(es)" and "Idiomas", but that's not possible with my custom columns names. In fact you have to manually change the custom column name. There's no "table", "list" or similar place where you could put the Custom Column name in any of the possible languages.



Custom template functions help cannot be translated The very same situation.



Some troubles with yes/no columns when they are calculated
Official Yes/No-type columns work really fine but in one issue which is not language related: they cannot be selected to appear in the Tag Browser.

But there's a really useful trick: copying them into an auxiliary extra column ({#my_original_yes_no_column}) which can be shown in Tag Browser.

And then, specially after this patch, you can say that they work in any language. Their "Yes/No" values are transparently translated into "Sí/No", "Oui/Non" or whatever in either the table view (I mean, #my_original_yes_no_column) and in Tag Browser (through its copied column).

But the real problem arises when you calculate a custom column through whichever chain of functions which gives a "Yes/No" result. What do you do in that case?:
  • If you hardcode "Yes" and "No" as the possible result values (or "Sí/No" or "Oui/Non" depending on your primary used language), it perfectly works within that original language. You get the green tick / red cross and that column is virtually undistinguishable from a built-in Yes/No-type.
    But if you change the GUI language, then it doesn't work anymore because, in Spanish or French, "Yes/No" have no meaning as key-words.
  • Instead of this way, you could hardcode "True"/"False" which are actually the real key-words inside a Yes/No column.
    Ok, if you do this, then the Calibre table view works perfectly, (you get the correct icons), no matter what language you select in the GUI. But then the Tag Browser is somehow weird, because the native yes/no columns would say "Sí/No" (or whatever other words in each particular language) and the calculated pseudo-yes/no columns would say "True/False". It just doesn't look fine.

Possible solution, but rejected in this bug ticket: implementing yes_in_GUI_language() and no_in_GUI_language() template functions...



Problems with languages-like column
Well, the problem here is that there aren't language-type custom columns... Languages is a tricky kind of column because it actually stores language codes (Spanish is "spa", English is "eng", ...), but it shows the readable and conventional name of that language in the GUI selected language. So spa appears as "Spanish", "Español", "Espagnol", ... based on GUI language.

But then, I do have an extra pseudo-language-like custom column: #original_language of the ebook. As you can imagine, I've just selected a Tag-Like custom column which actually contains the conventional name of that original language...

Problems:
  • When I switch to other GUI language, languages is automatically updated but #original_language is not. And it looks weird.
  • Comparisons between languages and #original_language are really, really tricky. (I make this comparison to detect and classify which books are translations and which ones aren't).
    You have to use the language_codes() and language_strings() functions which have very limited GUI language support. In my particular library case, they work because I've filled #original_language in English. But imagine that I had set it as #langue_originale (I mean in French) and I had set the GUI language to Spanish... It's just impossible to make comparisons between languages and #langue_originale in that case...

Possible best solution (but partially rejected quite long ago):
  • Enable TRUE language-type custom columns, and
  • Add full arbitrary language support, (through an extra argument), to language_code() and language_strings() functions, and
  • Add a "mask as language" option on calculated custom columns. In this way this kind of column containing language codes, (the result of whichever calculations), would be shown as their human readable names in the GUI language.




Additional translation values for Tags and Tag-like custom columns
I mean if I set Tag as "Novel", "Thriller", "Fantasy" or whatever, maybe I would like getting "Novela", "Suspense", "Fantástico"... in Spanish. But it is not possible.

And of course it's the same with tag-like columns. I have an #edit_tweaks_made_by_me_on_the_ebook column where I have introduced "Justify text", "Typos", "Remove page margins"... which I would like to see as "Justificar texto", "Erratas", "Eliminar márgenes de página"... in Spanish.

Last edited by arspr; 09-19-2014 at 05:55 PM.
arspr is offline   Reply With Quote
Old 09-20-2014, 05:34 PM   #2
DaltonST
Deviser
DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.
 
DaltonST's Avatar
 
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
Additional translation values for Tags

arspr,

Most of your thoughts are outside of my scope, but I would like to comment on your subsection entitled "Additional translation values for Tags and Tag-like custom columns".

Tags are usually downloaded metadata. Amazon, for example, bases its categories (i.e., Tags) on BISAC Subject Headings and Codes. BISAC is the acronym for the Book Industry Study Group. It is based (unsurprisingly) in New York, where the publishing industry is clustered. Everything BISAC publishes for use by real Publishers is in English, as far as I can discern. Their current list can be found at: https://www.bisg.org/complete-bisac-...s-2013-edition .

When you see a Tag in Calibre named "Thriller", it likely was assigned to a particular ebook by its publisher. The publisher was given the BISAC codes for a particular ebook by its author, because the author presumably knows best about what their book is about.

For example, the BISAC Subject Heading FIC030000 has the English meaning of "FICTION / Thrillers / Suspense".

The entire list of BISAC Subject Codes can be downloaded from BISAC from https://www.bisg.org/publications/bi...s-2013-edition for $295.

If someone were to translate the BISAC Subject Codes into a particular language, and if those translations were imported into a special static reference table in Calibre, then the standard Calibre API could be used to acquire the translations from that (currently non-existent) table for the purpose of translating Tags that had a match.

However, there is an immediately available alternative option. My plugin 'Derive Genres' is multilingual, and can be used to populate the custom column Genre (#genre) using Boolean Tag Rules. The examples that I provide with that plugin in order to jump-start its users were inspired by Amazon's categories (Tags), which themselves were derived from BISAC Subject Codes. The few test data examples shown in the attached .jpg file are in Spanish, Catalan, Portuguese and German. The Boolean Operators (AND, OR, and NOT) are supported only in English, Spanish, French, German and Hindi because I had to hardcode their translations to English for use in the Boolean Equations, but everything else is fully user-defined since the rules themselves (maintained in a .csv file) must be encoded in Unicode (UTF8). So, your English Tags can be turned into Spanish Genres. Or, you can mix and match and have Fiction genres in Spanish and Nonfiction in German, with the exception of Romance which could be in French or Catalan. My point is that it is highly flexible.

If you have any interest, I refer you to its 19-page instructions/user-guide available within the plugin for more information.


Best of luck.
Attached Thumbnails
Click image for larger version

Name:	tag_to_genre_boolean_rules.JPG
Views:	324
Size:	109.1 KB
ID:	128619  
DaltonST is offline   Reply With Quote
Old 09-21-2014, 02:32 PM   #3
arspr
Dead account. Bye
arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.
 
Posts: 587
Karma: 668244
Join Date: Mar 2011
Device: none
Dalton, you are just explaining how default Tag values are populted by calibre and a "hack" to convert them to whatever you want. But it doesn't solve the root problem, tag, #genre or #my_new_translated_column are always fixed. They do not get automatic translations at all as Languages or Yes/No columns do (in fact: looking in a table for "paralel" values in other languages; no "intelligent" traslation at all).
arspr is offline   Reply With Quote
Old 09-21-2014, 03:17 PM   #4
Terisa de morgan
Grand Sorcerer
Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.
 
Terisa de morgan's Avatar
 
Posts: 6,233
Karma: 11768331
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
@arspr, what do you mean by "intelligent" translation? As I understand, you're asking for calibre translating all the texts you add. Is it right?
Terisa de morgan is offline   Reply With Quote
Old 09-21-2014, 03:39 PM   #5
DaltonST
Deviser
DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.
 
DaltonST's Avatar
 
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
Quote:
Originally Posted by arspr View Post
Dalton, you are just explaining how default Tag values are populted by calibre and a "hack" to convert them to whatever you want. But it doesn't solve the root problem, tag, #genre or #my_new_translated_column are always fixed. They do not get automatic translations at all as Languages or Yes/No columns do (in fact: looking in a table for "paralel" values in other languages; no "intelligent" traslation at all).
@arspr,

The proper word I believe you are looking for is not "intelligent", but "dynamic". That is the first method I explained in my post (BISAC, etc.). I guess you didn't grok it. Calibre would dynamically translate what is in metadata.db from whatever code is there to whatever language the GUI is currently displaying on the screen for a particular column. You would not see what is actually physically stored in the database, but its dynamic translation to whatever language you want to see at that moment. The screen would display "não ficção" even when the underlying data was originally "nie fikcją" (assuming you wanted Portuguese on the screen instead of Polish).

The dynamic translation would have to be based on something local within Calibre, since there are no little linguist people inside of Calibre with little keyboards. I guess Google Translate could be tied into the process if a local translation were not found. Most Tags are easily translated with a school dictionary. Others that are somewhat idiomatic (such as BDSM) are not. It would be a waste of computing resources (and human wait-time) to constantly search the internet for the translation of the English word "love" to Spanish, but anything not local to Calibre could be retrieved from the web.

What you are asking for is essentially an entire equivalent to http://translation.babylon.com/ made local to and used dynamically by Calibre for the display on the screen of any custom column metadata that is desired.

That is a wonderful idea. I fully support it. I would adore such a thing. Thank you for suggesting it. Next step: finding someone willing to build it for you.



n.b. A dictionary is just a printed table of "from-to" combinations. If I were a dictionary, I would be offended to be called a "hack".

Last edited by DaltonST; 09-21-2014 at 03:45 PM.
DaltonST is offline   Reply With Quote
Old 09-22-2014, 02:27 PM   #6
arspr
Dead account. Bye
arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.
 
Posts: 587
Karma: 668244
Join Date: Mar 2011
Device: none
Quote:
Originally Posted by Terisa de morgan View Post
@arspr, what do you mean by "intelligent" translation? As I understand, you're asking for calibre translating all the texts you add. Is it right?
Not exactly. (My English is worse than I thought)

What I'm suggesting is a that each value you put in Tags or in Tag-like custom columns ("Novel", "Typos", "Or whatever other thing you want to type") could be in fact a list values. Each one corresponding to a different language.

I mean if I create "Novel" as a new Tag, automatically all the English, French, Spanish, German, Hindi, whatever language values are populated with "Novel". But I should be able to edit that series of "Novel" translations to whatever I want. So if I go to the Spanish value and I overwrite "Novel" with "Novela", whenever I change GUI language to Spanish, I get "Novela" instead of "Novel", because in fact "Novel" is not just a single word but a customizable list of translated equivalent words...

Of course there are a lot of difficulties to solve with this scheme. Some examples:
  • What you do with Word_A and Word_B in English when they both share the same Palabra_Común translation in Spanish.
  • How you set up the "Rename" feature when in fact you could have to rename a list of words at once...


Quote:
Originally Posted by DaltonST View Post
@arspr,

The proper word I believe you are looking for is not "intelligent", but "dynamic". That is the first method I explained in my post (BISAC, etc.). I guess you didn't grok it. [...]
n.b. A dictionary is just a printed table of "from-to" combinations. If I were a dictionary, I would be offended to be called a "hack".
First of all, and just in case, sorry if my "hack" word offended you. No offence was meant in any way.

Wow, your suggestion about real true translations on the fly it's really really ambitious (and also too dangerous ). As a matter of fact I never use automatic translation services like Google Translate on foreign web pages but when no other way of understanding that web page is available. They usually contain too many errors...

But please re-read my explanation to Terisa. What I'm am suggesting is much simpler. It's just a list of values instead of single values... No "automatic" translation at all, just switching to the third value of the vector/list for French and to the seventh for Russian... But it's the user who must fill that vector in its desired translations.
arspr is offline   Reply With Quote
Old 09-22-2014, 05:33 PM   #7
Terisa de morgan
Grand Sorcerer
Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.
 
Terisa de morgan's Avatar
 
Posts: 6,233
Karma: 11768331
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
How many languages, @arspr? How do you know which the position for each language is? Which is the database size? Does calibre have to create all the entries "on the fly" any time you add a value? I think, as a Spanish said, that it is "demasiado arroz para tan pico pollo"
Terisa de morgan is offline   Reply With Quote
Old 09-22-2014, 06:07 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Does anyone know of any consumer targeted software (free or pay4me), that allows the user to specify translation tables for data. I can't think of any.

I know of a big end of town package that has it, costs big end of town money too. I suspect one of the larger central banks might have done it in it's bespoke systems, such as the infamous TARGET2.

BR
BetterRed is offline   Reply With Quote
Old 09-22-2014, 06:15 PM   #9
arspr
Dead account. Bye
arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.arspr ought to be getting tired of karma fortunes by now.
 
Posts: 587
Karma: 668244
Join Date: Mar 2011
Device: none
Of course, Terisa, I don't know all the drawbacks or how to code it or even if it is worth the effort. And of course maybe (possibly) it's a complete crazy idea... As I said, I just wanted to start a brainstorm, nothing more nothing less.

Nevertheless I've just solved the yes/no issue. It was reaaaaaally easy. Now all my calculated yes/no columns support GUI language changes!!!

Follow the link to launchpad in my first post to get the solution...
arspr is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
calibredb format for "series" custom columns ZioNemo Library Management 7 02-06-2014 11:57 PM
Custom TextToSpeech "engine" as a way to create custom real audiobooks with texts noisy Kindle Developer's Corner 2 03-31-2012 08:42 AM
Custom column: "Updated date", when adding new "versions" of the same file? enriquep Library Management 16 11-03-2011 10:46 AM
No data in "In Library" and "On Device" columns after upgrade ily426 Library Management 8 04-03-2011 02:53 PM


All times are GMT -4. The time now is 12:53 AM.


MobileRead.com is a privately owned, operated and funded community.