View Single Post
Old 11-19-2015, 04:08 PM   #1
Kay_2
Junior Member
Kay_2 began at the beginning.
 
Kay_2's Avatar
 
Posts: 3
Karma: 10
Join Date: Feb 2015
Device: Win 8.1 / 7Pro, Android
Force transcription for Non-ASCII characters?

Hey there!
I was wondering, if there's any way to force calibre to transcribe titles, authors and such a certain way. I'm not sure about other languages but with Chinese and Japanese it's a pretty sad result right now, because calibre resists any common transcription principles which results in nonsense.

An example:

The author 仲村佳樹 (read "Nakamura Yoshiki" or -as Asian names are always "the other way around "Yoshiki Nakamura") becomes Zhong Cun Jia Shu
The title スキップ・ビート 第30巻 (read as "Skip Beat dai30wa") becomes sukitupubito Di 01Juan

While in this case one might be tempted to oversee the faux-pas because it's a "Jenglish" word, the issue becomes more pronounced upon using "normal" Japanese titles:

The title コイバナ! 恋せよ花火 第08巻 ("Koibana Koiseyo Hanabi") becomes "koibana! Lian seyoHua Huo Di 08Juan"
The author ななじ眺 (read "Nanaji Nagamu" or written in "Western Style" "Nagamu Nanaji") becomes "Nanazi Tiao"

While I was going to praise the system for actually transcribing Japanese Hiragana and Katakana the right way at least, that "zi" of "Nanazi" made me change my mind, because the transcription for じ is "ji" and not "zi".

But let's get to the really messed up part:Japanese Kanji or Chinese Hanzi.

Whenever there is a Kanji in a word, calibre will use (apparently) the first Chinese transcription on a list. No regards to any kind of language rules. I'd rather have it stored in the original language right away, if that means, I won't get some BS-named files out of it. Or - one might always hope- find a way to use a correct transcription.

Why, if it ignores the language metadata eitherway, doesn't calibre at least realize that, if the title uses one Japanese character, the rest must be Japanese too? That would at least fix the matter for any words that contain Hiragana, Katakana or Japanese-only Kanji... The tougher part would probably be those characters that have both, Japanese and a Chinese reading(s).

While it probably is a lot harder to make a script that reverses Chinese and Japanese characters into alphabet letters, there certainly must be some way because it works the other way around, if you install the IME.

When you type "wangzi" in Chinese or "ouji" in Japanese both times you will automatically get "王子". I'm no expert, but can't calibre use that IME-"intelligence" together with the language-field metadata to automatically create transciptions (I'm aware that it won't always work as for some symbols there are more options possible even if you stay in the range of one language instead of mixing in others, but it'd be a lot better than the bogus coming out right now...)?

If that is too complicated, is it possible to force the issue "by hand"?

Right now I always write the titles in their "native" language and put the transcription in brackets behind it, otherwise I'll have no chance in hell to find anything in the folders, if I cannot access the calibre interface and/or copy certain files (I use calibre to store scanned comic-rar-archives as well and because calibre doesn't always store the whole title -too long- I end up with files that have nothing to do with any language knowledgeable to mankind).

So right now I always write entries like this:
Title: コイバナ! 恋せよ花火 第08巻 [Koibana Koiseyo Hanabi] Author: ななじ眺 & Nagamu Nanaji

I thought about creating a new sorting systems with an additional column "transcription". Then make calibre but stuff under author_transcription -> series_transcription -> title_transcription or something like that, but halfway through the idea, I remembered that that would f*** with the rest of my "alphabet-friendly" library entries as well..Oh and system language doesn't change anything either there. Same problems.

Does anyone have an idea how to solve this issue? I know it's not exactly a grave problem, but if there's a solution-possibility, I'd like to try it.
Kay_2 is offline   Reply With Quote