Mobileread
Need Help from Chinese Speakers
#1  KevinH 10-26-2020, 12:07 PM
Sigil uses transifex for translations. I can handle and understand most of this except for the Chinese (zh) language codes. I see 3 codes instead of the expected 2.

I expected zh_CN to be simplified Chinese.
I expected zh_TW to be traditional Chinese.

All based on my searching and reading on websites.

But we have 3 zh related language codes on transifex for Sigil.

There are the expected ones:

zh_CN and zh_TW but there is also just a zh code with no region.

When I diff the translation files, it is clear that zh_CN and zh have very few differences and many are just whitespace differences, whereas zh_TW and zh seem to have many more differences.

So is zh different from zh_CN enough to warrant including a third translation for basically the same language. Sigil releases are already big downloads, and I do not want to include a language variant that is not significantly different from another.

So if zh_CN really represents simplified Chinese and zh_TW really represents traditional Chinese (ignoring regions and country boundary claims here) what does zh (no region code) bring to the table?

Thanks for any help here.
Reply 

#2  Tex2002ans 10-26-2020, 05:27 PM
Quote KevinH
When I diff the translation files, it is clear that zh_CN and zh have very few differences and many are just whitespace differences, whereas zh_TW and zh seem to have many more differences.
Don't know if you stumbled upon my topic from a few months ago:

"Should Chinese Fonts be Embedded in Ebooks?"

Many of the CJK languages use the same character, but display differently depending on the language/font:

Like 返 (U+8FD4) can be displayed 5 different ways:

https://en.wikipedia.org/wiki/File:Source_Han_Sans_Version_Difference.svg

Quote KevinH
So is zh different from zh_CN enough to warrant including a third translation for basically the same language. Sigil releases are already big downloads, and I do not want to include a language variant that is not significantly different from another.
Unsure, unsure. But in Post #8 of the above thread, I also linked to talks discussing common Asian-language bugs/issues within open source programs.

Also, "Source Han Sans" is one of the major CJK fonts designed by Adobe. They also have a lot of great documentation discussing many ins-and-outs and differences between the languages (punctuation alignment, etc. etc.).

https://github.com/adobe-fonts/source-han-sans/

I haven't done too much research into it since then, and since I can't read/write any Asian languages, it all looks too similar to me (probably why so many "western" programs have so many Asian-rendering bugs!).
Reply 

#3  KevinH 10-26-2020, 06:01 PM
Yes, what I am comparing really has nothing to do with fonts. I compared the two different .ts files as utf-8 byte sequences.

I just need to know if/how "zh_CN" translation is different from just "zh" translation. The difference between zh_CN vs zh_TW is simplified versus traditional so that is understood.
Reply 

#4  The_book 10-27-2020, 06:57 AM
I think, no. I think just combine zh and zh_CN is OK. It may have difference in meta, so may be needed in Default Language For Metadata, but for User Interface Language of Sigil, I think just remain zh_CN/zh_TW, or more exactly, zh-Hans/zh-Hant will be fine.
Reply 

#5  KevinH 10-27-2020, 10:08 AM
Thank You!
Reply 

#6  DiapDealer 10-27-2020, 12:15 PM
The only input I can offer is that I seem to vaguely remember some sort of dispute between two of Sigil's Transifex Chinese translation camps. I don't remember if it was these two or not. I could be way off base, though.
Reply 

#7  KevinH 10-27-2020, 12:34 PM
Hope not. I really do not plan or want to bundle two versions of the same language, just because they can not agree.
Reply 

Today's Posts | Search this Thread | Login | Register