[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Characters, glyphs, hanzi, etc....
-------------------
Jen-yih Chu wrote:
: Dear Dr. Lee:
: ....
: I use "Hansii" instead of "Chinese" becasue
: my friends from Korea, Japan and Vietnam used
: to complain to me that those should better
: be called "Hansii" than "Chinese" characters.
: ....
-------------------
Hi,
I'd just like to comment that the appropriate term should be used when talking
about characters derived from the Chinese script.
Chinese : Hanzi
Japanese : Kanji
Korea : Hanja
Vietnam : Chu+~ Ha'n (exclusively used for texts of a Chinese origin)
Of course, there are characters which were created in non-Chinese countries
which adopted the script. These are
Japanese : kokuji (a few have permeated into use in Chinese)
Korea : gugja (no data)
Vietnam : Chu+~ No^m (exclusively used for Vietnamese)
There is a neutral word which at once means the "character" and embodies an idea
of "meaning" and that is 'glyph'. However, this is appearance dependent, that
is, a character can be represented by different glyphs (e.g. simplified and
tradtional).
(Vietnam now uses an alphabetic script, but the slightly modified Roman alphabet
is expanded to include Vietnamese's rich set of vowels.)
>From a computing point of view, there is wide overlap of characters between the
languages, and those which aren't part of another language. So seemingly
differing glyphs across the CJKV spectrum can be gathered togethered and
'unified'. This is part of the project known as Unicode. From China, Japan, and
Korea, characters used in the encodings for the characters used in these
countries were systematically researched, and it is found that some 20902
characters are unique.
These are the basis of the CJKV characters used in Unicode.
This would allow Big5 or GB encoded documents to be transformed into the Unicode
encoding. Likewise, Japanese encodings can also convert to it, as can Korean.
For Vietnamese, there is a large body of characters which are not in Unicode,
but there is plenty of room available in this encoding to place it elsewhere in
the user defined section.
What Unicode does is allow cross cultural communication, not only in CJKV
countries, but there are cyrillic, greek, and other alphabetic characters
included to enable this.
>From a practical point of view, the encodings do not include rare characters
used in old Chinese texts, so it cannot be viewed as an all encompassing entity.
What computing does is raise ones' awareness of the limits one has when one
needs to put characters on the screen. With movable type printing, there is no
problem as new glyphs can be made, wherease you're stuck with one of the 20902
characters in Unicode, and at the mercy of the font developer.
A standard for characters, CNS 11643-1992 contains about 48027 hanzi characters.
There is also the CCCII which is forecast to contain 75684 characters when it is
completed.
Dylan.
*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-*
* CJKV Encodings Dylan W.H. Sung mabr12@dial.pipex.com *
*====================================================================*
* Unicode [IPĖ UTF-7 +W4tQSZbE UTF-8 宋偉雄 *
* Chinese Song4 Wei3Xiong2 Sung3 Wai5Hung4 Sung4 Vui3Hiung2 *
* Big5 GB ΰ GBK Hanzi ~{KNN0P[~} *
* Big5-Zhuyin *
* Japanese Sou I-Yuu *
* EUCJIS װͺ SJIS v̗Y NewJIS $BAW0NM:(J *
* EUCJIS-Hiragana 椦 EUCJIS-Katakana 楦 *
* Korean Song Wi Ung *
* KSC-Hanja KSC-Hangul *
* Vietnamese To^'ng Vi~ Hu'ng *
* TCVN Chu+~ Ha'n *
*====================================================================*
* http://www.geocities.com/Tokyo/Pagoda/3847/ *
*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-*