[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Characters, glyphs, hanzi, etc....



-------------------
Jen-yih Chu wrote:
: Dear Dr. Lee:
: ....
: I use "Hansii" instead of "Chinese" becasue
: my friends from Korea, Japan and Vietnam used
: to complain to me that  those should better
: be called "Hansii" than "Chinese" characters.
: ....
-------------------

Hi,

I'd just like to comment that the appropriate term should be used when talking
about characters derived from the Chinese script.

Chinese  : Hanzi
Japanese : Kanji
Korea    : Hanja
Vietnam  : Chu+~ Ha'n (exclusively used for texts of a Chinese origin)

Of course, there are characters which were created in non-Chinese countries
which adopted the script. These are

Japanese : kokuji (a few have permeated into use in Chinese)
Korea    : gugja (no data)
Vietnam  : Chu+~ No^m (exclusively used for Vietnamese)

There is a neutral word which at once means the "character" and embodies an idea
of "meaning" and that is 'glyph'. However, this is appearance dependent, that
is, a character can be represented by different glyphs (e.g. simplified and
tradtional).

(Vietnam now uses an alphabetic script, but the slightly modified Roman alphabet
is expanded to include Vietnamese's rich set of vowels.)

>From a computing point of view, there is wide overlap of characters between the
languages, and those which aren't part of another language. So seemingly
differing glyphs across the CJKV spectrum can be gathered togethered and
'unified'. This is part of the project known as Unicode. From China, Japan, and
Korea, characters used in the encodings for the characters used in these
countries were systematically researched, and it is found that some 20902
characters are unique.
These are the basis of the CJKV characters used in Unicode.

This would allow Big5 or GB encoded documents to be transformed into the Unicode
encoding. Likewise, Japanese encodings can also convert to it, as can Korean.

For Vietnamese, there is a large body of characters which are not in Unicode,
but there is plenty of room available in this encoding to place it elsewhere in
the user defined section.

What Unicode does is allow cross cultural communication, not only in CJKV
countries, but there are cyrillic, greek, and other alphabetic characters
included to enable this.

>From a practical point of view, the encodings do not include rare characters
used in old Chinese texts, so it cannot be viewed as an all encompassing entity.
What computing does is raise ones' awareness of the limits one has when one
needs to put characters on the screen. With movable type printing, there is no
problem as new glyphs can be made, wherease you're stuck with one of the 20902
characters in Unicode, and at the mercy of the font developer.

A standard for characters, CNS 11643-1992 contains about 48027 hanzi characters.
There is also the CCCII which is forecast to contain 75684 characters when it is
completed.

Dylan.

*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-*
*  CJKV Encodings      Dylan W.H. Sung      mabr12@dial.pipex.com    *
*====================================================================*
*             Unicode [IPĖ  UTF-7 +W4tQSZbE  UTF-8 宋偉雄     *
*  Chinese    Song4 Wei3Xiong2   Sung3 Wai5Hung4   Sung4 Vui3Hiung2  *
*             Big5   GB ΰ  GBK ΂   Hanzi ~{KNN0P[~}  *
*             Big5-Zhuyin                      *
*  Japanese   Sou I-Yuu                                              *
*             EUCJIS װͺ  SJIS v̗Y  NewJIS $BAW0NM:(J        *
*             EUCJIS-Hiragana 椦  EUCJIS-Katakana 楦 *
*  Korean     Song Wi Ung                                            *
*             KSC-Hanja   KSC-Hangul                     *
*  Vietnamese To^'ng Vi~ Hu'ng                                       *
*             TCVN Chu+~ Ha'n                                  *
*====================================================================*
*            http://www.geocities.com/Tokyo/Pagoda/3847/             *
*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-*