From TORI
Jump to navigation Jump to search

Australien-Krokodil.JPG

ワニ by Pbuergler [1][2]

DEN view mouth ja.gif
by msdmanuals, 2017 [3]

, X53E3 is unicode character number 21475, KanjiLiberal.

In Japanese, can be pronounced as "kuchi" and mean mouth.
At site nihongomaster, appears as kanji number 845 [4].

Many other Unicode characters have similar drawing, similar (or the same) glyph(s) and are easy to confuse each other. To year 2025, the glyph ("rectangle") has not yet assigned a unique exclusive Unicode number; the glyph does not identify the character.
For this confusion, character and other characters with the same or similar glyphs are not allowed in the technical language Tarja.

Encoding of ,,,,

The 5 unicode characters look similar.
Their glyphs appear as rectangles.
Some programming is necessary to identify each of these characters.
Their Utf8 encoding can be revealed with PHP program du.t; command

php du.t ⼝⼞ロ口ロ

produces the following output:

⼝⼞ロ口ロ
The array has 15 bytes; here is its splitting:
e2 bc 9d e2 bc 9e e3 83 ad e5 8f a3 ef be 9b
array(5) {
  [0]=>
  string(3) "⼝"
  [1]=>
  string(3) "⼞"
  [2]=>
  string(3) "ロ"
  [3]=>
  string(3) "口"
  [4]=>
  string(3) "ロ"
}

Unicode character number 12061 id est, X2F1D
Picture:  ; uses 3 bytes. These bytes are:
xE2 xBC x9D in the hexadecimal representation and
226 188 157 in the decimal representation

Unicode character number 12062 id est, X2F1E
Picture:  ; uses 3 bytes. These bytes are:
xE2 xBC x9E in the hexadecimal representation and
226 188 158 in the decimal representation

Unicode character number 12525 id est, X30ED
Picture:  ; uses 3 bytes. These bytes are:
xE3 x83 xAD in the hexadecimal representation and
227 131 173 in the decimal representation

Unicode character number 21475 id est, X53E3
Picture:  ; uses 3 bytes. These bytes are:
xE5 x8F xA3 in the hexadecimal representation and
229 143 163 in the decimal representation

Unicode character number 65435 id est, XFF9B
Picture:  ; uses 3 bytes. These bytes are:
xEF xBE x9B in the hexadecimal representation and
239 190 155 in the decimal representation

Phonetic

Character can be pronounced as "kuchi"[5].

Being written close to other kanji(s), it can be pronounced also as "guchi".

Graphic

Character appears as a rectangle; this rectangle may be slightly deformed, dependently on the software, used to observe it. Five other Unicode characters (,,,) have similar shape.

Semantic

Character may mean "mouth" [5]

Examples

, でいりぐち ("Iriguchi"), entrance [6]

, でぐち ("deguchi"), exit

Confusion

(X53E3) is highly confusional character. Even a native Japanese speaker, looking at characters
, , , , , cannot answer:
which of them is X2F1D (KanjiRadical)?
which of them is X2F1E (KanjiRadical)?
which of them is X30ED (Katkana "ro")?
which of them is X53E3 (KanjiLiberal)?
which of them is XFF9B ( Character from the UnicodeBottom table)?

The teachers of Japanese, manuals on Japanese, as well as dictionaries (Jishos) usually ignore this confusion and do not mention problems, accidents, catastrophes caused by this confusion. This makes the problem even worse: Those, who learn Japanese and begin to use it, meet the confusion being not prepared to it, and have to investigate such a cases by themselves.

Only one of the 5 characters, , X53E3, is qualified as KanjiLiberal.
Perhaps, namely this character should be used for generation of Japanese texts.

In order to avoid confusions with other Unicode characters, it is better to avoid glyph .

This has sense while no universal font (Uiglif) is established as a default, providing bijective, one-to-one relation between the glyphs and the corresponding Unicode characters.

Tarja

Due to the confusion mentioned above, none of characters , , , , , Is allowed in technical language Tarja. It is modification of Japanese excluding those glyphs that have not yet assigned a default unique exclusive Unicode number.

At the translating from Japanese to Tarja, the indication of the Unicode number (X53E3) can be used, or the appropriate transliteration with Hiragana or Romaji, as well as an ascii word borrowed from other language (Mouth form English, Bouche from French, Boca from Spanish, etc.)

Such a translation has sense while some universal font with one-to-one relation between glyphs and characters (analogy of the sci-fi Uniglif) is not yet established as a default.
The Uniglif or its analogy is supposed to avoid confusions, errors, mistakes typical for the default Unicode fonts of century 21. Then, Tarja may loss its principal goal and meaning (avoid confusions).

Warning

The description above is supposed to serve as a patch, fixing the main fault, gap
common for the most of publications dedicated to character .

Such a description may require a correction(s) from side of a native Japanese speaker.

However, the description above can be considered as an appeal for elaboration of the advanced font (analogy of the sci-fi Uniglif) that provides the one-to-one relation between the glyphs and the Unicode characters.

While such a font is not accepted as a default, it may have sense to avoid using any of characters , , , , in technical documents.

References

  1. https://commons.wikimedia.org/wiki/File:Australien-Krokodil.JPG English: Australian freshwater crocodile (Crocodylus johnsoni) Author Pbuergler Photo taken in Zoo Basel (Switzerland). Aufnahme: Zoo Basel (Schweiz)
  2. https://en.wikipedia.org/wiki/Mouth A freshwater crocodile at Basel Zoo in Switzerland In animal anatomy, the mouth, also known as the oral cavity, buccal cavity, or in Latin cavum oris
  3. https://www.msdmanuals.com/ja-jp/ホーム/18-口と歯の病気/口の生物学/口の生物学 口の生物学 執筆者: Rosalyn Sulyanto , DMD, MS, Harvard School of Dental Medicine and Boston Children's Hospital 最終査読/改訂年月 2017年 10月 ここをクリックするとプロフェッショナル版へ移動します 本ページのリソース
  4. https://www.nihongomaster.com/dictionary/kanji/845 3 Strokes Radicals: 囗 JLPT Level 5 Definition of mouth // Readings On'Yomi (音読み) Kun'yomi (訓読み) コウ ク くち Popular Words With This Kanji 人口, じんこう population, common talk// 口, くち mouth, opening, hole, gap, orifice, mouth (of a bottle), spout, nozzle, mouthpiece, gate, door, entrance, exit, speaking, speech, talk (i.e. gossip), taste, palate, mouth (to feed), opening (i.e. vacancy), available position, invitation, summons, kind, sort, type, opening (i.e. beginning), counter for mouthfuls, shares (of money), and swords// 窓口, まどぐち ticket window, teller window, counter, contact person, point of contact// 口座, こうざ account (e.g. bank)// 入口, 入り口, いりぐち, いりくち, はいりぐち, はいりくち entrance, entry, gate, approach, mouth// 入口, 入り口, いりぐち, いりくち, はいりぐち, はいりくち entrance, entry, gate, approach, mouth// 大口, おおぐち, おおくち big mouth, boastful speech, tall talk, large amount, large sum// 河口, 川口, かこう, かわぐち mouth of river, estuary// 河口, 川口, かこう, かわぐち mouth of river, estuary// 悔しい, 口惜しい, 悔やしい, くやしい, くちおしい vexing, annoying, frustrating, regrettable, mortifying // ..
  5. 5.0 5.1 https://ja.wikipedia.org/wiki/口 口(くち)は、消化管の最前端である。食物を取り入れる部分であり、食物を分断し、把持し、取り込むための構造が備わっていると同時に、鼻腔と並んで呼吸器の末端ともなっており、発声器官の一部でもある。文脈により口腔(こうこう)とも言う。なお口腔の読みの例外として、日本の医学界においては(こうくう)を正式とする。 生物学に限らず、一般に穴等の開口部を指して口と呼ぶ。このため、「口」は様々な慣用句や比喩表現に使われる(後述の「通念」を参照)。
  6. https://nihongoichiban.com/2011/05/01/jlpt-vocabulary-入口/ Vocabulary Card – 入口 – iriguchi MAY 1, 2011 BY NICOLAS LEAVE A COMMENT Furigana: いりぐち Romaji: iriguchi Meaning: entrance

Keywords

«Confusion», «du.t», «Japanese», «Kanji», «KanjiConfudal», «KanjiLiberal», «KanjiRadical», «Kuchi», «Guchi», «Tarja», «Unicode», «Uniglif», «SomeU», «» («X2F1D»), «» («X2F1E»), «» («X30ED»), «» («X53E3»), «» («XFF9B»)