Difference between revisions of "口"

From TORI
Jump to navigation Jump to search
(File -> pic)
Line 1: Line 1:
  +
{{top}}
<div style="float:right; margin:-40px -8px 0px 4px">
 
  +
<div style="float:right; margin:-72px 0px 0px 6px">
[[File:Australien-Krokodil.JPG|200px]]<center>
 
  +
{{pic|Australien-Krokodil.JPG|220px}}<center>
 
<p style="margin:-1px 0px 12px 0px">
 
<p style="margin:-1px 0px 12px 0px">
 
ワニの[[口]] by Pbuergler <ref>
 
ワニの[[口]] by Pbuergler <ref>
Line 15: Line 16:
 
</center>
 
</center>
   
[[File:DEN view mouth ja.gif|200px]]<br><center>
+
{{pic|DEN view mouth ja.gif|220px}}<br><center>
 
[[口]] by msdmanuals, 2017 <ref>
 
[[口]] by msdmanuals, 2017 <ref>
 
https://www.msdmanuals.com/ja-jp/ホーム/18-口と歯の病気/口の生物学/口の生物学
 
https://www.msdmanuals.com/ja-jp/ホーム/18-口と歯の病気/口の生物学/口の生物学
Line 61: Line 62:
 
</ref>.
 
</ref>.
   
Many other [[Unicode]] characters have similar drawing and are easy to confuse each other.
+
Many other [[Unicode]] characters have similar drawing, similar (or the same) [[glyph]](s) and are easy to confuse each other. To year 2025, the [[glyph]] [[口]] ("rectangle") has not yet assigned a unique exclusive [[Unicode]] number; the [[glyph]] does not identify the [[character]].<br>
  +
For this confusion, [[character]] [[口]] and other characters with the same or similar [[glyph]]s are not allowed in the technical language [[Tarja]].
==[[⼝]],[[⼞]],[[ロ]],[[口]],[[ロ]]==
 
  +
The 5 unicode characters are similar.<br>
 
  +
==Encoding of [[⼝]],[[⼞]],[[ロ]],[[口]],[[ロ]]==
All of them, at the first glance, look as rectangles.<br>
 
  +
  +
The 5 unicode characters look similar.<br>
  +
Their [[glyph]]s appear as rectangles.<br>
 
Some programming is necessary to identify each of these characters.<br>
 
Some programming is necessary to identify each of these characters.<br>
 
Their [[Utf8]] encoding can be revealed with [[PHP]] program [[du.t]]; command
 
Their [[Utf8]] encoding can be revealed with [[PHP]] program [[du.t]]; command
Line 71: Line 75:
   
 
produces the following output:
 
produces the following output:
<div style="margin:-2px -8px 0px -200px; background-color:#ffFFff">
 
<div style="margin:2px 0px 0px 30px; line-height:1.3em">
 
 
<poem>
 
<poem>
 
⼝⼞ロ口ロ
 
⼝⼞ロ口ロ
Line 165: Line 167:
 
which of them is [[XFF9B]] ( Character from the [[UnicodeBottom]] table)?
 
which of them is [[XFF9B]] ( Character from the [[UnicodeBottom]] table)?
   
The manuals on Japanese, as well as dictionaries ([[Jisho]]s) usually ignore this confusion.<br>
+
The teachers of Japanese, manuals on Japanese, as well as dictionaries ([[Jisho]]s) usually ignore this confusion and do not mention
  +
problems, accidents, catastrophes caused by this confusion.
Those, who learn Japanese, have to investigate such a cases by themselves.
 
  +
This makes the problem even worse:
  +
Those, who learn Japanese and begin to use it,
  +
meet the confusion being not prepared to it, and
  +
have to investigate such a cases by themselves.
   
 
Only one of the 5 characters, [[&#X53E3;]], [[X53E3]], is qualified as [[KanjiLiberal]].<br>
 
Only one of the 5 characters, [[&#X53E3;]], [[X53E3]], is qualified as [[KanjiLiberal]].<br>
Perhaps, namely this character should be used for generation of Japanese texts.
+
Perhaps, namely this character should be used for generation of [[Japanese]] texts.
  +
  +
In order to avoid confusions with other [[Unicode]] characters,
  +
it is better to avoid glyph [[&#X53E3;]].
  +
  +
This has sense while no universal font ([[Uiglif]])
  +
is established as a default, providing bijective, one-to-one relation between the
  +
[[glyph]]s and the corresponding [[Unicode]] [[character]]s.
  +
  +
==[[Tarja]]==
  +
Due to the confusion mentioned above, none of characters
  +
[[&#X2F1D;]],
  +
[[&#X2F1E;]],
  +
[[&#X30ED;]],
  +
[[&#X53E3;]],
  +
[[&#XFF9B;]],
  +
Is allowed in technical language [[Targa]].
  +
It is modification of [[Japanese]] excluding those [[glyph]]s
  +
that have not yet assigned a
  +
default unique exclusive [[Unicode]] number.
  +
  +
At the translating from [[Japanese]] to [[Tarja]],
  +
the indication of the Unicode number ([[X53E3]])
  +
can be used, or the appropriate transliteration with [[Hiragana]] or [[Romaji]],
  +
as well as an ascii word borrowed from other language ([[Mouth]] form English, [[Bouche]] from French, [[Boca]] from Spanish, etc.)
  +
  +
Such a translation has sense while some universal font
  +
with one-to-one relation between [[glyph]]s and [[character]]s (analogy of the sci-fi [[Uniglif]])
  +
is not yet established as a default.<br>
  +
The [[Uniglif]] or its analogy is supposed to avoid confusions, errors, mistakes typical for the default Unicode fonts of century 21. Then, [[Tarja]] may loss its principal goal and meaning (avoid confusions).
  +
  +
==Warning==
  +
  +
The description above is supposed to serve as a patch, fixing the main fault, gap<br>
  +
common for the most of publications dedicated to character [[口]].
  +
  +
Such a description may require a correction(s) from side of a native [[Japanese]] speaker.
  +
  +
However, the description above can be considered as an appeal for elaboration of the advanced font
  +
(analogy of the sci-fi [[Uniglif]]) that provides the one-to-one relation
  +
between the [[glyph]]s and the Unicode [[character]]s.
  +
  +
While such a font is not accepted as a default, it may have sense to avoid
  +
using any of [[character]]s
  +
[[&#X2F1D;]],
  +
[[&#X2F1E;]],
  +
[[&#X30ED;]],
  +
[[&#X53E3;]],
  +
[[&#XFF9B;]]
  +
in technical documents.
   
 
==References==
 
==References==
  +
{{ref}}
<references/>
 
  +
{{fer}}
   
 
==Keywords==
 
==Keywords==
  +
«[[Confusion]]»,
  +
«[[du.t]]»,
  +
«[[Japanese]]»,
  +
«[[Kanji]]»,
  +
«[[KanjiConfudal]]»,
  +
«[[KanjiLiberal]]»,
  +
«[[KanjiRadical]]»,
  +
«[[Kuchi]]»,
  +
«[[Guchi]]»,
  +
«[[Tarja]]»,
  +
«[[Unicode]]»,
  +
«[[Uniglif]]»,
  +
«[[SomeU]]»,
  +
«[[&#X2F1D;]]» («[[X2F1D]]»),
  +
«[[&#X2F1E;]]» («[[X2F1E]]»),
  +
«[[&#X30ED;]]» («[[X30ED]]»),
  +
«[[&#X53E3;]]» («[[X53E3]]»),
  +
«[[&#XFF9B;]]» («[[XFF9B]]»)
   
[[Japanese]],
 
[[Kanji]],
 
[[KanjiConfudal]],
 
[[KanjiLiberal]],
 
[[KanjiRadical]],
 
[[Unicode]],
 
[[SomeU]],
 
[[&#X2F1D;]] ([[X2F1D]]),
 
[[&#X2F1E;]] ([[X2F1E]]),
 
[[&#X30ED;]] ([[X30ED]]),
 
[[&#X53E3;]] ([[X53E3]]),
 
[[&#XFF9B;]] ([[XFF9B]])
 
 
</div></div>
 
 
[[Category:Japanese]]
 
[[Category:Japanese]]
 
[[Category:Kanji]]
 
[[Category:Kanji]]

Revision as of 11:47, 7 November 2025


Australien-Krokodil.JPG

ワニの by Pbuergler [1][2]

DEN view mouth ja.gif
by msdmanuals, 2017 [3]

, X53E3 is unicode character number 21475, KanjiLiberal.

In Japanese, can be pronounced as "kuchi" and mean mouth.
At site nihongomaster, appears as kanji number 845 [4].

Many other Unicode characters have similar drawing, similar (or the same) glyph(s) and are easy to confuse each other. To year 2025, the glyph ("rectangle") has not yet assigned a unique exclusive Unicode number; the glyph does not identify the character.
For this confusion, character and other characters with the same or similar glyphs are not allowed in the technical language Tarja.

Encoding of ,,,,

The 5 unicode characters look similar.
Their glyphs appear as rectangles.
Some programming is necessary to identify each of these characters.
Their Utf8 encoding can be revealed with PHP program du.t; command

php du.t ⼝⼞ロ口ロ

produces the following output:

⼝⼞ロ口ロ
The array has 15 bytes; here is its splitting:
e2 bc 9d e2 bc 9e e3 83 ad e5 8f a3 ef be 9b
array(5) {
  [0]=>
  string(3) "⼝"
  [1]=>
  string(3) "⼞"
  [2]=>
  string(3) "ロ"
  [3]=>
  string(3) "口"
  [4]=>
  string(3) "ロ"
}

Unicode character number 12061 id est, X2F1D
Picture:  ; uses 3 bytes. These bytes are:
xE2 xBC x9D in the hexadecimal representation and
226 188 157 in the decimal representation

Unicode character number 12062 id est, X2F1E
Picture:  ; uses 3 bytes. These bytes are:
xE2 xBC x9E in the hexadecimal representation and
226 188 158 in the decimal representation

Unicode character number 12525 id est, X30ED
Picture:  ; uses 3 bytes. These bytes are:
xE3 x83 xAD in the hexadecimal representation and
227 131 173 in the decimal representation

Unicode character number 21475 id est, X53E3
Picture:  ; uses 3 bytes. These bytes are:
xE5 x8F xA3 in the hexadecimal representation and
229 143 163 in the decimal representation

Unicode character number 65435 id est, XFF9B
Picture:  ; uses 3 bytes. These bytes are:
xEF xBE x9B in the hexadecimal representation and
239 190 155 in the decimal representation

Phonetic

Character can be pronounced as "kuchi"[5].

Being written close to other kanji(s), it can be pronounced also as "guchi".

Graphic

Character appears as a rectangle; this rectangle may be slightly deformed, dependently on the software, used to observe it. Five other Unicode characters (,,,) have similar shape.

Semantic

Character may mean "mouth" [5]

Examples

, でいりぐち ("Iriguchi"), entrance [6]

, でぐち ("deguchi"), exit

Confusion

(X53E3) is highly confusional character. Even a native Japanese speaker, looking at characters
, , , , , cannot answer:
which of them is X2F1D (KanjiRadical)?
which of them is X2F1E (KanjiRadical)?
which of them is X30ED (Katkana "ro")?
which of them is X53E3 (KanjiLiberal)?
which of them is XFF9B ( Character from the UnicodeBottom table)?

The teachers of Japanese, manuals on Japanese, as well as dictionaries (Jishos) usually ignore this confusion and do not mention problems, accidents, catastrophes caused by this confusion. This makes the problem even worse: Those, who learn Japanese and begin to use it, meet the confusion being not prepared to it, and have to investigate such a cases by themselves.

Only one of the 5 characters, , X53E3, is qualified as KanjiLiberal.
Perhaps, namely this character should be used for generation of Japanese texts.

In order to avoid confusions with other Unicode characters, it is better to avoid glyph .

This has sense while no universal font (Uiglif) is established as a default, providing bijective, one-to-one relation between the glyphs and the corresponding Unicode characters.

Tarja

Due to the confusion mentioned above, none of characters , , , , , Is allowed in technical language Targa. It is modification of Japanese excluding those glyphs that have not yet assigned a default unique exclusive Unicode number.

At the translating from Japanese to Tarja, the indication of the Unicode number (X53E3) can be used, or the appropriate transliteration with Hiragana or Romaji, as well as an ascii word borrowed from other language (Mouth form English, Bouche from French, Boca from Spanish, etc.)

Such a translation has sense while some universal font with one-to-one relation between glyphs and characters (analogy of the sci-fi Uniglif) is not yet established as a default.
The Uniglif or its analogy is supposed to avoid confusions, errors, mistakes typical for the default Unicode fonts of century 21. Then, Tarja may loss its principal goal and meaning (avoid confusions).

Warning

The description above is supposed to serve as a patch, fixing the main fault, gap
common for the most of publications dedicated to character .

Such a description may require a correction(s) from side of a native Japanese speaker.

However, the description above can be considered as an appeal for elaboration of the advanced font (analogy of the sci-fi Uniglif) that provides the one-to-one relation between the glyphs and the Unicode characters.

While such a font is not accepted as a default, it may have sense to avoid using any of characters , , , , in technical documents.

References

  1. https://commons.wikimedia.org/wiki/File:Australien-Krokodil.JPG English: Australian freshwater crocodile (Crocodylus johnsoni) Author Pbuergler Photo taken in Zoo Basel (Switzerland). Aufnahme: Zoo Basel (Schweiz)
  2. https://en.wikipedia.org/wiki/Mouth A freshwater crocodile at Basel Zoo in Switzerland In animal anatomy, the mouth, also known as the oral cavity, buccal cavity, or in Latin cavum oris
  3. https://www.msdmanuals.com/ja-jp/ホーム/18-口と歯の病気/口の生物学/口の生物学 口の生物学 執筆者: Rosalyn Sulyanto , DMD, MS, Harvard School of Dental Medicine and Boston Children's Hospital 最終査読/改訂年月 2017年 10月 ここをクリックするとプロフェッショナル版へ移動します 本ページのリソース
  4. https://www.nihongomaster.com/dictionary/kanji/845 3 Strokes Radicals: 囗 JLPT Level 5 Definition of mouth // Readings On'Yomi (音読み) Kun'yomi (訓読み) コウ ク くち Popular Words With This Kanji 人口, じんこう population, common talk// 口, くち mouth, opening, hole, gap, orifice, mouth (of a bottle), spout, nozzle, mouthpiece, gate, door, entrance, exit, speaking, speech, talk (i.e. gossip), taste, palate, mouth (to feed), opening (i.e. vacancy), available position, invitation, summons, kind, sort, type, opening (i.e. beginning), counter for mouthfuls, shares (of money), and swords// 窓口, まどぐち ticket window, teller window, counter, contact person, point of contact// 口座, こうざ account (e.g. bank)// 入口, 入り口, いりぐち, いりくち, はいりぐち, はいりくち entrance, entry, gate, approach, mouth// 入口, 入り口, いりぐち, いりくち, はいりぐち, はいりくち entrance, entry, gate, approach, mouth// 大口, おおぐち, おおくち big mouth, boastful speech, tall talk, large amount, large sum// 河口, 川口, かこう, かわぐち mouth of river, estuary// 河口, 川口, かこう, かわぐち mouth of river, estuary// 悔しい, 口惜しい, 悔やしい, くやしい, くちおしい vexing, annoying, frustrating, regrettable, mortifying // ..
  5. 5.0 5.1 https://ja.wikipedia.org/wiki/口 口(くち)は、消化管の最前端である。食物を取り入れる部分であり、食物を分断し、把持し、取り込むための構造が備わっていると同時に、鼻腔と並んで呼吸器の末端ともなっており、発声器官の一部でもある。文脈により口腔(こうこう)とも言う。なお口腔の読みの例外として、日本の医学界においては(こうくう)を正式とする。 生物学に限らず、一般に穴等の開口部を指して口と呼ぶ。このため、「口」は様々な慣用句や比喩表現に使われる(後述の「通念」を参照)。
  6. https://nihongoichiban.com/2011/05/01/jlpt-vocabulary-入口/ Vocabulary Card – 入口 – iriguchi MAY 1, 2011 BY NICOLAS LEAVE A COMMENT Furigana: いりぐち Romaji: iriguchi Meaning: entrance

Keywords

«Confusion», «du.t», «Japanese», «Kanji», «KanjiConfudal», «KanjiLiberal», «KanjiRadical», «Kuchi», «Guchi», «Tarja», «Unicode», «Uniglif», «SomeU», «» («X2F1D»), «» («X2F1E»), «» («X30ED»), «» («X53E3»), «» («XFF9B»)