I'm curious as to what the percentage of native Japanese words that contain digraphs, or to put it another way, the average number of digraphs (or individual kana) in a Japanese word would be. I'm a big math nerd, so as I'm learning Japanese this popped into my head.
Asked
Active
Viewed 179 times
0
-
3I am curious what the term “digraph” refers to in the context of hiragana, which is a syllabary and not an alphabet. Things like きゃ, きゅ, きょ, etc? – aguijonazo Mar 15 '22 at 06:34
-
3The term digraph is usually used for sequences that together tend to represent a sound, for example
, – jogloran Mar 15 '22 at 07:33, in English. I'm not aware that Japanese consider sequences such as きゃ, しゃ, ちゅ to be digraphs though. Perhaps you can clarify what you intend by "digraph". -
1@jogloran You are both on the right track, in that I do mean things likeりゃ、りゅ、りょ – JShoe Mar 15 '22 at 13:54
-
For that you would need to get a comprehensive list of words from somewhere and somehow extract only "native" words. Excluding recent loan words might be easy because they use katakana. Do you also need to exclude 漢語? – aguijonazo Mar 15 '22 at 22:20
1 Answers
2
From a 68,000 word dictionary, I counted 22,000 words whose readings include one, or more, of 「っ, ゃ, ゅ, ょ」. Unfortunately, I triple-counted unusual words like 出張(しゅっちょう). 「ょ」was in 11,000 of the words while「ゃ」was in just 2,200. I ignored all katakana.

davewp
- 2,252
- 1
- 16
- 30