[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Family-name rankings (Spelling/Reading)



<snip: Family-name rankings>

Hi Jim (et. al.),

3 thoughts, broken up into separate mails for clarity.

Firstly, this is great news, and very interesting!
(Japanese names being pretty complicated.)


1st point: Spelling/Reading
(This point was touched on by earlier posts; I expand.)

AFAICT, this is a listing of *kanji* frequencies.
Properly, a Japanese name is a pair: (kanji, yomi),
and thus there are 3 possible rankings:
* ranking of a given (k, y)
* ranking of a given (k) (irrespective of y)
* ranking of a given (y) (irrespective of k)
…which are all interesting.

Just kanji data (as we have here) is already interesting;
if we had reading frequency information, that would very
practically useful.

For example, I suspect that 井上 is generally pronounced
いのうえ but I see (thanks to ENAMDICT) there are many other
readings, so seeing *how* common or rare these are would be
interesting.
Conversely, given (the personal name) ようこ seeing how
common different spellings are would be useful – how common
is 「太陽の“陽子”」? (for the おひさま fan?) vs. 洋子
(Beatles fans) vs. 曜子 etc.

…so it would be nice, given a kanji spelling, to have the readings
in order of frequency (which I think ENAMDICT strives for),
together with frequency data (%), and conversely given a
reading.

Personal names are admittedly tricky b/c change pretty
rapidly in popularity, so would really need time-data
(popularity in given year of birth easiest), but this should
be possible for family names, given data.


On a personal note, in Japanese I go by
西山海斗(にしやまかいと) where
西山 is something like #800 in popularity
(by a “list of top 1,000 family names” I saw)
and 海斗 has been a very common name lately
(top 10 boys’ names in a few years of past decade).
西山 is pretty unambiguous (kanji/reading clear from each
other), while かいと has several spellings (海斗 is most common)
and it’s slightly tricky to explain the spelling due to 斗
(easiest I’ve found is 「うみ」の海、北斗七星の斗
 anyone have better suggestions?)

Sorry for going off-track – kanji frequency is also quite interesting!

best,
  ~nils