[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] "P" Markers - Google as corpus?
Take good old ほうれんそう, I word I learned very early in
Japanese. At present it has an entry with 菠薐草, 法蓮草 and
ほうれん草, with the 菠薐草 version tagged as "ichi1", thus
getting it a P. The major dictionaries ONLY have the 菠薐草
version. The Googits have:
菠薐草 36,800
22,400 yahoo.co.jp
法蓮草 27900
21,600
ほうれん草 1,950,000
2,950,000
At least that's consistent.
鳳蓮草 73 (!)
76 yahoo.co.jp
So the unsanctioned ほうれん草 (which is what it was labelled
as at my local スーパー) win hands-down, and the official
version is almost beaten 法蓮 which to me is an ateji, and
probably a 変換ミス.
http://dictionary.goo.ne.jp/search.php?MT=%E8%CA%E9%B3%C1%F0&kind=jn
「法蓮草」「鳳蓮草」とも書く
Note that 大辞林 (and presumably other dictionaries) actually
have notes like ...
▼菠▼薐草
The meaning of ▼ being "Not in 常用漢字表". You could, at
a pinch, take ▼ as also meaning "Write this word/kanji in
kana". In which case ほうれん草 is as, or more, official
than 菠薐草.
So should ほうれん草 be promoted to pride-of-place at the
front of the ほうれんそう headwords? Should the official
菠薐草 be stripped of its "P"? What should be the overall
policy?
In the absense of the equivalent of ▼ tags in WWWJDIC
paper dictionary entries with all kanji headwords should
probably not be applied strictly where partially kana
versions exist (and are more common). In other words
paper dictionaries generally are saying "This is the
right kanji _if_ you write it in kanji."
I strongly believe that WWWJDIC should not be in the position
of 'endorsing' rare usages by having 'full-kanji' headwords
first in the line up unless those words are often written
with 'full-kanji'. That can be the equivalent of having
為る 【する】without the (uk).