[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] New top kanji forms for numbers



Chris, thanks for bringing this up. I'm responsible for most of those edits but I was a little sloppy with the prio-tagging. Normally we put [spec1] tags on kanji/surface forms that are more common than same-entry kanji that have other prio-tags (news1, etc.) but I didn't for most of these numerals. I will go through them and fix this so it's less ambiguous which surface form is the most common.

Anton, 百 and other kanji are still included and prio-tagged in each numeric entry (where it makes sense), only they now come after other more common forms. I wouldn't ever recommend using an older file of the dictionary files as they are continuously improved and updated on. Sure, we translate "100" as "100", but we're also specifying how it's pronounced and making it clear which way is the most common way to represent these numerals in Japanese. 

Best,
Marcus

On Sat, Jun 1, 2019 at 11:50 AM Anton Tagunov anton.tagunov@********* [edict-jmdict] <edict-jmdict@***************> wrote:
 

You = gods, me = worshiper :)

Still.. doesn't this make 100 _both_ the primary form and the main translation?

Effectively translating 100 to 100? :)

In the meantime I feel rather happy to be using an older version of the dictionary mapping 100 to 百. Of course I am aware they are rarely used, but they are glyphs I need to learn..

Thx,
learner

On Sat, 1 Jun 2019, 01:58 Jim Breen jimbreen@********* [edict-jmdict], <edict-jmdict@***************> wrote:
 

Sorry for the slow response. Marcus Richert has been trying to send to the group about this
but Yahoo has been rejecting his emails. I had the same issue with another list a few days
back.

The 全角 numerics appear to be the most common surface forms these days, at least in WWW pages
but probably elswhere too. We're tagging them "by hand", as they don't show up in the older
ranking metrics.

Jim


On Thu, 30 May 2019 at 06:24, Chris Vasselli clindsay@********* [edict-jmdict] <edict-jmdict@***************> wrote:


Hi everybody,

I noticed recently a bunch of entries for numbers have been getting updated with a new top kanji form using the full-width arabic numeral representation. For example, the top kanji form for  is now 100.

I’m not necessarily against this change, but I was curious to hear the reason for it.  I’m not completely sure if as a Japanese learner you looked up ひゃく or “one hundred” in a dictionary, you’d want to see 100 as the primary form, I’m guessing you’d want to see 百? Of course, if 100 is truly more common, then maybe that’s the appropriate form to show, I’m not sure. Just wanted to bring it up for discussion.

Also, in the above case the 百 form is still marked with the [ichi1,news1,nf01] tags, which I believe is supposed to indicate that that’s the most common form. But the 100 entry is the first one in the list. So it seems slightly ambiguous to me which is being indicated as the most common form.

Chris




--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/                                 http://nihongo.monash.edu/