[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Possible simplification of readings in JMdict



When I first designed the JMdict structure in the late 90s I was keen
to provide for back-compatibility with the older EDICT format, as a
number of sites, etc. were using it. One aspect of that was to make
sure the kanji and reading parts matched precisely. For example, in a
(hypothetical) entry with a kanji part of "何を食べる;ナニを食べる" the reading
part had to have both なにをたべる and ナニをたべる, with restrictions tying the
readings to the matching kanji forms.

This approach has led on occasion to some rather complex and ugly
entries, and it's appropriate to ask whether it's really worth doing.
Does it really matter? A recent example of this is the 喉が渇く entry (*),
where some variants were added containing ノド in place of 喉 The reading
part of that entry now contains:
のどがかわく[喉が渇く,のどが渇く,喉が乾く,のどが乾く,喉がかわく];ノドがかわく[ノドが渇く,ノドが乾く]

Would it really matter if it just had "のどがかわく"? Looking up the entry
using kana alone would/should find it (provided the developers matched
both kana forms.)

A simplification like this would only apply to those sorts of mixed
terms. Entries where having readings fully or partly in katakana are
considered appropriate would not be affected.

Any views on this?

Jim

(*) https://www.edrdg.org/jmdictdb/cgi-bin/entr.py?svc=jmdict&sid=&q=1277350
-- 
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/

-- 
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CABHGxq6osUfyXqGP%2BvfJRHAOZPmOhr6ENEqFv0FS_pPN2H1agQ%40mail.gmail.com.