[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Possible simplification of readings in JMdict
This has just come up in the edits, with アコヤ貝 being proposed for
addition to the 阿古屋貝 entry. I propose to approve it without adding
アコヤがい as a reading.
Jim
On Thu, 29 Jul 2021 at 10:17, Jim Breen <jimbreen@gmail.com> wrote:
>
> On Tue, 27 Jul 2021 at 00:05, Chris Vasselli <clindsay@gmail.com> wrote:
> [...]
> > I want to clarify a little bit about your proposal, especially your last paragraph. Are you saying that, in your hypothetical ナニを食べる for example, if there were real-world usage of the reading ナニをたべる as a surface form then that form would be included in the database, but if not, it would be excluded?
>
> Yes, something like that. The [nokanji] cases would stay, of course.
>
> To give a real example. An entry such as:
> 鉛筆
> 【 えんぴつ; エンピツ (nokanji) 】
> would stay as it is, but:
> 鉛筆削り; えんぴつ削り; エンピツ削り
> 【 えんぴつけずり (鉛筆削り, えんぴつ削り); エンピツけずり (エンピツ削り) 】
> would see the reading field change to just:
> 【 えんぴつけずり】
>
> > At first blush, I imagine as long as there is a consistent and well-documented understanding of what the presence/absence of the form means, and all forms that actually occur in real world text still appear in the database, then as a consumer of JMdict that shouldn’t be too hard to adapt to.
>
> Yes, it's a bit of a trade-off between precision and visual clutter.
>
> Jim
>
> > On Jul 26, 2021, 4:20 AM -0400, Jim Breen <jimbreen@gmail.com>, wrote:
> >
> > When I first designed the JMdict structure in the late 90s I was keen
> > to provide for back-compatibility with the older EDICT format, as a
> > number of sites, etc. were using it. One aspect of that was to make
> > sure the kanji and reading parts matched precisely. For example, in a
> > (hypothetical) entry with a kanji part of "何を食べる;ナニを食べる" the reading
> > part had to have both なにをたべる and ナニをたべる, with restrictions tying the
> > readings to the matching kanji forms.
> >
> > This approach has led on occasion to some rather complex and ugly
> > entries, and it's appropriate to ask whether it's really worth doing.
> > Does it really matter? A recent example of this is the 喉が渇く entry (*),
> > where some variants were added containing ノド in place of 喉 The reading
> > part of that entry now contains:
> > のどがかわく[喉が渇く,のどが渇く,喉が乾く,のどが乾く,喉がかわく];ノドがかわく[ノドが渇く,ノドが乾く]
> >
> > Would it really matter if it just had "のどがかわく"? Looking up the entry
> > using kana alone would/should find it (provided the developers matched
> > both kana forms.)
> >
> > A simplification like this would only apply to those sorts of mixed
> > terms. Entries where having readings fully or partly in katakana are
> > considered appropriate would not be affected.
> >
> > Any views on this?
> >
> > Jim
> >
> > (*) https://www.edrdg.org/jmdictdb/cgi-bin/entr.py?svc=jmdict&sid=&q=1277350
> > --
> > Jim Breen
> > Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
> > http://www.jimbreen.org/
> > http://nihongo.monash.edu/
> >
> > --
> > You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CABHGxq6osUfyXqGP%2BvfJRHAOZPmOhr6ENEqFv0FS_pPN2H1agQ%40mail.gmail.com.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/1eb92748-732d-4c0e-9614-2f9020af1484%40Spark.
>
>
>
> --
> Jim Breen
> Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
> http://www.jimbreen.org/
> http://nihongo.monash.edu/
--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/
--
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CABHGxq5vsAOo_E1wxcnPz-Lc9fmVGRsmZbVAHekrAVYFLvZMgw%40mail.gmail.com.