[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [edict-jmdict] Errors and Remarks on JMDict
[Stuart McGraw (RE: [edict-jmdict] Errors and Remarks on JMDict) writes:]
>>
>> I would like to suggest that cross references be sense->sense rather
>> than sense->entry as they are now. A croff-reference from 背が高い
>> to 高い(tall) is more accurate than one to 高い(high;tall;expensive).
I quite agree. This is something we should move to once we are
maintaining the data in a database.
It won't be too hard - there are only 1500+ xrefs at present and most are
from/too single-sense entries.
>> Another issue would be representation in jmdict and/or edict.
For edict/edict2 it could simply be: "(see 高い s2)" or something like
that.
>> For jmdict, if one were adopting Marc's change, the additional change to add a
>> sense element would be minor. Marc's suggested change (or equivalent) is a
>> prequisite for my change anyway, since specifying explicit senses when the target
>> can resolve to multiple entries make no sense (oops, sorry). For edict, one could
>> either make a change in format, or keep the same format (point 5 above)
I think we need to decouple the tagging of senses from the ordering of
senses.
It make good sense to follow the more-or-less standard lexicographic
practice and display senses in descending frequency of use, which means that
files such as edict/edict2 need to be generated that way, and jmdict
should either have the senses in that order, or have a discrete send-order
attribute/entity. In addition, there should a sense-id to which xrefs can
point, which will stay the same even if the senses are rearranged. Otherwise
whenever sense order changes, we'd have to find all entries with xrefs to
the entry and check/modify the xref details.
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学