[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] <xref> tags without a destination or with too many destinations
On 5 November 2010 03:36, Jean-Luc Léger <jean-luc.leger@dspnet.fr> wrote:
> On Thu, 4 Nov 2010 21:39:05 +1100, Jim Breen <jimbreen@gmail.com> wrote:
>> For sense focus the dreaded dotted numbers may as well continue.
>> Alternatively
>> it could be something like: "(See 何か(2))".
>
> I think the "dreaded dotted numbers" are a problem only in JMDict, so we
> can keep them in EDICT.
That would mean no changes to clients such as WWWJDIC, which suits me fine.
>> For JMdict, something like:
[...]
>> <xref type="see" seq="1234560" id="amdg">何か</xref>
>> or possibly just:
>> <xref type="see" seq="1234560" id="amdg"/>
>> (This would effectively be file-global using the (1234560,amdg) tuple.
>
> keep in mind that if you want to generate Edict* from JMDict, it would be
> easier to have kanji/reading and sense ordinal rank directly available in
> the xref.
Yes, it would help to have the kanji/reading in the xref. I hadn't thought
of having the ordinal there as well, and expected to build that from a
double pass over the data. Something like:
<xref type="see" seq="1234560" id="5678" id_ord="2">何か.なにか</xref>
would avoid that.
>> For the JEL, we need the sense id glued to the sense in some way.
>> Something like:
>>
>> [s=spqr][n]....
>> [s=amdg][n,vs]....
>>
>> Messy.
>
> Hope we can find a way to keep JEL as it is now (i.e. with ordinal
> numbers).
> Sense Ids should be managed automatically by program.
> Like Glenn said, "amdg"/"spqr" don't have any meaning. Neither do numbers
> (local or global).
> Users should only have to see/manipulate ordinal numbers.
So just have the JEL as:
[sense][n] ...
[sense][n,vs] ....
>> For the xref itself:
>>
>> [see=1234560[spqr]]
>
> Ordinal number is quite enough here. The program that add it into the
> database can convert Sense Ordinal number to its Sense Id.
But you'd need to count down the list in the target entry. I'd hate to
do that with
上がる, which has 26 senses!
Or maybe the JEL show ordinals and you just say:
...... [see=123450[7]]
That "7" gets replaced with the (hidden) id of the current 7th sense,
and hence if the
senses get shuffled later, the xref target changes? I guess that can work.
Jim
--
Jim Breen
Adjunct Snr Research Fellow, Clayton School of IT, Monash University
Vice-president: Hawthorn Rowing Club, Treasurer: Japanese Studies Centre
Graduate student: Language Technology Group, University of Melbourne