[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Handling leading particles in Edict.



> >  e.g. With あ、またコンピューターが固まっちゃったよ。
> >  it would be indexed to ちゃう1[1]{ちゃった}
> >
> >  While 変顔のつもりとちゃうやで〜これがメラニーのpouty lookアカンか〜笑。
> >  would be indexed to ちゃう2
>
> Not sure how that would help.

I chose a bad example, and failed to notice a small existing mistake.

In practice the ちゃう clash can be partially avoided for linking
purposes by using じゃう instead of ちゃう when possible. That makes
it a bad example.  In practice I wouldn't want to throw away
existing 'work-arounds' where they exist. Certainly not at first.

The small mistake was the sense numbers on ちゃう (which I think
were left over from when it was all one entry).

> I'd still need a way to relate ちゃう1
> to "ちゃう;じゃう (v5u) to do something completely" and ちゃう2 to
> "ちゃう (exp) (1) (osb:) No!; (2) isn't it?; wasn't it?"

ちゃう1 is the first encountered example of the headword ちゃう in
JMDICT, ちゃう2 is the second encountered example.

> At least in the ちゃう case you can use じゃう (as you do now) as the
> index target.

For the contraction of てしまう, not for the osb: one.

> I really can't see a definitive solution apart from using entry
> sequence numbers.

The advantages from my point of using 1, 2 instead of entry numbers
are
1) I can remember '1' and '2' - I can't remember 1112100 and 1443220
(this makes a big difference in ease of entry in the first place
and 'at a glance' understandability of the data).
2) Only applied to the cases that actually need it applied
(thus adding few bytes to the file and little in the way of extra
processing my side - in theory).

It might be worth my doing it even if only on my own computer.

As far as implementation with WWWJDIC goes, you already have the
example of TEMPSUB and FIX to show that information can be in
an entry that is not displayed on the standard WWWJDIC display.
Could you have something like NONUNIQ1 NONUNIQ2 for the ambiguous
cases and use that to alter the generation of the [EX] link?

P.S. I said ...
"after the  heading(+reading if req.)"
but, on reflection, I think it looks better after the headword
even if a reading is required.

駆ける1(かける) not 駆ける(かける)1