[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Exp problem
That's would be amazing Jean-Luc. I've brought up the problem before with Jim but perhaps I should have posted something on this list. Many, many "exp" entries don't have a verb (v5,v1) tag. I fix them manually when I find them but I don't think I could ever fix them all within my lifetime. It would be great if you could automate the task.
--- In edict-jmdict@yahoogroups.com, Jean-Luc Léger <reiga@...> wrote:
>
> On Fri, 16 Jul 2010 15:32:39 +1000, Jim Breen <jimbreen@...> wrote:
> >
> > The problem is that the existing 74 vs-i verbs are all expressions of
> > the XXXXã?«ã??ã?? variety. WWWJDIC handles them as cases of "vs" by
> > stripping off the ã??ã?? then treating it as though it were å??å¼·, et al.
> >
> > I could:
> >
> > (a) leave the æ??ã??ã??brigade as "vs-s";
> > (b) leave the å??å¼· class as "vs" (these are the ones that Daijirin tags
> > "ï¼?å??ï¼?ã?¹ã?«" and the Japanese NLP people call ã?µå¤?å??è©?/verbal noun).
> > There are thousands of entries with this tag;
> > (c) leave "vs-i" on ç?ºã??/ã??ã??, mainly to keep WWWJDIC happy. It's
> > not at all irregular, so I may rebadge it "vs-x".
> > (d) do something different for the expressions with ã??ã?? (XXXã?«ã??ã??,
> > XXXã??ã??ã??, XXXã?'ã??ã??, etc.) There are about 200 of these, of which 74
> > are currently tagged "vs-i". Again I don't like "vs-i" on them as they
> are
> > not irregular. Maybe "vs-e" for "expression using ã??ã??"?
>
> for the last one, take into account expressions with verbs other than ã??ã??.
>
> For exemple, take those entries :
>
> æ°?ã?«ã??ã?? [ã??ã?«ã??ã??] /(exp,vs-i)
> æ°?ã?«ã?ªã?? [ã??ã?«ã?ªã??] /(exp,vi,v5r)
>
> Both should be managed the same way. So having a 'vs-e' tag seems
> illogical.
> My opinion is that 'exp' + a tag describing the conjugation of the final
> word (here a verb), is quite enough.
> In summary, I would use the same tag in c) and d) (call it vs-x if you
> don't like vs-i)
>
> Tell me if I misunderstood the problem and the need to separate c) and d).
>
> By the way, I see many verbal expressions with only (exp). Do we agree
> that they should also have a vxxx tag ?
> If so, I will build a script to extract every one that need a tag and
> which tag it need (from the one put on the entry of the verb alone).
> That could be a good occasion to think about a bulk updater for JMDictDB,
> if you don't already have one :D
>
> JL
>