[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: PoS vs-i issues
> Re Scott's comment about Rikaichan et al., well I guess
> if WWWJDIC can be programmed around it, they can
> do the same.
They might be able to program around it but it still strikes me as logically inconsistent. Shouldn't there be a different POS for an entry that includes ¤¹¤ë as part of it and one that doesn't? Perhaps you could use a new POS tag for this.
Also, I would remove vs-i and vs-s from the user available choices (if such a thing is possible with the new system). (If I'm correct vs-s is completely useless?)
--- In edict-jmdict@yahoogroups.com, Jim Breen <jimbreen@...> wrote:
>
> On the old submission system, Scott raised the issue of
> vs/vs-i tags on XXXX¤¹¤ë entries. My creaky memory had it
> that vs-i only was used by °Ù¤ë/¤¹¤ë itself, and I was
> alarmed to see it's on 70+ entries, mostly XXXX¤Ë¤¹¤ë.
>
> Scott's most recent comment is:
>
> > Regarding the POS debate, I thought that the tag was exclusive
> > to cases where the ¤¹¤ë verb is not part of the word and must be
> > added after it. e.g. Íý²ò¡¼¡äÍý²ò¤¹¤ë¡£ Having the vs tag cover both
> > those cases and the cases where suru is part of the expression
> > e.g. µ¤¤Ë¤¹¤ë could be potentially confusing, especially to software.
> > I'm afraid it could confuse third-party (e.g. rikaichan) conjugators.
> > I also noticed that some entries already use the vs-i tag. Wouldn't
> > using the vs-i tags keep things separate and more logically consistent?
>
> I hunted back in the ML archive and found we were discussing this
> about 15mo. ago, where I said:
>
> > I only introduced "vs-i" to be able to handle entries which had the ¤¹¤ë
> > included, but which didn't do the ¤µ¤Ê¤¤/¤µ¤Ê¤«¤Ã¤¿ thing. It can be
> > dropped when this comes in, except perhaps for °Ù¤ë/¤¹¤ë itself.
>
> The full message and some discussion can be seen at:
> http://tech.groups.yahoo.com/group/edict-jmdict/message/3382
>
> It seems I partially implemented this, because XXXX¤¹¤ë entries
> with a "vs" tag work fine in WWWJDIC's tables, whereas "vs-I"
> just produces a table for ¤¹¤ë alone.
>
> What I never got around to doing was rolling the tables for "vs"
> and "vs-s" together, or removing the vs-i from everything except
> °Ù¤ë/¤¹¤ë.
>
> Anyway I think the approach should be:
>
> (a) stop using vs-i
> (b) modify WWWJDIC to show up the ¤·¤Ê¤¤/¤µ¤Ê¤¤ cases
> in the one table with a footnote on the ¤µ¤Ê¤¤ as described
> in the posting last year
> (c) convert all the vs-s (74) and vs-i (218) to just vs
>
> Re Scott's comment about Rikaichan et al., well I guess
> if WWWJDIC can be programmed around it, they can
> do the same.
>
> The thought of correcting !300 PoSes is a bit scary, but
> we can probably do a global change using database tools.
>
> Comments?
>
> Jim
>
> --
> Jim Breen
> Adjunct Snr Research Fellow, Clayton School of IT, Monash University
> Treasurer: Hawthorn Rowing Club, Japanese Studies Centre
> Graduate student: Language Technology Group, University of Melbourne
>