On 03/23/2018 11:38 PM, Marcus Richert superbrightfuture@********* [edict-jmdict] wrote:
> I don't quite want to give up on the idea of merging adj-ix and adj-i entries just yet. It's been a very giving discussion, but I think there are solutions that really should satisfy all the different issues which have been brought up with the merge.
> This is what I think we should be doing:
>
> 1) To account for the fact that 良い and 無い share an irregular inflection (the addition of -さ with the suffix そう e.g. 良さそう and なさそう), we should implement a misc tag to these,
This is related to the v5aru vs v5r+aru discussion I think (i.e. do we
want a flat pos tag namespace with one tag encoding all the conjugation
properties of a word or should those properties be described by a
composite set of tags?)
While I didn't mention it before, if an additive tag is used I think it
should be another pos tag -- a misc tag would be dispersing information
about conjugation in two quite different places which I don't think is
good design.
> or maybe make them both "adj-i-s" instead of "adj-i" and "adj-ix" (if we want to follow the pattern previously established with v5u-s)
This would maintain the "one pos tag is all you need to know" status quo.
To the objection about inventing our own unique parts-of-speech definitions,
I still don't think that is a problem as long as ours are easily reducible
to conventional parts-of-speech. It may even be advantageous if it allows
reduction to different conventions: いい as non-conjugatable vs いい as an
irregular conjugation.
> 2) To account for the fact that いい doesn't conjugate, we can either
>
> a) implement a [noconj] tag and add it to any reading/kanji that includes いい. We might want to recommend dictionaries to hide this tag from the end user and only use it for conjugation tables and the like. >
>
> /or/
>
> b) always place よい before いい, with a usage note stating that いい is more common in 終止形/連体形.
I think this assumes that a conjugator will only conjugate the first
of multiple readings. But that seems to be excessively restrictive.
For example, the conjugator in the JMdictDB web pages conjugates all
the readings (as well as all the kanji). (I consider that a feature! :-)
Click the Conjugations link in
http://edrdg.org/~smg/cgi-bin/entr.py?e=1974645&svc=jmtest
for example.
[Why the heck does Yahoo remove indentation?]
Given any form of a word and its part-of-speech tag [*], one can currently
conjugate it, mechanically, with no further information. It would be
sad to lose that property in my opinion, either by adding a requirement
to make decisions based on the word's surface form or by restricting
conjugatability to only the first reading.
[*] At least for modern words, I'm excluding the v4*, v2* and other pos
for which I at least haven't a clue about conjugation rules.
> My preferences is for solution a) but I really don't think that there are any big issues with b) either, considering that altogether, counting the conjugated forms, "よい" *is* probably the more common surface form of the two.
>
>
> 3) Either way though, merging 良い and いい will make a handful of entries such as 格好良い a little messier than what they already are, but averaged out, I think this change would still make things tidier, not messier. Even if it did actually make things messier averaged out as well, though, I wonder to what extent it would really matter? I hadn't ever thought of it in that way previously, but I kind of took to heart what Stuart said about JMdictDB not actually being a dictionary, but a store of data that that can be used to produce dictionaries. Merging things might, on occasion, make things look a little messy in the database, but in an interactive dictionary app, you could set up things in a neater way, only displaying more basic information at a first glance, and then displaying other information based on the user pressing/tapping different kanji, etc.
>
> Also, in the specific case of 格好いい, I think we might just as well split it into 格好(かっこう)良い(いい/よい) and かっこ良い(いい/よい) instead of よい vs いい - neither daijs nor daijr use the the 格好 kanji in their かっこ良い entries..
>
>
>
> On Fri, Mar 9, 2018 at 8:51 AM, s mcgraw smcg6347@*********** <mailto:smcg6347@***********> [edict-jmdict] <edict-jmdict@*************** <mailto:edict-jmdict@***************>> wrote:
>
> __
>
> Interesting idea. I don't think it presents a problem for the jmdictdb
> submission code (syntactically it's no different than the "nokanji" tag.)
>
> The jmdictdb conjugator conjugates each reading-kanji pair in the
> cross-product of the all the readings with all the kanji for each
> sense with a conjugatable PoS. It should (but i don't think does
> yet) filter out any combinations of readings, kanji or senses that
> are excluded by the restr, stagr or stagk restrictions. A "noconj"
> tag on a reading could, in effect, during the conjugation process,
> be expanded to equivalent restr pairs so it would fit quite naturally
> into the conjugation process.
>
>
> On 03/07/2018 02:29 PM, Marcus Richert superbrightfuture@********* <mailto:superbrightfuture@*********> [edict-jmdict] wrote:
> > In regards to Stuart's suggestion about having a misc tag on よい/いい entries, another idea (not sure how feasible) could be to add a "does not conjugate" tag to readings and kanji, i.e. something like
> >
> > 格好良い
> > かっこういい[noconj];かっこうよい
> >
> > or
> >
> > 格好良い;かっこいい[noconj]
> > かっこういい[noconj];かっこうよい
> >
> > We could recommend dictionaries using jmdict to not show this tag to the end user, but just use it for conjugation info. We could possibly use it for other things as well, like all-katakana readings of i-adjectives (see ウザイ <http://www.edrdg.org/jmdictdb/cgi-bin/entr..py?svc=jmdict&sid=&e=1933202 <http://www.edrdg.org/jmdictdb/cgi-bin/entr..py?svc=jmdict&sid=&e=1933202>'>, etc.).
------------------------------------
Posted by: s mcgraw <smcg6347@***********>
------------------------------------
------------------------------------
Yahoo Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/edict-jmdict/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/edict-jmdict/join
(Yahoo! ID required)
<*> To change settings via email:
edict-jmdict-digest@***************
edict-jmdict-fullfeatured@***************
<*> To unsubscribe from this group, send an email to:
edict-jmdict-unsubscribe@***************
<*> Your use of Yahoo Groups is subject to:
https://info.yahoo.com/legal/us/yahoo/utos/terms/