I concur with Chris here. I have no opinion about separate or merged いい/よい
entries but I don't see the benefit of dropping the adj-ix tag. For those to
whom it makes no difference, it is trivial to map adj-ix to adj-i prior to
processing. However going the other way is more difficult. For example
the jmdictdb conjugator is driven solely by the PoS tag with no need to look
at the text of the word itself. Were that needed it would be a significant
architectural change.
Another point is that the adj-ix tag captures the information that has been
proposed to be duplicated in far less machine-useable form as a note in the
entry. That is, the note can easily be generated from the tag. Again, going
the other way (looking for some particular text in the note in order to
rephrase it for example), is more fragile.
It seems to me that even if いい is sometimes conjugated the same way as よい,
いい/よい (and friends) is a sufficiently unique case, and one that is described
as such even to beginning students of Japanese, as to justify a separate tag.
Given the number of PoS tags already in use it doesn't seem like a huge gain
to eliminate this one.
-- Stuart
On 03/05/2018 08:16 PM, Chris Vasselli clindsay@********* [edict-jmdict]
wrote:
> Hi all,
>
> Just wanted to give my 2 cents as a developer of an app <http://nihongo-app.com> that relies on this data. For some background, my app has a dictionary component,
but it also uses the part of speech data to find the words that are contained in a piece of Japanese text. I rely on the part of speech information to power a deconjugation system, that can take something like 頭がよくなさそう and link it to the dictionary entry for
頭が良い.
>
> I can definitely adapt to whatever decision you guys make here, but it’s certainly useful for me to have entries that end in 良い to have a separate part of speech, or at least some sort of special tagging. To me, it’s useful information that can be provided
by the database, that I will otherwise need to infer on my own. More information is better than less information, and the [adj-ix] marking would help me distinguish the cases where I need to do special handling.
>
> For example, Marcus mentioned that “we could maybe consider instituting a rule that よい should always come before いい in the kanji/readings.” In that case, this would make the conjugation/deconjugation logic work properly. But, I would need to special case
these entries when they show up in the dictionary component of my app to show the いい version as the primary reading of the word. I want to display the most common reading, which is the いいversion.
>
> Without any sort of tagging that this an いい adjective, I will probably have to just look for words that end in よい and also contain an alternate いい reading, and assume that those need to be flipped for display. This seems a little brittle, and I’d rather have
the database provide that information.
>
> The alternative of listing the いい version as the first version has its own problems. For example, when generating conjugation tables, we’d want to use the よい version. Even if as Rene mentioned いくない could in some circumstance be a valid conjugation, it’s
certainly not the conjugation I would want to present to my users, at least as the default conjugation. So again, I need a way to distinguish entries that need this kind of special treatment.
>
> In short, if we merge よい/いい entries back together, app developers still need a way to know that these entries need special handling. [adj-ix] seems like a reasonable way to do this in my mind.
>
> Just my 2 cents. Thanks!
>
> Chris
>
>
> On Mar 6, 2018, 11:40 AM +0900, Jim Breen jimbreen@********* [edict-jmdict]
<edict-jmdict@***************>, wrote:
>>
>> Yes, I'd missed that comment in Daijirin too. I had been leaning towards
>> merging them back together with "adj-ix", but now I think that they can be
>> "adj-i" with a note on the いい entry. Any software that conjugates adjectives
>> will need to be aware that 良い/いい needs a bit of special treatment.
>>
>> Thanks for the very useful discussion.
>>