[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Re: いい/よい POS matters



Sorry I took some time to answer.

Le 2014-07-30 08:18, Stuart McGraw smcg4191@frii.com [edict-jmdict] a écrit :
Formality and Conjugation are not the same things.
It might be easier for learning people to have them together in the
same
table but ..

Are you also putting Honorific and Humble forms in those tables of
yours
? One could say they are conjugation as well !
Would you say that honorific verbs don't conjugate at all because
they
don't have an honorific form ?
See なさる, ご覧になる and ご存知です
 What is and is not a conjugation is obviously subject to debate.
 For example. I have seen some sources that give a "desirative"
 conjugation, eg. 書く -> 書きたい.  But one has to draw a line
 somewhere because otherwise one would have to include zillions
 of various auxiliary verb suffix forms.

Exactly. My point of view is that the choice of where to draw the line must be in applications using JMDict and not in JMDict itself.


 I based my choice of conjugated forms on what I commonly saw
 in JSL textbooks, on the internet (eg wwwjdic's conjugations)
 etc.  I realize *any* specific choice can be criticized as including
 too many or not enough.

Yup, the choices you made for your application are completely valid.


 Honorific and humble forms do not seem to be included in many
 (or perhaps any) of the conjugation tables I've seen.

Another way to create "conjugation tables" of a word could be to
stick
to its formality level and points to the conjugation tables of
auxiliary
words such as です and ます.
 When I first tried to come up with code to do conjugations,
 that is an approach I tried:  view conjugations as the outcome
 of a series of sequential primitive conjugation transforms.

 Thus v5* verbs (eg 書く) would not have a negative past form
 but rather only a negative form that results in an adj-i and
 it is the i-adj that has a past form:

 		書く
 		 ->
 		書かない
 		->
 		書かなかった

 v5k
 (neg)
 adj-i
 (past)

 or

 		書く
 ->
 		書きます
 		->
 		書きませんでした

 v5k
 (fml)
 masu
 (neg-past)

 In the above, the second row of each table shows the PoS of each
 verb form and the transformations (in parens) applied.

 It is interesting to try to reduce conjugations to a minimal
 sequence of primitive conjugations but I came to the conclusion
 that it is hard to display such results in a convenient form to
 a user.  That is, you'll likely end up presenting the above in
 a row labeled: "neg-past" regardless of whether it was generated
 stepwise or directly.  And the same for conjugations of だ/です.

I agree. Even though those sequence of primitive conjugations are really how the language works, it's absolutely a nightmare to display it like this and probably not useful to many people :)


Would be nice if what we put in JMDict doesn't force us to choose
between any of these points of view.
 I agree.  And having a cop-desu (and masu) PoS tags would indeed
 make it easier to do conjugations in a minimal stepwise manner.
 However that is not a co-requisite for having a cop-da PoS tag.

absolutely. cop-da is a must-have, while cop-desu may still be debated.
For your application i guess you could just ignore cop-desu (or choose to redirect it to cop-da conjugation table).


I thought perhaps です in 暑いです would be an example
where one could
take it as an independent auxiliary verb. But then, it doesn't
conjugate
at all: (X)暑いでした.

What ? of course 暑いでした exists !
I must have missed your point here.
 I thought that 暑いでした is considered grammatically incorrect.

ok i might have reacted too quickly, and being at work I didn't check what I said ^^


 My point was: re René's comments, when is viewing です as an
 independent auxiliary verb more advantageous than as a "formal"
 conjugation form of だ?

 綺麗です (to borrow from Nil's post) does not seem to be an
example
 since です is that usage (ISTM) a form of だ.  The only case I can

 think of where です is not a form of だ is something like
暑いです.
 Yet that is not an example either because です does not conjugate
 (AFAIK) in that usage.

Ok. Not sure when either. Maybe in a fragment of japanese text in an automated analysis ? By the way, do you distinguish だ after a noun and だ after an adjectival-noun ? One transforms into the "no" particle before another noun and the other transforms into "na".


Again, although I see your points re です and いい and I can
see that
they may lead to a more technically accurate way of producing a
Japanese
grammar, I have to resort to the fact that in many (the vast
majority?)
of JSL teaching materials, です *is* presented as a
conjugated form of だ
and いい as an irregular i-adjective. I would like to see
JMdict apply
PoS tags that will allow (easily) either interpretation.

The question here is more : what do you want to present when you
click
on the "conjugation table" of です ?
 Before that, the question is, do you want to present conjugations
 of です at all?  If one takes the view that it is a form of だ,
then
 one would look at the conjugations of だ and find です therein.
 If one takes the view that it is an independent, conjugatable
 auxiliary verb, then indeed, what would one want to see?
 I suppose it would be the same as the affirm-formal and neg-formal
 columns of a だ conjugation table.

"nothing" was a valid answer to my question :) (it's what one gets right now)
Or you could send the user back to cop-da conjugation table.


By the way, we should have 2 new kinds of xref :
- one that link conjugated words to their plain form (at the same
level
of formality) : でしょう->です, な->だ
- one that link a formal, honorific, humble verb to an unformal
verb :
なさる->する, です->だ
(and maybe link dialect verbs to "normal" verb ?)

Those Xref could be used to get the "right" conjugation table when
you
click on any of those.
 I would like to see such xrefs.  (They seem to be mostly already
 there but as "see:..." xrefs rather than being marked more
 specifically.)  But I am doubtful about using them as the basis
 for doing conjugations.

They could be used to redirect the user to the word that have a PoS and a conjugation table. Either directly, or just because the link would appear more clearly among other xrefs.


 Most verbs in JMdict are only in dictionary form and not in any
 conjugated forms.  Thus, any conjugations presented to a user
 will need to include the formal forms as conjugations of the
 dictionary form or the formal forms will not be presented at
 all.  (I.e. unlike だ/です there is no 食べます entry from
which
 to derive the formal forms of 食べる.)

we have 行ってきます as an expression with a link back to 行って来る.
I guess we have a few of them, though very very few.
Of course, those are mostly never conjugated any more !


 Having a few words such as だ use a different method (present
 only the "plain" conjugations, use an xref to find a "formal"
 pseudo-dictionary form, and then generate conjugations of that)
 seems to be introducing a lot of gratuitous inconsistency.

oh, no ! I never thought to use those links that way.
Only use them from the conjugated word to its dictionary form (that will give the full conjugation table)

 I would not object if です got a dedicated PoS tag like
 "cop-desu" -- I was just wondering how useful it is to be able
 to treat it as a conjugatable word in its own right given
 it has a link to だ, and だ's conjugations would likely have
 です and its conjugations per the reasoning above.

Well, です really is different from any other formal word, because you can find it dictionaries. One could say it's a dictionary form by itself, and as such should have its own conjugation table.
Point of view !
(of course ます also has its own entry and conjugation forms in some dictionaries)

JL