[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: parsing submission data (was: [edict-jmdict] xrefs in WWWJDIC)
On 03/06/07, Stuart McGraw <smcg4191@frii.com> wrote:
Jim Breen wrote:
> Actually [[s_lang="en: the source word"]] would probably be better, just to
> emphasise to the user that the language code is needed.
>
> "from" may well be better than "s_lang", but it might confuse with "trans"
> and "lit". (Many people get those confused.)
The last time this came up,
http://tech.groups.yahoo.com/group/edict-jmdict/message/1503,
http://tech.groups.yahoo.com/group/edict-jmdict/message/1524
you were considering making trans a sense.lsource
attribute, e.g.
<lsource lang="en" translit="soapland">
rather than a tagged gloss
Well, I had __ls_type="wasei"__ in there, as "soapland"
isn't meaningful English...
<gloss lang="en" translit>soapland</gloss>
or gloss-like element
<translit lang="en">soapland</translit>
(BTW, I think the last two are informationally equivalent
and would have identical representations in the database.)
That's still my thinking. I don't want a real gloss to contain broken or
non-English. The 和製英語 from which a loanword is transliterated is of
interest, but it must be seen as informtion relating to a sense; not a
translation of the Japanese word itself.
If you did go with the first, then it would come down to
the teaching people:
1. Use [from....] to specify the foreign language word
or pseudo-word a Japanese word was derived from.
1a. Use [from ... trans*] when that word is not a real
word in the source language.
Yes.
2. Use a gloss to provide the meaning of the Japanese
word in English (or the specified gloss language).
May or may not be the same as the word given in
[from:...] (but will never be the same as [from...trans].)
Yes, but there's no need for a [from="..."] if the 外来語 is from a
real English/etc. word or phrase.
3. Use [lit] for gloss that is an unusually word-for-word
translation (but still a legitimate word/whatever in the
gloss language) [This is not expressed very clearly
but you get the idea I hope.]
Mostly. Take the recent entry: 鬼の居ぬ間に洗濯. It contains:
"(lit: refreshing oneself while the ogre is gone)." As a gloss
that is uselessly literal, and should not be regarded as a gloss at all,
but more some information about the background/structure of the Japanese
phrase (most occurrences of "lit:" are for expressions like that.)
I suspect that documenting and teaching people this will
be easier than teaching them some other things, like when
a gloss goes in an existing sense and when it is a new sense.
I think so too.
For a learnable syntax, how about:
[from="fr: avec"] or just [from=it:] in the case of フェットチーネ.
and
[wasei="soaplady"] or [wasei="de: gebroken Deutsch"]
and
[lit="refreshing oneself while the ogre is gone"]
In fact the first two will end up as the same XML entity, but with
different attributes,
Cheers
Jim
--
Jim Breen
Honorary Senior Research Fellow
Clayton School of Information Technology,
Monash University, VIC 3800, Australia
http://www.csse.monash.edu.au/~jwb/