[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] Indicating 高低アクセント
Jim Breen wrote:
> - in Australia? Quite possibly. In a case a few years back
> (http://www.findlaw.com.au/article/6404.htm)
> an attempt to republish the telephone book was struck down, establishing
> a clear difference from the Feist case in the US.
If i was a lawyer i would argue that a telephone book and a dictionary
are inherently different in nature, since the former contains
information that, strictly speaking, _originates_ with the telephone
company (after all, they are the ones who issue, revoke, and change
telephone numbers) while (at least most, if not all) entries that would
make it into a common (or general) dictionary don't originate from the
dictionary company but "belong" to everyone from the get go. (There is a
greyzone: specialized dictionaries that contain expressions that are
mostly only known to practitioners of an arcane art might be considered
differently).
> So do I think there is a significant risk in taking a one-digit code
> representing a fact from approximately 20% of entries where 大辞林 and
> JMdict intersect? No I don't, just as I thought there was no risk when I
> compared JMdict's nouns with those 大辞林 and looked for a スル tag
> when fleshing out the "vs" tags. Both are cases of factual information
> making up a very small proportion of both works.
That confirms what i suspected: we are dealing with a small proportion
of a work in question. In addition, this is information that does not
belong to anybody in particular and that can be presented in more than
one ways (thus one does not need to copy the style or method of another
dictionary even if one uses their data). And if we only use those pitch
markers in those cases where we think they are vital for a
differentiation (i.e., in a subset of the entries in the 大辞林), then i
think there is absolutely no problem using the data.
Now, this gets me back to the original topic: yes, i would like to have
some pitch information in the database, but that does (better: should)
not be in the form of LH markings - using numbers or letters to indicate
that there is a _difference_ in pitch between two entries that otherwise
look the same would be enough to alert me to pay attention to this
matter in the spoken language (where it counts - only there, ne?). Thus
the actual pitch information itself should be derived from acoustic
sources, i.e., spoken (real life) samples that i think may/will one day
be part of the database... :-)
Regards: Hendrik
--