[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] Policy on names
On Tue, 13 Jul 2010 11:14:55 +1000, Jim Breen <jimbreen@gmail.com> wrote:
> On 12 July 2010 23:37, Jean-Luc Leger <reiga@dspnet.fr.eu.org> wrote:
>> since you talk about names in JMDict, I have this small file of
location
>> names in edict that I extracted some months ago.
>> It may help to draw the line for the grey part.
>> http://dspnet.fr/~reiga/location_proper_nouns.euc
>
> Thanks. I see that some of the entries, e.g.
> 独和 [どくわ] /(n) German-Japanese/
> are not really location names. I see that sort of thing as
> only being in JMdict.
oh yes those, well it made sense when I compiled that file.
>
>> My point of view regarding names in JMDict is : I don't care which are
>> present
>> and which aren't as long as they are marked as such, so that search
>> tools can
>> decide to choose them or not.
>> You can see this as the first step to merge JMDict and JMNedict.
>
> So some sort of tag. I'll go along with that. We could have a
> "fld" tag of "ne" (named entity), and rely on the pseudo PoS to
> classify them into given.family,organization, etc.
I don't know. To me a field tag describes the topic of the discussion in
which a word occurs.
Any topics have named entities.
Moreover, don't 'given', 'organization' ... flags already imply a named
entity ?
Yet, as a temporary purpose, if putting JMNedict tags into JMdict proves
to be difficult, using a flag 'ne' (as a POS ?) will be better than doing
nothing.
JL