[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [edict-jmdict] Re: Language codes (was: a few more jmdict errors)
Jim Breen wrote:
On 09/04/2008, Stuart McGraw <smcg4191@frii.com> wrote:
> Jim Breen wrote (and Jean-Luc Leger wrote similarly):
> > > The Algonquin language uses the code "alq", not "alg". :-)
> >
> > Not according to
http://www.loc.gov/standards/iso639-2/php/code_list.php
> > which is the "home" of that standard.
>
> Ah, I see what happened. You were using ISO-639-2 (1998), I was
> using ISO-639-3 (2007) http://www.sil.org/iso639-3/default.asp,
> (downloadable code table at http://www.sil.org/iso639-3/download.asp)
> which defines Algonquin as "alq". .
>
> It is surprising to me that an already established code would be
> changed but that seems to be the case.
It's one of several where 639-2 and 639-3 differ. ara->arb is another.
It's
a result of a different treatment of language families, which is partly
why there are two standards.
I looked at the Wikipedia article on ISO-629 and a little at some of
the docs at the SIL site but my eyes started to glaze over pretty fast.
Is the intention that there be two different standards or that -3 is
a successor to -2? (Of course intent and reality are sometimes
different.)
> Any reason for preferring -2 to -3? The latter has quite a few
> more languages and you can never tell when you might want to include
> Zeem language glosses in JMdict. :-) It downloadable table also
> seems to have more information (e.g. language type and scope
columns),
> though that's not directly relevant to jmdict use.
The reason I preferred 639-2 was that it has B codes, whereas
639-3 only has T codes. For most users of EDICT/JMdict I think
chi, ger and tib are likely to be more useful than zho, deu and bod.
Excerpted from from the ISO639-3 code table at
http://www.sil.org/iso639-3/download.asp
Id Part2B Part2T Part1 Scope Type Ref_Name
zho chi zho zh M L Chinese
deu ger deu de I L German
bod tib bod bo I L Tibetan
I don't know if the "part2B" column is technically part of the
standard or supplementary information but as it is distributed
officially with it, it seems like rather a fine point. (I did
not however look at all the entries to see if all the B codes
were the same in -3 as in -2).