[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] German glosses in JMdict



On 10/09/2007, Francis Bond <bond@ieee.org> wrote:
>  I am a little confused about how the German glosses appear in JMdict:
>  they seem to be split without concern for parenthesis:
>  E.g.,
>  <entry>
>  <ent_seq>1117660</ent_seq>
>  <r_ele>
>  <reb>プロトアクチニウム</reb>
>  <re_pri>gai1</re_pri>
>  </r_ele>
>  <sense>
>  <pos>&n;</pos>
>  <gloss>protoactinium (Pa)</gloss>
>  <gloss xml:lang="ru">протакти́ний (Pa)</gloss>
>  <gloss xml:lang="de">{Chem.}</gloss>
>  <gloss xml:lang="de">Protactinium</gloss>
>  <gloss xml:lang="de">(beim natürlichen Zerfall von Uran entstehendes
>  radioaktives Metall</gloss>
>  <gloss xml:lang="de">Zeichen: Pa)</gloss>
>  </sense>
>  </entry>
>
>  I would expect this to be either:
>  <gloss xml:lang="de">(beim natürlichen Zerfall von Uran entstehendes
>  radioaktives Metall; Zeichen: Pa)</gloss>
>  or possibly
>  <gloss xml:lang="de">(beim natürlichen Zerfall von Uran entstehendes
>  radioaktives Metall)</gloss>
>  <gloss xml:lang="de">(Zeichen: Pa)</gloss>
>
>  But the current split seems a little odd.

Problems largely associated with the use of of Hans-Joerg Bibiko's
EDICTification of WaDoku, in which he put "/" all over the place. I
have edited it now to drop the split between the domain tag and the first
gloss. I have also dropped the split in front of the "Zeichen: " for
elements, so
I think it will appear as per your first preference.

While I was at it, I changed the internal gender tags to (f), (m) and (n).

High time that file was rebuilt totally.

>  P.S. I first noticed this in がり, which has other problems (^_^).
>  <entry>
>  <ent_seq>1003340</ent_seq>
>  <r_ele>
>  <reb>がり</reb>
>  </r_ele>
>  <sense>
>  <pos>&n;</pos>
>  <gloss>sliced ginger prepared in vinegar (served with sushi)</gloss>
>  <gloss>pickled ginger</gloss>
>  <gloss xml:lang="de">{Persönlichk.}</gloss>
>  <gloss xml:lang="de">Boutros Boutros Ghali</gloss>
>  <gloss xml:lang="de">(ägyptischer Diplomat</gloss>
>  <gloss xml:lang="de">1922-</gloss>
>  <gloss xml:lang="de">1992-96 Generalsekretär der UNO)</gloss>
>  </sense>
>  </entry>

Yes, an unfortunate matching with WaDoku's "ガリ" entry. Dropped.

Thanks

Jim

PS: I made matching edits to the "jddict" file used by WWWJDIC.

-- 
Jim Breen
Honorary Senior Research Fellow
Clayton School of Information Technology,
Monash University, VIC 3800, Australia
http://www.csse.monash.edu.au/~jwb/