[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Tanaka Corpus



Greetings,

I am sorry to announce that Paul Blay has had to give
up maintaining the Tanaka Corpus and the associated
indices. At present it is all back with me.

I'd like to thank Paul for the immense amount of excellent
work he has put into this task over the last few years. He
volunteered to look after it while I was tripping in Europe, and
then he stayed with it. He didn't just keep my simple text-based
system running - instead he loaded it all into a database and
did extensive work in extending and validating the indices. The
quality of the sentences and the indices have improved significantly
under his care.

Paul has passed the databases, macros, scripts, etc. he has used to
me. If there is anyone out there who would like to succeed Paul in
this task, I'd LOVE to hear from you. Paul's database, etc. is in
Access, but any database would do, once you'd converted things.

Another avenue that should be explored is interfacing better with
Trang Ho's "Tatoeba" project (http://tatoeba.fr/), which is
extending the Tanaka sentences to other languages. Trang's site
allows for online edit of the sentences, and she has added some
links back to WWWJDIC.

In the interim, I will be just hand-editing the text files here.
If/when a different system is used, or if anyone wants to run with
the database, I can do a "diff".

Thanks again Paul for the great work.

Cheers

Jim

-- 
Jim Breen
Honorary Senior Research Fellow
Clayton School of Information Technology,
Monash University, VIC 3800, Australia
http://www.csse.monash.edu.au/~jwb/