Difference between revisions of "Main Page"

From EDRDG Wiki
Jump to: navigation, search
m (Reverted edits by Cecdes (talk) to last revision by Scott)
Line 1: Line 1:
.
+
==Electronic Dictionary Research and Development Group==
  
.
+
Welcome to the Wiki of the [[About EDRDG |  Electronic Dictionary Research and Development Group]]. The Wiki is being developed as a repository of information and documentation about the Group's projects.  
  
.
+
==The JMdict/EDICT Project==
  
.
+
This project is to build a freely-usable general Japanese dictionary file. It began in 1991 with the EDICT Japanese-English file in a simple format, and in 1999 expanded into the XML-format JMdict file. From then the file has been maintained by Jim Breen in a mark-up system from which the JMdict file, in both English and multiple-language editions, the EDICT file, and the extended EDICT2 file have been generated. Public input into the project has been mainly via WWW forms incorporated in the WWWJDIC server. A new edition of the files have been generated daily.
 +
 +
Some useful links are:
  
.
+
*the [http://www.csse.monash.edu.au/~jwb/j_jmdict.html overview documentation of the JMdict file]
 +
*the [http://www.csse.monash.edu.au/~jwb/edict.html overview documentation of the EDICT file]
 +
*the main [http://www.csse.monash.edu.au/~jwb/edict_doc.html documentation of the JMdict/EDICT dictionary files]
 +
*the [http://www.csse.monash.edu.au/~jwb/edrdg/licence.html licence statement for use of the projects' files]. This licence also applies to the contents of this Wiki.
 +
*lists of [[JMdictEDICT_software|packages and servers]] using the JMdict/EDICT files
 +
*the [[editorial policy]] and guidelines for the JMdict/EDICT files (under development)
 +
*an [[Entries Under Development]] page, where people can place incomplete words and phrases for later filling out to become full entries.
 +
==JMdictDB Database==
 +
The maintenance of the JMdict/EDICT dictionary files is now handled by the online JMdict Database (JMdictDB) system developed by Stuart McGraw since June 2010. For more information see:
 +
* an [[JMdictDB Project|overview]] of the database;
 +
* Stuart's [http://edrdg.org/~smg/ summary page];
 +
* the [http://edrdg.org/jmdictdb/cgi-bin/edhelpq.py quick overview] to editing entries;
 +
* the [http://edrdg.org/jmdictdb/cgi-bin/edhelp.py full help file] for editing entries.
  
.
+
==The Tanaka Corpus==
 +
This project is to maintain and extend the [[Tanaka Corpus]] which is a large collection of parallel Japanese/English sentence pairs.
  
.
+
The Corpus is now maintained within the [http://tatoeba.org/home Tatoeba Project]. This project has extended the file to include many other languages, and many sentences are available in three or more languages. The project WWW site has extensive facilities for searching and editing the sentences, and has an active community of people entering and editing sentences.
  
==<center>[http://starsearchtool.com/SESS_ng8MjE3fHwxMjk2MTY2MjIxfHwxOTUyfHwoRU5HSU5FKSBNZWRpYVdpa2k%3D_.html <big>'''<u>>>>  <<<</u>'''</big>]</center>==
+
==The KANJIDIC Project==
  
.
+
The KANJIDIC project has compiled files of data on kanji used in Japanese text processing. The files
 +
cover the kanji in three Japanese standards:
 +
* JIS X 0208-1998, which includes 6,355 kanji.
 +
* JIS X 0212-1990, which includes extra 5,801 kanji
 +
* JIS X 0213-2004, which extends JIS X 0208, overlaps with some of JIS X 0212, and adds 884 extra kanji.
  
.
+
Three data files are distributed by this project:
 +
* the KANJIDIC2 file, which is in XML format, and contains all the kanji. For this file the following information is available:
 +
** a project [http://www.csse.monash.edu.au/~jwb/kanjidic2/ overview page]
 +
** a file [http://www.csse.monash.edu.au/~jwb/kanjidic2/kanjidic2_ov.html overview]
 +
** the [http://www.csse.monash.edu.au/~jwb/kanjidic2/kanjidic2_dtdh.html DTD]
 +
** a [http://www.csse.monash.edu.au/~jwb/kanjidic2/kd2examph.html sample entry]
 +
* the KANJIDIC file, which covers the 6,355 kanji in JIS X 0208. For this there is the
 +
** [http://www.csse.monash.edu.au/~jwb/kanjidic2/ overview page]
 +
** [http://www.csse.monash.edu.au/~jwb/kanjidic_doc.html original documentation]
 +
* the KANJD212 file, which covers the 5,801 kanji in JIS X 0212. for this there is the
 +
** [http://www.csse.monash.edu.au/~jwb/kanjd212_doc.html original documentation]
  
.
+
==The COMPDIC Project==
  
.
+
The COMPDIC project involved the compilation of a glossary of terms used in the computing and telecommunications industries. The file was in the "EDICT" format. See the [http://ftp.monash.edu.au/pub/nihongo/compdic_doc.html brief documentation].
  
.
+
In 2008 the entries in the COMPDIC file were included in the JMdict/EDICT file. While it is no longer maintained as a separate file, an extract of the entries relating to computing and telecommunications is still generated.
  
.
+
==The ENAMDICT/JMnedict Project==
  
.
+
The ENAMDICT file contains about 720,000 proper names in Japanese. It is in EDICT format, with some special tags to indicate the type of proper name. It is also available in XML format as the Japanese-Multilingual named entity dictionary (JMnedict).There is a basic [http://www.csse.monash.edu.au/~jwb/enamdict_doc.html documentation page].
  
.
+
==The KRADFILE/RADKFILE Project==
  
.
+
This project provides a decomposition of kanji into a number of visual elements or radicals to support software which provides a lookup service using kanji components.
  
.
+
There is an [http://www.csse.monash.edu.au/~jwb/kradinf.html information page] about the files.
  
.
+
==The WWWJDIC Dictionary Server==
  
.
+
* [http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?1C home page] of the server (at Monash).
  
.
+
* [http://www.csse.monash.edu.au/~jwb/wwwjdicinf.html User's Guide]
  
.
+
* [[WWWJDIC in Japanese]] project
  
.
+
* [[Common words]] - the 850 common words from Ogden's list. To be used to enhance English-Japanese lookups.
  
.
+
==Wishlist==
  
.
+
This is a set of [[wishlist]]  items for the various projects. Feel free to add suggestions.
  
.
+
There is also an old [http://www.csse.monash.edu.au/~jwb/edictredev/edictwishlist.html wishlist page]. Some of the items in this section have been copied from it.
  
.
+
==Mailing List==
  
.
+
There is a [http://tech.groups.yahoo.com/group/edict-jmdict/ mailing list] for people engaged in the EDRDG projects.
  
.
+
==How Can I Help?==
  
.
+
From time to time people ask how they can best contribute to the projects. There are many ways of assisting, the main ones being:
  
.
+
* adding to and enhancing the main (EDICT/JMdict) dictionary file. This is best done by using the [http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?17 New Entry/Amendment] page of WWWJDIC.
  
.
+
* adding extra Japanese-English sentence pairs to the collection based on the Tanaka Corpus. There is a [http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?14 New Examples] function in WWWJDIC for this.
  
.
+
* assisting with the translation of the WWWJDIC interface into other languages. At present the priority is to make it fully available in Japanese. See the [[WWWJDIC in Japanese]] page.
  
.
+
* work through the lists of words Paul Blay has place on the [[Talk:Tanaka_Corpus]] page, which could become new dictionary entries.
  
.
+
* join and participate in the [http://tech.groups.yahoo.com/group/edict-jmdict/ mailing list] for people engaged in the EDRDG projects.
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
 
 
.
 
====  ====
 
 
 
<small>  f2fb2d065d  companies help find jobs
 
art jobs ohio
 
car paint jobs in woking
 
comercial jobs
 
alaska state jobs
 
bergen county nj jobs
 
croydon local jobs
 
christian jobs online
 
computer jobs austin
 
barratts homes jobs
 
city of los altos jobs
 
city chicago jobs
 
automobile finance and insurance jobs
 
computer consulting jobs
 
british film institute jobs
 
civil engineering jobs in southern africa
 
courier driver jobs
 
cna jobs in columbia south carolina
 
civilian government jobs overseas
 
dress steve jobs
 
cave creek jobs
 
diplomatic security jobs
 
civil engineering teaching jobs
 
city of west allis jobs
 
ancient history research jobs texas
 
advisor jobs
 
alaska fishing summer jobs
 
casino gaming jobs
 
all jobs in mancheser uk
 
city of annapolis jobs
 
blackburn with darwen borough council jobs
 
chicago journalism jobs
 
computer technical support jobs
 
aquarist jobs
 
blow molding jobs
 
australian defence jobs
 
construction jobs available in uk
 
adult video jobs
 
blue cross blue shield wisconsin jobs
 
church jobs for music minister
 
dental jobs in north carolina
 
arts teaching jobs
 
civil engineer jobs india
 
call center jobs in michigan</small>
 
.
 
.
 
.
 
<small>ein euro jobs
 
babysitting jobs in south jersey
 
cobol programmers jobs
 
dcc jobs
 
customer service advisor jobs
 
benefits of outsourcing jobs
 
bonita springs jobs
 
baseball scouting jobs
 
attendant jobs
 
call center jobs in dallas
 
al jazeera international jobs
 
building trades jobs
 
borders bookstore jobs
 
bus driver jobs california
 
civil engineering jobs in uk
 
communications officers jobs
 
board sports jobs
 
design jobs in kansas city
 
bangalore vlsi jobs
 
chef jobs goldthorn park
 
creative fashion jobs
 
bicsi jobs
 
au pair jobs in belgium
 
california tv jobs
 
cop jobs ohio
 
economics research jobs
 
driving jobs uk
 
alabama goverment jobs
 
capital hill jobs
 
door county summer jobs
 
billericay jobs
 
commercial roofing jobs
 
christian jobs wisconsin
 
arizona insurance jobs
 
duluth news tribune jobs
 
delevery jobs</small>
 
.
 
.
 
.
 
<small>bodyguarding jobs
 
broadcast jobs fort lauderdale florida
 
cna jobs colorado
 
cruise ship casino jobs
 
cisco systems jobs
 
disney cruiseline jobs
 
credit counseling jobs
 
casting assistant jobs
 
are good jobs dissapearing in canada
 
class a driver jobs
 
bilingual teacher jobs
 
city of torrance jobs
 
bureau of prisons jobs
 
corrections officer jobs in iraq
 
choral conducting jobs
 
city of lakewood wa jobs
 
corpus christi texas nursing jobs
 
business process mapping jobs uk
 
clerical office jobs
 
aroostook county maine jobs
 
contract writing jobs
 
courtney tug jobs
 
chicago ill jobs
 
colorado springs education jobs
 
earth city jobs
 
austin jobs com
 
city of kingston jobs
 
broadcast technician jobs
 
celebrity personal assistant jobs
 
control engineering jobs
 
courier jobs in houston tx</small>
 
.
 

Revision as of 15:41, 28 January 2011

Electronic Dictionary Research and Development Group

Welcome to the Wiki of the Electronic Dictionary Research and Development Group. The Wiki is being developed as a repository of information and documentation about the Group's projects.

The JMdict/EDICT Project

This project is to build a freely-usable general Japanese dictionary file. It began in 1991 with the EDICT Japanese-English file in a simple format, and in 1999 expanded into the XML-format JMdict file. From then the file has been maintained by Jim Breen in a mark-up system from which the JMdict file, in both English and multiple-language editions, the EDICT file, and the extended EDICT2 file have been generated. Public input into the project has been mainly via WWW forms incorporated in the WWWJDIC server. A new edition of the files have been generated daily.

Some useful links are:

JMdictDB Database

The maintenance of the JMdict/EDICT dictionary files is now handled by the online JMdict Database (JMdictDB) system developed by Stuart McGraw since June 2010. For more information see:

The Tanaka Corpus

This project is to maintain and extend the Tanaka Corpus which is a large collection of parallel Japanese/English sentence pairs.

The Corpus is now maintained within the Tatoeba Project. This project has extended the file to include many other languages, and many sentences are available in three or more languages. The project WWW site has extensive facilities for searching and editing the sentences, and has an active community of people entering and editing sentences.

The KANJIDIC Project

The KANJIDIC project has compiled files of data on kanji used in Japanese text processing. The files cover the kanji in three Japanese standards:

  • JIS X 0208-1998, which includes 6,355 kanji.
  • JIS X 0212-1990, which includes extra 5,801 kanji
  • JIS X 0213-2004, which extends JIS X 0208, overlaps with some of JIS X 0212, and adds 884 extra kanji.

Three data files are distributed by this project:

The COMPDIC Project

The COMPDIC project involved the compilation of a glossary of terms used in the computing and telecommunications industries. The file was in the "EDICT" format. See the brief documentation.

In 2008 the entries in the COMPDIC file were included in the JMdict/EDICT file. While it is no longer maintained as a separate file, an extract of the entries relating to computing and telecommunications is still generated.

The ENAMDICT/JMnedict Project

The ENAMDICT file contains about 720,000 proper names in Japanese. It is in EDICT format, with some special tags to indicate the type of proper name. It is also available in XML format as the Japanese-Multilingual named entity dictionary (JMnedict).There is a basic documentation page.

The KRADFILE/RADKFILE Project

This project provides a decomposition of kanji into a number of visual elements or radicals to support software which provides a lookup service using kanji components.

There is an information page about the files.

The WWWJDIC Dictionary Server

  • Common words - the 850 common words from Ogden's list. To be used to enhance English-Japanese lookups.

Wishlist

This is a set of wishlist items for the various projects. Feel free to add suggestions.

There is also an old wishlist page. Some of the items in this section have been copied from it.

Mailing List

There is a mailing list for people engaged in the EDRDG projects.

How Can I Help?

From time to time people ask how they can best contribute to the projects. There are many ways of assisting, the main ones being:

  • adding to and enhancing the main (EDICT/JMdict) dictionary file. This is best done by using the New Entry/Amendment page of WWWJDIC.
  • adding extra Japanese-English sentence pairs to the collection based on the Tanaka Corpus. There is a New Examples function in WWWJDIC for this.
  • assisting with the translation of the WWWJDIC interface into other languages. At present the priority is to make it fully available in Japanese. See the WWWJDIC in Japanese page.
  • work through the lists of words Paul Blay has place on the Talk:Tanaka_Corpus page, which could become new dictionary entries.
  • join and participate in the mailing list for people engaged in the EDRDG projects.