Difference between revisions of "JMdictDB Project"

From EDRDG Wiki
Jump to: navigation, search
(JMdictDB Database Project)
(Overview)
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
=JMdictDB Database Project=
 
=JMdictDB Database Project=
The JMdictDB online database has been developed to support the maintenance  of the JMdict/EDICT, JMNEdict/ENAMDICT and other dictionary files originally compiled by Jim Breen. From May 2010 the JMdict/EDICT file has been maintained using the database, with full public access enabled in July 2010. The database system has been developed by Stuart McGraw.
+
==Overview==
 +
The JMdictDB online database has been developed by Stuart McGraw to support the maintenance  of the JMdict/EDICT, JMNEdict/ENAMDICT and other dictionary files originally compiled by Jim Breen. From May 2010 the JMdict/EDICT file has been maintained using the database, with full public access enabled in July 2010.
  
 
Access to the database is in several forms:
 
Access to the database is in several forms:
* the [http://www.edrdg.org/cgi-bin/wwwjdic/wwwjdic?1C WWWJDIC server] links directly to the edit screen of the JMdictDB system when a user wishes to add a new entry or amend an existing entry.
+
* the [http://www.edrdg.org/cgi-bin/wwwjdic/wwwjdic?1C WWWJDIC servers] link directly to the edit screen of the JMdictDB system when a user wishes to add a new entry or amend an existing entry.
 
* other servers using the JMdict/EDICT file are encouraged to offer similar links.
 
* other servers using the JMdict/EDICT file are encouraged to offer similar links.
* JMdictDB system's own search/lookup screens. These can look up entries using Japanese words, English words and the entries' sequence numbers. There is a [http://edrdg.org/jmdictdb/cgi-bin/srchformq.py?svc=jmdict&sid=& basic search screen] and an [http://edrdg.org/jmdictdb/cgi-bin/srchform.py?svc=jmdict advanced search] screen]
+
* JMdictDB system's own search/lookup screens. These can look up entries using Japanese words, English words and the entries' sequence numbers. There is a [http://edrdg.org/jmdictdb/cgi-bin/srchformq.py?svc=jmdict&sid=& basic search screen] and an [http://edrdg.org/jmdictdb/cgi-bin/srchform.py?svc=jmdict advanced search screen].
  
Users will be able to propose new entries and edit existing entries. New entries and amended entries will be held as "pending" until approved by one of the editors working with the project. The user submissions can be viewed using [http://www.edrdg.org/jmdictdb/databaseupdates.html this page].
+
Users are able to propose new entries and edit existing entries. New entries and amended entries are held as "pending" until approved by one of the editors working with the project. The user submissions can be viewed using [http://www.edrdg.org/jmdictdb/databaseupdates.html this page]. Approved changes can also be seen in the summary of [http://www.csse.monash.edu.au/~jwb/edictdiffs/ differences] between the daily editions of the EDICT file.
  
 
The contents of the JMdictDB database are released daily as the current JMdict and EDICT dictionary files, and are automatically added to the WWWJDIC dictionary server.
 
The contents of the JMdictDB database are released daily as the current JMdict and EDICT dictionary files, and are automatically added to the WWWJDIC dictionary server.
Line 14: Line 15:
 
* the [http://edrdg.org/jmdictdb/cgi-bin/edhelpq.py quick overview] to entering/editing entries;
 
* the [http://edrdg.org/jmdictdb/cgi-bin/edhelpq.py quick overview] to entering/editing entries;
 
* the [http://edrdg.org/jmdictdb/cgi-bin/edhelp.py full help file] for entering/editing entries;
 
* the [http://edrdg.org/jmdictdb/cgi-bin/edhelp.py full help file] for entering/editing entries;
 +
* the [[editorial policy|Editorial Policy]] page;
 
* Stuart's [http://edrdg.org/~smg/ information page].
 
* Stuart's [http://edrdg.org/~smg/ information page].
 +
* a presentation/software demonstration Jim Breen did to the 2013 AsiaLex conference in Bali. ([http://www.edrdg.org/~jwb/2013asialexJMdictDB.pdf abstract][http://www.edrdg.org/~jwb/2013asialexpres.pdf OHPs])
 +
 +
==Processing Flow==
 +
 +
'''User Creates/Amends an Entry'''
 +
 +
Users can enter a new entry using the [http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&c=1&sid= entry form] or edit an existing entry using the same form that has been preloaded with the entry details. ([http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1413240 example]) For existing entries, the loading of the screen could be via a WWW server such as WWWJDIC, or via the system's own [http://www.edrdg.org/jmdictdb/cgi-bin/srchformq.py?svc=jmdict&sid= search form].
 +
 +
On completing the entry/edit the user clicks on the Next button which leads to the confirmation page. If the entry is satisfactory, the user can click on the Submit button which will mark the as Pending and queue it for consideration by an editor. At this stage a new entry will be allocated a Sequence Number which can be used for tracking its progress.
 +
 +
'''Editor Verifies Entry'''
 +
 +
An editor can view Pending entries and either approve them, perhaps after some modification, or in rare cases reject them. The editing process will be faster if the submission is accompanied by references such as dictionary extracts, quoted text, WWW site URLs, etc. The progress of an entry can be tracked by using the links on the [http://www.edrdg.org/jmdictdb/databaseupdates.html View Updates page].
 +
 +
'''Dictionary Distribution'''
 +
 +
Once each day the dictionary database (approved entries only) is converted to an XML file from which the distribution formats (JMdict, EDICT2, EDICT, etc.) are generated. These are placed on the Monash ftp server and into the Monash WWWJDIC server, from which the other WWWJDIC servers will progressively update their files.
 +
==Viewing Current and Previous Edits==
 +
If you wish to see what new entries or amendments are currently being processed, and recently-approved changes, there are several ways this can be done:
 +
 +
(a) you can get a display of all the not-yet-approved new entries and amendments. To do this:
 +
: - go to the [http://www.edrdg.org/jmdictdb/cgi-bin/srchform.py?svc=jmdict&sid= Advanced Search Form]
 +
: - check the Active, Deleted and Rejected boxes in "Status", and the Unapproved box in "Approved".
 +
: - click on the Search button
 +
 +
(b) you can use the Advanced Search Form to display all the approved and unapproved changes for a given day. To simplify this we have a [http://www.edrdg.org/jmdictdb/databaseupdates.html View Updates Page] where you can access this information by clicking on the day you wish to see.
 +
 +
(c) each day a "differences" page is created showing the old and new entries side-by-side. These are in the "EDICT2" format, but are still a handy way of seeing the additions, amendments and deletions. Go the the [http://www.csse.monash.edu.au/~jwb/edictdiffs/ folder] containing these files and click on the date you wish to see.
 +
 +
==Interface from Other Systems==
 +
WWW servers and web-enabled devices using the JMdict or EDICT2 versions of the dictionary can link directly to edit screens in the JMdictDB system using the Entry Sequence Number in each entry. This is in the <ent_seq> entity in the JMdict version and in the "EntLnnnnnnn" field at the end of each EDICT2 entry. The URL to use is:
 +
 +
: http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=nnnnnnn  where nnnnnnn is the sequence number
 +
 +
Using that URL results in an entry edit screen being loaded with the current contents of the entry.
 +
 +
Complete new entries can be submitted in the EDICT or EDICT2 format. For example, to submit the entry: "何か [なにか] /(exp) something/", the URL to use is:
 +
 +
: http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&c=1&j=何か.......
 +
 +
The entry must be in UTF-8 coding, and the Japanese and space characters must be "URL encoded", e.g. "%E4%BD%95%E3%81%8B%20[%E3%81%AA%E3%81%AB%E3%81%8B]%20/%28exp%29%20something/".

Latest revision as of 04:10, 17 September 2017

JMdictDB Database Project

Overview

The JMdictDB online database has been developed by Stuart McGraw to support the maintenance of the JMdict/EDICT, JMNEdict/ENAMDICT and other dictionary files originally compiled by Jim Breen. From May 2010 the JMdict/EDICT file has been maintained using the database, with full public access enabled in July 2010.

Access to the database is in several forms:

  • the WWWJDIC servers link directly to the edit screen of the JMdictDB system when a user wishes to add a new entry or amend an existing entry.
  • other servers using the JMdict/EDICT file are encouraged to offer similar links.
  • JMdictDB system's own search/lookup screens. These can look up entries using Japanese words, English words and the entries' sequence numbers. There is a basic search screen and an advanced search screen.

Users are able to propose new entries and edit existing entries. New entries and amended entries are held as "pending" until approved by one of the editors working with the project. The user submissions can be viewed using this page. Approved changes can also be seen in the summary of differences between the daily editions of the EDICT file.

The contents of the JMdictDB database are released daily as the current JMdict and EDICT dictionary files, and are automatically added to the WWWJDIC dictionary server.

For more information, see:

Processing Flow

User Creates/Amends an Entry

Users can enter a new entry using the entry form or edit an existing entry using the same form that has been preloaded with the entry details. (example) For existing entries, the loading of the screen could be via a WWW server such as WWWJDIC, or via the system's own search form.

On completing the entry/edit the user clicks on the Next button which leads to the confirmation page. If the entry is satisfactory, the user can click on the Submit button which will mark the as Pending and queue it for consideration by an editor. At this stage a new entry will be allocated a Sequence Number which can be used for tracking its progress.

Editor Verifies Entry

An editor can view Pending entries and either approve them, perhaps after some modification, or in rare cases reject them. The editing process will be faster if the submission is accompanied by references such as dictionary extracts, quoted text, WWW site URLs, etc. The progress of an entry can be tracked by using the links on the View Updates page.

Dictionary Distribution

Once each day the dictionary database (approved entries only) is converted to an XML file from which the distribution formats (JMdict, EDICT2, EDICT, etc.) are generated. These are placed on the Monash ftp server and into the Monash WWWJDIC server, from which the other WWWJDIC servers will progressively update their files.

Viewing Current and Previous Edits

If you wish to see what new entries or amendments are currently being processed, and recently-approved changes, there are several ways this can be done:

(a) you can get a display of all the not-yet-approved new entries and amendments. To do this:

- go to the Advanced Search Form
- check the Active, Deleted and Rejected boxes in "Status", and the Unapproved box in "Approved".
- click on the Search button

(b) you can use the Advanced Search Form to display all the approved and unapproved changes for a given day. To simplify this we have a View Updates Page where you can access this information by clicking on the day you wish to see.

(c) each day a "differences" page is created showing the old and new entries side-by-side. These are in the "EDICT2" format, but are still a handy way of seeing the additions, amendments and deletions. Go the the folder containing these files and click on the date you wish to see.

Interface from Other Systems

WWW servers and web-enabled devices using the JMdict or EDICT2 versions of the dictionary can link directly to edit screens in the JMdictDB system using the Entry Sequence Number in each entry. This is in the <ent_seq> entity in the JMdict version and in the "EntLnnnnnnn" field at the end of each EDICT2 entry. The URL to use is:

http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=nnnnnnn where nnnnnnn is the sequence number

Using that URL results in an entry edit screen being loaded with the current contents of the entry.

Complete new entries can be submitted in the EDICT or EDICT2 format. For example, to submit the entry: "何か [なにか] /(exp) something/", the URL to use is:

http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&c=1&j=何か.......

The entry must be in UTF-8 coding, and the Japanese and space characters must be "URL encoded", e.g. "%E4%BD%95%E3%81%8B%20[%E3%81%AA%E3%81%AB%E3%81%8B]%20/%28exp%29%20something/".