JMdict/EDICT Editorial Policy and Guidelines
Note: This policy is being redrafted to align it with the new JMdictDB system being phased in for additionas and amendments.
These guidelines are intended for people preparing new entries or amendments for the JMdict/EDICT files. Typically these entries or amendments will be made via the JMdictDB on-line database system.
Before proposing a new entry or an amendment, you should:
- familiarize yourself with the style of the dictionary, particularly the way the English meanings are typically worded;
- make very sure it is not already an entry. An amazing number of "new" entries turn out to be in the dictionary already, or variants of existing entries;
- check you have written it correctly. Has it the correct kanji? Is the reading correct, with the vowel length right, ず/づ issues resolved, etc.?
- verify the source. There are excellent online dictionaries available, e.g. the Sanseido dictionaries at the Goo site. The Eijiro dictionary at the ALC site is also useful. If the word or phrase can't be found in a dictionary, WWW references to where it is used may suffice, but the meaning and context has to be clear. Dictionary and other reference information should be included in the "Reference" section in the form.
- verify that the word or phrase is common enough to include in the dictionary. Page counts for Google or Yahoo are useful for this purpose. In general unless a word or phrase has more than about 50 hits on the WWW, it is not worth submitting.
Dictionary Entry Fields
The Kanji section of the entry form contains the form of the Japanese word/phrase which contains kanji, special characters or letters from non-Japanese scripts (e.g. ＭＰ３プレーヤー). The word/phrase should written in full-width characters (e.g. it is not MP3プレーヤー).
There may be more than one version of the word or phrase in this section. The usual reasons for having more than one version (also known as "surface forms" or "orthographical variants") are:
- alternative kanji in the word, e.g. 合気道 and 合氣道
- variations in okurigana, e.g., 生け花 and 生花
- part of a word being written either in kanji or kana, e.g., 言い付ける and 言いつける
Where there are multiple forms of a word, enter them with the most commonly used form first, and then order them in decreasing frequency of use.
Synonyms should not be included here. Instead they should be entered as separate dictionary entries, and a cross-reference inserted to them.
Some other points to note:
- in the case of na-adjectives (形容動詞), the な is NOT included in the entry (some Japanese dictionaries include it.) Use a part-of-speech of "adj-na".
- as most adverbs are derived from either regular adjectives (く form) or na-adjectives (に), there is no need to have an entry unless the adverb is not apparent from the adjective.
- for verbs formed from adding する to a noun, do not include the する in the headword - instead use the part-of-speech of "vs". The exception to this is the group of single-kanji-plus-する verbs such as 愛する. For these include the complete verb and use the "vs-s" part-of-speech.
- for adverbs that are indicated by と, e.g. まざまざと, do not include the と, instead note the part-of-speech as "adv-to".
- for adjectives that use たる (and と in the adverbial form), e.g. 依然たる, 依然と, omit the たる and と and use "adj-t" as the part-of-speech.
- for the -さ (-ness) and -く (adverb) inflections of adjective, only include them if the meaning is not obvious from the gloss of the adjective itself.
A set of tags, e.g. iK or oK, can be applied to the words in this section. These should be used sparingly.
In this section enter either the reading(s) of the word/phrase in the Kanji section, or the word itself if it is written inly in kana. Readings associated with kanji should normally be in hiragana; the main exceptions being:
- Chinese or Korean words and names, which are often transliterated using katakana;
- the names of biological species which should be entered in both katakana and hiragana (if there is also a kanji form.)
More than one reading can be entered where alternatives are possible. This can occur when
- a kanji has alternative readings;
- where there are different transliterations of 外来語, e.g., ダイヤモンド and ダイアモンド;
- where a species name is being recorded; in these cases both hiragana and katakana forms should be entered. The katakana form must have "[nokanji]" after it to indicate that it is used without the kanji form, and a "[uk]" should be included in the Meanings field. Place the hiragana form first (client software such as WWWJDIC will display the katakana form first.)
Where alternative readings are restricted to particular variants of the kanji form, specify this using the [restr=KKK] pattern after the reading. As in the Kanji section, place the more common reading(s) first.
If a 外来語 (e.g. ベースボール) means the same as a native Japanese word (e.g. 野球), do not include the 外来語 form as a reading of the kanji. Instead create a separate entry and create cross-references between them. Similarly if two kana-only words have the same meaning, do not place them in the same entry unless they are related, e.g. spelling or pronunciation variants.
If the kanji part contains katakana (e.g. 一眼レフ), use katakana in the Reading as well for the matching portion (いちがんレフ).
A set of tags, e.g. ik or ok, can be applied to the words in this section. These should be used sparingly.
The Meanings section of the entry form is divided into senses, i.e. distinct meanings. These are indicated by a sense number: , , etc. Each sense can have a number of part of speech tags (POS), e.g. [n], [adj-i] and miscellaneous tags, e.g. [abbr] and [col].
The meanings consist of one or more short translations of the Japanese word or phrase.
- do not copy translations, especially longer ones, directly from other dictionaries. For simple terms there may not be much in the way of alternatives, but for longer explanations use you own words, reword things, etc. Significant copying carries a risk of charges of plagiarism or copyright violation.
- where the Japanese has more than one distinct meaning, break the section into senses.
- make each translation a separate item, i.e. place a ";" between them. This makes reverse look up and exact match on the English possible. Some examples:
- abbreviations: "three letter acronym; TLA" not "three letter acronym (TLA)"
- conjunctions: "rice field; rice paddy" not "rice field or paddy"
- make sure the meaning agrees with the part-of-speech. If the Japanese word is a noun, don't make the translation a verb (e.g. to xxxx)
- when entering a verb, use the infinitive in English (to run, to jump, etc.)
- for adjectives, the English entry should be just the adjective, not the adjective and copula:
- "lucky" not "be lucky" or "is lucky"
- where different forms of English use different terms, include all major variants (e.g. both "snow pea" and "mange tout" or "tap" and "faucet".)
- include both "British" and "American" spellings. For short meanings it is better to repeat the meaning with the alternative spelling, however it is also acceptable to just put the alternative at the end in paretheses, e.g. "full colour (color)". Do not use patterns such as "colo(u)r" as they can't be searched for successfully.
- do not use capital letters unless referring to a proper name (person, place, etc.)
- do not precede the meaning with the articles "a", "an" or "the" inless it is absolutely necessary to make the meaning clear.
- when entering the scientific name of a plant, animal, etc. put it in brackets after the first common English name, e.g. "spectacled bear (Tremarctos ornatus)". Note that the first word of the scientific name will have a capital letter. (See the note on "Names of biological species" below.)
- put any context in brackets, e.g.: "consulting (the oracle)" not "consulting the oracle".
- when indicating a field or domain for an entry, e.g., "comp" or "ling", state it using the [fld=xxxx] pattern. For example:
- [fld=comp] floating-point
- when using "e.g." to expand on the meaning of a word by giving examples, or when using "i.e." to qualify the meaning of a word, place the expansion in parentheses after the initial translation. For example say "hand game (e.g. rock, paper, scissors)", not "hand game, e.g. rock, paper, scissors". Also, do not include a comma after e.g. or i.e.
- where the part-of-speech includes "vs", i.e. the Japanese word will function as a verb with the addition of する, there are two options:
- enter the English meaning just as a noun or participle with a POS of (n,vs); not as a verb. See the 料理 entry, which has: "cooking; cookery; cuisine", not "cook".
- have one sense with a POS of "n" and a second sense with a POS of "vs" in which meaning can be "to ...".
(If the POS is "vs" alone, then it the meaning will be given as a verb, but such occurrences are rare.)
- never create an English meaning purely based on the translation of the meanings of the kanji making up a word. Sometimes it will be correct, but there are many cases where the result would be quite wrong. (魂柱 does not mean "spirit pillar").
- short explanatory notes can be included as part of a sense. Use the pattern [note="this is a note"].
- if the word comes from another language, mark this next to the English meaning. The format is [lsrc=lng:], where lng is the three-letter code from the ISO 639-2:1998 "Codes for the representation of names of languages" standard:
- アルバイト [n,vs] part-time job [lsrc=ger:Arbeit]
- アールデコ [n] art deco [lsrc=fre:]
Don't do this for (i) common Sino-Japanese vocbaulary, (ii) English loan-words where the first translation listed is the source word.
Cross-references must refer to other dictionary entries. Where the target word has a kanji form, that form should be used.
Most cross-references will be synonyms, references to the full form (in the case of abbreviations), words from which the entry is derived, etc.
Specify the cross-reference using the pattern [see=言葉] or [ant=何等] (see the detailed instructions). Where the reference is to a particular headword/reading combination, use the format: kanji・reading, e.g., [see=金本位・かねほんい]. For targets that are a particular sense of the target word use the format [see=漢字]
Please note that the "ant" (antonym) tag should only be used for genuine opposites. Words such as "short" and "tall" are antonyms; "short person" and "tall person" are not - use the regular "[see=...]" form for these.
On occasions two or more entries may be merged when there are grounds for assuming they are variants of each other. The basic principle that is applied is a "two-out-of-three" rule. For the candidate entries, if at least two out of the (a) kanji-headword, (b) reading and (c) meaning fields are the same, the entries may be merged. Otherwise they must be separate entries.
Two entries with no kanji could be merged if they have the same meaning and the kana forms are related, e.g. are variants of each other.
Names of biological species
My suggestion for rules for biological species:
- Whenever possible, both the common name and the scientific name (using binomial nomenclature) of a species should be provided. The preferred format is: common_name (scientific_name), e.g. European magpie (Pica pica). If the common name is unknown, the preferred format is: scientific_name (description), e.g. Mola mola (a species of sunfish).
- Common names should be written in dictionary form. This means that only proper nouns and proper adjectives should be capitalized, even for officially standardized common names. e.g. "American kestrel", not "American Kestrel".
- Generic names (and names of higher taxa) are always capitalized; specific epithets are never capitalized. e.g. "Tyrannosaurus rex", not "tyrannosaurus rex" or "Tyrannosaurus Rex"
- Where applicable, subspecific taxonomic categories should be written out fully using ICZN or ICBN rules.
- For animal subspecies, this consists of merely writing the subspecific epithet. For example, the cinnamon bear, a subspecies of American black bear, should be submitted as: cinnamon bear (Ursus americanus cinnamomum)
- For plant subspecies, the abbreviation "subsp." should be used before the subspecific epithet. For example, occluded blindweed, a subspecies of hedge bindweed, should be submitted as: occluded blindweed (Calystegia sepium subsp. erratica)
- For varieties, the abbreviation "var." must be used.
- For forms, "f." must be used.
- Cultivar epithets should capitalized and placed in single quotes. (e.g. Taxus baccata 'Variegata')
- For forms, "f." must be used.
- For varieties, the abbreviation "var." must be used.
- Do not submit the author name. e.g., raspberry (Rubus idaeus), not raspberry (Rubus idaeus L.) (The "L." stands for Linnaeus.)
- Whenever possible, junior synonyms should not be submitted. Submit only the single scientific name currently accepted as the senior synonym. Wikipedia and The Encyclopedia of Life are good resources for finding the most up-to-date classifications.
- Submissions should include the Japanese name in kanji, hiragana, and--in the vast majority of cases--katakana. Biological names are very often written in katakana, and thus a (uk) tag is usually warranted. Nevertheless, the katakana reading should always be placed after the hiragana reading. For example, 銭形海豹 [ぜにがたあざらし,ゼニガタアザラシ] (n) (uk) harbor seal (Phoca vitulina)/harbour seal/common seal
- Names of higher taxa should include the headword written entirely in kanji, even though it may be only rarely used in practice. Reading restrictions will be used where appropriate. For example, セリ科,芹科 [セリか(セリ科),せりか(芹科)] (n) Apiaceae (parsley family of plants)/Umbelliferae
- When unsure of a kanji headword, it is often easy to determine based on the English translation or the appearance of the species. For example, the white-cheeked pintail (Anas bahamas) is known as ホオジロオナガガモ in Japanese. This word does not appear in any Japanese dictionary, but it is rather obviously written as 頬白尾長鴨. Include a kanji headword whenever it can be determined in this manner, but never guess. ReneMalenfant 21:05, 25 August 2009 (UTC)