[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Higher-level taxa in JMdict



As we were recently discussing which people and place names 'deserve' to end up in JMdict, I think we need some kind of policy for deciding whether to include taxa above the species level.

As far as I'm concerned, all species with a common name belong in JMdict itself, but most higher-level taxa (e.g., genus, family) usually ~do not~.  I see including all the ~科 and ~属 forms as no different than including all the possible ~的 entries (something we currently do not allow).  And while you might say that it would be useful to include the ~科 and ~属 entries for an E-J search, I have five general points against their inclusion.

  1. It would arguably be useful to also include the countless ~的 entries so that people would be able to search for "-ic" and "-ive" forms of many English words, etc., but we've still decided as a general rule to exclude all ~的 entries that cannot be found in a kokugo dictionary.
  2. Kokugo dictionaries omit the vast majority of genus and family names for a very simple reason: their construction is almost always derived from the name of a member.  For instance, the family "Myxinidae" is called "メクラウナギ科", which simply means "hagfish family".  You don't find "メクラウナギ科" in a kokugo dictionary for precisely the same reason that you wouldn't find "hagfish family" in any English dictionary: it's derivative and obvious.
  3. "Myxinidae" is not English; it's Latin.  English dictionaries generally do not include Latin names (with exceptions for extremely important cases like "Anopheles" and "Culex", or where no other word exists for its member species, as in "Stegosaurus").  Last I checked, there wasn't a Latin project for JMdict, so if you're searching for "Myxinidae", you're looking up the wrong language.  Such entries belong in JMnedict or---even better---a specialized J-E taxonomy dictionary, perhaps one that could be run as a partner project to JMdict.  (And it might not be hard to start such a thing up automatically using Wikipedia and the UJSSB database here: http://research2.kahaku.go.jp/ujssb/)
  4. Not a single instance f actual usage can be attributed to many of these taxa names.  They simply aren't used outside of one place: dedicated taxonomic dictionaries (something that edict is not).  For instance, インドシュモクザメ属  (submitted last month by Jim Rose) gets only 38 hits, and as far as I can tell, they're all just species lists, etc.  As expected, most of these hits already include the Latin name Eusphyrna, because the Latin names are the accepted standards used worldwide.  Even in Japan.
  5. We attempt to discern which person, place and organization names are important enough to include in JMdict and which belong in JMnedict instead. I don't see any reason to bend those guidelines and allow the names of all ranked taxa, especially considering that (unlike personal names) taxa names sometimes become obsolete and therefore require maintenance.

So I recommend that all ~属, ~科, etc. entries that cannot be found in any of the major kokugo dictionaries (or whose non-Latin foreign-language equivalents cannot be found in a non-specialists' dictionary) be excluded from JMdict.


Rene