On Jun 21, 2010, at 3:52 PM, René Malenfant wrote:
The problem, as I see it, is for example... I don't speak Portuguese, Dutch, nor German... and yet any of these could be the origin of an otherwise native sounding word. So you didn't quote what I was responding to, but the idea that you always know a "native" Japanese katakana word from a "borrowed" katakana word is unlikely... especially when it comes to the many diverse and random names of species that could be based on latin or an author or a location on the planet which is then katakana-ized... is that a foreign word if the location where a species was found was written in the latin and then turned to Japanese ??? Gets confusing.
Again, I don't see the problem. If uncertain of the w
rd's origin, submit it as katakana:hiragana. There's no reason why the cases in which you have imperfect information should determine how you treat the cases in which you have perfect information.
The primary reason would be to have internal logical consistency...
ALL? www.oceandictionary.net lists thousands of organisms by latin, and provides (almost) nothing but the katakana name. Does that fail to meet the standard of a dictionary? Or does it have to fail to use standard taxonomic conventions to be considered a dictionary?
I wouldn't exactly quote oceandictionary as a source that has been assembled with great attention to detail, taxonomically speaking. For instance, "red snapper" turns up multiple translations
(as expected), but none of them are provided with binomial names for disambiguation, and the only taxonomically accepted "red snapper" (according to the AFS, FDA, and CFIA) does not appear on the list at all. In fact, if you turn to their E-J marine life dictionary, there is nary a binomen to be seen. It takes a great taxonomical source to do away with the actual "taxonomy" bit of it.
oceandictionary.net, at least the part I use, is not an E <-> J dictionary... Not sure what you are looking at. Every entry has binomial names... and as this is a Japanese dictionary... the Japanese name... almost always, perhaps always Katakana, and would be completely irrelevant to AFS, FDA etc as those are English and possibly French standards as far as I can tell without wasting more time looking it up. The useful pages on the site are for example, the page for the letter A:
Which for those too lazy to click on the link starts with these three entriess:
Abalistes stellatus: オキハギ. Abarenicola pacifica: イソタマシキゴカイ. Abbottina rivularis: ツチフキ.
The reason katakana is used is as I described
earlier... because that's the convention for species names. I really don't think "laziness" was the motive. The overwhelming factor is that the number of species which happen to have a kanji based name is extremely small in comparison to the number of species who are ONLY named via katakana... which you then ascribe to laziness, and invent kanji compounds which nobody has ever used... except you... that's the danger here.
I'm well aware that katakana is standard, however that does not explain why certain information is ~omitted~. Katakana is sufficient within the realm of taxonomy. In a taxonomic database there is simply no reason to provide hiragana or kanji, which requires a great deal more effort to compile for very little practical return. On the other hand, if you turn to kokugo dictionaries you consistently see that kanji headwords are provided, even for words that lack them in oceandictionary and other such sources.
Again you are adding your personal collection of pejoratives like laziness and "sufficient" within "the realm of taxonomy". But I think no. Its the standard. Neither lazy nor just getting by sufficient. For many many many species, the majority by far, there is no kanji compound and there never was. That a few have a kanji compound is really not that important, especially as its not "proper" to use that as the name in most cases.
And again, the oceandictionary E-J dictionary lacks even the binomial names. According to your reasoning, this must be because of convention rather than lack of time/effort. This is clearly not true in either case. Wikipedia, which is obviously a more collaborative effort, is usually quite good at including kanji for species names even though they use katakana for the page titles.
So again you've sa
d it lacks binomial names... but actually its basically a list of binomial names as anyone can see by clicking on the link above (warning, set to shift-JIS). And none of the wikipedia pages use kanji as the titles of pages on species because that's non-standard... off the wall... not how its done.
Again ALL? Except none of the dictionaries devoted to actual living creatures...
Which is a great summary of what EDICT is not.
EDICT/ JMDICT is whatever its creators make it... and for now for example, I am pouring in a lot of species names -> and to the best of my ability, the names actually used. Yes "EDICT" is, to the extent of work put in thus far, and to the extent of work to come... a dictionary of living creatures. As well as many other dictionaries. Its not limited to any narrow scope of terminology or dictionary type that I'm aware of. It is whatever contributors wish to contribute. When it has 10,000 species names, and there is no Japanese dictionary online with more fish names... I will safely claim it is the premier Ichthyological dictionary.
Um... our "Target Audience" might be quite a bit more broad than students... as I've already shared how my staff uses the dictionary to create scientific posters which are in turn displayed to the benefit of Japanese tourist from JAPAN... and teaching students to learn non-standard and unusual forms of the language is certainly not beneficial to anyone. However, should a kanji compound exist its inclusion in the list of names for a species would be beneficial to students... as long as they realize that this isn't the preferred name of the species nor preferred way of writing it to the public.
There exists a (uk) tag that suggests to people that they should write it using the kana reading rather than kanji headword. Just as we expect people to use the README to determine what (uk) and all the other tags mean, we might also expect them to learn the extremely simple rule that Japanese biological names are usually written in katakana, even though it may be listed after hiragana in EDICT's readings. In fact, that could even become part of the description of the (uk) tag.
(uk) doesn't tell them which kana script. Nor does it account for the many times when the fish has several Katakana names of Japanese origin and a compound that is rarely if ever used that may match only one of those. (uk) tells people to use the hiragana... which is the most wrong case of all. Nesting additional rules inside of obscure documentation... is not particularly powerful design of a dictionary format. This die-hard insistence that kanji compounds must take priority is baseless, and simply an arbitrary impediment to delivering a more qualified form of data.
For what it's worth, I agree with your suggestions about なかぐろ. Submit the most-dotted version and let parsers figure it out. Dots are a pain.
I see Jim's POV on having both. I can live with either scenario. I just want there to be a rule set down so I know how I'm gonna handle it on the entry side. Once the rule is there, we all live with the consequences and move forward.
|