JMdictDB - Japanese Dictionary Database


Search | Advanced Search | New Entry | Submissions | Help
Login for registered editors
jmdict 2027080 Active (id: 2291429)
べきでは無い [sK] 可きではない [sK]
べきではない [spec1]
1. [exp,adj-i] [uk]
▶ should not
▶ must not


12. A 2024-02-08 22:15:23  Jim Breen <...address hidden...>
11. A* 2024-02-08 20:41:04  Stephen Kraus <...address hidden...>

Google N-gram Corpus Counts
│ べきでは無い │    18,958 │  1.6% │ - add, sK
│ 可きではない │       210 │  0.0% │ - add, sK
│ 可きでは無い │        33 │  0.0% │
│ べきではない │ 1,197,884 │ 98.4% │
│ べきでは無く │       248 │  0.3% │
│ 可きでは無く │         0 │  0.0% │
│ 可きではなく │         0 │  0.0% │
│ べきではなく │    74,530 │ 99.7% │
I didn't read the long discussion in this entry but I don't think we have a problem with these sorts of kanji forms anymore.
@@ -3,0 +4,8 @@
@@ -9,0 +18,2 @@
10. A 2020-04-11 09:27:15  Jim Breen <...address hidden...>
I've responded to Rob directly.
9. A* 2020-04-04 17:04:55  Rob Harwood <...address hidden...>
Thanks for the explanations on your decision. And sorry for the long reply. I was just worried that if I didn't go into the whole thought process, it might seem like I was arguing for a very nit-picky thing, when really for me it's much more of a big-picture kind of question.

With that in mind, I notice that no one has yet responded to the point I made about JMDICT being the data source for many Japanese language learning tools, and even if data may not reflect popularity (as in the example of "(wa) katakana at top"), it may still be worthwhile for the sake of documenting the language itself.

All that said, I respect and accept your decision on this particular entry; but I'm still interested in hearing your thoughts/opinions about this more 'philosophical' point about how JMDICT should (or should not) attempt to support language learners, rather than just reflect current-day usage.

Is such discussion appropriate in these comments or is there another/better way I can ask about this kind of thing?

Jim, I'll look for your email address and try to email you. Thanks for the offer. [D'oh! I see it now at the top of the confirmation page.]
8. A 2020-04-03 06:24:04  Jim Breen <...address hidden...>
Thanks, Marcus. I think that covers it, and I'll close this off now. I don't mind having the kanji form shown on the basic forms (GG5 has 可き in the べき entry), but I do think it is inappropriate to have them in the virtually-always-kana expressions.
Re この/此の/etc. I see Daijirin and GG5 have the 此の, but the other JEs don't.
Marcus: I've added a GitHub issue for [rare]
Rob: if you want to find out more about the n-grams, email me. They are not the counts you get from Google searches, which are almost useless.
@@ -4,3 +3,0 @@
@@ -13 +9,0 @@
(show/hide 7 older log entries)

View entry in alternate formats: jel | edict | jmdict xml | jmnedict xml | jmdictdb xml
jmdict 2027080 Rejected (id: 2063320)
べきではない [spec1]
1. [exp] [uk]
▶ should not
▶ must not

6. R 2020-04-03 04:40:09  Marcus Richert <...address hidden...>
Rejecting fork
5. A* 2020-04-02 08:56:10  Rob Harwood <...address hidden...>
Google n-gram counts for various searches:
"可きではない"	108
"冖冠"		96
"ワ冠"		105

可きではない	124
冖冠		128
ワ冠		121 Parsing issues:
A reply to the comment. (I have quite a bit to say in reply. I'll try to keep it as brief as possible. If such lengthy replies don't belong in these kinds of edit comments, please let me know. I've included my email address if that's useful.)

1. I added a [uk] tag to mirror the same in 可き. That should at least notify that the kanji version(s) are less common.

2. I attempted to replicate your Google n-grams, but I'm not sure of the correct procedure, though I searched around for it. It seems it's just page counts? I assume with 'duplicates omitted', and you go to the last page of the search to get most accurate estimates? If I'm "doin' it rong", please point me to the correct procedure.

  2.1 I could not replicate your exact n-gram counts. Perhaps it depends on prior search history or something. So, instead I just used the same procedure each time, and hopefully my relative counts will give a better-than-nothing data point.

  2.2 I wasn't confident with the counts with 'raw' searches, so I include primarily results with "quotation" marks around them, and 'raw' results underneath. Hopefully the quoted searches provide more accurate results. The counts without quotes were larger, but not wildly out of proportion.

  2.3 There are slight differences between the counts at the top of the page, e.g. "... about 108 results", and at the bottom of the page, e.g. "... we have omitted some entries very similar to the 110 already displayed." For simplicity, since the differences are small, and the top-of-page counts are more obvious, I've used the top counts.

3. I did a simple search for "beki" on (which relies on JMDICT), and it converted the romaji into a search for べき. A direct search for べき returns identical results. I chose this search because that's how I've often used; it seemed natural.

  3.1 The first result was of course 可き. Result #6 was 冖冠, and result #11 was べきではない. A similar search on JMDICT itself gave a different order: #3 冖冠, #12 可き, and #17 べきではない. Not sure why this was the order, just reporting it.

A search for べき (on results in 可き, 冖冠 (kanji "wa" radical at top (radical 14)), and べきではない with roughly equal footing (all in top 15 results), though with べきではない below 冖冠, even on the JMDICT search, even though both 可き and べきではない are vastly more 'popular' than 冖冠, which is a fairly obscure Japanese language terminology (AFAICT).

So, I thought, "How does 冖冠 stack up in Google n-gram results?" Turns out that it's about even with 可きではない. Roughly 100 page counts with duplicates omitted.

That clearly says to me that while Google n-grams will be a useful heuristic, there are clearly additional considerations other than mere internet popularity that factor into whether it's worth including an entry or not. For example, including an entry about the 'wa' radical in a kanji is relevant to the Japanese language itself, so I conclude that -- even though 冖冠 has a low Google n-gram -- considerations of an entry's relevance to the Japanese language itself can overcome that rough rule-of-thumb.

But, just to be certain I wasn't jumping the gun, I noticed that the entry for 冖冠 has a 'see also' reference to ワ冠. Perhaps 冖冠 is obscure, and ワ冠 is actually the more common terminology? But no, as the n-gram counts in the References above show, it's actually nearly identical to 冖冠, and thus to ​可きではない also.

What about 可きではない? Does it have any relevance to the Japanese language itself? I have two lines of reasoning that it does:

1. The first entry on the Google search for "可きではない" is for a book from 2000 titled 徳田秋声全集 (Complete Works of Shūsei Tokuda). 徳田秋声 lived from 1 February 1872 – 18 November 1943, so that seems to me that he would still be considered relatively 'modern', to fit with JMDICT's goal to reflect modern Japanese. According to the English Wikipedia, several of his novels were made into movies. On of his short stories was included in "The Columbia Anthology of Modern Japanese Literature" (2005), and another in "Modern Japanese Stories: An Anthology" (originally 1962, but with several reprints and a second edition, latest published 2005) by Ivan I. Morris (who himself has a Wikipedia page and academic credentials, etc.).

So, this is at least one case (literally the first Google result) in which 可きではない appears in culturally/historically significant modern (according to scholars at Columbia University) Japanese literature/writing. Since several other of the results are also physical books, I think that may be what's skewing the n-gram so low. While it might not be very common in present day usage, on the Internet especially, it was used in modern literary works whose contents barely show up on Google searches. And most of those results will be marked as 'duplicates' by Google, yet these are popular books, earning themselves many reprints even from the year 2000 and beyond. Clearly, this usage is not so rare as the Google n-gram heuristic at first makes it appear.

I hope that helps to dispense with the Google n-gram objection. More important (IMHO) though is my second point.

2. A major purpose of any Japanese-English dictionary, or especially a Japanese-Multilingual dictionary, is to aid people who are not only using the dictionary as a static tool of reference -- the way most native English speakers would use an English dictionary, or a fluently bilingual person might need to look up the odd word here or there -- but as a dynamic tool to aid in *learning* Japanese from the standpoint of a native English (or whatever language) speaker.

To this end, it makes sense to include 'not so common on the Internet with native Japanese speakers these days' information that nevertheless reflects the structure and consistency of the Japanese language itself, thus to make it 'visible' to the confused beginner and intermediate learners.

The word 可き happens to be one of those particularly confusing words from the standpoint of someone mostly only familiar with English grammar, idioms, and vocabulary. Especially in the way that it connects with the words around it. I know it's confusing, because it's been a stumbling block for me for weeks, more so than most other grammar/vocabulary points I've been able to work through with relative ease.

I use (again, based largely on JMDICT data) more as a User's Manual than as a mere dictionary. It helps me find the similarities between words and kanji that are obscured 'in the wild' where all sorts of Internet idioms used by native speakers would be eternally mysterious to me if I didn't have a tool like to help me parse out the meanings, and see the connections between kana-from-the-wild and the corresponding kanji in my 'User's Manual'.

As it stands, itself runs into stumbling blocks when one tries to learn about 可き and how to use it properly (regardless whether with kanji or kana). I will give you a perfect example: uses the JMDICT entries and part-of-speech metadata to help you understand a text fragment (word, phrase, sentence, whatever) by pasting in the entire text fragment into the search box and it will do its best to parse the text, classify each word or expression in the sentence, and allow you to click on each parsed word to see what it means in context. So, in order to help me understand 可き better, I wanted to learn how it connects with ではない grammar (and not, for example, がない or some other idiom).

So, I searched on for "可きではない", as seems very natural to do. But instead of getting the actual entry for "べきではない", as you'd expect, instead Jisho doesn't find that entry (because the kanji is not included in that entry) so it tries to parse-out this text fragment as best it can. The results are quite discouraging. Link to this example in References above. It splits up the text like this:

可 き で は ない 

In other words, it is forced to interpret the 可 kanji as か ('possible') instead of being べ as part of べき. This confusion is no doubt caused by the other characters ではない being forced to be interpreted as で, は, and ない, rather than as part of the idiomatic way that べき joins with ではない instead of something like がない.

Likewise, if we try to find an example sentence using 可きではない, a quick google search turns up that Shūsei Tokuda book, with a handy text fragment "躊躇す可きではないが". Sadly, this extra context doesn't seem to do the parser any good. It botches the parsing as:

ちゅうちょす	か 
躊躇す		可 き で は ない が 

The same issue arises: The べきではない entry simply *lacks* the crucial piece of information that it *can* be written as 可きではない.

But, critically, it's not that 可きではない is all that popular. In fact it has probably fallen out of current usage, for the most part. But the more important point, which is the entire point I'm trying to make here with point #2, is that 可きではない is *correct* Japanese. So, from my perspective as a Japanese learner, I would wish for my 'User's Manual' to be able to recognize this correct Japanese and tell me what it means without sending me on a wild goose chase trying to figure out what the connection to 可 (possible) has to do with anything, and why き is marked as an unknown noun (perhaps 'tree', perhaps 'spirit', who knows?).

Summing Up

Now, this may seem like a lot of hullabaloo about one measly entry in JMDICT. What's the big deal?

Well, as I mentioned in my initial comment, this is just *one* example of where a minor touch-up to the JMDICT data could help others who come after me to not have to trip up and get confused and run around in circles. There are many other examples I've run across over the past couple of months, and I held off trying to make any edits, since it seemed a little too risky at the time (not knowing much about the innards of JMDICT at the time); didn't want to mess anything up. But this example with 可きではない seemed like a perfect little test-case to try out a small edit that would smooth out the learning experience for anyone else following a similar learning path.

And just today, I was trying to understand the system of こんな, そんな, あんな, どんな, and how they connect to related words like こういう, etc., and I found yet more instances where the JMDICT entries are simply incomplete (from a learner's point of view), and where 'current popularity on the Internet' should probably not be the main guide as to whether some minor changes are accepted/rejected, but rather whether those changes would improve the quality of all the various tools used by a Japanese language learner (not just, but several other learning sites I'm using as well), that all inevitably depend on the JMDICT data itself.

I finally understand that こんな system, but it took quite a while, and it really didn't need to if the JMDICT entries just had a little additional information, and the correction of a few inconsistencies between all the different related entries. All of which I'd feel much more confident to try to fix if this particular 可きではない test proves worthwhile.

So, I apologize for this huge comment. But it seems like an important enough point, that I felt it necessary to justify it at some length: that dictionaries, especially language-to-language dictionaries, aren't just about 'what's being used most frequently', but also about being language-learning toolkit/documentation/user-manual for language learners.

Hopefully this doesn't annoy the heck out of ya. If you got this far, thanks for your patience and time. Cheers! And take care these days.
@@ -12,0 +13 @@
4. A* 2020-03-31 03:16:11  Rob Harwood <...address hidden...>
A quick google search for "可きではない" found 6000+ hits. I briefly scanned a few to make sure I didn't mess up the search, and they were legitimate.
Just adding a kanji version, to parallel 可き. This is my first edit, so if there are any problems, please let me know. I've been tempted to make some similar edits in the past, but wasn't sure about it, so this is a test for me. 

The motivation to add this kanji reading is that I'm learning kanji via different sites that indirectly use JMDICT data, and having these additional kanji 'readings' would assist in this learning process.
@@ -3,0 +4,3 @@
3. A 2017-02-10 11:56:08  Jim Breen <...address hidden...>
2. A* 2017-02-10 11:38:26  Johan Råde <...address hidden...>
G n-grams 1197451
@@ -5,0 +6 @@
(show/hide 1 older log entries)

View entry in alternate formats: jel | edict | jmdict xml | jmnedict xml | jmdictdb xml