20. |
A 2024-02-07 00:58:00 Stephen Kraus <...address hidden...>
|
|
Refs: |
おんな〈の/ノ〉〈子/こ/コ〉
╭─ーーーーー─┬─────────╮
│ おんなの子 │ 15,253 │
│ おんなノ子 │ 20 │
│ おんなのこ │ 124,421 │
│ おんなのコ │ 3,647 │
╰─ーーーーー─┴─────────╯ |
|
Comments: |
Had a typo in my search pattern (おんあ instead of おんな). No big difference. |
19. |
A 2024-02-06 23:22:38 Stephen Kraus <...address hidden...>
|
|
Refs: |
〈女/おんあ/オンナ〉〈の/ノ〉〈子/こ/コ〉
Google N-gram Corpus Counts
╭─ーーーーー─┬────────────┬───────╮
│ 女の子 │ 14,136,830 │ 97.0% │
│ 女のコ │ 263,573 │ 1.8% │
│ オンナの子 │ 35,514 │ 0.2% │ - add, sK
│ 女のこ │ 19,098 │ 0.1% │ - add, sK
│ 女ノ子 │ 770 │ 0.0% │
│ 女ノコ │ 286 │ 0.0% │
│ オンナノ子 │ 49 │ 0.0% │
│ オンナノコ │ 102,435 │ 0.7% │
│ オンナのコ │ 21,350 │ 0.1% │
│ オンナのこ │ 517 │ 0.0% │
│ おんあのこ │ 45 │ 0.0% │
╰─ーーーーー─┴────────────┴───────╯ |
|
Diff: |
@@ -11,0 +12,8 @@
+<ke_inf>&sK;</ke_inf>
+</k_ele>
+<k_ele>
+<keb>オンナの子</keb>
+<ke_inf>&sK;</ke_inf>
+</k_ele>
+<k_ele>
+<keb>女のこ</keb> |
18. |
A 2022-08-16 23:33:00 Robin Scott <...address hidden...>
|
|
Refs: |
女の子 14136830 98.2%
女のコ 263573 1.8% |
|
Diff: |
@@ -11,0 +12 @@
+<ke_inf>&sK;</ke_inf> |
17. |
A 2022-03-12 03:54:27 Jim Breen <...address hidden...>
|
|
Comments: |
I think there are many cases where absolute values are important, but I sort-of agree that it's not so important in these sorts of cases. I'll add it to my work-around file for WWWJDIC glossing. |
|
Diff: |
@@ -18,4 +17,0 @@
-</r_ele>
-<r_ele>
-<reb>オンナノコ</reb>
-<re_nokanji/> |
16. |
A* 2022-03-11 13:55:09 Robin Scott <...address hidden...>
|
|
Comments: |
I don't think absolute values for variant forms are that meaningful when it comes to common words. With a threshold like 10k hits, we'd end up including all sorts of rubbish that most people would very rarely (if ever) encounter. If text glossing considerations are the main sticking point here, then we should implement hidden kanji/reading fields. Users don't need to be shown オンナノコ. |
(show/hide 15 older log entries)
|
15. |
A* 2022-03-11 05:27:51 Jim Breen <...address hidden...>
|
|
Comments: |
I think the best approach is to use a combination of the proportion of forms and the absolute numbers. Having a form out there like オンナノコ with over 100k usages in WWW pages and not having any mention of it in the entry would be a mistake IMNSHO. Having a rule-set such as "either 2% or more of surface forms, or 10k or more in the n-gram counts" would be better.
WWWJDIC will catch オンナノコ for regular lookups because for kana-only cases it maps おんなのこ and オンナノコ together. It won't catch it for text-glossing because it doesn't fold the kana versions together there (for good reason). I'd certainly like to be able to handle text containing オンナノコ correctly without having to do some quite messy reprogramming. |
14. |
A* 2022-03-11 03:25:45 Marcus Richert <...address hidden...>
|
|
Comments: |
Yes, I agree with Robin. I've said this plenty of times before - it doesn't make sense to just look at the raw numbers, or all our P-tagged entries should be horrible messes with lots of different and rare versions. There's a balance we need to strike between presenting easy-to-read entries, and trying to include absolutely everything.
It's also worth noting that Yomichan for example has no problem whatsoever displaying the correct entry when presented with オンナノコ, even if that reading isn't included in the entry. (jisho falters, but that's something I think Kim could easily fix if it was brought to his attention) |
13. |
A* 2022-03-11 00:41:12 Robin Scott <...address hidden...>
|
|
Comments: |
I don't think it should be included. We generally don't include katakana forms unless they account for a significant proportion of the total hits. With less than 1% of the hits for 女の子, オンナノコ doesn't pass that test. I don't think it's helpful to have obscure katakana forms. |
12. |
A 2022-03-10 09:38:58 Jim Breen <...address hidden...>
|
|
Comments: |
100k is a lot of hits. Worth having. |
|
Diff: |
@@ -17,0 +18,4 @@
+</r_ele>
+<r_ele>
+<reb>オンナノコ</reb>
+<re_nokanji/> |
11. |
A 2022-03-10 05:41:44 Marcus Richert <...address hidden...>
|
|
Refs: |
女の子 14136830
女のコ 263573
女の児 1935
おんなのこ 124421
おんなのコ 3647
オンナノコ 102435 |
|
Comments: |
even 女のコ is barely needed, gets less than 2% of the total hits, on the level of what I've been applying [rK] too. オンナノコ gets less than 1%, so I'm not really seeing a strong case for including it. |
|
Diff: |
@@ -13,3 +12,0 @@
-<k_ele>
-<keb>女の児</keb>
-</k_ele>
@@ -18,2 +14,0 @@
-<re_restr>女の子</re_restr>
-<re_restr>女の児</re_restr>
@@ -24,7 +18,0 @@
-<r_ele>
-<reb>おんなのコ</reb>
-<re_restr>女のコ</re_restr>
-</r_ele>
-<r_ele>
-<reb>オンナノコ</reb>
-</r_ele>
@@ -34,2 +22,2 @@
-<xref type="see" seq="1420010">男の子・おとこのこ・1</xref>
-<xref type="see" seq="1420010">男の子・おとこのこ・1</xref>
+<xref type="see" seq="1420010">男の子・1</xref>
+<xref type="see" seq="1420010">男の子・1</xref>
@@ -43,2 +31,2 @@
-<xref type="see" seq="1420010">男の子・おとこのこ・2</xref>
-<xref type="see" seq="1420010">男の子・おとこのこ・2</xref>
+<xref type="see" seq="1420010">男の子・2</xref>
+<xref type="see" seq="1420010">男の子・2</xref> |
10. |
A* 2022-03-10 03:47:10
|
|
Diff: |
@@ -26,0 +27,3 @@
+</r_ele>
+<r_ele>
+<reb>オンナノコ</reb> |
9. |
A 2020-06-18 15:36:46 Marcus Richert <...address hidden...>
|
|
Refs: |
女の子 14136830
女のコ 263573
女の児 1935 ← do we need this? |
|
Diff: |
@@ -12 +11,0 @@
-<ke_pri>spec1</ke_pri>
@@ -28 +26,0 @@
-<re_pri>spec1</re_pri> |
8. |
A 2020-06-18 01:06:26 Marcus Richert <...address hidden...>
|
|
Comments: |
Bizarre. How did that make it in, and
stay undetected for this long? Thanks
for noticing it. |
7. |
A* 2020-06-17 14:06:22 Frazer Robinson <...address hidden...>
|
|
Comments: |
What is the source for the おんにゃのこ reading? It sounds like it is how this word is pronounced when someone is mimicking a cat
speaking.
I think it should be removed. It doesn't appear in the Kotobank entry. |
|
Diff: |
@@ -30,4 +29,0 @@
-<r_ele>
-<reb>おんにゃのこ</reb>
-<re_nokanji/>
-</r_ele> |
6. |
A 2020-06-03 01:40:31 Rene Malenfant <...address hidden...>
|
|
Diff: |
@@ -36,0 +37 @@
+<xref type="see" seq="1420010">男の子・おとこのこ・1</xref>
@@ -43,0 +45 @@
+<xref type="see" seq="1420010">男の子・おとこのこ・2</xref> |
5. |
A 2020-06-03 01:39:48 Rene Malenfant <...address hidden...>
|
4. |
A* 2020-05-22 23:47:01 Jim Breen <...address hidden...>
|
|
Refs: |
Daijirin: (1) 女である子供。女児。(2) 俗に、若い女性。
GG5: (1)〔少女〕 a girl; 〔娘〕 a daughter; 〔女の赤ちゃん〕 a baby girl. (2)〔若い女性〕 a young woman; a girl.
中辞典: 〈娘〉 a daughter; 〈若い女〉 a girl; 〈子供〉 a (little) girl; 〈赤ん坊〉 a baby girl |
|
Comments: |
I don't think it's a separate sense - it's more an indication of the breadth of the main sense. I'd prefer to follow the references on this. |
|
Diff: |
@@ -38,4 +37,0 @@
-</sense>
-<sense>
-<pos>&exp;</pos>
-<pos>&n;</pos>
@@ -42,0 +39 @@
+<gloss>baby girl</gloss> |
3. |
A* 2020-05-21 18:42:14 Frazer Robinson <...address hidden...>
|
|
Comments: |
I feel like having "girl" and "daughter" together makes it look like it mainly refers to someones daughter, rather than any young female in any context.
Feels like two separate senses to me. |
|
Diff: |
@@ -37,0 +38,4 @@
+</sense>
+<sense>
+<pos>&exp;</pos>
+<pos>&n;</pos> |
2. |
A 2015-10-08 12:23:10 Jim Breen <...address hidden...>
|
|
Refs: |
Daijr, GG5 |
|
Diff: |
@@ -37,0 +38,6 @@
+<gloss>daughter</gloss>
+</sense>
+<sense>
+<pos>&exp;</pos>
+<pos>&n;</pos>
+<gloss>young woman</gloss> |
1. |
A* 2015-10-07 13:56:50 luce
|
|
Diff: |
@@ -34,0 +35 @@
+<pos>&exp;</pos> |