5. |
A 2022-07-02 03:20:06 Marcus Richert <...address hidden...>
|
|
Refs: |
daijr 「平安時代の写本が―する」 |
|
Diff: |
@@ -11,0 +12 @@
+<pos>&vi;</pos> |
4. |
A 2021-03-01 20:02:35 Robin Scott <...address hidden...>
|
|
Comments: |
Thanks for the info. I'll keep that in mind for when I next encounter confusing results. |
3. |
A* 2021-03-01 04:17:27 Jim Breen <...address hidden...>
|
|
Refs: |
KM n-grams:
伝存 368
伝存する 121
伝存し 133 |
|
Comments: |
It's most likely due to imperfections in the old IPADIC morpheme dictionary that Taku Kudoh et al. used in 2007 to generate the n-grams. It probably has 伝存し as a distinct morpheme, and might even have 伝存する. Both IPADIC and Jumandic have an issue with compounds being included as morphemes (日本語 was one of them), which played hell with proper analysis. The compilers of Unidic, which I used to build the KM n-grams, have taken a much stricter line on this.
When I see anomalous n-gram results like this I usually cross-check with the KM ones, as the ratios there are usually more plausible. |
2. |
A* 2021-03-01 00:19:38 Robin Scott <...address hidden...>
|
|
Refs: |
伝存 2596
伝存する 1899
伝存し 1431 |
|
Comments: |
Jim, could you explain why the combined counts for 伝存する and 伝存し exceed those of 伝存? |
1. |
A* 2021-02-27 22:36:12 Jim Breen <...address hidden...>
|
|
Refs: |
GG5, ルミナス |