[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] [rare] tag for obscure kanji?
I am one of the Kyoto people.
We have newer (and larger) 10B web corpus and I can compute n-gram statistics for it.
The thing is I am not sure how to match JMDict entries with JUMAN/Juman++ analysis results.
JMDict is not a morphological analysis dictionary as Unidic/Jumandic/IPAdic are and there are multiple many-to-many matchings possible. Phrases are problematic as well. We also don't really do disambiguation of hiragana.