[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] [rare] tag for obscure kanji?



> An [ofk] tag alone wouldn't entirely solve this problem though.
> ...
> Google N-grams:
> 眼鏡        2228370
> めがね1206845
> メガネ3949144

Can someone remind me where those n-grams are, and their license?

Is it possible to keep them in a column in the jmdict sql database?
(Even better: scaled to a 0.0 to 1.0 range)

And then have the option of a special export that will include them in
the xml export?

I'd love these numbers on all entries, rather than trying to draw an
arbitrary line between uK, rK, ofk, etc.

If the license does not allow it, does it allow end users to download
the data and this merge on their own machine?

Darren