[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] Editoriel Policy - kana order
> It should perhaps be stored as both most-dotted and least-dotted forms,
> but I think we should only have to submit the most-dotted. Seems like
> it would be easier to enforce consistency this way.
Sounds good, but I think it would be good to also make sure the most
common version is there.
E.g. if a word has versions with all of 0, 1 or 2 dots, and the 1-dot
version is most common it would be good if that was in there too, and
listed first. (There is no algorithmic way to turn a 2-dot word in to a
1-dot word.)
(If the 1-dot version is unusual, no need to add it I think.)
But, anyway, I imagine this is a very rare case.
Darren
--
Darren Cook, Software Researcher/Developer
http://dcook.org/gobet/ (Shodan Go Bet - who will win?)
http://dcook.org/work/ (About me and my work)
http://dcook.org/blogs.html (My blogs and articles)