[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] Future EDICT/JMdict, etc. maintenance system
[Pawel Szymczykowski (Re: [edict-jmdict] Future EDICT/JMdict, etc. maintenance system) writes:]
>> On 9/1/06, Jim Breen <Jim.Breen@infotech.monash.edu.au> wrote:
>> > I had done a mental MySQL design in which one table could completely
>> > hold about 95% of entries. The rest, i.e. entries with many kanji/kana
>> > variants or lots of glosses, would go into overflows.
>>
>> If you get a chance, could you describe this mental schema in a bit
>> more detail? I think it might help to jumpstart some of the
>> brainstorming on proposed interfaces and how they might relate to the
>> underlying data store.
Well, I had a concept of a basic table as follows:
* entry number
* flag indicating if this entry has been deleted or merged with another
* comment associated with delete/merge
* up to 3 instances of kanji headword, info tags, priority tags
* flag indicating if there are more kanji headwords.
* up to 3 instances of reading text, no-kanji flag, reading restriction,
reading info and reading priority tags.
* flag indicating if there are more readings
* language,dialect & etymology fields
* up to to 3 instances of sense. Each sense consisting of:
kanji restriction
reading restriction
part-of-speech
cross-ref(s)
antonym(s)
domain/field(s)
misc info (tags)
comment field
up to 5 glosses
* flag indicating if there are more senses
At present about 3% of entries have 2 or more senses marked.
Just my thoughts
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学