[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] ID numbers in Edict




On Apr 10, 2010, at 11:34 PM, Jim Breen wrote:

On 11 April 2010 17:55, Paul Blay <blay.paul@googlemail.com> wrote:
> I think I've suggested this before, but how about including the unique
> Entry ID numbers in the Edict2 file? (Or having an Edict3 file if
> necessary)
>
> I think there would be a good 'market' for that addition from all
> those who up to coping with XML but want an easier way of matching up
> Edict Entries between updates than is possible at present.

I could easily do this, as it's currently an option in the utility that makes
the edict2 format. At present it can do it one of two ways:

- simply dumps the sequence number at the end of the file as though it were
a meaning, e.g. 漢字 [かんじ] /(n) kanji/1001000/
- flags it with "EntL", e.g. 漢字 [かんじ] /(n) kanji/ EntL1001000/ (this is the form
used by WWWJDIC.)

Neither is exactly what a developer or user wants to be hit with unannounced,
but I don't really want to make an "edict3" at this stage.

Maybe I could simply pop the number in: 漢字 [かんじ] / (n) kanji/#1001000/ and deal with the flak (if any). That "#" indicates it's some sort of sequence number. I must say I have no idea who uses the "edict2" file, or for what.

Comments, anyone?

Jim



As long as there is a file with a unique number for each entry, I'm a happy camper. Haven't built Ice Mocha 2 yet, but its coming. Its coming. I swear its coming.

That number will make daily updates possible.  Its critical.