[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Examples file duplicate id numbers



G'day,

tatoeba thinks of the English and Japanese as first class objects, whereas the tuples are not --- therefore it does not have IDs for the tuples (although I also wish it did).  So if we are importing them into a database, we need to assign our own ID to the tuples.

On 5 April 2010 09:10, Glenn Maynard <glenn@********> wrote:
 

On Sun, Apr 4, 2010 at 8:56 PM, Francis Bond <bond@********> wrote:
> I would prefer that we use a combination of numbers
> jpn-id:eng-id so that every pair gets a unique ID.

I havn't tried to do this yet, but having a static, unique ID
(preferably a single integer, not a tuple) for each entry is
absolutely critical for importing to a database, to have a stable
primary key to link with other tables which will be preserved across
later updates when entries are edited. Similarly, so it's possible to
link entries to each other when updating from one version to the next
(to tell which entries are edits of what). I havn't dealt with this
particular database yet, but it's surprising that it would be missing.

--
Glenn Maynard



--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University