[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] Keeping track of entries between JMdict releases
Thanks for the answers. I see there are 2192 entries that have been deleted
between my 2015 JMidict version and the latest one. Some of these really
had nothing to do in a dictionary and will probably have no impact on my
users. For the others, I will try to match the k_ele and r_ele with other
entries in order to figure out where a given entry has moved.
This won't be perfect, but it should be good enough. Hopefully the most
common entries (which are the ones people are most likely to study) won't
be affected.
If you know of a better heuristic, please let me know.
Cheers,
Alex.
On Fri, Jan 3, 2020 at 9:39 AM Marcus Richert superbrightfuture@gmail.com
[edict-jmdict] <edict-jmdict@yahoogroups.com> wrote:
>
>
> Alexandre,
>
> Over 40,000 entries have been revised or added since mid-2015. Staying
> with the 2015 version is definitely not doing your users any favors.
>
> On Thu, Jan 2, 2020, 22:58 Alexandre Courbot gnurou@gmail.com
> [edict-jmdict] <edict-jmdict@yahoogroups.com> wrote:
>
>>
>>
>> Hi everyone,
>>
>> This is a question I have raised a few years ago but which has not
>> reached a final conclusion yet as far as I know.
>>
>> I am maintaining a dictionary software that relies on JMdict and allows
>> users to "mark" entries for study and other things. Every time the software
>> is updated (which has not happened in the recent years, but I plan on
>> making a new release soon) I like to ship the latest JMdict data with it..
>>
>> Due to JMdict's dynamic nature, entries get merged, split and deleted,
>> which makes keeping track of "studied" entries across versions challenging.
>>
>> My software currently relies on the ent_seq to keep track of studied
>> entries. The challenge is what to do when, after an update, a given ent_seq
>> is not present anymore.
>>
>> Older versions (around 2010 I think) used to have a comment indicated
>> where a deleted entry has been merged. After it got removed, I started a
>> discussion on this list (thanks Yahoo for purging the archives) about this,
>> and an "audit" tag was proposed to keep track of entries status, but AFAICT
>> it does not exist in recent dictionary versions.
>>
>> So my question is, do we now have a reliable way to track entries that
>> get merged and deleted? It seems to be a pretty reasonable thing to have.. I
>> am trying to consider other ways to identify entries, maybe by serializing
>> their k_ele and r_ele, but I don't expect this method to be reliable enough.
>>
>> Until I can find a good solution, I'm afraid I will have to keep using
>> the same JMdict file as my latest 2015 release to avoid troubling users.
>>
>> Thanks,
>> Alex.
>>
>>
>