[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] PoS vs-i issues



On 07/16/2010 08:34 PM, Jim Breen wrote:
> On 17 July 2010 02:53, Jean-Luc Léger <reiga@dspnet.fr.eu.org> wrote:
>> By the way, I see many verbal expressions with only (exp). Do we agree
>> that they should also have a vxxx tag ?
>> If so, I will build a script to extract every one that need a tag and
>> which tag it need (from the one put on the entry of the verb alone).
>> That could be a good occasion to think about a bulk updater for JMDictDB,
>> if you don't already have one :D
> 
> If anyone can do that sort of extraction, it's JL, who has been a quiet
> and significant contributor in the background, and has scripted some
> significant checks and cleanups of errant entries.
> 
> Something like:
> 
> nnnnnnn  v5k
> mmmmm v1
> ...
> 
> would be a great help. We are not quite into bulk updating scripts
> yet, but they are certainly feasible.

Depending on the nature of the update, a simple (or not so simple)
SQL statement is often all that is needed.  When more logic is needed
the jmdictdb python api is intended to be easy to use.  Api documentation 
is still a problem though.  A list like Jim proposed above could be 
easily converted to a list of SQL insert commands needed to perform 
the update.  The only complicating factor I can think of is if some 
of the entries had pending edits but that could probably be fixed
up in advance of doing the update.  A database makes this kind of 
stuff pretty easy usually.