[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Abbreviations (Was: Combining entries)



On 07/18/2010 08:56 PM, Jim Breen wrote:
> 2010/7/19 Stuart McGraw <smcg4191@frii.com>:
>[...]
> That's a fair point.  Also I've been looking at the places where this may apply.
> There are lots of entries like:
> 
> カンペ /(n) (abbr) (See カンニングペーパー) large sketchbook used .....
> 
> It should be quite feasible to extract a set of these and create SQL
> sequences to convert them into a new style.

A lot could be but ones with multiple xrefs might require some 
manual intervention?

jmdict 1612820   Active   {id:59187}
お早う [ ichi1] ; 御早う
【 おはよう [ichi1] 】
1. [int] [uk,abbr]
  《from お早く》
  ▶ Good morning
  Cross references:
    see: 1002340 お早うございます 1.good morning
    see: 1404975 早い 1.fast;quick;hasty;brisk
    see: ...

>[...]
>> Rather than adding a new element that will later need to be changed
>> again to the more general
>>  <xref type="abbr" seq="nnnnnnn">ナニナニ</xref>
>> (or similar) form, ISTM that it would make sense to make the change
>> to the latter form now.  Either all xrefs could be changed to this
>> form now, or for backward compatibility, it could be used only for
>> the new abbr xrefs with <see> and <ant> remaining but growing a
>> "seq="nnnnnnn" attribute.  (I believe that the common convention in
>> the xml/html world of ignoring unknown attributes would cause this
>> change to introduce at most only a very small amount of backward
>> incompatibility.)
> 
> That makes sense to me. So hold off until JMdict moves to a revised
> DTD and set of entities. (I quite agree about getting the sequence number
> explicitly into the xrefs.)

A couple months ago there was some discussion here about a major revamp
of the DTD in the thread "Changing entities to attribute values" started 
by Glenn Maynard in January
  http://tech.groups.yahoo.com/group/edict-jmdict/message/3561
and revived by you in June
  http://tech.groups.yahoo.com/group/edict-jmdict/message/3753
Is that the revision you are thinking of?  That seems like a pretty big 
undertaking which I presume implies a long time before it comes to pass.

My point about the xrefs change is that it is pretty minor change (or
so it seems to this non-XML-guru) that would be unlikely to have a big
impact on JMdict users (far less I think than the lsource/dialect change
of a year or two ago) so is it necessary to put it off, especially as 
the need for abbr xrefs seems to be significant?