[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] Abbreviations (Was: Combining entries)
On 07/18/2010 08:56 PM, Jim Breen wrote:
> 2010/7/19 Stuart McGraw <smcg4191@frii.com>:
>[...]
> That's a fair point. Also I've been looking at the places where this may apply.
> There are lots of entries like:
>
> カンペ /(n) (abbr) (See カンニングペーパー) large sketchbook used .....
>
> It should be quite feasible to extract a set of these and create SQL
> sequences to convert them into a new style.
A lot could be but ones with multiple xrefs might require some
manual intervention?
jmdict 1612820 Active {id:59187}
お早う [ ichi1] ; 御早う
【 おはよう [ichi1] 】
1. [int] [uk,abbr]
《from お早く》
▶ Good morning
Cross references:
see: 1002340 お早うございます 1.good morning
see: 1404975 早い 1.fast;quick;hasty;brisk
see: ...
>[...]
>> Rather than adding a new element that will later need to be changed
>> again to the more general
>> <xref type="abbr" seq="nnnnnnn">ナニナニ</xref>
>> (or similar) form, ISTM that it would make sense to make the change
>> to the latter form now. Either all xrefs could be changed to this
>> form now, or for backward compatibility, it could be used only for
>> the new abbr xrefs with <see> and <ant> remaining but growing a
>> "seq="nnnnnnn" attribute. (I believe that the common convention in
>> the xml/html world of ignoring unknown attributes would cause this
>> change to introduce at most only a very small amount of backward
>> incompatibility.)
>
> That makes sense to me. So hold off until JMdict moves to a revised
> DTD and set of entities. (I quite agree about getting the sequence number
> explicitly into the xrefs.)
A couple months ago there was some discussion here about a major revamp
of the DTD in the thread "Changing entities to attribute values" started
by Glenn Maynard in January
http://tech.groups.yahoo.com/group/edict-jmdict/message/3561
and revived by you in June
http://tech.groups.yahoo.com/group/edict-jmdict/message/3753
Is that the revision you are thinking of? That seems like a pretty big
undertaking which I presume implies a long time before it comes to pass.
My point about the xrefs change is that it is pretty minor change (or
so it seems to this non-XML-guru) that would be unlikely to have a big
impact on JMdict users (far less I think than the lsource/dialect change
of a year or two ago) so is it necessary to put it off, especially as
the need for abbr xrefs seems to be significant?