[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Entry-wide notes



Thinking out loud about the XML options for entry-wide information field(s), a couple of approaches come to mind.

- we could have a single <info> element within which could go a set of elements of different types, e.g. <lit-trans>, <deriv>, etc.

- there could be multiple  <info> elements, with an attribute indicating the type of each, e.g. <info i-type="lit">,  i-type="deriv", etc.

I think the latter is the more modern approach. 

What I have no feel for is whether there needs to be any internal structure for the information in these elements.

Anyway, just thinking.

Jim


On Sat, 6 Jul. 2019, 7:52 am Jim Breen, <jimbreen@*********> wrote:
In principle an entry-wide note/comment would be a good thing.

Since it would be a relatively significant change to the distributed dictionary file, and quite a few downstream systems would probably need to be updated, it needs some care, as well as providing advance notice.

If the DTD, etc. is to be modified, it would be an appropriate time to do any other changes that we think would be useful. Now is the time to raise them.

Jim


On Fri, 5 Jul. 2019, 1:20 am Marcus Richert superbrightfuture@********* [edict-jmdict], <edict-jmdict@***************> wrote:


Would it be hard to implement entry-wide notes?

I think this has been brought up briefly before, but I wanted to give this issue its own topic. In entry 2573910, I've suggested adding a note saying the expressions とんでもありません and とんでもございません are considered incorrect and that a different form is preferred. This note is about the very construction itself (just like some of our partial ateji-notes) so therefore applies for all 3 senses, despite being inside of just the first one. Normally we don't duplicate these notes but instead hope it's obvious as to what the intended message is, but in many cases, like in this, I think there's an unfortunate amount of ambiguity left in there. I think there's plenty of entries where this holds true, esp. when the note is a comment on the reading or the kanji.

The xml code as I understand it is basically 

<entry>
<k_ele>
kanji stuff
</k_ele>
<r_ele>
reading stuff
</r_ele>
<sense>
<s_inf>contents of not </s_inf>
<gloss>contents of gloss</gloss>
</sense>
</entry>

Would it be so hard to allow for <s_inf> tags to also appear on the parent level without throwing a syntax error, i.e. inside of <entry> instead of just <gloss>? It wouldn't be hard to deal with from the editor side, if you could just do the note above the first sense in the "Meanings" box. 

Best,
Marcus