[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Errors in cross-references
Hi again,
while I was trying to resolve cross-references in JMdict, I found three
references referring to JMnedict entries:
1. <xref>フィレオフィッシュ</xref> in seq 2137630
2. <xref>NHK</xref> in seq 2176930
3. <xref>タミフル</xref> in seq 2648970
(That would be OK if there was a way to mark them as referring to
entries in another dictionary. However, as there are only three such
cross-refererences, it seems more likely that it's just an error.)
Additionally, there are two references that are repeated twice within
the same entry and sense:
1. <xref>豚トロ・とんトロ・1</xref> is repeated twice in seq
2677780 (and the sense number is unnecessary)
2. <xref>無線呼出符号・1</xref> is repeated twice in seq 2827567
(and the sense number is unnecessary)
I have also noticed that there are about 4000 cross-references that
specify a reading although the target entry has only one.
Last but not least, in the references to the following entries, the use
of the centre can be misleading for the parsing software (and the DTD
disallows it: "The target keb or reb must not contain a centre-dot."):
1. シルキー・シャーク
2. シンガポール・スリング
3. カーゴ・スリング
4. ベイビー・スリング
5. マヌカ・ハニー
6. タックス・ヘイブン
Maybe the best solution would be for the xref to contain only
XML-structured information (seq, type and optional restriction to a
particular sense/kanji/readings). As for the restriction to
kanji/readings it could be done in much the same way senses can now be
restricted using stagr/stagk along these lines:
<!ELEMENT xref ((xtagk*, xtagr*)|xtags*)>
-- or if only one kanji/reading/sense is enough: <!ELEMENT xref
((xtagk|xtagr|(xtagk, xtagr)|xtags)?)>
<!ATTLIST xref seq CDATA>
<!ATTLIST xref type CDATA #IMPLIED>
<!-- Type of cross-reference, implied value "see".>
<!ELEMENT xtagk (#PCDATA)>
<!ELEMENT xtagr (#PCDATA)>
<!-- These elements, if present, indicate that the cross-reference is
restricted
to the lexeme represented by the keb and/or reb of the entry identified
by xref's
seq attribute. -->
<!ELEMENT xtags (#PCDATA)>
<!-- These elements, if present, indicate that the cross-reference is
restricted
to particular senses (represented by their numbers) of the entry
identified by
xref's seq attribute. -->
--
Adam Nohejl