[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Meta Information in Examples



As you have probably noticed along with the English sentences in the
Tanaka Corpus are a number of notes given in square brackets.

I have typed up an introduction to their present usage in the JMDICT wiki
http://www.edrdg.org/wiki/index.php/Talk:Tanaka_Corpus#Meta_information_in_the_corpus

I have suggested some changes to the way these tags are handled to
Jim.  This post is to let people know what's happening and
to allow comments / discussion before anything goes in.

At present all the tags go at the end of the English line.
I have suggested moving those that apply to the Japanese text on to the
end of the Japanese line.  (All those [M] and [F] tags for a start).
A number of tags apply to both languages and so I think should go
on both sides.

Very few tags apply to the English line only (as we haven't really been
thinking about that) but if this goes ahead you might expect a gradual
increase in them.  For example where the English line contains archaic
or old-fashioned English.  Remember that the Tanaka Corpus was originally
aimed more at speakers of Japanese learning English than vice versa,
this could be an opportunity to make it more user-friendly for them.

Secondly I have suggested that the meta information could be left off
the in-line examples, to save screen space and reduce clutter.

Lastly I think it might be good if the customization screen
included an option to show / hide meta information.
Are there people out there who find them distracting / annoying?