[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Re: EDICT derriviation



Jim et al.,

-On [20100115 10:06], Jim Breen (jimbreen@gmail.com) wrote:
>> It's purpose will be to include a polish translation and probably make it
>> more user friendly. There is actually a better way than using XML. It's
>> called JSON (JavaScript Object Notation - http://www.json.org/) - a
>> light-weight XML alternative with biderectional compatibility. That way the
>> size of the file, and it's compatibility (though XML compatibility and
>> flexibility is already VERY high) could be optimized.

The thing is that the dictionary data files are not meant to be user
friendly. They're (intermediate) data files and as such intended for machine
reading and processing. In this capability XML is one of the best formats
available.

>I don't know much about JSON, but I am aware that it doesn't have the
>richness of XML in handling complex data structures.

Correct.

>There is nothing to stop you passing the JMdict file through a filter
>that turned <gloss xml:lang="foo">xxxxx</gloss> into
><gloss><foo>xxxxx</foo></gloss> before crunching it into JSON.

Converting XML data into JSON is trivial. Like you said there is nothing
stopping you from converting

<gloss xml:lang="de">Wort</gloss>
<gloss xml:lang="es">palabra</gloss>

into something like

{
  "gloss": {
    "de": "Wort",
    "es": "palabra"
  }
}

-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
Care-charmer Sleep, son of the sable Night, Brother to Death, in silent
darkness born...