[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] RESTful interface for wwwjdic.com

To: edict-jmdict@***************
Subject: Re: [edict-jmdict] RESTful interface for wwwjdic.com
From: Jim Breen <jimbreen@*********>
Date: Sun, 24 Jul 2011 10:51:48 +1000

Hi Mark (an everyone)

"techy nature" warning noted, and there are rather techy bits in the reply.

On 24 July 2011 09:13, Mark Burns <markthedeveloper@googlemail.com> wrote:
> I've been on this list for sometime, but had my messages filtered in gmail so not really kept up-to-date with the progress of things.
> This is a rather technical topic, so be pre-warned for those of you not of a techy nature.

Just what I need on a wet Sunday morning.

> I did a quick search of the list and didn't find anything about this topic though. So here goes.
> One thing I notice when using wwwjdic.com is that it doesn't seem to follow the principles of RESTful design.
> Wikipedia:
> http://en.wikipedia.org/wiki/Representational_State_Transfer

It sure doesn't. Not surprising really - wwwjdic began in 1998, two years before
Roy Fielding proposed RESTful designs.

> and from Fielding's research:
> http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

> I notice that firstly we get a redirect, then secondly, the requests are POSTed to the server.

The redirect is because wwwjdic.com is a domain name (actually owned
by a friend,
who set up the redirect.)

> I think that a nicer architecture would be a GET on a search result resource. Which would correspond to showing that resource.

I agree. The history of it goes like this:

- when I first decided to throw together a WWW-based dictionary system (there'd
been two before including Jeffrey Friedl's Perl one which William Maton looks
after now) I hunted for advice on how to do it. Such information was sparse back
then, but a couple of sources advised on using POST rather than GET because some
browsers had quite severe limits on the overall length of URL strings.
So that's what I
did.
- I quickly added an API-like "backdoor" which does use GET, e.g.
http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?1MUJ%E8%87%AA%E6%83%9A%E3%82%8C%E3%82%8B
- I eventually realized that the POST/GET decision was not a great
one, and that I should
try and fix it up, but I have never got around to it. I'd need to do
it in a way that didn't
upset the backdoor method, as there is a heap of sites and smartphone apps out
there using it.

> What this would mean for the site is that you can just go to for example.
> wwwjdic.com/魚 or wwwjdic.com/fish

Um, er...

> Or if you prefer, wwwjdic.com/search/魚

Even that has problems. Search what? The kanji dictionary? One of the 26
dictionaries available? Words starting-with/containing?  Coded how?

Something more workable would be: wwwjdic.com/?key=魚&st=1&dic=1&posn=any&...

> I chose the top level there because it's probably the most common use-case for the site.

Most common, yes, but probably only about 35% of accesses.

> I'm a Ruby-on-Rails developer and you get this kind of routing stuff almost for free, but I believe wwwjdic.com is a CGI site?

Absolutely. See:
http://www.csse.monash.edu.au/~jwb/wwwjdicinf.html#techbits_tag

Or to be more precise, it uses vanilla old-fashioned HTTP GET/POST interactions.

> I'm not experienced with that stuff so I don't know how hard this stuff would be to implement.
> It would be nice at least to have a URL that you can go straight to with a GET request to get a search.

As I mentioned, it's really there already . That "wwwjdic.cgi?1MUJ%E8%87%AA..."
I quoted above does that - the 4-letter preamble just says what options are to
be used (1 = EDICT, M = backdoor method, U=utf8 coding, J=Japanese key).

The backdoor also has a "raw output" option (1ZUJ%E8%87%AA...) where
the results come back in a minimal page. Phone apps such as "WWWJDIC Android"
use it, as it saves them having to scrape the results from the full pages.

> You then also get the opportunity to do browser caching and responding with a 304 Not modified header.
> I also think it would be great if the search worked for Japanese or English rather than having an additional checkbox.
> This is the kind of thing that the server can infer.

Actually the checkbox is only relevant for romaji keys. It needs to be
told whether it
is the (English) word "kimono" or きもの as the search results are quite
different.

> Anyway, these are technical reflections, but also based on usability. I find that I often need to briefly look a word up, and find it
> a little frustrating going to the interface, and waiting for the redirect, then checking a box, and inputting my search followed by
> submitting the form. It would be great to just type in a predictable URL in my address bar to get a result.

Well:

(a) avoid wwwjdic.com. It's just a redirect to
http://www.edrdg.org/cgi-bin/wwwjdic/wwwjdic?1C  Use that site natively, or any
of the other 5 sites.

(b) use the backdoor for fast access. The really simple way to do it is
to install the toolbar buttons -
http://www.csse.monash.edu.au/~jwb/wwwbuttongen.html
That's how I do the vast majority of my wwwjdic accesses. Since I use the
mouse-oriented X-windows, often it's just a case of highlighting the text and
clicking the button.

> And Jim if it's a lot of work for you, then I'd gladly write an alternative in Rails if you wanted to run it on your servers. I don't think
> that it would take a great deal of work for me to do.

Sorry if my comments above seem rather defensive. It's more an explanatory
narrative, and some advice on how to make the best use of what's there.

Somewhere down my to-do list is the complete rewrite of wwwjdic to use
a highly RESTful AJAXy approach with a user-configurable CSS-driven
presentation. I doubt I'll get there:
- I have years of work ahead on another major project
- I'm now well into my 7th decade (wwwjdic started when I was just over 50)
and my best programming days are well behind me.
- after 13 years wwwjdic is so full of bells and whistles that I'm daunted
by the thought of starting again

I have considered putting the wwwjdic code up on sourceforge or
googlecode and saying "do what you want with it", but I'm shy about its
convoluted nature and undocumented state. Maybe I'll do that when
I'm not going to be around to answer questions. Certainly I ask people
like William
Maton, who looks after the edrdg.org site, to make sure it is available if
I fall off the perch.

ANYONE keen to do a RoR rewrite of wwwjdic is welcome to do so, and
they are welcome to a copy of the code on a don't-release-it-yet basis. If
it's not too much overhead (load, installation, maintenance) it could
run on the
edrdg.org site (I don't know what the back-end database implications
for RoR are. WWWJDIC uses simple text files with ISAM-like token-level
indexing which is
light and fast.)

I'd better get back to what I should be doing.....

Cheers

Jim

--
Jim Breen
Adjunct Snr Research Fellow, Clayton School of IT, Monash University
Webmaster: Hawthorn Rowing Club, Treasurer: Japanese Studies Centre
Graduate student: Language Technology Group, University of Melbourne

Follow-Ups:
- Re: [edict-jmdict] RESTful interface for wwwjdic.com
  - From: Mark Burns <markthedeveloper@**************>

References:
- RESTful interface for wwwjdic.com
  - From: Mark Burns <markthedeveloper@**************>

Prev by Date: RESTful interface for wwwjdic.com
Next by Date: Re: [edict-jmdict] RESTful interface for wwwjdic.com
Previous by thread: RESTful interface for wwwjdic.com
Next by thread: Re: [edict-jmdict] RESTful interface for wwwjdic.com
Index(es):
- Date
- Thread