[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Minor WWWJDIC fix.



I often use the Japanese PM's WWW page (http://www.kantei.go.jp)
to test the Translate Words function in its download mode (a pretty
ugly function as it makes no attempt to retain the formatting, etc.)

I tried it yesterday and it failed spectacularly. I realised it was because
the site had changed from Shift_JIS to UTF-8, and I had never got
around to putting a UTF8 conversion into that function.

I have been using a version of Ken Lunde's old "jconv" utility, which
has the advantage that it does its own test of the code used. It is
pre-Unicode, and treats UTF-8 coding as though it were Shift_JIS.

So I dusted off a code-checking utility I wrote in 1993(!) when
Unicode was so new that practically no-one was using it and
when UTF-8 was still called by its old name: UTF-FSS. I have
inserted it into the WWWJDIC code to test the incoming WWW
text, and changed to the standard iconv() utility to convert
everything into EUC, which I use internally. It all seems OK now.
A probably long-overdue fix, as the number of pages in UTF-8
is growing.

Bug fixes aside, I have a few changes I'd like to make when I get
a chance. One is Hendrik's suggestion to add an option to make
the various links open in a new tab. Adding options to the
customization system is a bit messy and I like to batch tem up.
If anyone has other customization options they'd like, now
is a good time to tell me.

Cheers

Jim

-- 
Jim Breen
Adjunct Snr Research Fellow, Clayton School of IT, Monash University
Treasurer: Hawthorn Rowing Club, Japanese Studies Centre
Graduate student: Language Technology Group, University of Melbourne