[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [edict-jmdict] Diacritics in lsrc
On 6 July 2010 13:04, Ben Bullock <benkasminbullock@gmail.com> wrote:
on now.)
> It looks like you've already solved this problem, but if you're having
> trouble running Python on your computer, here is a Perl solution (it's
> been there for a year or two):
>
> http://www.lemoda.net/perl/strip-diacritics/index.html
>
> The only part of that you need is the bottom subroutine, "decompose".
> It needs Perl of 5.8 (from 2003 onwards). I can make a short script to
> read from standard input/write to standard output if you want it.
Interesting, and a bit of a shock to see it also whips the にごり
marks off kana as well. (It left the dots in the tops of "i" though 8-)})
I see the old Suns have Perl 5.6.1. If I get nowhere with the Python
upgrade, I'll see if I can stir the Perl along.
Thanks
Jim
--
Jim Breen
Adjunct Snr Research Fellow, Clayton School of IT, Monash University
Treasurer: Hawthorn Rowing Club, Japanese Studies Centre
Graduate student: Language Technology Group, University of Melbourne