[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Re: Choosing a database backend



[David Ranvig (Re: [edict-jmdict] Re: Choosing a database backend) writes:]
>> 
>> However, I did test JMdict and could not find any 4 byte utf8 characters in it.

This is correct. 

>> The characters in kanjidic that offends MySQL seems to start at unicodepoint
>> 0x2000B and end at point 0x2A6B2. So using characters within that range
>> might serve as a good test-data.

Those are the "new" kanji in JIS X 0213. 

4-byte UTF-8 sequences will become a database issue when (and if)
the kanjidic database happens. By the time that happens MySQL will
probably have got its act together on full UTF8 support.

Jim

-- 
Jim Breen                                http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology,               Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia                  Fax: +61 3 9905 5146
(Monash Provider No. 00008C)                ジム・ブリーン@モナシュ大学