The KANJIDIC Project
The KANJIDIC project has compiled files of data on kanji used in Japanese text processing. The files cover the kanji in three Japanese standards:
- JIS X 0208-1998, which includes 6,355 kanji.
- JIS X 0212-1990, which includes extra 5,801 kanji
- JIS X 0213-2012, which extends JIS X 0208, overlaps with some of JIS X 0212, and adds additional kanji.
Three sets of data files are distributed by this project:
- the KANJIDIC2 file, which is in XML format, and contains all the kanji. For this file the following information is available:
- the KANJIDIC file, which covers the 6,355 kanji in JIS X 0208. For this there is the
- the KANJD212 file, which covers the 5,801 kanji in JIS X 0212. for this there is the
There is also a combined overview of the KANJIDIC/KANJD212 files.