KANJIDIC Project

From EDRDG Wiki
Jump to navigation Jump to search

The KANJIDIC Project

(Note that this page in the process of being rewritten, so be patient with any aspects that seems incomplete.)

Introduction

The KANJIDIC project, which began in 1991, has the goal of compiling and distributing comprehensive information on the kanji used in Japanese text processing. It covers the 13,108 kanji in three main Japanese standards:

Three data files are distributed by this project:

  • the KANJIDIC2 file, which is in XML format and Unicode/UTF-8 coding, and contains information about all 13,108 kanji. For this file the following information is available:
  • the KANJIDIC file, which in in EUC-JP coding and covers the 6,355 kanji in JIS X 0208. For this there is the
  • the KANJD212 file, which also is in EUC-JP coding and covers the 5,801 kanji in JIS X 0212. For this there is the

There is also a combined overview of the KANJIDIC/KANJD212 files.