[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Kanjidic skip code comments and corrections
Just a bit more follow-up on this.
First, thanks for the SKIPs for the ~900 "new" JIS213 kanji. I ran
them through my updater script and they are now in kanjidic2.xml.
Second, I've looked at the additional kanji at the bottom of the page
at https://news.sljfaq.org/skip.html In particular the last 4.
For 冄, 韱 and 韯 I think the proposed 4-* SKIPs are probably correct.
I've added them to the database driving kanjidic2.
For 竸, as indicated earlier, I'm pretty sure 1-11-11 is correct.
I'm really not sure about 隺. I'm inclined towards staying with 2-3-8.
Anyway, I'll run all those ones past Jack Halpern and get his opinion/ruling.
Cheers
Jim
On Wed, 3 Feb 2021 at 12:20, Jim Breen <jimbreen@gmail.com> wrote:
>
> On Tue, 2 Feb 2021 at 22:35, Ben Bullock <benkasminbullock@gmail.com> wrote:
>
> > One I would like to draw your attention to is 隺 for which kanjidic appears to have an incorrect stroke count of 11, it should be 10. Amusingly it's possible to work out which kanji sites are using kanjidic for their information source by looking at what stroke count this has at each web site.
>
> Yes, it should be 10 and the SKIP should be 2-3-7. Fixed.
>
> Having been spreading Japanese lexical data around the network world
> for about 30 years, It's hard not to come across it all the time. It's
> also often possible to detect the sites/systems which don't update
> their files. I get a bit cross when people contact me about errors
> that were in fact fixed ages ago. I get even crosser when I cop abuse
> on various forums for those long-fixed errors. (And don't get me
> started on people who prefer to spray criticisms over proposing
> corrections.)
>
> > Also I would guess that Halpern has 竸 in the dictionary, but kanjidic has two different things for it, 1-11-11 and 2-2-8, both mathematically unlikely given the symmetry in the character (how would two identical parts result in an odd number of strokes or a different number of strokes left and right?)
>
> Halpern only has this in one of his later dictionaries, of which I
> don't have a copy. I think the 1-11-11 is correct; it's consistent
> with the 1-10-10 he has for 競. The misclassification code for that one
> is 2-10-10, so for 竸 I'm making it 2-10-12. The stroke count of 22 is
> supported by several sources, including Unihan.
>
> > According to the version of kanjidic2.xml mentioned on the page above, there are 13108 characters in total but only 12156 have skip codes.
>
> Yes, the ~900 are the kanji that are in JIS X 0213 but not in JIS X 0212.
>
> > I've put skip codes for the remaining ones here:
> >
> > https://kanji.sljfaq.org/news/missing.json
>
> Thanks. That looks very useful. I'll add them to the kanjidic file for
> JIS213. Great to have them.
>
> Jim
>
> --
> Jim Breen
> Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
> http://www.jimbreen.org/
> http://nihongo.monash.edu/
--
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/
--
You received this message because you are subscribed to the Google Groups "EDICT-JMdict" group.
To unsubscribe from this group and stop receiving emails from it, send an email to edict-jmdict+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/edict-jmdict/CABHGxq5yQvfYaeffArW%2BjiXvOzOt4rBbb0mrVw5DCjauTPwD0w%40mail.gmail.com.