[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Commas in kanji/reading



That sounds fine to me.

For comparison, there are 52 entries that have interpuncts in the kanji field.  But none of them have interpuncts in the reading field.


On Dec 8, 2017, at 6:18 PM, Jim Breen jimbreen@********* [edict-jmdict] <edict-jmdict@***************> wrote:


As usual I can see there's good arguments for having them and not having them.

I really doubt it's appropriate to have them in the reading field at all.

How about the following:

- where appropriate, the Kanji field may have both expressions with and
without commas where the commas add clarity.
- the reading field will not have commas.

I've just edited 2570040 to reflect this approach:

朝焼けは雨、夕焼けは晴れ;朝焼けは雨夕焼けは晴れ [あさやけはあめゆうやけははれ] /(exp) (proverb) Red sky at night, sailors delight; red sky in morning, sailors take warning/Red sky at night, shepherds delight; red sky in morning, shepherds take warning/

How does that go?

Jim


On 8 December 2017 at 15:10, René Malenfant rene_malenfant@*********** [edict-jmdict] <edict-jmdict@***************>wrote:


Yes, but I think Daijirin may have a general policy to omit commas, which is why there’s little/no overlap with Daijisen.  I mean, if they don’t have commas for that one...


On Dec 8, 2017, at 12:08 AM, Marcus Richert superbrightfuture@********* [edict-jmdict] <edict-jmdict@***************> wrote:


I don't necessarily disagree on this one but it's worth noting that in daijs, it's 来た見た勝った without the commas.

Marcus

On Fri, Dec 8, 2017 at 12:04 PM, René Malenfant rene_malenfant@hotmail.com [edict-jmdict] <edict-jmdict@***************>wrote:
 

I’m in favour of keeping the commas when they add clarity.  Something like “来た、見た、勝った” doesn’t work without commas (or, sub-optimally, spaces).




On Dec 7, 2017, at 10:54 PM, Marcus Richert superbrightfuture@gmail.com [edict-jmdict] <edict-jmdict@***************> wrote:


My personal opinion is that they needlessly complicate things and including versions with and without commas would be needlessly messy. I think that we should try to avoid them as far as possible. Not even daij seem to have consistent policies when to use them so it's hard to lean on either of them for guidance, but considering that the 300k entries heavy daijs only has 10 or so non-title entries with commas, the few that have snuck in kind of seem like a mistake rather than a policy decision.

For example, "仰いで天に愧じず、俯して地に怍じず" is one of the only two (!) entries that have commas in both daijs and daijr. Grammatically, it's no different from "君子は周して比せず小人は比して周せず", which on the other hand does NOT have a comma in either dictionary's entry. It's worth mentioning that "仰いで..." can also be found without the comma in the wild: http://teabreakt.studio-web.net/TEXT-kotowaza-irohakaruta.pdf

Marcus

On Wed, Dec 6, 2017 at 8:25 PM, Jim Breen jimbreen@********* [edict-jmdict] <edict-jmdict@******roups.com> wrote:
 

I've avoided commas, but I guess they aren't a problem. Perhaps we need both
versions, with readings to match. The U+3001 comma maps onto the JIS zenkaku
comma, which is appropriate.

I see we have 朝焼けは雨、夕焼けは晴れ with comma and 
夕焼けは晴れ朝焼けは雨 without comma. We probably should be consistent.

I see GG5 has 朝焼けは雨 as a subentry.

Any other views on commas?

Jim




On 6 December 2017 at 15:16, Marcus Richert superbrightfuture@****l.com [edict-jmdict] <edict-jmdict@***************> wrote:


I noticed recently we have 3 entries with commas, specifically “Ideographic commas”, U+3001 () in the kanji and reading:

 

http://www.edrdg.org/jmdictdb/cgi-bin/srchres.py?svc=jmdict&s1=1&y1=3&t1=%E3%80%81&s2=1&y2=1&t2=&s3=1&y3=1&t3=&idtyp=seq&idval=&search=Search&src="" style="line-height: 1.22em;" class="">at=2&nfcmp=%3C%3D&nfval=&gacmp=%3E%3D&gaval=&smtr=&smtrm=0&ts0=&ts1=&mt=0&grp=

 

Out of these three, two have entries in daijr, with the commas intact. One is in daijs, without the comma. I’ve noticed that daijr will very often include commas where daijs will not: for their respective entries for the saying “虎は死して皮を留め人は死して名を残す”, daijr has a comma after “留め", while daijs doesn’t. Likewise it’s “天知る,地知る,我知る,人知る” in daijr but “天知る地知る我知る子知る” in daijs, and so on. On the other hand, in daijs “知る” entry, the same proverb is listed under “[下接句]” as “天知る、地知る、我知る、子()知る”.

 

Daijs and daijr also use different symbols for their commas, at least on dic.yahoo.jp: daijr uses “” ("Fullwidth Comma”, U+FF0C), while daijs uses “” (“Ideographic Comma”, U+3001). 

 

Daijs does have a handful of entries that do include commas; doing a search on dic.yahoo.jp with the ideographic comma, I got a list of 50 entries. I think I counted 10 which aren’t titles of novels, etc., among them “権力は腐敗する、絶対的権力は絶対に腐敗する”. Oddly enough, out of the 5 of those 10 entries that also have entries in daijr, 3 do not actually have a comma in daijr!

 

Even when daij both don’t use commas though, other sources might: in daij (and JMDict): “沈黙は金雄弁は銀”, but in 新和英中辞典, 英語ことわざ教訓辞典and kotowaza-allguide; “沈黙は金、雄弁は銀”.

 

When should we be using commas in JMDict?

 

Marcus






-- 
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/













-- 
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/