[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Updating Tatoeba Indices



Thanks for pointing this out. I should have dealt with it when the split
happened.

There are 930 sentences indexed by 如何 of which 168 were いかが and the rest
どう. Fortunately the original indexing (by Paul Blay) had locked them into
the どう reading, so the index element for the いかが ones was "如何(どう){いかが}", in
other words it was linked to the 如何(どう) entry but appears in the sentence
as いかが. It was a simple global replacement to change it to "如何(いかが){いかが}".
It may just catch the weekly download of sentences.

The global replacement capability in Tatoeba is very useful, although the
nature of the indices make it a bit tricky at times.
The regular-expression sentence search facility in WWWJDIC can be useful
for finding these sorts of things too.

Cheers

Jim


On Sat, 10 Oct 2020 at 05:38, Chris Vasselli clindsay@gmail.com
[edict-jmdict] <edict-jmdict@yahoogroups.com> wrote:

>
>
> Hi all,
>
> I noticed that due to a change in JMdict back in July where いかが and どう
> were split into two entries, the Tatoeba indices for most sentences
> including いかが are now broken. They now point to どう instead of いかが.
>
> I’d be happy to help fix issues like this when I come across them, but I
> realized I actually don’t know how to contribute to those Tatoeba indices.
> Is that maintained by the JMdict team, or the Tatoeba team?
>
> Also, it looks like this may have effected over 100 sentences. Is there a
> way to do bulk updates to those indices?
>
> Feel free to redirect me to Tatoeba if this is not something you have the
> answer to.
>
> Thanks,
> Chris
>
>
> 



-- 
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/