[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [edict-jmdict] Yahoo - over-estimating or more sites indexed?



Paul,

I would argue that the engine with the lower hit count has simply not done as thorough a job.  If one engine says the pages exist, its hard to say they don't.  The one with the lower count just hasn't concentrated on the Japanese language enough.

Jim



On Dec 20, 2007, at 5:58 AM, Paul Blay wrote:

Hi,

This isn't directly related to Edict, but indirectly relates to
the assigning of (P) tags based on web page hits.

A good while back I looked into page hit estimates returned by
Google vs. Yahoo Japan and Google was pretty consistently higher.
The reverse now appears to be the case.

Google might have become more conservative with estimates,
Yahoo might have become less conservative with estimates,
or Yahoo Japan might just be indexing more (Japanese) pages than Google.

Imagine Yahoo says "this word is on 1,500,000 pages" and Google says
"this word is on 740,000 pages", if Yahoo is accurately reporting
that should it be a (P) or should we stick with Google (as the lower