[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Website worriers



Yesterday I was manually blocking the addresses sending GETs to
the "edform.py" program once they reached an unreasonable level.
A real user doing editing is unlikely to make more that 100 or so in a
day. An hourly "cron" script is monitoring and logging the busy ones.

About 3am (UTC) I blocked 85.203.22.34 when it went over 2,000
accesses. This address is in Cyprus (no DNS)
By 430am 104.238.63.6 had reached 400 accesses, so I blocked it too.
It is in Tokyo.
Things were quiet until about 1130am when 93.209.83.236 joined
in . I blocked it at about 900 accesses. It's in Berlin and actually reports
a domain name (p5DD153EC.dip0.t-ipconnect.de)

Around 12pm 178.73.220.93 (in Stockholm) leapt in. By 330pm
it had made over 4,000 access (I was asleep by then, so no blocking.)
Some time after 5pm 91.65.218.146 (also in Berlin) took over and
by 7pm had reached 2.5k accesses.
Then 46.246.1.163 (back in Stockholm) took aver and by 1230pm
had made 6,300 accesses.
Then at 1133pm (UTC) it went quiet and there's been nothing since
(215am UTC).

All very curious. I'm impressed with they way the script or whatever is
being shipped around. It's not intense enough to be a DOS and for the
time being I'm happy to keep watching and making the occasional
manual block. There was a 7-hour gap yesterday, so I won't be surprised
if they start again.

Jim


On Thu, 7 Nov 2019 at 12:21, Jim Breen <jimbreen@gmail.com> wrote:
>
> One of the fun things about running a busy website is that you
> have to watch out for traffic you don't really want. Since there is
> a data charge from the hosting company, I watch out for users
> who do silly things like trying to download the whole of JMdict
> via wwwjdic, one entry at a time. Most of the semi-professional
> harvester sites obey the robots.txt go-away rules, but there are
> still the rogues.
>
> One user who is annoying me at present is firing requests at the
> edform.py script, e.g.
> http://www.edrdg.org/jmdictdb/cgi-bin/edform.py?svc=jmdict&sid=&q=1582000
> (This is the one that loads up an entry for edit.)
>
> When I noticed them they were sending in 20-30k of these a day.
> There is no identifying information in the request, the IP address is
> never showing up in the DNS data. I am now blocking them with a
> kernel filter and after a couple of hours they switch to another IP
> address and resume. The current culprit is at 85.203.22.34 and
> has sent in about 2,000 in the last hour. The log shows an odd client
> identifier  ending in "Gecko/20041107 Firefox/x.x". That same
> pattern is on all the requests I've been blocking, and AFAICT no other
> user has it. Anyway another block going in.
>
> Jim
>
>
> --
> Jim Breen
> Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
> http://www.jimbreen.org/
> http://nihongo.monash.edu/



-- 
Jim Breen
Adjunct Snr Research Fellow, Japanese Studies Centre, Monash University
http://www.jimbreen.org/
http://nihongo.monash.edu/