Y!Jボットの暴走

2011年3月28日 | 最終更新:2011年4月6日

腹立たしいと言うべきか、それとも呆れるべきなのか、ちょっと分からないが、 Yahoo! のクローラが暴走している。それもブックマークの画像作成用のボットのようで User-agent は BMT/1.0 (Y!J-AGENT)。短時間に違う IP アドレスから執拗に繰り返しアクセスしてくる、非常に無茶なことをするボット。

あるベージを例にすると:

203.216.255.76 | - [28/Mar/2011:16:09:39 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.72 | - [28/Mar/2011:16:09:50 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.74 | - [28/Mar/2011:16:10:09 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:10:20 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.71 | - [28/Mar/2011:16:10:30 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:16:10:39 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.77 | - [28/Mar/2011:16:10:50 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:16:11:09 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.79 | - [28/Mar/2011:16:11:10 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:11:22 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.70 | - [28/Mar/2011:16:11:30 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:16:12:00 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.77 | - [28/Mar/2011:16:13:38 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.70 | - [28/Mar/2011:16:14:23 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.74 | - [28/Mar/2011:16:15:58 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:15:59 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:16:07 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.72 | - [28/Mar/2011:16:16:24 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:16:29 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:16:16:35 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:16:16:35 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

上記は間違いではなく、2回ログに記載されている。

203.216.255.71 | - [28/Mar/2011:16:17:07 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:16:17:13 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.79 | - [28/Mar/2011:16:17:27 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.77 | - [28/Mar/2011:16:20:12 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.70 | - [28/Mar/2011:16:20:57 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.74 | - [28/Mar/2011:16:22:32 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:22:33 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:22:42 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.72 | - [28/Mar/2011:16:22:58 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.76 | - [28/Mar/2011:16:23:03 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:16:23:46 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.79 | - [28/Mar/2011:16:24:01 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:18:47:13 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.78 | - [28/Mar/2011:18:47:43 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.77 | - [28/Mar/2011:18:48:13 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.70 | - [28/Mar/2011:18:48:43 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.75 | - [28/Mar/2011:18:54:05 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.78 | - [28/Mar/2011:18:54:30 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.77 | - [28/Mar/2011:18:54:51 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.70 | - [28/Mar/2011:18:55:20 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.78 | - [28/Mar/2011:19:01:17 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

203.216.255.70 | - [28/Mar/2011:19:02:08 +0100] "GET /ja/snapshots/seasons-Autumn-2010-3.html HTTP/1.1" 200 6778 "-" "Mozilla/5.0 (compatible; BMT/1.0 (Y!J-AGENT); Windows NT 5.1; ja; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"

上記は HTML 形式の1ファイルのみを例として記している。ページの要素の画像などもダウンロードしているので、著しく迷惑なボット。そして同様のことを30ページ以上で行っていた。これでは Yahoo! という会社の良識が問われてもおかしくないだろう。

あまりにもひどいので、robots.txt で弾く事にした。

2011年3月29日追加

robots.txt でこのクローラのアクセスを禁じたと思ったのだが、今日もまたしつこくアクセスしてきたので、もう我慢ならず .htaccess を使用しアクセスを禁じた。

2011年4月6日追加

.htaccess の記述に間違いがあって、また収まる様相もないので、結局3月31日、利用しているサーバー会社に IP アドレスをファイアウォールで遮断するように依頼して、ようやく落ち着いた。

インターネットで探しても、あまり似た事例が報告されていないので、例外的だったのか、何だったのか、未だによくわからない。