Google Crawler hitting your site too aggressively?
If your Drupal site suffers occasional slow downs or outages, check if crawlers are hitting your site too hard.
We've seen several clients complain, and upon investigation we found that the culprit is Google's own crawler.
The tell tale sign is that you will see lots of queries executing with the LIMIT clause having high numbers. Depending on your site's specifics, these queries would be slow queries too.
This means that there are crawlers that accessing very old content (hundreds of pages back).
Here is an example from a recent client:
As you can see, Google's crawler is going back 340+ pages for the last query.
Going to your web server's log would show something like this:
Note the page= part, and the Google Bot as the user agent.
The solution is often to go into Google Webmaster and reduce the crawl rate for the site, so they are not hitting too many pages at the same time. Start with 20%. You may need to go up down to 40% in severe cases.
Either way, you need to experiment with a value that would fit your site's specific case.
Here is how to change Google's crawl rate.
Is your Drupal site slow?