Post-mortem of a Search Engine Suicide

I prob’ly don’t need to tell you this, but don’t ever ask search engines to delist your site. Ever. They actually do it.

A bit of background…

Last month I relaunched justin hileman dot info with a fresh new theme, and a couple of cool features. While I was developing the replacement site, I had a staging version which I didn’t want Google to index. I used a handy little robots.txt file to keep the search engine spiders at bay:

User-agent: *
Disallow: /

This file did the trick.

The staging site wasn’t indexed, exactly as advertised. Unfortunately when the time came to deploy, my über-restrictive robots.txt file overwrote the existing file and slipped into the live site.

Google, Yahoo, and friends dutifully ignored every page on my site. The majority of the damage was instantaneous. Most high traffic and high PR pages were unindexed within a few hours. My crawl rate dropped, the major search engines removed more pages every time they crawled my site.

My SERP traffic, understandably, tanked.

Yesterday my search engine referrals hit zero.

Search engine referrer traffic, post-apocalypse

The sharp drop in SERP traffic on February 12th coincides with the first Google crawl with the new robots.txt. The second drop, around the 23rd was the result of Yahoo’s reindex. In just a few days my site was completely unlisted from the major search engines.

As of the time of this post, a search for “justin hileman dot info”, which should result in about 1500 pages on this domain, returns nothing.

Google Search for justin hileman dot info

How could this have been avoided?

Google Webmaster tools provides a great overview of your site. It dutifully lists any problems encountered while spidering your domain, and what might have caused them. In my case, there’s a huge red flag:

Google Webmaster tools site overview

A few restricted URLs is normal, but 817 is certainly a bit excessive. Had I paid attention to the tools Google provides, I would have noticed an abrupt change in crawl rate, and the spike in restricted URLs. But at the time, I only saw the decline in traffic, and didn’t think to consult the webmasters tools.

The moral of the story:

First, search engines actually respect your robots.txt file. Second, it’s a really bad idea to tell them to go away, because they will. And take your traffic with them.

Google, Yahoo, I’ve learned my lesson… Please relist me.