With a few simple changes to a robots.txt file, the daily page impressions on one of my sites have more than quadrupled. The bottom line is this: don’t underestimate the effects of Google’s search results.
One of my sites gets steady traffic from Counter Strike addicts (and has for a few years now). In recent months I had noticed a decline in AdSense page impressions for that site, but had dismissed it. The site is old and isn’t updated at all, so I was thinking that its day had passed. On the other hand, several pages are still linked from dozens (if not hundreds) of gaming forums across the world.
On a whim I checked the contents of my robots.txt and found that I was denying web crawlers access to various parts of the site. I searched Google and didn’t find any of the site’s pages. Crap. So, I made one simple change to my site and the hits are rolling now in again. I made a decision to remove all of the
Disallow lines, thereby letting the bots, spiders, and crawlers do their thing without restriction.
Of all the directories removed from robots.txt, they either:
- had authentication requirements, thus denying access anyways
- were old and were physically removed from the system
The only other URLs I wouldn’t want crawled are for administrative use, and they all use various authentication means. Why wouldn’t I want them listed, just in case? There are plenty of bad spiders and bots that either ignore the robots.txt standard or have been found to harvesting email addresses. Why would I want to advertise administrative URLs to potential attackers?
At 1145, I saved the new robots.txt file to disk:
User-agent: Mediapartners-Google* Disallow: User-agent: TurnitinBot Disallow: /
The first section allows the AdSense crawler access to everything, so Google can display relevant ads. TurnItIn owes me money, thus their crawler — Turnitinbot — is denied all access in the second section.
This morning I noticed that I was back in Google’s search results for “counterstrike survey” and that yesterday’s AdSense impressions has quadrupled. Today’s impressions are about five times the norm, and the day is barely half over.
Granted, my pages were already ranked pretty high for that particular search. My pages are also linked from numerous other sites. Your mileage may vary, but Google’s web search is definitely not to be underestimated.