Pipermail Archives Removed from Google Index
May 20th, 2006 by Alex
A few days ago, requests for the public archives of my SAIC mailing list ceased. It appears that the pipermail archives have been removed from Google’s index entirely.
A few weeks ago, I noticed a rather large spike in traffic to the archives. After looking at my referral logs, much of the new traffic originated from Google search results. A quick site search on Google yielded a LOT of pages from the archives. Some of the queries on “SAIC IPO” referenced my pages on the first page of results.
It is hard to find news articles of interest without a search engine of some sort, especially with two years of content. I originally submitted the site to Google so that I wouldn’t have to index it myself. If I could pull in a bit of revenue from AdSense, so much the better. Before the spike in traffic, the archive pages never really made it up there in the rankings, but when combined with the “site:vision.moundalexis.com” parameter it was a useful enough way to search the archives for a particular topic.
Today, a site search yields only the Mailman listinfo pages. None of the pipermail archives. It isn’t a global phenomenon either; other sites still have archives, for example a query for “Blue Security” from ISN’s archives. At first I thought that the type of content was to blame: reprints. Given the nature of ISN, that doesn’t make sense either.
Proof that Google — as grand as they can be — can’t be depended on for some services. While this isn’t news, given their terms of service, it’s a new problem for me to deal with. I’ll probably never know why the pages were removed from the index, but it appears that I’ll have to find a new mechanism for searching the archives.