« Nearly everyday when I walk home, I pass the Church of Scientology | Main | Off to New York »

Saving Usenet

This morning I became caught up in a Deja thread concerning Usenet archiving. As many users of Deja have noticed, it is now somewhat difficult to find netnews posts more than a year old. The initial fear, which ended up being unfounded, was that Deja was purging its archives of old material. This prompted a few scattered discussions of Usenet archiving, and a rather disturbing picture has emerged.

Usenet itself has existed since 1979, and there were popular mailing lists even earlier, probably the most famous being SF-LOVERS. However, Deja's archive extends back only to the spring of 1995. Usenet's supposed "golden age," which started with the "Great Renaming" in 1988, is for the most part lost. There is one small, fascinating archive of material from 1981 through 1982, the Usenet Oldnews Archive Newsgroups List, and beyond that, nothing.

There are obviously a lot of cracks in any techno-utopian discourse of the Internet as the ultimate knowledge tool, but it is more than a bit disturbing that so much of the Internet simply disappears. I tend to think of Usenet and listserv as really the principle Web knowledge repositories prior to the wide-scale popularity of the Web. The idea that these resources and the history they contain are gone for good leaves me feeling sick.

The Web itself doesn't fare much better. I dug up a Lynx bookmark file from 1995, and checking the 117 links told an interesting story. Most high profile sites listed in 1995 - Yahoo, EFF, Hotwired, and others - remain up, while a few, most notably GNN (remember them?), are gone. However, the real carnage is among special interest sites and links to specific documents in any domain. Roughly 80% of these links are dead.

As well, all the gopher sites were dead, as one would expect, including the beloved WELL gopher. Most of the telnet links were dead, as were about half the ftp links.

The June 1999 OCLC Web Characterization Project report seems to indicate that fully 44% of IP addresses associated with a Web site in 1998 are no longer associated with a site in 1999. At best, we are seeing a substantial churn as Web sites move from server to server. At worst, we are seeing the loss of knowledge and personal narrative with each passing day.

Deja has no plans to extend its archive beyond 1995. As the only Usenet archive on the Web, the best we can hope for is that Deja will change it's mind. Let Deja know what you think: Email Deja and urge them to extend the Usenet archive past 1995.