For my last session of SAA 2010 I chose to attend A Flickr of Hope: Harvesting Social Networking Sites.
Lori Donovan spoke and represented Internet Archive that launched Archive-It. Beginning as a digital library in 1996, Internet Archive is now the biggest digital repository with over 4.5 PETABYTES of information. By using Archive-It one of their most popular services, the user can create, manage and present collections of web content. They use a suite of open source software to crawl, enable searches and also uses the Wayback Machine program to allow uses to surf the web as it was when a particular website was created! Super cool!
Jennifer Ricker of the North Carolina State Archives shared their experience being a pilot partner with the Internet Archive that launched a beta version of Archive-It in 2005. This service harvests websites as often as the host institution schedules a crawl. For the NC State Archives they were interested in capturing the web presence over time of their state agencies. Their Governor encouraged them to pursue this project and her support certainly seems to have ensure that this process continues. The governor’s office also acts as a user who searches this captured information and views the rendered results.
Currently NC State harvests data from accounts on Facebook, Twitter, Flickr and YouTube. However, although data on YouTube is archived it cannot be rendered as of now- that is you can’t watch the videos but can see that they’ve been captured. In the future they hope to harvest accounts and data from Linkedin. For more information and <a href="http://www.archives.ncdcr.gov/webarchives/index.html"<to view the harvested websites.
Similarly, Bonita Weddle of the New York State Archives has embarked on a harvesting project to capture the web presence of state archives. They are using an OCLC harvesting tool. They have captured data or panoramic sweep of websites during the turnover of the Governonr’s office beginning in 2006. In addition to creating catalog records using MARC for these harvested sites, Weddle also shared that they will be creating finding aids for these sites in the near future.