Date of Award

Summer 2008

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

Committee Director

Michael L. Nelson

Committee Member

Kurt Maly

Committee Member

Steven J. Zeil

Committee Member

Mohammed K. Zubair

Committee Member

Simeon Warner

Abstract

Digital preservation of theWorldWideWeb poses unique challenges, different fromthe preservation issues facing professional Digital Libraries. The complete list of a website’s resources cannot be cited with confidence, and the descriptive metadata available for the resources is so minimal that it is sometimes insufficient for a browser to recognize. In short, the Web suffers from a counting problem and a representation problem. Refreshing the bits, migrating from an obsolete file format to a newer format, and other classic digital preservation problems also affect the Web. As digital collections devise solutions to these problems, the Web will also benefit. But the core World Wide Web problems of Counting and Representation need a targeted solution.

As the host of web content, the web server is uniquely positioned to assist in the preservation of the resources it serves. It both knows the resources it has, and knows what kind of resources they are. This dissertation presents research in which preservation functions have been integrated into the web server itself. The CRATE Model defines a method for addressing the Counting Problem and the Representation Problem using existing web server-compatible technology. A series of experiments which evaluated this approach are presented, along with a technical review of the MODOAI web server module which acts as the preservation agent. The feasibility of this approach is demonstrated by a quantitative analysis of its use in a commercial web testing environment.

ISBN

9780549753667

Share

COinS