Date of Award

Summer 2000

Document Type


Degree Name

Doctor of Philosophy (PhD)


Computer Science

Committee Director

Kurt Maly

Committee Member

David Keyes

Committee Member

Stewart N.T. Shen

Committee Member

Frank C. Thames

Committee Member

Mohammad Zubair


Discussion of digital libraries (DLs) is often dominated by the merits of various archives, repositories, search engines, search interfaces and database systems. While these technologies are necessary for information management, information content and information retrieval systems should progress on independent paths and each should make limited assumptions about the status or capabilities of the other. Information content is more important than the systems used for its storage and retrieval. Digital information should have the same long-term survivability prospects as traditional hardcopy information and should not be impacted by evolving search engine technologies or vendor vagaries in database management systems.

Digital information can achieve independence from archives and DL systems through the use of buckets. Buckets are an aggregative, intelligent construct for publishing in DLs. Buckets allow the decoupling of information content from information storage and retrieval. Buckets exist within the Smart Objects and Dumb Archives model for DLs in that many of the functionalities and responsibilities traditionally associated with archives are “pushed down” (making the archives “dumber”) into the buckets (making them “smarter”). Some of the responsibilities imbued to buckets are the enforcement of their terms and conditions, and maintenance and display of their contents. These additional responsibilities come at the cost of storage overhead and increased complexity for the archived objects. However, tools have been developed to manage the complexity, and storage is cheap and getting cheaper; the potential benefits buckets offer DL applications appear to outweigh their costs.

We describe the motivation, design and implementation of buckets, as well as our experiences deploying buckets in two experimental DLs. We also introduce two modified forms of buckets: a “dumb archive” (DA) and the Bucket Communication Space (BCS). DA is a slightly modified bucket that performs simple set management functions. The BCS provides a well-known location for buckets to gain access to centralized bucket services, such as similarity matching, messaging and metadata conversion. We also discuss experiences learned from using buckets in the NCSTRL+ and Universal Pre-print Server (UPS) experimental digital libraries. We conclude with comparisons to related work and discussion about possible areas for future work involving buckets.


In Copyright. URI: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).