Document Type
Conference Paper
Publication Date
2012
DOI
10.1145/2232817.2232930
Publication Title
JCDL '12: Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries
Pages
437-438
Conference Name
12th ACM/IEEE-CS Joint Conference on Digital Libraries, Washington, DC June 10-14, 2012
Abstract
The Internet Archive's Wayback Machine is the most common way that typical users interact with web archives. The Internet Archive uses the Heritrix web crawler to transform pages on the publicly available web into Web ARChive (WARC) files, which can then be accessed using the Wayback Machine. Because Heritrix can only access the publicly available web, many personal pages (e.g. password-protected pages, social media pages) cannot be easily archived into the standard WARC format. We have created a Google Chrome extension, WARCreate, that allows a user to create a WARC file from any webpage. Using this tool, content that might have been otherwise lost in time can be archived in a standard format by any user. This tool provides a way for casual users to easily create archives of personal online content. This is one of the first steps in resolving issues of "long term storage, maintenance, and access of personal digital assets that have emotional, intellectual, and historical value to individuals".
Original Publication Citation
Kelly, M., & Weigle, M. C. (2012). WARCreate: Create wayback-consumable WARC files from any webpage. Paper presented at the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, Washington, DC.
Repository Citation
Kelly, M., & Weigle, M. C. (2012). WARCreate: Create wayback-consumable WARC files from any webpage. Paper presented at the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, Washington, DC.
ORCID
0000-0002-2787-7166 (Weigle)
Comments
© by the author/owners.
Included with the kind permission of the author.