Document Type


Publication Date




Conference Name

Coalition for Networked Information: Spring 2019 Membership Meeting, April 8-9, 2019, Ritz-Carlton Hotel, St. Louis, Missouri


[Summary] The authenticity, integrity, and provenance of resources we encounter on the web are increasingly in question. While many people are inured to the possibility of altered images, the easy accessibility of powerful software tools that synthesize audio and video will unleash a torrent of convincing “deepfakes” into our social discourse. Archives will no longer be monopolized by a countable number of institutions such as governments and publishers, but will become a competitive space filled with social engineers, propagandists, conspiracy theorists, and aspiring Hollywood directors. While the historical record has never been singular nor unmalleable, current technologies empower an unprecedented number of skillful would-be editors of history.

Web archives have a role to play in verifying the integrity and priority of resources. Unfortunately, web archives have a 1990s, ad-hoc approach to trust, interoperability, and audit. We implicitly trust the Internet Archive in the same way we used to trust email, Google, Apple, and Facebook. That we do not currently associate web archives with surveillance, spam, and subterfuge does not mean they are somehow immune in a way the other tools and services are not; it only means that the theatre of conflict has yet to encompass web archives. As the political, cultural, and economic stakes of disinformation rise, we can expect two primary changes.

First, existing, trusted web archives will be attacked. Obvious targets will be the machines and facilities themselves, but more subtle attacks will involve legitimately crawled pages, which then masquerade as pages with fake URLs and date stamps, thereby obfuscating the provenance of otherwise untrustworthy sources.

Second, the number of web archives will proliferate, and not all will be trustworthy. When web archives required custom tools and expensive hardware, there were a limited number of people capable of their operation and they were well-known in our community. We now have a dynamic marketplace of web archives, many of which are short-lived, and at least some of which could be operated by replicants (distinguishable from humans only by their empathetic responses to questions about tortoises).

In summary, is that really an archived tweet from 2016 with a video of your favorite politician in an unflattering situation? Or is it a backdated deepfake, injected into a trusted archive, and then replicated across several less established archives, all of which are secretly operated by the same entity?


Audiovisual recording on publisher's website.


© 2019 The Author.

Published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License.

Original Publication Citation

Nelson, M. L. (2019). Web archives at the nexus of good fakes and flawed originals [Presentation]. Coalition for Networked Information: Spring 2019 Membership Meeting. St. Louis, Missouri.


0000-0003-3749-8116 (Nelson)