Cal Lee

Cal Lee

Last updated on 30 November 2018

Christopher (Cal) Lee works at the School of Information and Library Science at the University of North Carolina at Chapel Hill

Digital preservation is about conveying meaningful information between contexts over time [1].  A great deal of the complexity stems from digital information residing at multiple levels of representation [2]  This process is never free.  It requires resources (human, technical, financial).  Ensuring a steady flow of resources over time is difficult. 

At any given time, dedicated individuals and informal groups play a vital role in the provision of resources (collecting, organizing, storing and sharing information in which they have an interest).  Commercial providers of information systems also play a major role, by providing the platforms upon which consumers create, manage and share information.

It's risky to rely solely on individuals, informal groups, and commercial information system providers, because they often don't have the capability or incentive to effectively allocate resources over long periods of time.  The two primary responses to this issue are to (1) systematically channel resources through the individuals/groups/providers (e.g. training, donations, new business models), or (2) transfer responsibility for stewardship of the information to third parties.  Traditionally, those third parties have been "memory institutions," including libraries, archives and museums (LAMs).

Whoever is responsible for digital preservation must deal with an ever-changing social and technical environment.  It's impossible to predict what the specific changes will be, especially over long time periods.  This makes it dangerous to optimize digital preservation approaches to a particular set of assumed conditions [3]. 

Instead, it's wise to aim for robustness, i.e. not only effectiveness in the short-term but also sufficient flexibility to remain effective in a wide range of possible future contexts.

Robust action involves making choices that don't unnecessarily lock someone into a narrow path of future choices [4].  "To escape a dependency on choice, action must be invariant to a wide range of preference judgements.  This flexible kind of action, called local action, buys more time to observe opportunity context.” [5]  Considering a game of chess as an example, players at the highest level often prioritize moves that leave their options open and give them time to see what their opponent does, rather than launching into a sequence whose success depends on their opponent doing something specific.

Similarly, robust design is based on a recognition of both immediate (better known) and future (less well-known) needs.  A "design is robust when its arrangement of concrete details are immediately effective in locating the novel product or process within the familiar world, by invoking valued schemas and scripts, yet preserve the flexibility necessary for future evolution, by not constraining the potential evolution of understanding and action that follows use." [6]  Limiting the dependencies between subsystems can also make a design more robust against disruptions from the environment [7].

Robustness for digital preservation can range from fairly low-level considerations, such as the robustness of file formats [8] to higher-level institutional strategies and structures.  According to a RAND Corporation report [9]:

The future of archiving and preservation is highly uncertain... A robust strategy will have to anticipate these uncertainties and prepare for future trends that are foreseeable. (p.2) using multiple scenarios, it becomes possible to test policy recommendations for their robustness. If an option appears to be effective in several, highly different scenarios, this implies that the option is robust.  For options that are not robust, it is equally significant to understand under which circumstances they are not effective. (p.76)

One could argue that because the above report was from more than a decade ago, and there are now many systems available to support digital preservation, we no longer need to take robust approaches.  Instead, we can simply pick a software offering and consider digital preservation to be solved.  However, such a response would be misguided.  While there have been many exciting advances in both digital preservation research and market offerings, the future of preservation is inherently uncertain, because it requires the navigation of an evolving landscape. 

A vital priority in digital preservation is interoperability. LAMs constantly struggle with the need to insist on strict conformance to standards internally while also interacting with actors (creators, users) who pay little (if any) attention to those standards.  In short, they must follow Postel's Law:

The implementation of a protocol must be robust. Each implementation must expect to interoperate with others created by different individuals. While the goal of this specification is to be explicit about the protocol there is the possibility of differing interpretations. In general, an implementation should be conservative in its sending behavior, and liberal in its receiving behavior. That is, it should be careful to send well-formed datagrams, but should accept any datagram that it can interpret (e.g., not object to technical errors where the meaning is still clear). [10]

Stated more simply, "be conservative in what you do, be liberal in what you accept from others.” [11]

It can be helpful to distinguish three different aspects of robustness:

  • Diversity - If you only have one vaccine and it doesn’t treat the current pandemic, a billion copies of the vaccine won’t help you.
  • Redundancy - A single copy of a diverse library collection poses a serious risk of loss.
  • Multiple locations - Diversity and redundancy may not help if it’s all in one place (e.g. destroyed rainforest).

Once again, none of the above comes for free.  Studies of robustness in various settings have shown that "it is not possible to simply increase general robustness of the system without a sacrifice in performance and increased resource demands." [12]  A major part of digital preservation work is making the case for robustness along various dimensions.  This involves foregrounding long-term adaptability in the face of pressure to solely prioritize short-term optimization.

How can advocates for digital preservation make this case most effectively?  One answer is based on, you guessed it, robustness.  In a study of the history of networking technologies, Urs von Burg argues:

...IBM was able to insist that the IEEE create a standard for its Token Ring LAN technology in addition to the Ethernet standard that the IEEE had promulgated.  But whereas Token Ring was supposedly an open standard, IBM did what it could to retain proprietary control over its development, and these efforts interfered with the formation of a robust technological community.  

Such a community must be nurtured and constructed.” [13]


[1] Lee, Christopher A. "A Framework for Contextual Information in Digital Collections." Journal of Documentation 67, no.1 (2011): 95-143.

[2] Lee, Christopher A. “Digital Curation as Communication Mediation.” In Handbook of Technical Communication, edited by Alexander Mehler, Laurent Romary, and Dafydd Gibbon, 507-530.  Berlin: Mouton De Gruyter, 2012.    

[3] Lee, Cal. Never Optimize: Building & Managing a Robust Cyberinfrastructure, History and Theory of Infrastructure: Distilling Lessons for New Scientific Cyberinfrastructures, Ann Arbor, MI, September 28 - October 1, 2006.

[4] Padgett, John F., and Christopher K. Ansell. "Robust Action and the Rise of the Medici, 1400-1434." American Journal of Sociology 98, no. 6 (1993): 1259-319.

[5] Leifer, Eric Matheson. Actors as Observers: A Theory of Skill in Social Relationships. New York: Garland, 1991. (emphasis added) 

[6] Hargadon, Andrew B., and Yellowlees Douglas. “When Innovations Meet Institutions: Edison and the Design of the Electric Light.” Administrative Science Quarterly 46, no. 3 (2001): 476-501. (emphasis added)

[7] Simon, Herbert A. "The Architecture of Complexity.” Proceedings of the American Philosophical Society 106 (1962): 467-82.

[8] Rog, Judith, and Caroline van Wijk. “Evaluating File Formats for Long-term Preservation.” The Hague: National Library of the Netherlands, 2008.

[9] Hoorens, Stijn, Jeff Rothenberg, Constantijn van Orange, Martijn van der Mandele, and Ruth Levitt. "Addressing the Uncertain Future of Preserving the Past: Towards a Robust Strategy for Digital Archiving and Preservation." Santa Monica, CA: RAND Corporation, 2007. (emphasis added)

[10] Postel, Jon. RFC 760 "DOD Standard Internet Protocol.“ 1980. (emphasis added)

[11] Information Sciences Institute.RFC 761 "DOD Standard Transmission Control Protocol." 1980.

[12] Kitano, Hiroaki."Biological Robustness." Nature Reviews 5 (2004):827-837.

[13] Burg, Urs von. The Triumph of Ethernet: Technological Communities and the Battle for the LAN Standard, Innovations and Technology in the World Economy. Stanford, CA: Stanford University Press, 2001. p.9 (emphasis added)



#1 Christopher Lee 2018-11-30 16:33
There are many interesting parallels to the themes I've raised and those of "Community Cultivation – A Field Guide" recently released by the Educopia Institute.

As the authors state, "Without deep knowledge of how to build a support community, and how
to manage such elements as resources, communications, engagement, and governance, innovators find the bridge between grant funding and ongoing operations very
difficult to cross."

Scroll to top