Jon Tilbury is Chief Technology Officer for Preservica, based in the UK
Digital Preservation has come a long way since the early research projects. The earliest practitioners were academics and specialists who set this field in the right direction and contributed hugely to defining what Digital Preservation is, creating the language of SIPs and DIPs, ingest and dissemination and preservation planning that we all use today. This journey will be complete when information is preserved without the need to understand how and long-term retention and use is just another tick box in your day-to-day IT platform. How far are we away from creating this preserved future?
The early Digital Preservation research projects started in the late 1990s and reached their peak with large numbers of EC funded projects in the first 15 years of the millennium. I become involved in the early PRONOM days and enjoyed many trips around Europe on four different research projects as practitioners exchanged ideas and built prototypes that encapsulated these ideas. We used the OAIS reference model to create a common language that we all now use to describe our systems.
As a result of this early research work pioneer organisations developed production systems to deliver this technology, in our case starting with the UK National Archives in 2003. Some of these systems evolved into products that could be re-used across many use cases and as a result you can now choose one of the current range of off the shelf products, each with a different focus and commercial model. The growth of the cloud over this period has expanded the choice to include hosted platforms as a viable choice that incorporate much of the good practice defined by the early research. I have been especially pleased to see how this has allowed less well funded organisations get access to leading edge technology to create some fascinating and important collections.
We now have good quality choices for archivists, librarians and record keepers that map onto standard practices and allow them to preserve information in the way they are used to with physical assets. You can transfer the files, arrange them, describe them and define the security settings. You can make them available via comprehensive catalogues and migrate the content to appropriate formats using standard workflows. So why, with all this capability, does it feel like we are at the start of the journey not at the end?
The trouble is that the rest of the world does not think like a cultural heritage curator. They don’t want to think about organisation - it already is organised as they wish. They don’t want to decide what needs to be preserved other than defining some rules describing what sort of information needs to be kept. They don’t want to think about multiple copies in multiple locations, the system should do this for them. And they don’t want to think about file formats, the system should constantly reset the format to match the technology there are using.
Bridging this divide is the remaining challenge of Digital Preservation technology and one my innovation team at Preservica is currently working on. It requires operational and preservation systems to be seamlessly linked using machine intelligence to decide what needs to be kept, and requires preservation planning and action to be automated. This latter challenge will shift the current community of academics, practitioners and vendors to cooperate on discussing and agreeing the best identification tools, the significant properties to be extracted and the best migration strategies. Once agreed, this advice will be automatically sent out and, for those users that choose, will be automatically applied. I will be speaking about this some more in Den Haag on International Digital Preservation Day when we discuss Significant Properties and how to share them.
My hope is that using these advances Digital Preservation can become truly ubiquitous. It will move beyond those in the know to those with the need. At first it will be built in to information systems to preserve corporate memory, but will soon become part of the consumer technology landscape, harvesting our digital lives and preserving it for the future. The irony is that my dream of making Digital Preservation truly widespread will only be delivered when we don’t even know it is happening.