Ailie O’Hagan is the Digital Preservation Librarian at Queen’s University Belfast
On the surface, the essential goal of digital preservation is simple: to maintain a digital object, its integrity and authenticity, for as long as necessary, while ensuring it remains usable and accessible. However, with the ready availability of digital software and platforms to suit all needs, and with increased complexity and interactivity, concepts of ‘usable’ or ‘authentic’ become harder to define, and the ongoing tasks of digital preservation must adapt to fit within stakeholder capacity.
In the academic university environment, we accept a wide variety of formats for diverse projects, across schools and disciplines with different priorities and requirements. This past year at Queen’s University Belfast, we began active digital preservation of our e-thesis content. Working around pre-existing upload and verification processes designed for sharing, storing, and linking large volumes of research content, digital preservation was introduced at the end of the chain. We wanted to improve this position, as the linear workflow disconnected digital preservation from our researcher community and meant we were tasked with preserving content that had already been compressed, encrypted, or hyperlinked, in response to rights management or repository submission requirements. Our goal then was to establish which options and tools might improve our ability to preserve the long-term integrity and understanding of these pre-existing collections, and to inform an inclusive preservation strategy for their management.
At the DPC Unconference in May, I opened this topic with the member community, asking, how do we balance inclusivity with stakeholder capacity? As with our collections, the discussion was varied: a ‘one-size-fits-all' approach to recommended file formats and deposit volume capacity can unintentionally bias our collections against depositors in fields working with larger (e.g. audio / visual) file types, or with interactive research outputs. Equally, accepting larger deposits has implications for sustainability, while our expectations around digitised and commissioned materials (for which we can control the specifications) will not be the same as for born-digital deposits. Being too restrictive in the file formats that we accept, undermines the information value of the content, and risks replicating historical exclusion of voices. For example, open-source apps are more accessible to creators working in lower-budget, often arts and humanities, or community groups but traditionally, less attention is given to preserving these. In some cases, this exacerbates a socio-economic divide, as proprietary apps and operating systems, unavailable to many communities, have more funding to maintain and preserve their projects.
So how can we be more inclusive across the digital preservation community? Suggestions included supporting preservation of open-source apps, providing context and explanatory notes for our collections to acknowledge any bias in institutional digitisation policies, and using decision trees as a useful device to support inclusive appraisal.
Working with colleagues from the Sonic Arts Research Centre and Open Research in the Library at Queen’s, we shared a survey with postgraduate and academic researchers, to understand what motivates file format choices in our own research community. Researchers revealed diverse reasons behind their file format choices, ranging from convenience, cross-compatibility, and personal preference; to industry or publisher requirements, accessibility and access control. For instance, Electronics, Electrical Engineering and Computer Science are more likely to favour formats that are open-source and shareable, whereas Pharmacy and Psychology prioritise access control. The Schools of Arts, English and Languages, and Natural and Built Environment, typically work with the greatest variation of file formats and data types, with larger proportions of visual outputs accounting for large file sizes and variation in proprietary software use. By considering what our researchers perceive to be the significant properties of their work, we can create better tailored support, guidance and advocacy for digital preservation.
During the summer, I travelled to The National Archives (UK) to get an alternative institutional perspective on inclusive preservation. Whether working with technical Digital Preservation, or community facing Archives Sector Leadership, TNA Digital Archives team emphasised the importance of good relationships and records. Rather than seeking perfect file formats, put effort into the documentation, with ReadMe files and boilerplates, and capturing rights documentation. For web archives, we can create metadata in layers or consider Elasticsearch or cataloguing to better preserve context and facilitate discovery. Tools, such as the PRONOM registry, improve inclusive preservation by informing about signatures, software, and technical elements of our file formats to support our preservation decision making and problem-solving. This tool is shaped by community contributions, helping contribute to equitable support for file formats, regardless of perceived commercial or societal value. And allow time – to engage with depositors, ensuring we capture rich information about our acquisitions – and to be kind to ourselves. Digital archiving is complex, and it may take weeks to archive and preserve a collection. Removing the pressure from ourselves to ingest and preserve immediately, means we can be more generous about the collections we preserve.
Understanding our digital preservation needs requires input from many stakeholders and perspectives. Considering the different angles to how our collections are created, used, and managed, can help us to be more inclusive in our preservation practice.
This opportunity to gather different perspectives on digital preservation was enabled as part of a TNA-RLUK Professional Fellowship project 2024-25. The next step for this project is to work with University schools-based and Digital Preservation practitioner focus groups to define a workflow and toolkit to support different creators. To find out more, or to be involved, please sign up here.