Introduction - Digital Preservation Coalition

Home
Digital Preservation
Implement Digital Preservation
Computational Access: A beginner's guide
Introduction

Providing access to digital content is a core activity for digital preservation practitioners, making up one of the six functional entities of the Reference Model for an Open Archival Information System (OAIS) or ISO 14721:2012. It is widely recognised within the digital preservation community that there is little point in preserving content for the long term if there is not an intention to facilitate access at some point (either now or at a future date), but even providing simpler forms of access to content can be challenge for digital preservation practitioners (as described in Developing an Access Strategy for Born Digital Archival Material), This is particularly the case where large volumes of content and more complex methodologies are employed.

Computational methods of providing access to digital content and metadata are generally considered to be more advanced techniques, certainly a step up from the more standard models of access, for example where a user can browse an online catalogue and view or download one file at a time.

The Levels of Born Digital Access from DLF is a helpful and practical resource which articulates three levels of access under a series of headings, moving from the most simple to the more advanced. Computational access techniques are included at the highest level of the model where the ‘Tools’ section of level 3 states that an organization should “Provide remote access and sophisticated tools for exploring, rendering, and interpretation of data; provide hardware and software to support access to legacy/obscure content, including emulation services.” Examples given within the supporting information include:

“Provide open and web-based remote access to materials, including via programming interfaces” and
“Provide software for exploring, rendering, and interpreting materials, such as text mining, data visualization, annotation, and natural language processing tools.”

Similarly, the DPC’s Rapid Assessment Model (DPC RAM) puts computational access techniques at the highest level of the model. Level 4 of 'Discovery and access' states that “Advanced resource discovery and access tools are provided, such as faceted searching, data visualization or custom access via APIs”.

There is typically no one-size-fits-all approach to digital preservation and this also extends to access strategies. Organizations are encouraged to weigh up their own priorities, resources and the needs of their users in informing their own approach. So whilst it is acknowledged that not all practitioners will strive for the highest levels of either of these models, many in the community are curious about understanding and exploring these more advanced approaches of access in order to inform their own decision making.

The access strategies of an organization should of course be aligned with the needs of their users. The growing desire for users to be able to carry out their own computational processing on archival metadata for example is mentioned in Born digital archive cataloguing and description. Whilst user needs are not covered in any great detail in our online resource, a helpful introduction can be found in Understanding user needs and it is acknowledged that engaging with users should be a key step in establishing appropriate access strategies.

What is the purpose of this guide?

Computational access is a term mentioned with increasing frequency by those in the digital preservation community. Many practitioners are aware it might be helpful to them (and indeed to their users), but do not have an understanding of what exactly it entails, how it is best applied and, perhaps most importantly, where to start. To add to the challenge, computational access raises professional and ethical concerns. These well-founded but sometimes partially formed concerns, in combination with a lack of practical experience and know-how, mean computational access has been relatively slow to develop within the digital preservation community despite its potential to help with our ambitions to facilitate greater access to digital archives.

The topic was highlighted as a priority by DPC members at the DPC unconference in June 2021. It was clear from discussions that digital preservation practitioners felt this was an area they would like to explore, but one of the key barriers was simply not knowing where to start. This guide has been created to provide an introduction to this topic, and to help the community move forward in applying computational access techniques.

Who is this guide for?

This guide is primarily aimed at digital preservation practitioners with no prior knowledge of computational access. It is a beginner’s guide, intended to provide an overview of key topics as well as tips on getting started and examples of a range of different implementations. It does not hold all the answers, but instead aims to move practitioners towards an understanding of computational access terms and approaches and give them the necessary information and resources to consider whether these techniques could be used to provide access to the digital archives that they hold.

It is not specifically aimed at researchers or users of collections who might want to use computational access techniques to analyze and understand digital collections. Other resources that will help those users are available – see, for example, the Programming Historian and GLAM Workbench. It is important, however, that digital preservation practitioners keep potential users and use cases in mind whilst reading and using this guide