Edith Halvarsson

Edith Halvarsson

Last updated on 4 November 2020

Edith Halvarsson is a Digital Preservation Officer at the Bodleian Libraries


It seems like it was only a couple of months ago that Bodleian Libraries’ last yearly roundup was posted on the DPC blog. While it has at times felt like time has stood still this year, looking back on the huge amount of work which colleagues have completed over the past months is a tell-tale sign that we are nearing the end of 2020.

So, what is new at the Libraries? After much research, requirements gathering and testing in previous years we are now seeing many of our proof-of-concept and pilot projects moving into their final delivery phases. It has also been a year when digital services, research, and teaching has been discussed and highlighted across the University of Oxford. In response, the Libraries has ramped up its web archiving activities to capture how the University is adapting to remote working during the COVID-19 pandemic. Below is a summary of our year so far.

Microservices

Firstly, DRUMROLLS… We are pleased to announce that (in time for the WDPD celebrations) the Libraries have put its first digital preservation microservice into production! This is one of six microservices which will monitor collections held in the Libraries’ digital repositories. The first microservice which has been put into production is File Integrity (including file fixity), this will be followed by services for tape backup analysis, characterisation, and validation of the digital content.

micro sevices hippo

Also making its world debut is the microservices logo. Each of the dots represents one of the four repositories which is now being monitored by the microservices.

So, how does it work? The microservices framework is built using the University of Oxford’s Elastic Stack (SAVANT), Zabbix, and Grafana. The framework was developed with modularity in mind, this means that new digital preservation tools (such as DROID, JHOVE etc.) can easily be added or removed on request. The microservices have been developed to operate independently from the repository application layer; it accesses files directly on the filesystem. This means that new repositories can also quickly be added as new tenants of the service.

micro services dash board

The microservices generates reports which are visualised with Grafana. Each repository owner can customize their reports to display the type of information most

relevant to them, in terms of the digital preservation risk profile of their content. The Libraries’ repository owners have worked this year to fine-tune their displays.

 

Born-digital archives ingest

While the microservices provide ongoing monitoring of digital collections, the Libraries has recognised that there is still a tool gap for initially preparing digital archives for ingest into a repository. This includes the ability to unpack a range of disk image format and extract metadata which can assist in the cataloguing and preservation of archives.

In 2019 the Libraries ran a proof-of-concept looking at Archivematica as a potential tool for providing these functionalities. Funding was subsequently allocated to address some of the requirements gaps found in the PoC. The Libraries are, due to the scale of its archive collections, undertaking inhouse development work to enhance the software’s throughput performance before it can be put into production. This work is scheduled to be completed in 2021.

A summary from Archives Trainee Marjolein Platjee about the outcomes of the original PoC can be read here.

 

Web archiving

The Libraries’ Modern Archives have a well-established web archiving programme which has the mission of documenting the web presence of the University of Oxford in the areas of Arts and Humanities, the Social Sciences, and Science, Medicine and Technology.

During COVID-19 the Libraries observed that the University’s web presence changed as departmental websites were repurposed for disseminating rapid updates. This has particularly been the case for sites relating to the Medical Sciences. In response the Libraries ramped up its web archiving activities to capture the daily changes to these resources. A deep dive into how this work was undertaken can be found on Bodleian’s Archives and Manuscripts blog written by archivist Kelly Burchmore.

 

Looking beyond Oxford, at a national level, staff have contributed to the UK Web Archive COVID-19 collection.  This collection was developed in partnership with the UK Legal Deposit Libraries and seeks to capture the impact of the pandemic in each nation of the UK. The collection can be browsed here and is regularly being added to.

At the start of 2020, the Libraries also wrapped up a largescale web archiving initiative to preserve content hosted on websites built on legacy technologies (some ranging back as far as the mid 1990s). This involved creating new web captures of all sites hosted on the Libraries IT infrastructure and where possible redelivering content via a central repository. If trawling across old exhibition sites user may now instead be redirect to the Oxford University Research Archive or Digital Bodleian.

paradigm landing page

http://www.paradigm.ac.uk was one of the sites which had its research outputs migrated to the Oxford University Research Archive.

 

OCFL (Oxford Common File Layout)

Another development we are excited about this year is the completion of version 1.0 of the Oxford Common File Layout (OCFL). Version 1.0 of OCFL represents the culmination of over two years work by the international OCFL Editorial team. (Disclaimer: The OCFL initiative originated in September 2017 from informal discussions at a Fedora/Samvera camp in Oxford. Like Dublin Core, the standard takes its name from the city it was conceived in and is not an Oxford University led initiative. However, staff in the Libraries are active contributors to the OCFL Editorial group.)

Twitter DP awards

What is OCFL and why are we excited about it? OCFL is an approach to storing digital information in a way that is application independent. OCFL standardises how files and metadata are stored on disk and provides an efficient method of versioning them. OCFL will enable organisations to build repository services on top of their OCFL compliant directories. This is of great value, since storing information in this way will mitigate key risks around losing data when exporting and importing it between repositories during repository migrations. Other systems and services (such as the Libraries’ microservices) can also make use of content stored according to the OCFL standard independent of any repository software.

P.S. OCFL made it to the finalists shortlist for the DPC Digital Preservation Awards for Research and Innovation in 2020. Read about it here.

 

Find out more?

To find out more about the Libraries’ digital preservation work follow Bodleian Library Digital Systems and Services on Twitter (@BDLSS) #DPOxford. 


Scroll to top