Emma Hancox is a Digital Archivist at University of Bristol Special Collections and Theatre Collection
At the University of Bristol we have recently been working on improving our capabilities for digital preservation. Some of the activities we have been carrying out include developing a digital preservation policy, assessing ourselves against the DPC’s rapid assessment model and implementing Preservica as our digital preservation system.
Before the Covid-19 pandemic our efforts had mainly been focussed on dealing with legacy digital material in our archive collections and digital images created through past digitisation efforts. As for organisations across the country, the pandemic brought major changes to the teaching, management, and operation of the University of Bristol. As a response to this, we decided that it was important to collect records documenting these changes in our Special Collections archive so that future researchers could understand the University’s response. We also wanted our Coronavirus records to act as a complement to those being collected by other repositories nationally and internationally. We see this and the work being done by these other archives to collect Covid-19 related material as encapsulating the World Digital Preservation Day 2020 theme of Digits: For Good.
The team worked together to draw up a list of records to focus on. We found that these were records being created almost at the same time that we wanted to collect them and we were in the relatively new position of the majority of the records being digital. The most successful approach was to contact the record creators who could give us access to the files and this had the bonus of building awareness of our project at the same time. The University’s primary method of communication to its community, who were mostly working from home, was via livestream updates on SharePoint each week (also available to watch back after the live event). We were unable to collect these videos directly from SharePoint. Liaising with the creator of these files meant we were able to obtain the video files plus their accompanying transcripts. We also worked on collecting minutes of committees established specifically as part of the University’s Coronavirus response.
One of the added benefits of the Covid-19 collecting project was that we were able to begin learning how to use the web archiving functionality within the Preservica system to collect WARC files of important Covid-19-related University of Bristol web pages at risk of disappearing. We concentrated on pages giving instructions and information to staff and students about Coronavirus and relevant University news stories but were unable to collect intranet pages that required students or staff to login to read them.
Although we were able to successfully crawl most pages, we learnt that expandable menus on some of the University web pages are driven by JavaScript and could not be crawled through the Heritrix crawler within Preservica. To capture sites with JavaScript driven content we would have needed to explore using the WebRecorder tool, but as we were carrying this work out on top of other duties, we had to mark this as something to look at in the future. It also made us aware that if we decide to carry out a more comprehensive web archiving programme it would be useful to be able to inform the way pages are designed to ensure that they can be preserved in the future.
The University made a series of short video conversations with research leaders at the University speaking about their response to Covid-19 available on the Executive team blog. We found that the Heritrix web crawler was unable to harvest these from the blog. Instead we were able to liaise with the Marketing and Communications team to collect the individual video files.
Further work on the project will be needed in terms of ingesting material to Preservica, cataloguing the material and making it accessible but we have the beginnings of a resource that we hope will be invaluable to future researchers. Revisiting the theme of ‘Digits for Good’, the obvious good that came out of our project was the creation of a Covid-19 archive for the University of Bristol that will add to the jigsaw of material collected by other archives. There are many other positives though including raising awareness of digital preservation in different areas of the University through conversations with creators of digital records, developing our skills in web archiving and pulling together as a team to form a digitally rich archive.