The Bodleian Libraries have partnered with Cambridge University Library on a two-year collaborative project: Digital Preservation at Oxford and Cambridge (DPOC). Funded by the Polonsky Foundation, the project is researching and developing requirements for each library’s digital preservation programmes. Both libraries have appointed three Polonsky Fellows each, specialising in the following areas: Policy and Planning, Outreach and Training, and Technology.
The idea behind the split is simple: a sustainable digital preservation programme does not rely on technology alone. It requires policies, strategies, funding and staff with the necessary skills. The roles were based on the Three-Legged Stool; to have a successful programme, you require a balanced approach. Technology is one of the legs, because digital preservation by its very nature relies on technology. We need technology and tools that are flexible, fit for purpose and that we can adapt over time. Organisational policies, strategies and workflows are necessary to provide the programme structure and a foundation to build upon. The Resources leg is about procuring sustainable funding, but also staff with the right skills to use the technology and to carry out necessary work.
Three-Legged Stool, DP Workshop
Somehow, as if it happened overnight, the DPOC project is now in its 9th month. It seems like just yesterday that we were arguing about project names and trying not to get lost between buildings. In reality, we’ve been doing a lot of surveying and research; here’s some of what we’ve been up to.
All about the skills
As the Outreach and Training Fellow, it’s not just about finding out what people need to learn, but making sure there's a programme in place that facilitates it. I began by reviewing skill frameworks like DigCurV to determine what questions I needed to ask. It wasn’t enough to have a list of skills that digital preservation practitioners and managers should have, I needed ways to quantify it.
After a lot of research and development, two types of training needs surveys were developed, to help assess current skills levels and drive the direction of the training programme. The surveys have gone out, the results are mostly in. Now I am using the results to make recommendations for a training programme, which will begin soon.
Overall, there is a strong need for building confidence and understanding of the theories and practicalities of digital preservation. More importantly, training needs to be relevant to the digital collections staff are working with—it requires a high degree of course specialisation and additional learning on my part. And to get the most out of training, I have been developing activities and strategies that ensure I’m actively involving the participants in the learning process, not just speaking at them.
Training activity for ‘Personal Digital Archiving’ - how to organise and label your digital files
Surveying our digitized collections
Part of being able to devise a strong preservation plan for our digitized collections, is knowing what we have and where we have it. After over 20 years of digitization projects, the Bodleian Libraries hold plenty of interesting content created back in the early days of heritage digitization. Much international standardisation and technological advancement have happened since, but we are still invested in these older, but very popular, resources. Edith Halvarsson, the Policy and Planning Fellow has been tasked with finding these weird and wonderful legacy digitized collections that exist on both project websites and offline storage. She’s also been mapping workflows, checking policies and researching best practice.
And during the survey process, discoveries were made. Tucked away in bottom drawers, in boxes and in fire safes were hard drives, tapes and DVDs of the remains of past digitization projects, many of them outsourced. Our largest find is below, tucked away carefully in the bottom drawer of a filing cabinet for many years (don’t fret, the images are already online at LUNA and the master TIFFs archived on tape).
What we found in the bottom drawer
Alongside surveying the Bodleian Libraries’ digitized legacy collections, we're looking at how metadata and images can be retrieved and re-delivered through the Bodleian Libraries' new more robust workflows (developed for the Polonsky Foundation Digitization Project). Edith found that learning about the current preservation risks to legacy collections helped her focus on areas where the Bodleian Libraries can pre-empt the same issues through strong data management plans and policies for newer workflows. After months of hard work, a draft of the Digitized Image Collection Survey is finally on the desks of the DPC for feedback and recommendations.
In order to image the hard drives and other materials we’ve found around the digital library, our Technical Fellow, James Mooney, has set up a BitCurator imaging station to carry out the work. With our trusty new USB write blocker in place, we have been acquiring the hard drive images, virus scanning, verifying the content, calculating checksums and then storing them on a newly provisioned digital preservation storage server which is connected to a large network storage array, these images are then re-verified and also written to tape.
Edith and James have been preparing the ground in order for us to validate all the master TIFF files which have been generated from the Polonsky digitization project. We are planning to validate the files with both JHOVE and DPF-Manager, and with over 500,000 images this should make for an interesting case study.
James has also been working on automating a series of DROID file format identification scans of the Oxford Research Archive (ORA), Digital.Bodleian and our large scale data repositories. The plan is with regular scans we are hoping that as the file signatures improve we will be able to more accurately report on all the various file formats stored in these repositories.
Reviewing our legacy collections, auditing our current infrastructure, working with service owners to add digital preservation actions into existing workflows, and working on plans to migrate content and metadata from existing platforms into Digital.Bodleian are all on the go, so an interesting summer lies ahead!
For more information, please visit: www.dpoc.ac.uk