Blog
Unless otherwise stated, content is shared under CC-BY-NC Licence
The first International Digital Preservation Day (#IDPD17) is finally here!
Glasgow, 30th November 2017
Dear colleagues and friends around the world,
Welcome one, welcome all! The first International Digital Preservation Day (#IDPD17) is finally here!
This day is for everyone who works in digital preservation. It’s about their work. It’s about the opportunities created by the digital materials they safeguard and make accessible. It’s about the hard work and ingenuity, often unrecognised, that makes a secure digital legacy possible. And it’s about fostering links across this growing but highly dispersed community. Supported by digital preservation networks around the world – old friends and new - IDPD17 is open to participation from anyone and everyone interested in securing our digital legacy.
Selecting a Digital Preservation System for the University of St Andrews
"We build our computer (systems) the way we build our cities: over time, without a plan, on top of ruins." - Ellen Ullman
John Geddy’s map of St Andrews, c.1580. National Library of Scotland (MS.20996) http://maps.nls.uk/view/00001427
Thoughts on Fixity Checking in Digital Preservation Systems
Neil Jefferies is Head of Innovation at Bodleian Libraries, University of Oxford
I would like to query the rationale for actually doing periodic fixity checking in isolation. This has bugged me for a bit so I am going to unload.
As far as I can see, the main reasons would be undetected corruption on storage and tampering that doesn’t hijack the chain of custody.
All storage media now have built-in error detection and correction using Reed-Solomon, Hamming or something similar which is generally capable of dealing with small multi-bit errors. In modern environments, this gives unrecoverable read error rates of at worst around 1 in 10^14 and generally several orders of magnitude better – which is around one in 12TB total read. Write errors are less frequent – they do occur but can be detected by device firmware and retried elsewhere on the medium. These are absolute worst case figures and result in *detectable* failure long before we even get to computing fixity. The chance of bit flips occurring in such a pattern as to defeat error correction coding is several orders of magnitude less – it is similar to bit flips resulting in an unchanged MD5 hash. Interestingly, in most cases the mere act of reading data allows devices to detect and correct future errors as the storage medium becomes marginal so there is value in doing that.
Preservation storage workshop at iPres 2017
Read Jaye Weatherburn's account of sessions at iPRES 2017. Jaye, who works at the University of Melbourne, attended iPRES with support from the DPC's Leadership Programme which is generously funded by our Commericial Supporters.
At iPres 2017 (in September | Kyoto, Japan) I attended the Digital Preservation Storage Workshop: Exploring Preservation Storage Criteria and Distributed Digital Preservation.
I was particularly keen to attend this workshop, as in my role at the University of Melbourne I am actively working with the research support community to develop better understanding of what digital preservation storage means and its requirements, as part of our Digital Preservation Project. We have used the most recent version (version 2) of the preservation storage criteria to run a workshop with our archivists, records managers, and IT staff, and have found this list of 58 criteria extremely useful both for increasing knowledge and understanding about preservation storage needs, and for generating discussion about what is required for preservation storage for different digital collections at the university.
The preservation storage criteria were originally developed by Kate Zwaard, Gail Truman, Sibyl Schaefer, Jane Mandelbaum, Nancy McGovern, Steve Knight, and Andrea Goethals, and have been further developed through workshops and presentations at various conferences and meetings during 2016-2017 (iPres conferences, PASIG meetings, the Library of Congress Designing Storage Architectures for Digital Collections 2016 meeting).
Cloudy Culture: Preserving digital culture in the cloud
Part 4: Costs and tools
The National Library of Scotland, Edinburgh Parallel Computing Centre, National Galleries of Scotland and the Digital Preservation Coalition are working together on a project called Cloudy Culture to explore the potential of cloud services to help preserve digital culture. This is one of a number of pilots under the larger EUDAT project, funded through Horizon2020.
We’ve already published an introduction to Cloudy Culture and reports on uploading and file fixity and downloading. This final report covers the use of preservation tools in the cloud using MediaInfo as an example, and the costs of using the cloud. The costs are based on a cloud service hosted by the Edinburgh Parallel Computing Centre (EPCC) using iRODS data management software (https://irods.org). The research questions we ask are:
- Can we use arbitrary preservation tools e.g. MediaInfo, in the cloud, even when the cloud uses one operating system and the tools run on another?
- How quickly did the tool run?
- How do EPCC cloud service costs compare to local storage and Amazon Cloud costs?
DPC 2.0: Growth and Change
It’s been a real pleasure in the last few years – even in the last few weeks – to witness the way the DPC membership has grown. Some statistics underline that this is not just anecdotal but genuinely represents a step change in our organization. Let’s compare the DPC over the last 10 years:
- In 2007 the DPC offered 3 public and member events; in 2017 that was 45
- In 2007 the DPC consisted of 2 staff; it now employs 5.8fte and is looking to expand that in the coming years
- In 2007 the DPC had fewer than 28 members; in 2017 it is over 70
- In 2007 the DPC had a turnover of 178K; in 2017 that was 412K
- In 2007 the DPC was active in 3 countries; in 2017 that was 3 continents
Data management at iPRES 2017
Read Chris Fryer's account of sessions at iPRES 2017. Chris, who works at the Parliamentary Archives, attended iPRES with support from the DPC's Leadership Programme which is generously funded by our Commericial Supporters.
First things first, I'm extremely fortunate and grateful to be in a position to attend events such as iPRES. My heartfelt thanks go out to the DPC for the Scholarship and my current employers the Parliamentary Archives.
"Data management"; a wonderfully flexible phrase which covers a multitude of sins. Thankfully, this particular session was chock full of work which will be of interest to anyone involved in the weird and wonderful world of digital preservation. The session kicked off with the spotlight on access. The Library of Congress showcased their brand new Labs initiative which aims to act as a place to encourage innovation with Library of Congress digital collections.
Reflections on PASIG 2017, Oxford
Read Matt Zawadski's account of sessions on Emulation and Software Preservation at PASIG 2017. Matt, who works at the University of Sheffield, attended PASIG with support from the DPC's Leadership Programme which is generously funded by our Commericial Supporters.
The PASIG conference arrived at the Museum of Natural History in Oxford with delegates from over 30 different countries attending. The first day consisted of a bootcamp, designed to get delegates ‘on the same page’ regarding all things digital preservation (DP) whilst days two and three of the conference consisted of presentations on various topics covering many aspects of digital preservation.
The venue for PASIG 2017 is home to both the Museum of Natural History and the Pitt Rivers museum, and both were open for delegates to avail themselves of during quieter periods when conference presentations regarding digital preservation were not foremost in their thoughts. The museums and Oxford served as a mirror to some broader themes that struck me as I attended the conference as a first timer, due to a generous scholarship from the DPC.
Here I am - advocating again……
Kirsty Lingstadt is Head of Digital Library and Deputy Director of Library and University Collections at the University of Edinburgh
Well here I am writing my very first blog post. My geeky other half has been blogging since LiveJournal was a thing. (These days he has a regular column at Black Gate Magazine and often guest posts for SF author Charles Stross.) He’s always telling me I should write up some of my increasingly technical “war stories”, and I’m always responding by saying no – it’s not that interesting. But the folks at the DPC swear they’re interested, so here I am…
I didn’t start out as a technical person, but it turns out I’ve been doing digital preservation for the last 13 years -- much longer then I realised! Mostly I’m self-taught, with some boosts from various “introductions to digital” preservation courses – I still fondly remember attending a Digital Curation 101 Course in Oxford in 2005 (with Seamus Ross in attendance) giving me my first formal insight into some of the challenges I was already facing and to some extent continue to face. My organisation at the time sent me because during my job interview I’d talked extensively about the need to preserve not just the physical collections but also the born digital ones. Hence I was there to learn more.
Acquisition and Appraisal at iPres2017
Read Jaye Weatherburn's account of sessions on Acquisition & Appraisal at iPRES 2017. Jaye attended iPRES 2017 with support from the DPC's Leadership Programme which is generously funded by our Commericial Supporters.
Thanks to a DPC Leadership Program scholarship (made possible by Commercial Supporters Arkivum, Preservica and Mirror Web) I attended iPres 2017 in Kyoto. This blog post focusses on the three presentations in the “Acquisition & Appraisal” session (Wednesday 27th September, 1410-1510). Each of the three presentations in this group focused on very different areas, from legacy media to augmented reality games, and also featured email analysis software. Each presentation had at its core the desire to provide answers to some digital preservation challenges, while also generating avenues and ideas for future research. I’ve gone with a descriptive style for my report on these sessions, with a wee bit of personal reflection.
The three presentations were:
- A Case Study on Retrieval of Data from 8 inch Disks ‒ Of the Importance of Hardware Repositories for Digital Preservation
- ePADD: Computational analysis software enabling screening, browsing, and access for email collections
- Challenges in Preserving Augmented Reality Games: A Case Study of Ingress and Pokémon GO