Helen Dafter is the Archivist for The Postal Museum in the UK
In common with most other archives The Postal Museum’s management of digital records has evolved over time. Since 2017 (and earlier for some records) I have tried to capture as much metadata as possible about digital records at the point of acquisition. This includes documenting file formats on entry.
However, the documentation for digital records received prior to this was less detailed. Our earliest digital records date from the mid-1980s and in some cases I was lucky if the existence of a floppy disc was recorded in the entry documentation. It was not unknown for me to only become aware of digital media in what appears to be an analogue deposit when I opened the box. Another complication with these earlier records is that they are often on obsolete media. Having these on my desk always created lots of curiosity from younger colleagues.
Stage 1 Accessing the content
When I started to consider our legacy records my priority was to get the content off the media and onto our secure server we use for our digital storage. This would at least provide me with assurance that the content was in one place (making it easier to monitor) and backed up.
I began tackling these media around 2018 and adopted a mixture of an in house and outsourced approach. I had access to an external 3.5” floppy disc drive, and an external CD drive and I used these with MagicISO and Exactfile to extract and verify the content from some of the media. In some cases this was not successful, and I didn’t have the hardware to access other media such as the 5.25” floppy discs or optical discs. As these were a small quantity I was able to secure budget to have these extracted externally. This is always a bit of a conundrum – without knowing what is on the media it can be difficult to justify the expense in extracting the data, but with limited entry documentation extracting the data may be the only way to make an informed appraisal of the content.
Stage 2 Assessing and appraising the content
Initial examination of the records retrieved from removable media was limited and focussed on the material processed externally. This identified several system files, a number of 1KB files, and some rather random content such as an account of a trip to London and an extract of from ‘Gulliver’s Travels’. It should be remembered that in the 1980s computers were still somewhat of a novelty and staff may have been teaching themselves about the functionality of different software. Somehow these ephemeral records made their way into the archive. In most cases these records were deleted. A few system or supporting files were retained as they may have potential to interpret or use other content we have in the archive.
This activity was then paused for a variety of reasons including a lack of confidence, the need for input from IT and staff in other teams, and a focus on the proactive acquisition of more recent digital records from the businesses. While it may seem illogical to seek out more digital records before having a full understanding of what was already held, records are easier to manage if acquired promptly from staff who understand their context and can provide any appropriate documentation alongside the records. I didn’t want to risk perpetuating the problems we were facing by delaying collecting activity.
In late 2020 as colleagues returned from furlough, and some of the immediate reactive work began to ease or become more embedded, I was able to return my attention to some of the long standing and somewhat neglected areas of digital preservation. One of my first priorities was legacy stamp artwork. This was prioritised due to ongoing conversations regarding received the more recent equivalents of this material and the fact that at least in theory this material should be straightforward (in terms of content) and significant (stamp artwork is a core record for us).
I began using DROID to get an overview of the file formats and identify any duplicate records. Once I had a list of file formats I referred to PRONOM and the Library of Congress file format sustainability documentation to enhance my understanding of the formats and potential issues. I also adopted a very belt and braces approach of copying the records to My Documents to test which ones were openable and understandable with the software available to me. In making copies I avoided any risk of altering either the content or metadata of the preservation copies. Local copies were deleted from both My Documents and the recycle bin once I completed each assessment.
Once I had an insight into what we hold and how accessible (or otherwise) it was, I could start to think about any interventions that might be required. This will form the focus of part two of my blog.