This blogpost has been written by St George’s, University of London (SGUL) Records Manager Kirsten Hylan, Research Data Support Manager Sarah Stewart, and Archivist Juulia Ahvensalmi.
‘Digital Preservation: A Concerted Effort’ is the theme of this year’s World Digital Preservation Day, celebrating how, by working with colleagues at our university and beyond, digital preservation allows us to both share and gain knowledge, which in turn supports our efforts at digitally preserving our digital records.
By engaging in dialogue with our colleagues both at St George’s, and at other institutions we regularly see in practice how the digital preservation community generously shares its experience and knowledge. As a specialist health care university, the information we produce today as part of our research and education endeavours needs to remain accessible to have the greatest possible impact and play a positive role in society. World Digital Preservation Day is another opportunity to consider our challenges, while learning how other practitioners have overcome their challenges. This year’s blogpost will consider a dataset stored on various storage media and the efforts underway to ensure its ongoing viability.
Research Datasets
St George’s, University of London (SGUL), is a specialist health and medical sciences university in South-West London. The Archivist, Research Data Support Manager, and Records Manager work together to advocate for digital preservation, winning funds for a digital preservation system, and identifying areas that hold records that require a long-term storage solution. As a medical school we have created many unique datasets that contribute to scientific knowledge and the teaching of medicine.
We have a responsibility to care for research data to ensure it can be used and reused. The reality though is that the bulk of research data is not published, and we are still building a culture where data is transferred to information managers for preservation. As a result, datasets are vulnerable to obsolescence and cannot continue to be accessed and developed by researchers.
Storage media within the Addicts' Index include digital files, floppy discs, microfilm, external hard discs, and paper documents.
The importance of publishing, sharing and archiving data was underscored during the COVID-19 pandemic, and many research funders such as the MRC, Wellcome Trust and Gates Foundation have clear policies on data sharing and data availability. Legacy research datasets are also important for understanding health outcomes, yet these research datasets are often difficult to access and re-use. The so-called Addicts’ Index is one such example. Legacy records offer researchers context for how research has evolved over time, and how and why decisions were made historically. Without access we are unable to understand and build on decisions that impact society.
Originally a Home Office project to collect data on individuals seeking treatment for drug dependence, the Addicts’ Index was created in 1968 and ran until 1997. The records are a unique resource for addiction specialists, medical historians and sociologists studying for instance patterns of drug use, providing a record of changing attitudes and approaches to treatment and practices relating to service provision.
The formats of the resource reflect the time of their creation, and the dataset has undergone various changes. From a now lost cardex system to paper files to various computer databases, the resource now exists on microfilm and on .pdf files on external hard drives, neither of which are currently easily accessible. Whilst the dataset provides several challenges in terms of digital preservation as well as in terms of the sensitive data it contains, it is a valuable resource.
5 1/4 or 5.25 floppy disk with data relating to the Addicts' Index.
The challenges we face in making of the Addict’s Index are mirrored across the research landscape, where it is estimated that 80% of datasets over 20 years old are not available. Our digital records are growing fast with most records produced in a digital form that is at risk of becoming obsolete, lost, corrupt, or unreadable if not effectively managed and preserved.
The Addicts’ Index is one dataset. However, the challenges we face in making it a viable resource again and ensuring its ongoing viability is mirrored across all areas of the university. Between us the digital preservation team are responsible for managing research data, archives, and records such as pensions and contracts that need to be maintain for significant periods of time. The decisions we make now will determine if in the future our records can be accessed.
Software Sustainability
The Addicts’ Index is a good example of the need to preserve software as well as data. Many of the storage media on which the Addicts’ Index or parts of it are stored are now obsolete and require special systems and software to read. The Addicts’ Index underscores the importance for preserving digital data, but also for preserving the software that can be used to read and analyse the data held within the Index.
Digital media carriers, many of them now obsolete.
More broadly, researchers often write specific software programmes or code which can provide a context for their research data, whether this is a simulation modelling a health condition or a code for an algorithm to find patterns within a larger dataset. As research becomes increasingly digital and data-oriented, it is important to ensure that software and code are preserved as part of good research information management.
Digital Preservation: A Concerted Effort
Manging digital records and data is indeed a collaborative effort. Key to achieving our goals at St George’s is the ongoing support of our Director’s and staff of our efforts to raise awareness and take steps to ensure the ongoing viability of our records. Looking outside our own institution we can see an information management community that has recognised the scale of the challenge ahead of us and the need for collaboration and knowledge sharing to put in place mitigating actions. Our shared community of practice is informed by an outlook of sharing knowledge and utilises its networks to discuss the challenges and share and evaluate best practices. It is this concerted effort that will see us in our efforts to preserve our records and ensure that they can be shared with future generations.
You can engage with the day and find out more about our work on Twitter and Instagram using the hashtags #WDPD2023 and #SGULWDPD2023. If you are interested in learning more about digital preservation at St George’s, or would like to get involved, please contact digpres@sgul.ac.uk.