The storage media prioritization methodology and tool were developed in early 2023 by the Archives & Special Collections (ASC) digital team at the University of Glasgow. The aim was to encapsulate the knowledge/guidance deriving from digital preservation community resources, such as 'Bit List' of Digitally Endangered Species, into a methodology and tool that addresses a recurring issue in digital archiving: how to prioritize archival processing of computer storage media – especially those at higher risk of loss – in a consistent, transparent manner.

The motivation for this work arose from the need to manage our ever-growing collections of born-digital assets. ASC currently holds an estimated 300TB of records that have been deposited in computer storage media, both contemporary and legacy, ranging from hard disk drives and USB sticks to optical media and floppy disks. At such scale and variability, it is practically impossible to process and preserve everything, all at once. Choosing which storage media to prioritize is more challenging that it might appear: a well-maintained CD-ROM created 15 years ago and stored in optimal conditions, might have far greater chances of survival than a six-month old Solid State Drive that has been spending its days at the bottom of a backpack (with all the hardship this might entail).

The methodology and tool mark a departure from the previously employed micro-appraisal approach, whereby the selection of storage media assets for processing was done on a case-by-case basis, based on predominantly empirical criteria. They introduce instead an evidence-based approach to decision-making on the prioritization of born-digital records for active archival management and preservation.

This work represents the first effort that draws and builds upon the digital preservation community’s expertise and resources for prioritizing archival processing of computer storage media. In doing so, it exemplifies and emphasizes the extensibility, reusability and continuing value of documented community practice in providing solutions to common digital preservation and archiving problems.

The methodology collates existing knowledge on maintaining and preserving computer storage media; and distils it into four criteria for prioritizing archival processing. These include:

  • The average lifespan of the medium, as indicated in academic literature and community resources.

  • The year of production of the medium, as a measure of longevity and obsolescence.

  • The environmental conditions in which the medium has been stored after being deposited.

  • The classification of storage media adopted by the 'Bit List' of Digitally Endangered Species, so as to inform the methodology with community practice; and avoid duplication of existing effort.

Each criterion is assigned a score from 1-5, with 1 being the best case scenario and 5 the worst, against a list of 24 types of storage media. The list summarizes our current knowledge on average life span and contemporaneity of storage media; and reflects the status identified in the 2023 ‘Bit list’.

The tool utilizes this information to calculate a priority score and provide an indicative action for a selected storage medium. Users have the option to choose a weight for each of the prioritization criteria, thus customizing priority scores to their specific needs. A priority score can range from:

  1. Low priority - action within 3 years
  2. Low priority - action within 1 year
  3. Medium priority - action within 6 months
  4. High priority - action within 3 months
  5. Extreme priority - immediate action 

The methodology and tool have been incredibly effective in prioritizing processing of storage media at our institution, due primarily to its simplicity and ease of use. We believe it can prove equally useful to other organisations and support their preservation planning and action. In particular, our work can contribute towards efficient resource allocation and risk mitigation. Prioritizing archival processing using the methodology and tool focuses efforts on higher-risk records. Efficient prioritization helps manage costs at scale, so that resources can be allocated where they matter most. In turn, prioritizing high-risk materials minimizes the chance of data loss due to deterioration or technological obsolescence.

On the broader scale of the digital preservation community, our work highlights and establishes the significance of consistent, evidence-based criteria for prioritization. Digital preservation professionals can follow predefined guidelines, ensuring transparency in active archival management. Consistent prioritization practices benefit the entire community by maintaining uniformity. Moreover, prioritization informs long-term planning and advocacy efforts. The community can advocate for resources, funding, and policies based on the importance of preserving specific records.

Although our work is primarily geared towards helping digital archivists prioritize the processing of computer storage media, anyone can use the tool to quickly assess if a storage medium is at-risk of data loss. The methodology encapsulates the current state-of-the-art in preservation of computer storage media, yet it is open and extensible for the community to continuously update and maintain for future use.

The methodology has been published in the proceedings of the iPRES 2023: The 19th International Conference on Digital Preservation (https://eprints.gla.ac.uk/295807/) The prioritization tool is licensed under CC BY-NC-SA 4.0, and freely available to access and download from the Enlighten Research Data repository: https://doi.org/10.5525/gla.researchdata.1634

 


Scroll to top