Helen Hockx-Yu is Program Manager for Digital Asset Strategy and Don Brower is the Digital Library Infrastructure Lead at the University of Notre Dame in the USA
Like many academic and research libraries, the University of Notre Dame's Hesburgh Library collection has evolved over the last twenty years or so from analogue to increasingly digital, and from a physically owned and locally stored collection to a broad range of both local and external resources, organised around users’ needs.
The complexity and varied nature of our collection means this is no one- size-fits-all digital preservation approach. Different strategy is required dependent of the significant properties of the collection items.
To gain a better understanding of the Library’s overall digital content, and to help plan for digital preservation, we recently co-led a project at Hesburgh Libraries, University of Notre Dame, where we developed a Typology of Digital Collection as a framework to guide digital preservation.
Inventory is a common method that libraries use to assess, examine and track the condition of their collections. This would be a valid starting point for digital preservation, too, but over a certain size inventories become hard to work with. Moreover, we have a few special collections where just performing an inventory would be a significant undertaking. We therefore deferred inventory to the next stage and decided to start with something more general, which can provide us with a high-level overview and help define the scope of digital preservation.
A typology is a general classification of collection items. It provides a structure for understanding the items by highlighting the properties either shared or not shared between them. This allows one to see patterns underlying the individual items. A typology can be contrasted with an inventory, which can become unwieldy and distract us from seeing the big picture.
We created the typology by interviewing subject selectors, digitization staff, and others who work with digital materials. We also surveyed our holdings, and then compiled the information into a big list. We then analysed the list and grouped the items based on the digital preservation treatment that may be required to maintain their ongoing access. These groups became the types. We ended up with three broad categories of types:
- Vended collections are resources that the Library is given permission to use for a limited purpose or timeframe. Access to vended collections by patrons generally takes place online at platforms provided by copyright holders or licensors. Examples are electronic journals and books and databases. Unlike with physical purchases, vended content may “disappear” from the Library’s collection once the licence is contractually terminated.
- Library-managed collections are content the Library chooses to or is obliged to exercise stewardship. These are for a large part kept on premise but the library may also use external services to host or provide access. There are 2 sub-types: digital surrogates (or digitized) and born-digita Digital surrogates are electronic capture of physical items, such as digitised manuscripts, newspapers, and audio visual material. In contrast, born-digital items are not derived from physical items, and instances include mobile apps, digital 3D models, panoramic photos, and GIS datasets.
Physical media were included in the project and is a type relevant to digital preservation, because (some of) these are expected to be digitised and will consequently produce digital surrogates.
- In-house digital creation is content produced for various purposes and are not collection items. Instances include marketing and instructional materials, student creation, source code, websites, blogs, access copies, and libguides.
The typology made it clear that not every type or all instances of a type need to or can be preserved. This helps us determine the scope of the Library’s digital preservation programme or long term stewardship, which centres around collections managed by the library and digital surrogates produced off physical media for longevity purpose. Digital surrogates produced for access, for example images from a book scanned for use on a poster, are outside the scope of long-term preservation.
Using the typology as a framework, we next developed high-level digital preservation recommendations for each type.
For Vended collections, there is relatively little that the Library can do to preserve the content pro-actively. One area where the Library can exercise some influence is in the licence negotiation process. We therefore recommend asking explicit questions about continued access, the possibility for archiving, and the content providers’ preservation commitment. We also expressed a word of caution with regard to local archival copies, which do not guarantee perpetual access. They become the Library’s responsibility and should be treated as part of Library-managed collections.
Instances of In-house digital creation are generally not intended for long term preservation. They may have significant short term value or be required to support the Library’s operations, so need to be kept safely, as with any other operational data. Some instances within this type may become a part of the Library’s collection, for example Doctoral dissertation or Master's thesis. Formal process should be followed to accept material into Library’s collection, which conceptually moves the content from one type to another. Some instances may be considered University Records. Our recommendation is to raise awareness of the University’s Records Management and Archives Policy and work with the University Archives to identify and appraise such records and implement the records retention schedule accordingly.
For Library-managed collections, we made specific organisational, technical, and process-related recommendations, including content inventorisation and prioritisation, as well as maintaining an ongoing overview of digital holdings—you simply cannot protect your data if you don’t know your data. A more detailed implementation plan will be developed and implemented to move us forward. The goal is to progress and achieve all Levels of Digital Preservation, recommended by the National Digital Stewardship Alliance (NDSA).[1]
Our work is still in progress and ongoing. The typology approach has been useful to conceptualise our digital collection, providing us with enough details to understand the broad patterns and preventing us from not seeing the forest for trees, too early in the process.
"Don Brower is the Digital Library Infrastructure Lead at the University of Notre Dame's Hesburgh Libraries."
[1] Levels of Digital Preservation. https://ndsa.org//activities/levels-of-digital-preservation/.