Caylin Smith is the Legal Deposit Libraries Senior Project Manager for the British Library
Earlier this month, the British Library hosted a workshop for the Legal Deposit Libraries’ Emerging Formats project. The purpose of the workshop was to engage stakeholders and look at some new approaches to address challenges the project has identified.
In this post, Caylin Smith, project manager for the Legal Deposit Libraries and a researcher on the Emerging Formats project, outlines the importance of Non-Print Legal Deposit in the UK, provides background information on the Emerging Formats project and in-scope content, as well as introduces the workshop.
Sara Day Thomson of the DPC then describes her experience of the workshop and how the Emerging Formats project reflects broader trends in digital preservation.
Non-Print Legal Deposit, Emerging Formats, and the challenges of supporting complex publications by Caylin
What is Non-Print Legal Deposit and why is it important?
As you might already know, there are six legal deposit libraries (LDLs) in the UK, and together they are responsible for a shared collection of print and digital publications.
In regards to digital publications, these libraries have been collecting eBooks and eJournals published in the UK as well as the UK web domain since 2013; more recently, the libraries have started collecting sheet music and geospatial datasets.
The print and digital collections are an invaluable resource for current and future generations of researchers. Digital items are preserved within the British Library’s digital repository, and access is provided at designated terminals at each of the LDLs.
The right to collect digital material is outlined in the Legal Deposit Libraries (Non-Print Works) Regulations 2013, a main driver of which was to help ensure comprehensive collecting of digital published works produced in the UK and to prevent the possibility of a digital black hole from happening.
As you might also already know, technology and the publishing world do not stand still. The tools that publishers use to create their works today might not be the same as those in the future.
As the publishing world evolves, so too must the LDLs. This isn’t just the case for these six libraries, but libraries worldwide as publications are created in new and more-complex formats than those already collected.
For the LDLs, these publications are in scope to collect, but more work is needed to understand how to collect, preserve, and make them accessible in a library context.
So what is this content exactly?
Within the Emerging Formats project, the LDLs are focused on eBook mobile apps; narrative web content that’s more complex and dynamic than websites currently collected; and, broadly speaking, structured data.
eBooks apps are essentially mobile apps, but ones that share characteristics of a digital book. These apps are created for Apple’s iOS and/or Android platforms, amongst others, and are downloaded directly to a mobile smartphone or tablet. They come in a vast array of non-fiction and fiction genres, including academic texts, cookbooks, children’s literature, to just name a few.
The web-based content the LDLs are looking to collect is highly interactive and narrative driven. These works share characteristics with some eBook mobile app publications, especially children’s literature, where they can be very device dependent and make use of a device’s functionality, like its camera, touchscreen, or speakers.
Structured data is the third category and the one that’s the hardest to define. Within the project, content created in this format includes published databases, datasets, and data feeds. The data must also include some sort of human input in its creation and not solely be automated, though this is difficult to identify if the creator is unknown. This criterion is not specific to structured data but was introduced to differentiate publications the project wanted to prioritise over data created without any human support.
Understanding what content has been created
The LDLs are looking to collect publications by creators who might not consider themselves to be publishers in the traditional sense of the word—or their creations to be publications.
Publisher—or creator—engagement has been a large activity within the project. Throughout the project, the LDLs have engaged with traditional publishers as well as creators who identify as something else: a storytelling lifestyle brand, for example. They have also invited content creators to discuss their works with library staff and external stakeholders, and made visits to creators’ studios.
The LDLs want to learn as much as they can about the content they’re looking to collect so they can understand how to preserve and make it accessible for current and future generations of users.
Along with creator engagement, engagement with external stakeholders is also an important activity within the project. These additional stakeholders include researchers, members of the digital preservation community, as well as contacts working in related disciplines, such as time-based media conservation.
The LDLs recently hosted a workshop at the British Library with creators, library colleagues, and external stakeholders to help them unpack why these publications are so complex.
Workshop
The day started out with background information on Non-Print Legal Deposit, the Emerging Formats project, and examples of the the types of works the LDLs are looking to collect. Michael Day from British Library Digital Preservation presented findings and observations this team has made from assessing sample publications, outlining why these works are more challenging to support. This work is ongoing, but you can read more about observations so far in their iPres paper.
Daniel Merlin Goodbrey, a comic creator and academic, presented a selection of his digital comics and spoke of the challenges of supporting these works to ensure they can be enjoyed by current and future generations of readers. One challenge in particular concerns Adobe Flash, a multimedia software platform that is nearing its end of life.
Daniel made an astute comment about more-complex digital content: if you don’t know how it worked to begin with, how do you know how it should work?
This comment really captures the territory the LDLs are having to navigate. The libraries have scalable workflows to acquire, ingest, preserve, and provide access to existing content types acquired under Non-Print Legal Deposit, but these activities get more challenging with works that have significant hardware and software dependencies.
The LDLs are also finding themselves in territory similar to time-based media conservators, where they’re looking to collect items that are more complex and unique, and therefore require more curatorial and technical knowledge to understand.
The workshop’s afternoon session was comprised of a series of activities that encouraged attendees to think more creatively about the challenges presented in the morning session and help them to understand the perspectives of different stakeholders. This part was led by David Crowe, a project manager at the British Library and a certified Collaboration Architect.
The Emerging Formats workshop from Sara’s perspective
It’s not every day I get to play with LEGO at work. These activities weren’t just some amusing party game – they’re based on a the Cynefin Framework for moving from chaos to complexity and then hopefully, one day, to obvious. It turns out these colourful bricks provide a great way to generate a conversation about something nebulous and dynamic.
The Emerging Formats workshop brought together stakeholders invested in the on-going development of new approaches for managing these objects. Together we discussed how these mobile app stories and web-based interactive narratives pose difficulties at different points in the digital lifecycle – from creation to preservation planning. The designation of ‘emerging formats’ itself is complicated – aiming to describe a broad range of digital objects, as described by Caylin above.
Creators, librarians, and researchers worked together with LEGO sharks and glitter pens to piece together a clearer picture of the challenges. What are these objects made of? What is the best way to make them available? What are the limitations to how they can be shared? How will we know if the object being collected it complete and in its most authentic form? How do we communicate the importance of collecting, curating, and preserving these works to others?
There might not be a ‘one size fits all’ answer to these questions. The presentations in the morning demonstrated the variety of digital artefacts included in the project. Caylin presented a interactive narrative designed to be used on an iPad – on a very specific version of the operating system. Daniel Merlin Goodbrey talked us through the evolution of web comics and lamented that some early instances of the form have already disappeared or broken. These different forms, while similar in some ways, pose different types of challenges and require different approaches to access and preservation. The app, for instance, relies heavily on a particular piece of hardware and software and the web comics, on the other hand, pose problems for web crawlers due to the use of Flash. And this is only the tip of the iceberg.
I hope the project team working on Emerging Formats found the day useful, but from the sounds of it, their work is well under way. From the background provided by Michael Day, the project has already made good progress on identifying the challenges that lie ahead. From operating system dependencies to unusual file types the team have uncovered a range of obstacles – and as a result – begun to formulate what some future solutions might look like.
In some ways, perhaps, the initial autopsy of the problem – the object and its context – is the easy part. The hard(er) part will be finding and understanding options for access and preservation and making a decision about the way forward. The digital preservation community – including the UK LDLs in fact – have been here before: facing a new challenge arising from innovations in technology and how people use it. In 2013 the LDLs faced the daunting challenge of capturing the UK web after the implementation of Non-Print Legal Deposit legislation. Now, the UK Web Archive is up and running, containing several full crawls of the UK web and UK-related web content for access in the LDL reading rooms. No small feat.
Forging ahead will mean making informed choices – with the input and feedback of the larger community of creators, publishers, users and other collecting institutions. Even if the solution isn’t perfect (and it never is), the future of these complex digital objects will require collaboration and regular review as technologies continue to evolve.
From a participant’s point of view, though, the day was certainly a very productive one. We are all now having the same conversation – together – moving forward.
Further Reading:
“Emerging Formats: Complex digital media and its impact on the UK Legal Deposit Libraries” by Caylin SMith and Ian Cooke
“Preservation Planning for Emerging Formats at the British Library” by Michael Day, Maureen Pennock, Caylin Smith, Jerry Jenkins, and Ian Cooke