On Sale at All Good Pharmacies: Eternal Life
There’s a paradox that links digital preservation with medicine. Digital preservation systems are subject to the same obsolescence that they exist to guard against: and even great doctors catch colds. I believe my doctor is mortal but that doesn’t mean I reject her advice. Her advice is not intrinsically dependent on her own experience but is situated within a global, dynamic community of research and practice. The medical profession deals with the problem of its own mortality by shifting the locus of their competence from the specific to the general. So, if mortality is to medicine what obsolescence is to digital preservation, and if really great doctors don’t have to prove themselves by living forever, what about digital preservation tools? What should we do about the problem of obsolescence? Should truly great digital preservation systems demonstrate their worth by living forever? What is the locus of their competence?
Obsolescence as Pathology
The analogy between digital preservation and medicine doesn’t extend very far but it works in this regard too: neither medicine nor digital preservation exist for their own sake. Despite what you might see on the television, the evidence of a successful medical profession isn’t revealed in constant crises, complex diagnoses or relentless pathologies: it’s about healthy happy and contented people, leading meaningful lives away from hospital and only going there periodically for routine checks and correctives. Medicine is about people and for people. And so with digital preservation: success isn’t fixing corrupted files or heroic efforts to resuscitate moribund systems, it’s in real and enduring impacts from robust and sustainable digital things that we monitor routinely and act on occasionally.
Let’s think about this for a moment because there’s a twist coming, and it needs set up properly. Digital assets – files, data, images, records, programmes – have value because they create opportunities in the real world. With digital preservation on our side we can cure cancer, detect crime and speak truth to power; we can buy and sell, laugh and cry, and we can manage processes through decades. These impacts are great but they are not guaranteed. That’s because access depends on a constant configuration of software, hardware and people and these change. So, there are constantly emerging barriers to re-use. We may worry about data loss or reproducibility but really, it’s the loss of opportunity and impact that matter. It might look as though we are obsessed with file formats, workflows, rights, metadata, authenticity and any number of other issues. But we’re only doing that so that we can come good on the digital promise. Our focus is (or ought to be) on people and opportunity. So, to summarise: digital preservation is for people.
We’ve worked hard at the problems towards the bottom of this stack. We’ve become good at looking after bytes in their many configurations. We’re also getting better at moving up, explaining the value and the opportunity. But there’s another aspect we barely mention. If we could get to the source of the problem then our lives might be simpler. If the problem is that software and hardware and people change shouldn’t we move upstream to deal with the cause rather than the symptoms? What are the causes of change? How do the consequences of change play out?
Change and decay in all around I see
I have inadvertently just asked you to explain the nature of change in society and technology. That’s an awfully big question and I am not sure we will solve it in a blog post. (There was a time when as an archaeologist might have attempted some grand theory on the structuring structures of pre-disposition. But I would have likely been wrong and those structures predispose me in other ways now. I no longer have it in me.)
We could narrow it down a little to ask if we might engineer an optimal socio-technical environment in which we might keep our digital resources alive? We might not be able to explain change but that needn’t stop us trying to do away with it. It’s appealing in part because everyone could play. I expect I would simply end up listing all the things I hated about Windows 95. Kate Zward recently proposed that the Internet peaked with the Hamster Dance, a conclusion I find hard to argue. Almost as soon as the idea is out of my mouth, it becomes preposterous. There is not, nor will there ever be a perfect digital paradigm. It’s an absurd refraction of file format normalization. Although I don’t think anyone seriously thinks like this, I know from experience that others think we think like that. If our colleagues think we are saying ‘stop the world we want to get off’, then I feel sure they will indeed find a way to stop the world and help us off.
If we’re so good at spotting obsolescence and preparing other people for it, what happens when we turn that assessment on ourselves. Can we understand the implications for change on digital preservation tools and the community that deploys them? How and why should we respond?
People, Technology, Standards, Trust (then back to people again)
Bearing in mind that I routinely pitch digital preservation as a socio-technical problem, I am going to answer in social and technical terms. Let’s start with social bit and end back on the social bit, by way of technology and standards.
Just to prove that I am hip with the data-driven zeitgeist I am going to open with a whole fact, one which I hope others can corroborate: the digital preservation community is growing. Let me illustrate the trend from a DPC perspective with four simple observations. The Digital Preservation Handbook was launched in 2002 as the work of two authors: the 2016 edition credits thirty-three authors. The Digital Preservation Awards in 2010 had one winner: in 2016 we had 6 winners with nominations from 10 countries and 4 continents. In 2009 the DPC had 2 staff and in 2017 we have 6. In 2009 we had 33 members and in 2017 we have 66. I am sure that others can identify similar trend lines. The simple fact is there’s more of us then there used to be.
The growth is welcome but it generates a certain amount of disruption. The community is not simply growing but is becoming more diverse. We have more stuff too, it seems ever-more complicated, and it is subject to more and different regulation. That means the use cases for preservation are more demanding, the requirements more expansive and expectations more exacting. It may seem like a good time to move into the digital preservation business (and everyone is welcome) but we’re a disruptive and eccentric bunch. One solution will not rule them all and if it did It wouldn’t do so for long.
So far so good for the people-bit. Now what about the technology bit of my socio-technical challenge? There’s a paradox that digital preservation systems are subject to the same kind of obsolescence that they are designed to prevent. Because such systems routinely implement a fixed (and frankly limited) set of tools and services the risk of obsolescence is in fact relatively concentrated. We need to embrace our own mortality: digital preservation tools are products of their own times so are a contingent solution to an enduring problem. We also need to make sure that by introducing digital preservation solutions into our organizations we don’t inadvertently accelerate the problem we’re supposed to be guarding against. That’s not unprecedented. Extending the earlier metaphor to breaking point, let’s call this the equivalent of ‘hospital acquired infection’. Learned counsel might suggest it’s a form of negligence.
We could get into another discussion here about the funding of tools like JHove or DROID, but that’s not the best argument. The tools can come and go, it’s the standards that matter. You can deprecate, castigate or fetishize tools, but it is standards that might be described as the digital preservation warranty. Standards codify and validate the community’s best guesses of resilience, declaring them formally and openly so that developers can make best practice executable. Better still if we can make our expectations atomic and to some extent modular then we can build architectures which are more robust. Components that become obsolete can be discarded and replaced. The more atomic the component, the easier that change will be. I grant you we are not currently overwhelmed by tool developers offering to build us a pick and mix constellation of standards-compliant mirco-services. But as the community grows that is likely to change. And in any case there’s a reasonable chance that the solutions will be proprietary in any case, meaning we might be asked to put our trust in black box solutions. But that simply means the standards matter more. Provided tools can be shown to comply with some standard specification, they should in theory come with some elementary community validation. Put another way: if the system fails and the system is entirely true to the requirements, and if the requirements are bound to the standard, it’s neither the technology nor the requirement that is at fault. It turns the spotlight on the standards and the community that validates them.
Standards matter in all sectors but they should matter to us in particular. Facing a large and complicated task, standards offer a beginners’ guide to requirements; working on behalf of heterogenous and risk-averse agencies standards enable meaningful comparison between incommensurate processes. Standards hold out the promise of interoperability and a keystone of succession planning. They are our best working solution to the paradox of obsolescence and a technology that cannot guarantee itself.
That doesn’t mean standards are a transferable proxy for success. Our standards – reference models, recommendations and frameworks - always need careful interpretation to be implemented. It sometimes seems that too many digital preservation organisations have no ambition farther than to implement some standard or other and assume their job is done. As if some tick box exercise is sufficient. Take ISO16363 as an example here. This is the formalised version of the TRAC and holds, in consort with ISO14721 (aka OAIS) and ISO16919 (which has no traceable origin), the prospect of certification as a ‘Trusted Digital Repository’. The metrics break own into three major blocks: Organizational Infrastructure, Digital Object Management and Infrastructure and Security. No shortage of inquisition flows through these headings as every possible competence and capacity is examined. And here is where I am beginning to struggle. I would never not want a trusted digital repository. But reading the standards I am none the wiser about how I could buy one.
Dive a bit deeper to see what I mean. Here are three metrics, selected from the three different sections of the standard:
- 3.4.1 The repository shall have short- and long-term business planning processes in place to sustain the repository over time;
- 4.4.2.1 The repository shall have procedures for all actions taken on AIPs
- 5.2.1 The repository shall maintain a systematic analysis of security risk factors associated with data, systems, personnel, and physical plant.
All well and good. I understand what they mean and I can’t fault the logic. But it’s also not how a market works: these are not things I can buy and they are things no one could ever sell. So any review of the digital preservation market place begins secure in the knowledge that none of the products on sale will ever meet the standard. Is (insert product name here) a Trusted Digital Repository system or solution? By definition, they cannot be. How can users know that these the products they procure or develop are doing what is required? And how might the companies and developers behind these systems validate their claims to ‘do digital preservation’?
So there’s a mismatch between the standard which presumes a soup-to-nuts interrogation of organizational capacity and how the market place is developing in reality. In a strange way we may be inadvertently discouraging developers. More on that in a moment.
For now, to repeat what others have heard me say, I think the idea of a ‘trusted digital repository’ rings a little hollow. I can summarise this in three points: I struggle with the word ‘trusted’; the word ‘digital’; and the word ‘repository’. I would rather have an ‘untrusted’ repository that meets its challenges openly. I want to hear about the failings and mistakes, and if I don’t then I have no real evidence that anything has been learned. I want to trust the people and their capacities not the thing they use. Secondly ‘digital’ seems a strangely transferred epithet. A ‘digital repository’ is like a ‘bread oven’: the oven is not made of bread. Much of what we need to build the repository is not digital: it’s a policy and process, flesh and blood, income and expenditure. And finally ‘repository’ is a peculiar metaphor which like ‘file’ or ‘desktop’ or ‘document’ signifies and simplifies highly complex systems. Don’t get me started on why so many ill-fitting metaphors permeate how we talk about managing bit streams because that’s a whole other blog post. For now, I would simply observe that ill-fitted metaphors like ‘repository’, ‘package’, ‘file’ and ‘document’ can be intellectual quicksand. (Ill-fitted similes are like quicksand). Is a ‘trusted digital repository’ the answer to our digital preservation problems? Well if digital preservation is a process not a thing, in the long run it’s more important to have trusted tools, processes and staff. Every time someone tells me they want a trusted repository I check the ground under their feet to make sure it’s solid.
That’s a digression. Here’s the real point: the technology paradox is infectious. We may rightly deflect criticism away from technologies by requiring them to live up to standards but that’s only going to buy us a little time. Because the standards become obsolete too.
Standards as social infrastructure
So, if we make sure systems don’t become obsolete by reference to the standards; and we make sure standards don’t become obsolete by reference to the community; can the community become obsolete too? Bluntly: yes. More accurately, the long-term health of all our digital objects depends on an effort to monitor and renew the social infrastructure of community interaction, a challenge made harder (but arguably more enjoyable) by our rapid growth and increasing diversity.
The question of guaranteeing that our social infrastructure is fit for purpose turns out to be critically important. People make standards. People who show up, have time and have good connections. People with specific interests, ethics, capabilities and objectives. I have just been invited to a review of the MoReq standard, in May next year Pretoria. I’d love to go but there’s no airfare and no hotel, and I have other commitments any way. The Consultative Committee for Space Data Systems made a big and generous offer in 2002 when it gave us OAIS. But it’s hardly surprising when local records offices or university libraries struggle to comply with it: because they are inadvertently comparing themselves to NASA. PREMIS is a remarkable thing, so wonderful that it won the 2005 Digital Preservation Award. How often can you say ‘award-winning metadata standard’? It’s great. But it’s not holy.
And that’s not just a challenge for the standards we already have, it’s a significant obstacle for what we lack. As I said earlier, the fact the I cannot buy or sell an explicitly standards-compliant digital preservation system seems to be a problem that we could deal with quickly. Here are some ideas, not all my own, that are not beyond the wit and wisdom of this growing community to implement. Imagine a model product specification for a digital preservation system, broken into a series of specifiable components, and supported by some kind of product validation framework? There is an entire industry dedicated to software quality assurance that we could learn from. Test corpora would help (segmented into different types of organisational function as much as by mime type), as would robust and external data integrity checking. And why not have a quality assurance framework for crash-testing: everything from a faulty plug in the data centre to a full-scale evacuation and deployment in another system. That would let DP vendors distinguish themselves from run-of-the-mill content management and repository systems. It would stop us buying EDRMS and wondering why they are generally so rubbish at digital preservation. Oh and ‘Invitations To Tender’ or ‘Requests for Proposals’ would be simpler because certain products could carry a sort of digital preservation passport. And if the DPC’s vision is ‘our digital memory accessible tomorrow’ then by reducing some of those barriers we’d stand a decent chance of having ‘a bit more of our digital memory accessible tomorrow’.
But back to the big question – what are we to do about obsolescence? The question echoes back with a transformation that puts community at the core. ‘What is the origin and direction of our standards? How have the mechanisms of scrutiny and revision adapted to our growing number, diversity and complexity?’ In the context of a rapidly growing and rapidly changing community there is a pressing need to make sure the relevant committees and working parties remain representative, transparent and inclusive? That probably means ensuring they are serviced by a credible and consistent infrastructure on behalf of the whole community. My guess: standards are still reviewed and revised as though it was 2002. My worry: this will not be good enough for 2020. To phrase it differently, OAIS requires that an archive monitor and respond to the changing needs of the designated community. Well said! Right back at you.
Before misapprehensions fester, none of this should is intended as a commentary on the content of the standards and even less of the people who generously give of their time and expertise to create them. The undertext is about drawing attention to the vital work they do, giving encouragement and expressing admiration. Rather than griping at them it’s a sort of self-criticism. We should do more to help them make standards better. The community infrastructure is here already and it comes with robust mechanisms to ensure that it is remains geared to the needs of the growing and diverse community. Frankly, my salary depends on that.
Thus, the question arises if can we do a better job of connecting the need for standards to be developed with the whole community to the DPC’s growing and renewing community infrastructure? I don’t only think we could, I think we really need to. It’s why last year we put time and effort into the community standards forum. That was a tentative step towards a more active role in standards for the DPC. And as we progress on the development of our new strategic plan, perhaps it’s time to put that to the test. Should we add standards development explicitly to the DPC portfolio? I’d love to hear your thoughts.
Digital Preservation: A Human Project
To recap, digital preservationists instinctively understand the need to monitor and renew emerging technology. It’s how we protect against obsolescence and there’s been a lot of effort on this. We have too often taken for granted the social infrastructure in which digital preservation occurs so efforts to renew and monitor community coherence and interaction are disjointed. There are some impressive outcomes in terms of training and workforce development which we can celebrate. But it’s surprisingly hard to find good examples of how the community infrastructure needed for the validation of standards has adapted. That matters partly because the community is growing: and mostly because, if you enter the paradoxical world of technical obsolescence, the community’s emergent codification of good practice is the warranty of all that we might seek to achieve.
For all our technology, digital preservation is a human project. Just as we articulate the value of digital preservation as an investment in people and opportunity, so we turn to the community to validate our processes. That community is dynamic, a dynamism which implies ongoing renewal of our social infrastructure. I can think of worse slogans: digital preservation by people, for people.
I'd like to acknowledge my frend Pete McKinney of the National Library of New Zealand and chair of the PREMIS editorial board who generously commented on an early draft of this post
Comments
And now I'll need to move to comment two...
For me standards should be simple and "composeable" and do one thing well - Like good software. So (perhaps) we have a standard for ensuring the long term integrity of bit streams ISOXYZ123 - I can buy XYZ123 compliant storage! (perhaps) We have a "versioning" standard that lets me relate one set of digital objects to another and assert one is a new "version" of the other say ISOXYZ321 - I can buy a "version control" component that is XYZ321 compliant and put it on top of my XYZ123 compliant storage! A Tool Box not a universal tool - If we only have a hammer (OAIS) then all Digital Preservation problems might become nails!
Dramatis Personae (alphabetical by surname)
Kevin Ashley: @kevingashley
Euan Cochrane: @euanc
William Kilbride: @WilliamKilbride
Jenny Mitcham: @Jenny_Mitcham
Ed Pinsent: @EdwardPinsent
Anthea Seles: @archivista13
Steph Taylor: @CriticalSteph
Comments:
Jenny Mitcham - Today's lunch time reading supplied by @WilliamKilbride and a very good read it is too! Obsolescence 2.0
Jenny Mitcham - Fave line re Trusted Digital Repositories has to be "I would rather have an 'untrusted' repository that meets its challenges openly"
Steph Taylor - I like that!
Jenny Mitcham - Me too! I've never really been entirely happy with the Trusted Digital Repository label ...and how it is sometimes quite randomly applied!
Steph Taylor - Agree. It seems sometimes like a leftover from the wild west days of the Internet. In some cases it's useful, but not in all.
Ed Pinsent - Clifford Lynch wondered if "competent" rather than "trusted" would be a better term. He said that in 2006!
Jenny Mitcham - That could be better. "Trusted" always raises questions in my mind - ie who by? @WilliamKilbride
Kevin Ashley - This is a bit TCFT, but 'Trusted' was in docs from 1990s & was wrong term; 'Trustworthy' more useful, but still only a stepping stone.
Euan Cochrane - I dunno... I thought it was "trustworthy" as assessed by auditors? I think the issue is with certification vs trust
Jenny Mitcham - Yes and no - because different organisations use the term differently - some to signal certification and some not - can be confusing
Euan Cochrane - Isn't that why trustworthy has been defined in #ISO16363 though?
Jenny Mitcham - Maybe I'm thinking of 'trusted' rather than 'trustworthy'...I think it is a confusing landscape!
William Kilbride - well ISO16363 really is RAC not TRAC. But the word 'trusted' is ubiquitous and therefore concerned when I hear it used like that. 1/2
William Kilbride - But also, I like that ''trusted is past tense, ie 'we used to trust that repository but not now'. If there were a future participle ... 2/2
Euan Cochrane - yeah, I'm just not as convinced as @WilliamKilbride that certification is problematic I guess. It's expensive but that's kinda the point.
William Kilbride - ... trusted, trustworthy, certified, competent, audited: all lead back to the community as infrastructure.
Jenny Mitcham - The problem is perhaps that DP never stands still - we need to evolve, react and experiment. Certification doesn't really capture that.
Euan Cochrane - I'm (reasonably) convinced we need a more comprehensive centralized digipres community services provider. Ideally permanently endowed #easy
Euan Cochrane - Minimum reqs stay reasonably static. Certification against those seems worthwhile. @WilliamKilbride points out the standards must evolve.
Euan Cochrane - I'm pushing back mainly because i don't think we've yet given certification a chance. It could be great but will take time IMHO
Anthea Seles - Certification is entirely problematic! It's riddled with unrealistic expectations: infrastructure, capacity and resources.