I’ve been thinking about interoperability a bit recently and wondered if it might help us frame a wider discussion on how digital preservation might integrate with the wider technology landscape. My basic thesis is that digital preservation remains a niche topic and this is bad news. For all the reasons we’ve discussed before, it’s hard to get chief technology officers, let alone finance directors to invest for the sake of the long term , but if we can build digital preservation capability for inclusion in other systems then everyone will be a winner. There are two messages here: that digital preservation vendors and their clients need to be alert to interoperability (as many already are) so that DP capacity can be deeply embedded within diverse systems; but also that digital preservation is a special case of interoperability and as such has something to offer the rest of the technology sector.

Virtue Signalling

Interoperability is a wonderful ideal with equally wonderful outcomes.  It’s a commonplace of communications technology, creating the conditions in which products or information or systems from multiple services or sources work seamlessly together to the greater benefit of all, normally through the consistent application of shared standards.  You barely notice when it works but can instantly tell when it sucks, because the systems are supposed to do the work and humans shouldn’t have to force the issue.  It normally applies to the here-and-now but there’s no fundamental requirement for it to be only in the present tense.  It’s not hard to recast digital preservation as a subset of interoperability, then use that new status to shape a debate about what’s next for our community.  You might describe it as ‘diachronic’ as opposed to ‘synchronic’ interoperability: interoperability between the past and the future rather than just within the present.  You might boil it down to ‘our digital memory available tomorrow’.

If I am right, then a simple test for digital preservation systems would be to ask if they are ‘good at interoperability in the here and now’ because if they can’t do it now then you have to ask what are the prospects they will come good in the future.  And by extension, if we can identify the challenges to interoperability now then we have some small proxy of the challenges to come.

But if you will permit me, interoperability is a wonderfully expressive word too.  There are a lot of syllables in there evoking the many moving parts needed to bring it about.  It has too many letters to win you points in Scrabble.  And it’s a composite word where the juxtaposition of multiple parts compounds to some grander outcome.  It starts with ‘inter’, a connection; it ends in ‘ability’, the capacity to do something you couldn’t do before; and somewhere in the middle is the ‘opera’, a highly stylised song and dance that often ends in tears.  I commend it to you as a perfectly constructed signifier for an entirely benevolent idea. Add it to your lexicon. Impress your friends, just not at Scrabble. 

Brothers in Arms

The quest for interoperability has been a significant influence on my career.  I think of examples of cross search technologies when I worked at the Archaeology Data Service (Z39.50 anyone?) or my brief entanglement with spatial data infrastructures but I can start earlier than that.  Let me transport you to Glasgow and the long, hot summer of 1976. 

Now I need to explain that our home – three boys under ten – was an ‘Action Man’ kind of place in 1976. For those in the know I had the ‘Explorer’ and ‘Talking Commander’ which came with all manner of uniforms, guns, bombs, boots, parachutes, radios, tents, sleeping bags, telescopes, and diving gear not to mention an entire Scorpion Tank. Anyone coming into our house in 1976 would know this, so it was an act of thoughtfulness as well as great generosity when my uncle returned from a holiday in the US carrying a great cache of diminutive American militaria to extend the domestic armoury. There were uniforms for each branch of the US armed forces and all manner of kit to match. But if I were to tell you that GI Joe was slightly smaller and slightly slimmer than his Scottish brother-in-arms, then you will begin to guess the problem. I remember the disappointment that Action Man’s gripping hands couldn’t hold the assorted hand grenades, guns, walkie-talkies, binoculars, frying pans etc; or that his feet couldn’t squeeze into the boots.

But I still weep great tears of simple, irrepressible joy when I remember what happened to the couture. (I can hardly stop my hands from shaking as a type these words for you in the departures lounge at Osaka airport. My childish giggles are winning me some strange looks.)

The helmets and hats, perched at jaunty angles on Action Man’s mighty head, confected a rakish, devil-may-care look: a louche side to his character which lumpen British tailoring had evidently repressed. And the uniforms. Oh the uniforms! The tropical white outfit for a US naval rating, complete with bell-bottoms, smock and matching pork-pie hat, would have raised an eyebrow, even on GI Joe. But squeeze that jaunty ensemble over the little big man’s muscle-bound frame and … well hallo sailor! An astonishing mannequin-cross of Popeye, The Village People and Right Said Fred. Imagine 25 centimetres of stern-faced but perfectly honed army beefcake rippling through 20cm of taut yet pristine, body-hugging white velour. A tiny persona transformed. Things seen that can never be unseen and still achingly funny 40 years on: pure camp, base comedy, high farce, interoperability fail.

I think you might describe this as a variation to significant properties, but the unflappable wee hero cared not a jot. He just got on with fighting baddies, planting bombs and exploring stuff, so I’ll presume implied consent to explore the challenges of interoperability through this pint-sized humiliation.

Dis-interoperability

Let’s scale up to something serious. Imagine you were trying to launch a new product and your company had not one but two world-class engineering facilities in different countries. Imagine these facilities each used subtly different implementations of the same design software. Imagine there were minute variations in data processing deep in these systems. Now try assembling parts in one facility based on calculations from the other: it’s more than a fashion faux pas. Let’s situate the story in a highly regulated and risk averse sector in which not only reputations and profits are on the line but lives, too. Naming no names, but that’s happened and almost bankrupted one Europe’s largest engineering companies. No one imagined that interoperability could be an issue until it suddenly, massively became one.

Let’s add a time dimension. Imagine you manage a rambling, high-profile and historic public building. To help maintain this property you commission a comprehensive photogrammetric survey. The building has a lifecycle of centuries but CAD packages have a shelf life of two or three years. You don’t urgently need the data so you don’t think about it. Ten years later you finally get the approval for the massive, overdue repairs and in the meantime the system has become obsolete, taking the data with it. So you end up wondering if it’s cheaper to pay someone to salvage the data or just re-survey the whole building. Of course the building is its own datum so there’s really no competition: a new survey is commissioned, the original data is abandoned, and the effort to generate it rendered futile. That’s happened, at least once in my knowledge. No one thought to check if the data could be exported, or if old and new CAD systems could talk to each other. Interoperability anyone?

Before this all becomes too sanctimonious, let’s acknowledge the practical and tactical barriers to interoperability. Both of these examples present large and complex systems which take effort and skill to map. And there are layers: depending on what you want to achieve the entire computing stack presents obstacles. There may simply not be common standards or protocols to facilitate exchange and, even if there are (MARC, EAD, GML, take your pick), that’s no guarantee it will be easy. Ask anyone who has tried a cross search portal in the Open Archives Initiative Protocol for Metadata Harvesting (OAIPMH) on how many different ways there are to implement even a lightweight and now widely understood standard like Dublin Core.

The effort can be expensive but not necessarily rewarding, certainly when one of the consequences of interoperability is for competitors to see inside your own systems, or indeed for customers to move more readily away. Traditional business logic locks competitors out and locks customers in. So there has to be a good reason for a technology provider to embed interoperability in their products, such as greater reach, enhanced functionality or market expectation.

But there are some useful examples of times when all the participants in a given market have come together and recognised that everyone is better off if everyone sticks to the same broad principles and expectations. My favourite example is the Open Geospatial Consortium, an international and industry-wide organization that, in the late 1990s, engineered what you might call an interoperability-based non-aggression pact that had the effect of consolidating a whole community around a carefully developed and widely adopted set of standards for spatial data infrastructures. It transformed much of the underlying tech not to mention data. The global market for spatial technologies and mapping promptly surged.

Ages of Migrations

Stick with me here: there are six digital preservation themes that I want to tease out.

My mailbox tells me that digital preservation is entering a new round of repository procurement. I am not sure if this is the second or third phase and indeed Jon Tilbury, who’s been keeping score for a while now, tells me it’s four (at the PASIG conference in Oxford earlier this month. Great conference by the way). The number doesn’t really matter, it’s the direction that counts. What I see in the DPC is a group of early innovators retiring existing repository systems, and at least one of these is doing it for the second time. But at the same point, a great number of other DPC members are procuring digital preservation systems for the first time. So it’s a great time to be involved in repository business and it’s harvest time for those interested repository migration. Instincts tell me that the global digital preservation community will look askance at products or tools that, for whatever reason, hinder or defeat such migration. Put that in other words: interoperability means open standards ab initio. The winners in the digital preservation market place will be the ones that demonstrate a commitment to interoperability, and by extension a commitment to open standards.

There’s also some exciting news from the E-Ark project, which has been well reported through DPC channels over the last couple of years. To recap, this project sought to standardize OAIS information packages in a way that would make them directly comparable across multiple platforms. Time will tell how applicable this is for the many stakeholders outside of the European national archives in which E-Ark was trialled but the worst-case scenario is that the concept has been demonstrated and the rest is just work. It seems unlikely that one size will fit all: but it’s plausible that a few sizes will fit most.

In any case, being able to share information packages, and ensure that someone else can make sense of them is neither a trivial job nor a trivial responsibility. It’s the foundation of succession planning in digital preservation. Repositories that are unsure about how to do this should think quite carefully before accepting any more collections. In most cases the information packages will have a lifecycle longer than the repository service that maintains them, and in more than a few the information lifecycle is longer than institution that runs the repository. In other words, if the AIPs are not portable by design then paradoxically the repository and its institutional context can become threats to the data it seeks to preserve.

Other areas need more work if we are to support interoperability as a virtue in digital preservation. Interoperability almost always means standards that are properly adopted, mutually supportive and actively managed. In a blog post in May I raised concerns about how digital preservation standards emerge and are validated: that the community infrastructure around them is obsolete, risking not only the competence of the standards but to some extent the community itself and for sure creating contradictions and confusions about what the standards represent. Although we still have some details to flesh out, I am pleased that the DPC Board has now explicitly extended our mandate to include a more sustained effort on standards. More on that in due course.

This new mandate will hopefully take us some way towards a more unified voice and more participatory framework for standards. It will ultimately be modelled around the DPC and its membership which is large, growing and healthily diverse, but it’s by no means yet a single voice for the whole digital preservation community, so let me encourage you: if you’re not a member yet and you feel you have a role or capacity for leadership, then take seriously our open invitation to get involved.

The reference to the DPC’s emerging programme reminds me also to say that we can’t simply spend our time thinking about digital preservation standards. The real challenge for the future of digital preservation is how do our standards interact with others. Are we willing or able to admit the proliferation, confusion and contradictions that the world throws at us, or are we going to wait for the world to bend to us. And that in turn leads me to ask about our own community and what path we should travel? Are we inclusive and engaging, responding flexibly to a world that is dynamic and disrupted; or are we waiting for the world to come to us in our perfection? Our standards say a lot about us and where we think we’re headed and with whom we expect to be working.

When the cap fits

Setting interoperability as a goal for digital preservation gives us two and possibly three benefits: it encourages us to integrate digital preservation tools with other products, it provides a litmus test for repository systems, and it provides a frame of reference that explains our purpose in terms that others will recognise. If it looks like interoperability for now, then it looks like interoperability for the long term too.

Acknowledgements

I am grateful to Mairi-Claire Kilbride, Sarah Middleton and Sara Day Thomson who reviewed this prior to release.

Comments   

#1 William Kilbride 2017-10-04 12:19
I've been asked to provide some images to illustrate Action Man for an international audience who might not have been exposed to Scottish toys of the 1970's. Here's a link to a picture of the Talking Commander whom I recall being the centrepiece of the collection: https://goo.gl/images/oupQnH
Quote
#2 Ian Meldon 2017-10-04 12:49
My gut reaction to this is that they are playing the same game as any company who builds in obsolescence / a need to keep returning to them for updates/new tools with whatever you buy. The worst case scenario is perhaps that they intentionally dont address the issue of interoperabilit y, hoping to trap clients with single tech vendors long-term. Echoes of this, I sense the DPC has a kinship with the right to repair movement! : https://www.economist.com/news/business/21729744-tractors-smartphones-mending-things-getting-ever-harder-right-repair-movement
Quote
#3 Sebastian Gabler 2017-10-04 12:51
Excellent blog post. My thoughts: OAIS likewise addresses long-term access as it does preservation. In fact, this has been a long time dream of IT departments. Interfacing across standards is a huge issue with traditional IT. The Semantic Web has answered a lot of the challenges in a non-intrusive manner.
Some preservation platforms such as FEDORA begin to leveraging Semantic Web technologies. That is a good thing imho
Quote
#4 Jonathan Tilbury 2017-10-06 11:21
I totally agree that any DP system should build in an exit strategy from the start, either via a storage structure standard like eARK or a documented alternate storage structure. Alternately this doesn't have to be access to the storage but could be via a comprehensive API that allows harvesting of the complete information store.
This is not the same as mandating the structure of the primary storage though which is often set up differently for operational reasons.
Lastly, interoperabilit y with data sources is another problem altogether and requires live harvesting of content ready for preservation. There are commercial tools that support this and these should be integrated with DP tools.
Quote
#5 Jon Tilbury 2017-10-06 14:21
- As you say interoperabilit y means different things to different people. The main challenge to DP at the moment is getting things in seamlessly so I think the transfer of content from Information Management Systems to Digital Preservation Systems using rules and auto-transfer is the key gap at the moment
- The follow up issue of DP System A to DP System B will become a problem and any DP vendor should build this into their system to allow it to happen seamlessly. We have always allowed this and know this is the thing that keeps the sector awake at night
- eARK may be a solution to the latter but I suspect is too bound up in the SIP, DIP and AIP mindset to be of use for transfer in for corporations that just don’t think that way. There are plenty commercial applications such as SkySync and Xillio which do this out of the box.
part 2 of 3 ...
Quote
#6 Jon Tilbury 2017-10-06 14:21
...part 3 of 3

- eARK will be useful for DP portability but I see it as a migration mechanism not the internal primary storage of a DP system which is tuned to live usage. In particular it doesn’t allow for changes in structure and I am not sure about its scalability for example very wide records and live streaming of data. We will work with eARK to reassure ourselves these things have been thought of
- A lot of the current DP community still think about access to the internal storage as the way to do migration rather than having public APIs. We are extending our APIs to make this more straightforward , but I think a lot of the focus should go here

The aspiration of inter-operabili ty is laudable but the manner of its delivery perhaps needs bringing up to date.

Also, happy to be score keeper of DP generations!
Quote

Scroll to top