As Rosetta’s business analyst, I’ve always positioned myself on the receiving end of this blog. The ever-growing experience and different perspectives of digital preservation expressed in the stories shared here - and elsewhere - has significantly contributed to my ability to provide informed recommendations and decisions concerning the product’s roadmap and feature set. Therefore, when approached to contribute a post for this World Digital Preservation Day, I was ambivalent: My eagerness to contribute to this conversation seemed at odds with the product-oriented type of knowledge I can offer, which many would find uninteresting, if not inappropriate. After mulling over this for a while, I decided to share a recent dilemma I’ve been facing, with the hope that it will resonate well with the, er, designated community.
From the start, the distribution of the roles and responsibilities of Rosetta development were well-defined: The business logic is dictated by the community of Rosetta users, and the solution is executed by Ex Libris. Whereas for Alma development we were able to lean on decades of our ILS experience with Aleph and Voyager, for digital preservation we started with virtually nothing comparable to bring to the table. Thus, Rosetta’s architecture is essentially a reflection of what our developer partners thought it should be: OAIS, METS AIPs, PREMIS implementation, etc.. Our modular approach, allowing for the integration of tools and the self-contained Format Library, is also a direct result of this policy. We were informed what digital preservation looks like, ours not to reason why.
I like to think we turned a corner several years ago, when our computer-science-trained business analyst was replaced by a library-science-trained one (who, for the previous couple of years, had been providing Rosetta customer support). The ensuing reassignment of internal development tasks left the technical designs for the engineers, allowing more time for the business analyst to observe other practices and methodologies. Particularly, the opportunity to participate in the SCAPE project exposed me to a completely different way of thinking about and doing digital preservation; so different, that it was difficult to apply anything I learned to improving Rosetta. Looking back, however, I believe Rosetta did improve: It improved by our ability to engage our user community with new ideas and openly raise some uncomfortable questions. And 10 years after going live, I found myself asking: Is Rosetta properly focused on digital preservation?
By now, we all have a pretty good idea of what a digital preservation system should do. If tender responses are any indication, it looks like all the major vendors are basically offering similar functionality, which would reflect the consolidation and maturity of the community’s requirements. What's not clear to me is whether we share the same ideas on what a digital preservation system should probably NOT do, and, more importantly, whether we’ve considered the possible implications of not giving this question its due attention.
Rosetta is described as “an end-to-end digital asset management and preservation solution.” A single system for both aspects – management and preservation – makes perfect sense. Maintaining two separate systems is far more expensive, and doing the first without the second is, in this day and age, irresponsible.
It’s not an easy task to demarcate the borders between asset management and preservation. Clearly, identification, characterization and validation of submitted content is closer to preservation. As an OAIS component, dissemination in itself is part of preservation, while you’d be hard-pressed to claim the same for IIIF support or maintaining viewers. A dedicated preservation system could probably get away with accepting SIPs in one or two standard formats, whereas repositories must be far more accommodating to the diverse users and their respective preferred submission methods.
Given Rosetta’s dual management/preservation mission statement, we constantly face the challenge of maintaining a balance between the two. This balance is compounded by diverse customer profiles – small and large institutions, consortia, and service providers. It is in this context that we are wary of requests to expand Rosetta’s responsibility, such as adding built-in backup solutions and monitoring tools. While indispensable for a preservation eco-system, we’ve learned that, like preservation tools, the key here is not adding them to Rosetta’s code base but providing additional APIs and reference implementations. We found our customers receptive to this approach.
But asking myself how quickly Rosetta has been to adopt recent community recommendations, I suspect the answer is not quite fast enough. Fortunately, our developer partners had the foresight of requiring the inclusion of Intellectual Entities in our data model, but there is remaining work to be done implementing PREMIS 3 changes, specifically nested Intellectual Entities (Intellectual Entities including other Intellectual Entities). And while we've introduced significant improvements concerning SIP and AIP validation for the upcoming release, event indexing and search can withstand improvement, multiple extractor support would benefit from more attention, and multiple copy awareness remains a twinkle in product management's eye.
There are several good reasons for this backlog: This year we invested heavily in improving user experience, and next year we'll be spending approximately a third of our development resources on infrastructure updates and associated regression testing (multiplied by those many aforementioned features). Yes, these will advance digital preservation insofar as practitioners' daily routines will be more comfortable, and meeting IT department security policies will surely go a long way towards keeping the system operational. Efficiency and system performance will hopefully improve with these changes; larger throughput and therefore more preserved objects sounds worthwhile. But doing so won't bring us any closer to addressing emerging core digital preservation needs per se.
As it turned out, our users were asking themselves a similar question. Of our four Rosetta User Groups, recently formed to manage and submit enhancement requests to product management, two are focused on digital preservation functionality. The new format of our annual user meeting, dedicating nearly a full day to working through and prioritizing the groups’ requests, followed up by periodic calls throughout the year, are already proving their invaluable role in keeping the product on track. The results of our combined efforts, already evident in Rosetta's upcoming release, and ongoing work towards the next release, are, as far as this librarian is concerned, excellent reasons for optimism.