Corinne Frappart

Corinne Frappart

Last updated on 9 November 2023

Corinne Frappart is Digital Archivist at the Publications Office of the EU


Introduction

As we recently celebrated World Digital Preservation Day (WDPD) in 2023, we find ourselves standing at the crossroads of history in the field of digital preservation, tasked with the challenge of safeguarding the ever-increasing number of digital artifacts that define our past, inform our present, and shape our future.

The theme for this year's WDPD, "Digital Preservation: A Concerted Effort," underscored the collaborative nature of the digital preservation community's mission. One of the most emblematic examples of this collaboration is the development of open-source software. In this blog post, I will delve into the specific case of the long-term preservation open-source software RODA, and the impact of its use by public administrations leading to improvements that benefit the digital preservation community at large.

Origin of this blog

It all started in April 2023, when I was invited to present the Publications Office of the European Union’s (the ‘Office’) activities at a conference dedicated to the eArchiving Initiative[1]. I concluded my talk by mentioning the positive impact of our activities on other users.

When my superiors suggested that I redo this presentation to the wider, and less technical IFLA audience in August[2], I tried to focus more on the benefits than on the used tools. I contacted our contractor tasked with maintaining our digital repository, who is an important player for the evolution of RODA, the open-source software we use. Reading the list of the Office’s requests which led to enhancements opened our eyes to the determining role that an administration can have in the financing of improvements for open-source projects. It is this stunning surprise that I would like to share in this blog.

But before we get into the details, let’s set the scene.

Setting up the context

What kind of administration do I work for?

The Publications Office of the EU[3] is the central point of access to EU law, as well to publications, open data, research results, procurement notices and other official information. In fact, it plays at the same time the roles of a publishing house, a legal depository and an archive for the publications authored by the EU institutions.

One of the other Office’s missions is also to facilitate the reuse of the published knowledge. Even if the “EU reuse directive”[4] is focused on the reuse of content, we will see that, in its spirit, it can be applied to software.

Corinne_Blog_Image_1.png

Figure 1 - The Publications Office of the European Union and its services

The eArchiving Initiative

By nature, the long-term digital preservation activity cares about both reuse and sustainability. That is why the Office’s strategy favours the sharing of resources and technology by the means of open sources and open standards.

The eArchiving initiative perfectly meets this objective. eArchiving is funded by the European Commission under the Digital Europe Programme (DIGITAL) that brings digital technologies to businesses, citizens, and public administrations. The eArchiving Initiative provides core specifications, software, and knowledge to help people store information for longer.

 Corinne Blog Image 2

Figure 2 - The eArchiving Initiative and its funding

Since 2016, the Office has opted for the main reference standards (OAIS, METS, PREMIS) that are used worldwide in the field of electronic archiving and that are naturally recommended by eArchiving. The Office models also its archival packages according to the European standard E-ARK that has been developed by the eArchiving Initiative. The long-term digital preservation software adopted by the Office is RODA, a fully open source proposed by eArchiving.

Corinne_Blog_Image_3.png

Figure 3 - The eArchiving Initiative as used by the Publications Office of the EU

The mechanism of the virtuous circle

Now we come to the heart of the matter.

When the Office needs functionalities that are not present in RODA, our contractor[5] assesses where the customisation will take place. If the new feature is useful to all users, it is developed within the core application. If it is specific to a subset of users, it will become a plugin to the open-source. This plugin will be stored on a private environment if it is very specific to the Office, or be commercialised if it arises the interest of other clients.

A virtuous circle comes into play. The Office gets suitable developments that it has paid for to carry out its archiving tasks. This is a first and normal satisfaction. But these same developments benefit all other RODA users. The basic users can download the enriched upgrades for free, and the companies having similar needs as the Office can buy the new plugins. New users are attracted by the product, their different needs lead to the improvement of the open source by the addition of new features. A new, enhanced version is put at disposal, and the Office, like all the other users, can take full advantage of it. The virtuous circle reinforces the positive aspects of the initial situation.

Corinne_Blog_Image_4.png

Figure 4 - Development of the digital preservation open-source software RODA by the Publications Office of the EU, and the benefits of its reuse by the community

Some concrete examples

The Figure 4 shows examples of developments that were directly linked to the use of RODA by the Publications Office.  It is the number of items of this list and the importance of each of them that surprised me and my colleagues so much.

Let’s take first the ingest workflow. The archive receives SIPs in the METS format from its main producer. The METS packages are converted into E-ARK, the internal archival format of the AIPs. As this conversion is specific to the Office, it is performed by a specific plugin.

On the other hand, the reporting and the monitoring systems are features that are especially appreciated by users who regularly ingest large volumes of data, and maybe less important for basic users. That is why they are commercial plugins.

Finally, Representation Information is needed by any archive, as it provides information necessary to users and future generations to understand and render the bit sequences constituting the Content Data Object of the archived material. So it was made a part of the core RODA, under the name of “Representation Network”.

In conclusion

The public money that the Office spends to customise the open-source software preserving its collections for the long-term triggers a snowball effect: it benefits not only to itself, but also to the whole community of users of that software by creating a sustained growth. But the benefit we are most proud of is the realisation that through our requests for developments, we have also been of service to the wider digital preservation community.

 

[1] https://digital-strategy.ec.europa.eu/en/library/earchiving-initiative-online-event

[2] https://2023.ifla.org/, session 175, Digital Technologies and Sustainability: Say Your Piece in 7+3.

[3] https://op.europa.eu/

[4] http://data.europa.eu/eli/dir/2019/1024/oj

[5] Currently a Consortium composed of NetCompany and of Keep Solutions (the latter specialised in the maintenance of RODA).


Scroll to top