Last December, the DPC completed a study on EOSC, FAIR, and digital preservation, ‘FAIR Forever? Long Term Data Preservation Roles and Responsibilities’, commissioned by the EOSC Sustainability Working Group and funded by the EOSC Secretariat Project in 2020. In this blog post, I will share some of our key findings and recommendations from the study with two special announcements!
Background to the study (with special announcement #1)
In 2015, the European Open Science Cloud (EOSC) initiative by the European Commission set out to provide a large infrastructure that would bring scientists and their audiences together; federate existing infrastructures; augment these infrastructures; and revolutionize how scientific knowledge is created in all disciplines, in all geographies.
The FAIR guiding principles (Findability, Accessibility, Interoperability, and Reusability) have been foundational to the development and implementations of EOSC from the start. EOSC was envisioned as a web of FAIR data and services, making research data interoperable and machine-actionable following the FAIR principles.
The continued sharing and re-use of FAIR data to reproduce research and build upon it are critical for Open Science, and the long-term preservation of the federated data in EOSC was an area of sustainability where more research was needed.
The EOSC Sustainability Working Group invited DPC to assess the current strengths, weaknesses, opportunities, and threats to digital preservation in the context of EOSC. Following the submission and approval of the research proposal last August, we conducted a desk-based assessment of relevant EOSC documents, interviews with representatives from various regional and thematic ESFRI research infrastructures, and ran focus group sessions with people working in research management and digital preservation at research performing organizations to get an overall view digital preservation in EOSC.
The first of our two special announcements is that the FAIR Forever final report is now published and freely available here through the EOSCSecretariat.eu Zenodo Community! The report details our findings relating to digital preservation capacity and our recommended coordinated actions by repositories, researchers, and stakeholders to better ensure the long-term preservation of—and access to—research data in all its forms. It is a hefty 58 pages, so for those of you who do not have the time to read it in full, a few of the key findings and recommendations are presented below.
What we found
We found many undoubted strengths within the EOSC vision, and opportunities created through the federated FAIR data and services provided by the EOSC. Some of the strengths include a commitment to persistent identifiers, data management planning, robust data storage and repository certification. There were clear points of intersection between the FAIR principles, data management and preservation planning for EOSC. Beyond EOSC, FAIR provides an entry point for preservation awareness and early planning among active researchers and an opportunity to guide and assess preservation early in the research data lifecycle.
Furthermore, there are already projects and efforts underway for digipres services made available through EOSC, notably the three high-tech industry consortia selected via the ARCHIVER (Archiving and Preservation for Research Environments) Pre-commercial Procurement Tender to build archival and preservation services made available to European researcher communities through EOSC.
But there were also weaknesses and threats we found that will impact securing digital assets in the long term. We found that
Digital preservation is not explicit: Whilst there were references to preserving and archiving research data with acknowledged importance, digital preservation was largely implicit in the EOSC vision. Additionally, there was a strong (perhaps rhetorical) tendency to focus on the preservation of data, with the term often used inconsistently among EOSC stakeholders.
Roles, responsibilities and accountabilities are unclear: Key findings from the study’s interviews, interactions, and focus groups reinforce the need for elucidation of roles and responsibilities, and recommended solutions to mitigate the risks. In particular, participants in this research have emphasized the need to clarify accountabilities that are implicit but never activated within data management plans (DMPs).
Data and reputation are at risk: Risks to reputation and data for EOSC arise from the technical complexity and uncertain accountabilities in the EOSC vision. Furthermore, there remain additional challenges tied to existing and available resources, such as the lack of clear funding and costing models for digital preservation and specific skills and training for the various actors in and across preservation activities.
What we recommended (with special announcement #2)
Based on the study's cumulative findings, we made nineteen recommendations for action, with each action assigned an owner and a priority level based on importance and urgency.
The final report presents the nineteen recommendations in two ways, first numerically by action area with the noted owners and priority and second by owner with recommendation number and priority (Table below).
For the EOSC Secretariat |
Recommendation One: of urgent priority, establish a working party or task group, reporting directly to the EOSC Association Board with respect to digital preservation. |
Recommendation Two: of high priority, formalize terms of reference and host an initial meeting of a digital preservation task group to establish an iterative work plan. |
|
Recommendation Three: of medium priority, establish an operational basis for partnership to deliver the candidate model services proposed in this report: · A legacy code or software preservation service · A mechanism to ensure accountability and implementation of preservation in DMPs · A business case factory or service for preservation cost modelling · A programme to support researchers with preservation at the point of creation · A mechanism for digital preservation policy across institutions within EOSC |
|
Recommendation Eleven: of medium priority, establish a mechanism to align EOSC implementation and interpretation of 'FAIR' with the path dependent and continuous quality improvement cycles of digital preservation. |
|
Recommendation Thirteen: of medium priority, establish and verify business models for preservation services. |
|
Recommendation Sixteen: of high priority, establish an ongoing basis for partnership in the digital preservation community, including beyond the research data community. |
|
For the EOSC Association Board |
Recommendation Five: of urgent priority, designate a Senior Digital Preservation Rapporteur on behalf of the Board to directly communicate and liaison with a Digital Preservation Task Group, to monitor and oversee EOSC's responses to digital preservation risks. |
Recommendation Eighteen: of high priority, obtain strategic control of digital preservation risks to EOSC. |
|
Recommendation Nineteen: of medium priority, establish a strategic trajectory for management of digital preservation risks, embedding these within reviews and enhancements. |
|
For Funders |
Recommendation Six: of urgent priority, articulate to all grant holders the clear view that adherence to FAIR principles requires preservation actions to be monitored and managed over the entire life of a project not simply at the point of completion. |
Recommendation Seven: of high priority, audit preservation pathways for all research outputs to identify critically endangered content. |
|
Recommendation Eight: of high priority, initiate a process to establish accountabilities and obligations with respect to implementation of data management plans. |
|
Recommendation Nine: of medium priority, establish mechanisms to engage expert communities of practice in the validation of data management plans. |
|
Recommendation Fifteen: of medium priority, identify costs of action versus inaction with respect to high value, critically endangered content. |
|
Recommendation Seventeen: of medium priority, establish more sustained digital preservation training for researchers and repository managers. |
|
For Research Repositories |
Recommendation Four: of urgent priority, adapt workplans to include quality improvement mechanisms where these do not already exist, including DPC Rapid Assessment Model, establishing thereby a strategic framework to achieve baseline certification for primary preservation services, or identifying preservation pathways for data. |
Recommendation Ten: of medium priority, provide strategic framework for audit of data management plans. |
|
Recommendation Fourteen: of medium priority, identify costs of action versus inaction with respect to high value, critically endangered content. |
|
For the Digital Preservation Community |
Recommendation Twelve: of urgent priority, provide a place for EOSC to share lessons and articulate emerging requirements outwith the research data 'bubble'. |
But we want to hear your thoughts on these recommendations. Under which role or ownership do you identify yourself? What do you think of their feasibility? Is anything critical missing now or will be critical as EOSC embarks on its new phase of implementation? How ready is the digital preservation community for EOSC; and how is EOSC enabling digital preservation?
Having discussions around these questions are important as 2021 is significant year for research data management with the European Open Science Cloud (EOSC) embarking on a new implementation phase of its compelling and ambitious prospectus.
This brings us to the second special announcement. You may have noticed over the last couple of weeks, DPC is hosting the FAIR forever? FAIRer for longer webinar on 18th March where we will share some of the study’s key findings and recommendations and get a discussion going with leaders, representatives, and attendees. We are pleased to confirm that our speakers and panel members will include
- Bob Jones, CERN and EOSC Association
- Jessica Klemeier, EMBL, on behalf of Rupert Lueck, EMBL and EOSC Sustainability Working Group
- Natalie Harrower, DRI and EOSC FAIR Working Group
- Herve L’Hours, UK Data Archive and CoreTrustSeal
We hope you can join us next Thursday, and more information and registration is here: https://www.dpconline.org/events/fair-forever-event