Research Data Published through Repositories

 Vulnerable small

Research data published through digital repositories or other services providers with specialist skills to manage the data and an ongoing commitment to ensure preservation.

Digital Species: Research Outputs

Trend in 2023:

reduced risk Material improvement

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Vulnerable

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

It would require a small effort to preserve materials in this group, requiring the application of proven tools and techniques.

Examples

Recognized data repositories in specialist disciplines; institutional data repositories in subject specialist centres and partnerships.

‘Endangered’ in the Presence of Aggravating Conditions

Lack of long-term commitment; lack of user community; lack of visibility to potential depositors; lack of institutional commitment; insufficient documentation; uncertainty over IPR or the presence of orphaned works.

‘Lower Risk’ in the Presence of Good Practice

Certification and documented good practice; effective documentation requirements for depositors; proven financial sustainability; skilled staff including professionalising disciplinary and general data stewardship offering a clear career option; participation in the digital preservation community; research data management training by repositories and research funders offered to depositors, in particular new career researchers.

2023 Review

This entry was added in 2019 as a separate entry, but it was previously introduced in 2017 under ‘Published research outputs,’ though without explicit reference to the capacity of the repository infrastructure. The 2019 Jury split the entry into a range of contexts for research outputs, including this addition classified as Vulnerable; the preservation of research data published through a well-founded repository with the capacity and commitment to ensure preservation and capability through their own professional development activities made it a lower risk outcome for research data. The 2021 Jury agreed with this classification but commented on the improvements and initiatives towards the preservation of research data and outputs, leading to a 2021 trend towards reduced risk. The 2022 Taskforce identified a 2022 trend towards reduced risk based on material improvement over the last year that had not only offered examples of good research data management and preservation practices but also suggested a significant shift towards a culture of change and collaboration across different research communities and stakeholders. Those mentioned included (but were not limited to) improvements and initiatives by the European Open Science Cloud (EOSC), Science Europe, Research Data Alliance (RDA), Digital Curation Centre (DCC) and related projects on the preservation of research data and outputs.

The 2023 Council agreed with the Vulnerable classification and noted that there was a trend towards reduced risk due to increasing research data management and engagement activity by libraries, which should result in increasing amounts of datasets being deposited. The 2023 Council also noted it would be useful to see empirical data on depositing trends to assess this.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

A key consideration with this entry is whether the data repository is integrated with a preservation system to facilitate long term access and usability of datasets.

The loss of tools, data or services within this group would impact on people and sectors around the world. Particularly those involved with reproducibility and those wishing to use the datasets for further research.

Although there have been improvements in current practice, policies and workflows, there is still a significant corpus of information that was deposited before these improvements came into force. It is unlikely that there will be the time, will or resources to bring this information up to current standards.

Creating additional preservation metadata to research data holdings may help render data more robust in the long term, where using a preservation system is not an option. With an emphasis on environmental sustainability, some repositories hesitate mandating additional copies of large datasets which may be in the region of hundreds of terabytes, as this adds to both storage cost and carbon footprint, especially when capturing and preserving the research methodology would enable recreating the dataset.

Case Studies or Examples:

See also:

  • A recent analysis from Martin Eve of CrossRef shows scholarly content at risk. The findings, based on the assessment of around 7.5 million of the e-books and articles for which CrossRef provides a fixed identifier or Digital Object Identifier, suggest that around a quarter of academic publications are not being preserved for the future. For c. 2 million articles in the study there were no evidence of them being preserved, and 4.3 of works studied were preserved in at least one place. See: Eve, M. P. (2024) ‘Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles’. Journal of Librarianship and Scholarly Communication 12(1). Available at: https://doi.org/10.31274/jlsc.16288

  • Strecker, D., Pampel, H., Schabinger, R. & Weisweiler, N.L. (2023) ‘Disappearing repositories -- taking an infrastructure perspective on the long-term availability of research data’. Available at: https://doi.org/10.48550/arXiv.2310.06712 

  • L’Hours, H., Kleemola, M., von Stein, I., van Horik, R., Herterich, P., Davidson, J., Rouchon, O., Mokrane, M., & Huber, R. (2021) ‘FAIR + Time: Preservation for a Designated Community (01.00)’. Available at: https://doi.org/10.5281/zenodo.4783116 

  • Science Europe. (2021) ‘Practical Guide to Sustainable Research Data: Maturity Matrices for Research Funding Organisations, Research Performing Organisations, and Research Data Infrastructures’. Available at: https://www.scienceeurope.org/media/b3odxx3s/sepractical-guide-sustainable-research-data.pdf [accessed 24 October 2023]

  • European Open Science Cloud (EOSC) (n.d.) ‘Development and outputs of the European Open Science Cloud (EOSC) Long-Term Data Preservation Task Force’. Available at: https://www.eosc.eu/advisory-groups/long-term-data-preservation [accessed 24 October 2023]


Scroll to top