Ingrid Dillo is Deputy Director at DANS and is co-chair of the RDA COVID-19 Working Group. Marjan Grootveld is Coordinator Projects and Policy at DANS and involved in FAIRsFAIR.
Under the umbrella of the Research Data Alliance (RDA) and on request of the European Commission more than six hundred data professionals and domain experts developed guidelines for sharing research data to address the COVID-19 pandemic. The guidelines for researchers, published in July 2020, help four domains to find a balance between timeliness and accuracy: clinical science, omics, epidemiology, and social sciences. Instruments and practices mentioned in the guidelines include data management plans, metadata, and ethical and privacy considerations, along with the technology needed for this. Furthermore, the RDA COVID-19 Working Group’s report contains recommendations for governments and research funders, for instance to promote open science through policy and investment, across international jurisdictions. Documentation is seen as crucial for all stakeholders. While researchers should document their methodology, data cleaning, and data provenance, decision makers should document their decisions. And of course, documentation should be preserved for the future.
Because the pandemic is as much a social as a medical phenomenon, the role of social scientists is vital. They collect and reuse data to inform policy makers and leaders about social and economic issues in the COVID-19 situation. The RDA COVID-19 Working Group’s report mentions areas of research like social isolation, food security, and education impact; an example from the DANS long-term archive – more about that below – concerns a survey on political preferences, including support for the emergency measures that were taken. Also, social sciences data such as demographics are valuable for all disciplines if we want to better understand data in their context. Since large parts of social sciences studies are observational, much of the data cannot be recreated, which makes preservation even more important.
(Image: Bas van der Schot in this opinion article.)
DANS provides an archive for long-term preservation of and access to research data. Social and behavioural sciences make up a substantial share of currently more than 150,000 datasets in the archive, which has acquired the CoreTrustSeal. However, it’s one thing to run an archive that is certified as being trustworthy. It is another thing, although related, for potential reusers to trust the preserved data. That’s why we make quality demands on the deposits, but we support the depositors as well. For example, we support researchers and research support staff by providing information about personal data, preferred file formats for sustainability, usage licences and CC0, and discipline-specific requirements. For social science datasets specifically, we expect documentation in the form of a codebook, which describes variables, the study’s population, types of data, the sample procedure et cetera. More indirectly DANS supports social scientists as a member of the Consortium of European Social Science Data Archives (CESSDA), which collaboratively offer the data management expert guide. This online guide helps quantitative and qualitative social science researchers to discover, manage and archive data.
What DANS as provider of a long-term data repository offers and expects is strongly aligned with the principle that data should be Findable, Accessible, Interoperable, and Reusable or in short: FAIR. With our partners in the FAIRsFAIR project we both help researchers with the FAIR-Aware advocacy tool and repositories to adopt FAIR good practices. In this spirit we wholeheartedly agree with the RDA COVID-19 Working Group that “Embracing the FAIR agenda is now critical for all social scientists collecting data relating to COVID-19, and future pandemics, in order to ensure maximum benefit from the data. In the current emergency context, it is a moral imperative to preserve the data and share it in the most open way possible for each case.” In practice this can vary from learning about rich metadata standards in other disciplines to – a great suggestion by COAR – adding “COVID-19” as a keyword.