Unpublished Research Data
Unpublished Research Data
Data sets produced in the course of research but never shared or made available outside of the initial research team. |
||
Digital Species: Research Outputs |
Trend in 2023: Material Improvement |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Practically Extinct |
Imminence of Action Action is recommended within twelve months. Detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost. |
Examples Unpublished research data can include different kinds of unpublished research data outputs, such as unstructured or structured datasets, databases, or other organized collections of computerized information or data such as periodical articles, books, graphics and multimedia. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Originating researcher no longer active or changed research focus; staff on temporary contracts; dependence on single student or staff member; weak or fluid institutional commitment to subject matter; weak institutional commitment to data sharing; uncertainty over IPR or the presence of orphaned works; encryption; limited or dysfunctional data management planning. |
||
‘Endangered’ in the Presence of Good Practice Replication and documentation; data management plan; preservation pathway agreed. |
||
2023 Review This entry was added in 2019 as a subset of the ‘Unpublished Research Outputs’ reported in 2018, which was split into entries to draw attention to the different preservation requirements and concerns that arise. This entry relates specifically to research data which has not been shared or published by any means and is thus in contravention of the ‘FAIR’ principles which require data to be Findable, Accessible, Interoperable and Reusable. Without proper planning, research data can have a high barrier to re-use, especially where documentation is lacking. The 2019 Jury took the view that documentation and re-use go hand in hand, and researchers should be under no illusions that data not documented or shared faces material and immediate risks of extinction. The 2020 Jury agreed with the assessment. The 2021 Jury identified a trend towards reduced risk in light of more robust collaborative initiatives to jointly address the risk of data loss in and across research communities. The 2022 Taskforce identified a trend towards even more reduced risk based on material improvement over the last year (‘Material improvement’ trend), which had not only offered examples of good research data management and preservation practices but also suggested a significant shift toward a culture of change and collaboration across different research communities and stakeholders. Those mentioned included (but were not limited to) improvements and initiatives by the European Open Science Cloud (EOSC), Science Europe, Research Data Alliance (RDA), Digital Curation Centre (DCC) and related projects on the preservation of research data and outputs. The 2023 Council, in light of the trends in 2021 and 2022, changed the classification from Practically Extinct to Critically Endangered, noting a positive trend of increased research data management activity and engagement by libraries, which should help to ensure that more research datasets are properly deposited in data repositories. They added that there was a general trend across many, if not most, HEI libraries producing research to do more in terms of research data management and a much larger part of what libraries do, with activities in this area growing and scaling up. However, the scale of unpublished datasets is hard to assess, as they are by definition unknown. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments If we do not know it exists, does it exist? It may also be that in certain circumstances this includes data that is unfavourable and has intentionally not been published. If perceived as high-value, someone in the research team will likely take steps to ensure it is protected. We can be proactive and offer advice, but ultimately it is down to them. We cannot keep everything! This is a wide field, so the scale and impact are hard to describe, but the risk is higher than papers due to potential file format complexity. Success is dependent on how successful an institution’s research data management communications are. Advocacy and research are needed to show the scale of the problem, as well as education regarding open science and preservation. Simply having a data management plan prepared is not sufficient, it needs to be properly implemented and kept up to date and relevant for both the researcher and the repository which will take a copy of the data. DMP should be used to appraise what data is worth long term preservation (e.g. NERC Data Value Check List), and what data is of lower quality, non-reusable, and even a reputational risk should it be shared further. |
Media Art by Deceased Artists or Defunct Workshops
Media Art by Deceased Artists or Defunct Workshops
Media art where the artists or creative technicians are either deceased or not able to provide guidance on authenticity and installation. |
||
Digital Species: Media Art |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, including the development of new preservation tools or techniques. |
Examples Works produced by media artists now deceased, such as: Jeremy Blake, Beatriz Da Costa, Heiko Daxl or Stanislaus Ostoja-Kotkowski. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of documentation to enable maintenance; uncertainty over IPR or the presence of orphaned works; complex interdependencies on specific hardware, software or operating systems; lack of capacity in the gallery or workshop; lack of strategic investment; complex external dependencies; loss of institutional memory resulting from staff churn; poor working relationship between the gallery and artist/workshop; lack of conservation assessment. |
||
‘Endangered’ in the Presence of Good Practice Strong documentation; clarity of preservation path and ensuing responsibilities; proven preservation plan; capacity of workshop to support re-installation; capacity of gallery to conserve; capacity of gallery to re-install; retention of institutional memory including archives of correspondence between gallery and artist/workshop; strong and continuing working relationship between the gallery and artist/workshop; regular conservation assessment. |
||
2023 Review This entry was added in 2019 as a subset of the 2017 ‘Media Art,’ which was first introduced with particular reference to historical media art, but split by the 2019 Jury to ensure greater specificity in its recommendation. This entry represents works held in galleries where the artist is deceased or the workshop has closed, and there is limited prospect to obtain new documentation. The 2020 Jury found a trend towards greater risk based on how galleries, which often rely on visitors for income, have been closed for extended periods and circumstances of economic dislocation. The 2021 Jury agreed on a continued trend towards greater risk based on the increasing risk of this loss happening with more time sensitivity for early media artworks. The 2023 Council agreed with the Critically Endangered classification with overall risks remaining on the same basis as before (‘No change’ to trend). |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). However, they add that it is important to bring attention to the key element and importance of providing guidance on authenticity and installation. Emulation tools are helping but the missing guidance on authenticity increases the risk. |
||
Additional Comments This entry includes a point in the lifecycle of all media art, so good practice recommendations are likely to become more important over time. Preservation issues may not become visible until the piece is brought out of storage for loan or exhibition, underscoring the value of continuous or periodic conservation assessment. The range of data/formats/hardware/software etc. can be new and varied, providing organizations with an ongoing technical challenge that they are not initially equipped to deal with. Some loss seems inevitable. Preservation of legacy media artworks is dependent on access to obsolete technology and also the knowledge of how to operate said technology. Documentation around the production process and artist intent can be limited and more critical without any access to artists or technicians. This creates risk around the preservation of a truly authentic artwork.. Case Studies or Examples:
See also:
|
Grey Literature
Grey Literature
Semi-published research outputs such as blogs, dissertations, informal conference papers or commissioned reports which are not formally published but which can contain original and insightful contributions within scholarly communications. This entry covers a wide spectrum of very diverse types of materials which all have different preservation considerations. |
||
Digital Species: Research Outputs |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost. |
Examples Blogs, technical reports, conference papers, dissertations, commercial research. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Originating researcher no longer active or changed research focus; staff on temporary contracts; dependence on single student or staff member; weak or fluid institutional commitment to subject matter; weak institutional commitment to data sharing; uncertainty over IPR or the presence of orphaned works; encryption; lack of recognition; non-disclosure agreements. |
||
‘Endangered’ in the Presence of Good Practice Use of persistent identifiers; embedded within repository infrastructure; quality assurance. |
||
2023 Review This entry was introduced in 2017 under ‘Research Data,’ though without explicit reference to grey literature. In 2019, the Jury split this entry into a range of contexts for research outputs. This entry represents activities which build towards formal publications and research outputs but which do not typically accumulate in institutional repositories. The 2020 Jury noted a trend towards greater risk because higher education and research institutions faced budget uncertainties, and a number of institutions introduced early severance schemes or put staff on short-term contracts at greater risk of redundancy. While this puts other types of research output at risk, the ad hoc nature of grey literature placed it at greater risk. The 2021 Jury agreed with the Critically Endangered classification but argued the content of grey literature is not entirely unique if it eventually makes its way into published outputs, and they also noted improvements and initiatives towards preservation of semi-published research data and outputs over the last year, which together led to a consensus on a 2021 trend towards reduced risk. The 2022 Taskforce agreed with the 2021 assessment, with risks remaining on the same basis as described (no change to the trend). The 2023 Council agreed with the Critically Endangered classification and that overall risks remained on the same basis as before (‘No change’ to trend), but noted that there will always be an element of risk to materials under this entry due to its semi-official nature. The Council also noted that this entry covers a wide spectrum of material, and all had different preservation considerations. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments Loss of material like this would be common in the analogue world, but in the digital age, we have the capacity and perhaps something of a responsibility to ensure that it is captured: more of an opportunity lost to extend the available research resource. The ADS’s Grey Literature Library demonstrates what could be done if information architectures are deployed to mirror and extend professional practice. Workflows and policies regarding tagging, collecting and EDRMS may help protect such data into the future. Past materials are almost certainly partially lost. Not all funder-maintained specialist repositories accept grey literature for long-term storage (e.g., UKRI-NERC EDS). These are redirected to generic open data repositories such as Zenodo which mint DOIs but do not offer data quality assurance for different data types. See also:
|
Consumer Social Media Free at the Point of Use
Consumer Social Media Free at the Point of Use
Social media platforms free at the point of use with a business model based on reselling user data for consumer behavior and/or advertising analysis, mainly for profit-driven corporations. This entry broadly includes digital content created, shared and hosted on social media platforms as well as current interfaces of social media platforms. |
||
Digital Species: Social Media |
Trend in 2023: Towards even greater risk |
Consensus Decision |
Added to List: 2017 |
Trend in 2024: Towards even greater risk |
Previously: Critically Endangered |
Imminence of Action Immediate action necessary. Where detected, should be stabilized and reported as a matter of urgency. |
Significance of Loss The loss of tools or services within this group would have a global impact. |
Effort to Preserve | Inevitability Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost. |
Examples Instagram, Facebook, Twitter/X, Pinterest, Yahoo Groups, Truth Social, Reddit, Mumsnet, Sina Weibo, Flickr, Bebo, and legacy BBS. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of preservation capacity in provider; Lack of preservation commitment or incentive from provider; proprietary products or formats, including user interface; poor data protection; inaccessibility to web archiving; political or commercial interference; Lack of offline equivalent; super-abundance; Uncertainty over IPR or the presence of orphaned works; Lossy compression in upload scripts. |
||
‘Endangered’ in the Presence of Good Practice Offline backup and documentation of media assets; Migration plan; Early warning from vendors; Roadmap from vendors; Accessible to web harvest; Suitable export functionality; Licencing enables preservation; Preservation commitment from vendor; Preservation capability in vendor; Resilient to hacking; Selection criteria. |
||
2023 Review This entry was added by the 2019 Jury as a subset of a broader social media entry first introduced in 2017. It was created as a standalone entry to draw attention to the different threats faced by online services that are paid for versus ‘free at the point of use’ (both depend on the business model of the vendor and the terms they impose). The 2021 Jury raised the risk classification from Endangered to Critically Endangered based on concerns arising with trends towards harmful and malicious hate speech as well as misinformation and deliberate deletion. The 2022 Taskforce agreed on a trend towards even greater risk based on the continued, significant trend towards hate speech, misinformation and disinformation, and deliberate deletion in light of ongoing global conflicts that include (but are not limited to) social and economic inequalities and climate change. In particular, they mentioned the sale of Twitter prompting a moment of instability in consumer social media, with the scale of Twitter, evident acrimony between parties prior to the sale and the hostile news coverage afterward, elevating the risks associated with social media. They also brought to attention issues surrounding platforms enabling extreme views not permitted on mainstream platforms, which emerged and proliferated noticeably and, from a preservation standpoint, could be argued are potentially at very high risk and historically significant. Based on the assessment of the rescoped entry, the 2023 Council agreed on the Critically Endangered classification and noted an increase in the imminence of action required as well as the effort to preserve. The need for major efforts to prevent or reduce losses continues, but it is now much more likely that loss of material has already occurred and will continue to do so by the time tools or techniques have developed. There is a greater urgency to prioritize the assessment of these materials and develop tools or techniques to prevent or reduce further losses in this group. The 2023 Council recommended further rescoping and adjusting of this and other social media entries in light of how web-based and cloud-based business products and services have developed in recent years. This included:
The 2023 Bit List Council additionally recommended that the next major review for the Bit List include:
|
||
2024 Interim Review The 2024 Council identified a trend towards even greater risk due to a number of factors, summarized below. Creators and archivists relying on consumer social media free at the point of use inhabit a precarious position. Free services may be favored by agencies or individual creators who are least able to respond to closure or loss because of the low barrier to entry associated with ‘free at the point of use’ services. Proprietary interfaces and services pose risks, as companies prevent third-party attempts to preserve either hosted content and/or the end-user experience of the environment. An inability to preserve social media interfaces diminishes future potential for emulation and may inhibit researchers' ability to glean important context, as described in the Bit List 2023 review. Additional barriers to preservation via web capture are also present in terms of service for user accounts that explicitly prohibit crawling. For example, the X Terms of Service state “You may not access the Services in any way other than through the currently available, published interfaces that we provide. For example, this means that you cannot scrape the Services, try to work around any technical limitations we impose, or otherwise attempt to disrupt the operation of the Services” and “crawling or scraping the Services in any form, for any purpose without our prior written consent is expressly prohibited” (X, 2023). Another example, from the Facebook Terms of Service, states “You may not access or collect data from our Products using automated means” (Facebook, 2022). An additional recommendation for the next 2025 review is to assess if ‘proprietary formats’ (e.g. the platform interfaces) adequately demonstrates the scope of this entry and answers the first bullet point of the 2023 Council recommendation. The 2023 recommendations for re-scoping and combining entries will also be assessed in more detail in 2025. 2024 Council members also raised concerns regarding Artificial Intelligence and Machine Learning, noting that for this entry and, more broadly, anything related to Social Media, an emerging risk is AI training fears. This manifests in two ways:
|
||
Additional Comments Social media free-at-the-point-of-use remains at a critical risk due in large part to the policies of unregulated (or underregulated) corporate platforms such as Facebook, X (previously Twitter), and their parent companies. The content shared on these platforms and the history of the development of platform infrastructure and policy itself provide a critical source of information for policy-makers and researchers. The complete lack of preservation provision and deliberate obstruction of archiving attempts for public interest puts this valuable content at high risk of loss and draws attention to the critical risk posed by these examples of platforms. Content hosted on social media platforms (that users might not have stored elsewhere) is at risk and users may lose the opportunity to keep their own data for personal archiving or to donate to an organization. Collecting organizations may lose the opportunity to archive hosted content within their collecting remit using web or API harvesting tools. In both instances, data remains at high risk because it is hosted by companies that could change policies or access on a whim. Also, the inability to archive even free content unless you have a login as an archivist (like with Browsertrix). Additionally, there are social media companies requiring payment to access data for preservation. There are interfaces of social media platforms that researchers may want to see to study the evolution of the platforms over time (through web harvesting typically) that are at risk. Preservation is affected by researcher API access being shut down, halting preservation of entire platforms. There are also differences between the themes/collecting policies of institutions and researchers who are scraping their own data and depositing it in repositories. Preserving this stuff en masse is still incredibly difficult, but many of these platforms allow the downloading of their own personal content/archives. However, these lose all the context of social media and therefore, whilst they do preserve the data, they do not preserve the essence of the material. Platforms like X (previously Twitter) have both opened and closed their API further in recent years, but others like Yahoo have closed, and Facebook, as well as X (formerly Twitter), continues to be almost hostile towards archiving and preservation attempts. With digital materials from premium or institutional social media services, the business model and sustainability are more obvious, and contracts may be enforceable more readily. Moreover, because these services have a slightly higher barrier to entry, they may be favored by agencies that are better able to respond to closure or loss. Traditional web archiving can be employed where the user pays for a service, but the content is ultimately publicly available (such as Flickr). But much is unclear about how to preserve internal social media / closed networks that web archiving cannot get to, or existing tools do not cover. Social media capture via web harvesting has become increasingly difficult. Social media platforms have done nothing to address the barriers to automated capture that prevent the preservation of even so-called public content. For example, campaign websites or other election-related content that is only published on Facebook or on X (previously Twitter) because these services are ‘free.’ This content is of particular concern as it appears on no other website. Web archivists are constantly shifting strategies and approaches and trying out new (but limited) tools to best capture this content. If we cannot successfully preserve these platforms, we are missing out on documenting organizations, campaigns and elections around the globe. Much of this data exists as data sets based on aggregated use rather than individual files. Often these are external proprietary platforms bound by intellectual property law and potentially privacy law which will impede the imminence of action. What recourse do archives or digital repositories have to deal with this and capture the materials? Case Studies or Examples:
See also:
|
Always Online Games
Always Online Games
Video games that are required to be continuously online. Gameplay is referenced here particularly as means of participation, along with social media and in-game interaction between players. This can include Massively Multiplayer Online games and single player games with always-on DRM. |
||
Digital Species: Gaming |
Trend in 2023: Towards even greater risk |
Consensus Decision |
Added to List: 2019 (rescoped 2023) |
Trend in 2024: Towards even greater risk |
Previously: Critically Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost. |
Examples Fortnite, World of Warcraft, Neverwinter, League of Legends. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of skills, commitment or policy from corporate owners; uncertainty over IPR or the presence of orphaned works; controversies around IPR; lack of offline backup; changing business model of providers; limited recognition of value of game play; limited recognition of value of game preservation; over dependence on goodwill of ad-hoc community; lack of preservation know-how at service providers; dependency on bespoke hardware or interfaces; increased reliance on always-on DRM for single player games. |
||
‘Endangered’ in the Presence of Good Practice Well documented code; IPR supportive of preservation; large and committed user community; removal of always-on DRM for single player games. |
||
2023 Review This entry was added in 2019 as a subset of the 2017 entry for ‘gaming’. The 2020 and 2021 Juries noted a trend towards greater risk, due to the increased significance of these games during the COVID Pandemic as well as the evolving nature of MMOs, to the extent that the 2021 Jury changed the risk classification from Endangered to Critically Endangered. The 2023 Council agreed with the 2022 Taskforce suggestion to consider the naming and scope of the entry, rescoping this entry to ‘Always Online Games’ covering all games that have to always be online, whether that is due to being MMOs, server-based games or single-player games with Always-Online DRM. Games that have online components but are not required to always be online fit into the new ‘Games with Online Play Components’ entry. |
||
2024 Interim Review The 2024 Council identified a trend towards even greater risk based on shifts in business models and increased litigation over the last year, resulting in more shutdowns which impact preservation efforts. It also raises time sensitivity for action; if there are no efforts to preserve and those existing are further shutdown, this raises the likelihood of loss. |
||
Additional Comments Preservation for Always Online games in a playable state requires preservation or re-creation of the servers that are used to run these games. Even then, for MMOs or multiplayer games, it would be impossible to recreate these games at their various peaks. This nicely encapsulates why video recordings of (online) gameplay are important. They will never have the same configuration of subscribers, to say nothing of the innumerable changes made to the software over the years, which have significantly altered how the game works and looks. Loss is inevitable, and it has already happened. The social and cultural aspects of play are incredibly important, and on-screen recording is the most robust way to capture that. Whilst it is expected that MMOs and always multiplayer games (such as Fortnite) would always require an internet connection due to their reliance on servers, single player games, or those where the primary gameplay is single player, being always online due to DRM provides an added risk to preservation. If the server shuts down, then even the single player components might not be playable, thus loss happens faster than a single player game that does not have a reliance on servers. For more details, see the ‘Shut Down or Discontinued Video Games’ entry. Case Studies or Examples:
See also:
|
Open Source Intelligence Sources of Current Conflicts
Open Source Intelligence Sources of Current Conflicts
Open source intelligence produced, collected and analysed from publicly, openly available social media and web content with the purpose of answering a specific intelligence question and that supports crowd-sourced investigation and fact-checking to verify or refute claims of state agencies and rebel groups in the context of current political or military conflict. |
||
Digital Species: Legal Data |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within twelve months, detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Social media sources relating to current conflicts, such as in Yemen or Syria. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Loss of authenticity; lack of preservation agency; limited or no digital preservation capability; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Offline backup captured by the journalist or investigating authority. |
||
2023 Review This entry was added as a subset in 2019, as part of a broader ‘Open Source Intelligence Sources’ which the Jury split into three elements, relating to current, recent and historic sources. This entry relates in particular to materials relating to current and ongoing conflicts. Social media companies have a policy to take down or suppress content that they consider to be propaganda for terrorist groups. This has had the unintended consequence of deleting or suppressing content that was being used in open source investigation or fact-checking for journalistic or judicial purposes, and which may therefore be an impediment to refutation or prosecution. However, a new generation of cloud-based services, such as Hunchly, have emerged in the last few years, which allow investigators to copy and stabilize content to private accounts in the process of investigating it: so, the ethical requirements of social media companies and the integrity of the investigation are both served. The 2021 Jury noted that such content stays at risk, and the process of investigation is slower than algorithmic deletion. Nonetheless, there is a notable difference in the investigation of current conflicts than historic ones where evidence has been lost. The 2022 Taskforce identified a trend towards even greater risk based on the increased significance of crowd-sourced investigations and fact-checking in light of ongoing global conflicts that include (but are not limited to) those in Ukraine. The 2023 Council agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council also added clarification to the meaning of ‘open source’ for this entry, to explain its meaning in relation to intelligence openly available online, noting that open source can also refer to a specific software or content licence that permits limited uses of IP so this distinction would be helpful for readers. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). The Council acknowledge the continuing challenge of ensuring the preservation of complete and accurate resources given that: platform owners continue to be obliged to remove content that violates community standards; copyright and ownership increasingly hinders capture/preservation of the open source materials; and with the rise in fakes, preservationists must attend to standards for legal admissibility and authentication which vary from one jurisdiction to another. |
||
Additional Comments Preservation is important for social context and may be picked up inadvertently in other ways - but is ambiguous about who has ultimate responsibility for collecting and preserving this. Case Studies or Examples:
See also:
|
Politically Sensitive Data
Politically Sensitive Data
Digital content where the knowledge to preserve exists, and there is no threat to obsolescence, but where political interests may be served by elimination, falsification or concealment. |
||
Digital Species: Political Data |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2017 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Online News; social media and web-based campaigning; social media relating to 2016 UK/EU referendum; Promises made in Scottish independence referendum 2014; US Environmental Data; UK Public Finance Initiative (PFI) documents; Recordings of Leinster House. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Opaque terms and conditions that facilitate deletion or obfuscation; lack of access to web-harvesting; significant lobby interest; change of administration; data resides in single jurisdiction; reputational risk to collecting institution; uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Robust political archives; robust preservation services for investigative journalists. |
||
2023 Review This entry was added in 2017 with additional comment and contextualization offered by the 2019 Jury. The 2019 Jury agreed that the nature and extent of political campaigning online continue to become more apparent, drawing attention to the manipulation of digital media but not explicitly the issue of deliberate deletion, alteration or concealment. They further noted that GDPR provides a pretext for the disposal of records, and that the increased capability of archives to secure the content from outgoing governments and ministers was a source of encouragement. Nonetheless, they pointed to a pressing need for a deep and comprehensive assessment of the risks faced by politically sensitive data and the impact which such deletions have on the public good. The 2020 Jury added a 2020 trend towards greater risk based on 2020 as a year of significant political and economic upheaval, in part because of the pandemic though also because of popular protest and the outcomes of elections around the world. Moreover it had been widely reported that senior officials in government have avoided scrutiny and record-keeping laws by using self-deleting messaging applications. The 2021 and 2022 reviews also identified trends towards greater risk based on the continuation and increase of significant political and economic upheaval. Moreover, they added how it had been widely reported that senior officials in government have avoided scrutiny and record-keeping laws by using self-deleting messaging applications. In those circumstances, politically sensitive records were likely to be at greater risk. The 2022 Taskforce agreed, and noted the significance of elimination, falsification or concealment in light of political upheaval, social and economic inequalities and climate change. The case of political upheaval and protest in Iran had further amplified the risks, and anonymous digital art and social media activism had burgeoned in response to gendered violence and acts of political repression in the latter half of the year. However, preservation infrastructures, such as national libraries and collecting archives within universities are conflicted, therefore unlikely, unable or unwilling to preserve content that is explicitly and radically critical of the regime. For those reasons there was a 2022 trend toward even greater risk. The 2023 Council agreed with the Critically Endangered classification with overall risks remaining on the same basis as before (‘No change’ to trend). They also provided discussion and comments around GDPR abuses. GDPR can be abused for blocking access to public records and political data. The existence of ‘special category data’ under GDPR is used to justify denying access even to people’s own data. These justifications usually do not reflect the reality of how GDPR works at all, but it is used as a way to shut down these challenges. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments There is a question of whether it is the duty of archives/libraries to preserve the falsification but to instead preserve the constituent pieces to allow researchers to infer elimination, falsification or concealment. See also:
|
Records of Local Government
Records of Local Government
Records from local government (i.e., below the state level) which are required for transparency and may be in many diverse forms, but in which the local authority may lack the capacity to manage the complex digital preservation requirements that arise. |
||
Digital Species: Public Records |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within twelve months, detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Born digital records of small and medium-sized agencies; fasting-changing internal manuals, advice or policies shared electronically; records of care services; Documentation supporting long-lived contractual relations like Public Finance Initiatives; Organizational Slack channels; network drives; EDRMS; Email. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of preservation infrastructure; conflation of backup with preservation; loss of authenticity or integrity; Long-lived business processes; poor storage; churn of staff; significant volumes or diversity of data; poorly developed digitization; ill-informed records management; poorly developed migration or normalization; long standing protocols or procedures that apply unsuitable paper processes to digital materials; encryption; political instability; lack of sustained funding; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Well managed data infrastructure; preservation enabled at the point of creation; carefully managed authenticity; use of persistent identifiers; finding aids; well managed records management processes; recognition of preservation requirements; strategic investment in digital preservation; preservation roadmap; participation in digital preservation community. |
||
2023 Review This entry was added in 2019 as a subset of a previous entry for ‘Records of long duration from Local Government or Other Government Agencies.’ The split was intended to allow greater concentration on the challenges that these distinct types of agency face. Local government typically operates across a broad range of digital formats and services, but it is unclear and unlikely that relatively small archival agencies are properly funded locally to support the wide range of digital preservation requirements that arise. The 2020 Jury noted a trend towards greater risk based on significant political and economic upheaval placing additional strain on local government and its agencies, making already vulnerable records at greater risk. Trends towards greater risk was also noted by the 2021 Jury and 2022 Taskforce, contributing examples like Grenfell to demonstrate the precarity of local government records, especially when these risks overlap with records of non-governmental agencies, resulting in significance and impact of loss, the impetus for action and call to governing frameworks where failing in enforcement (and depending on the jurisdiction). The 2023 Council generally agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council additionally recommended revisiting and rescoping this entry as part of the next major revision of the Bit List. Some Council members recommended splitting this entry into separate entries to differentiate the various risks associated with different types of digital public records, Others raised concerns regarding the breadth of records held by local government, and that it is perhaps not appropriate to have a distinct entry or split entries for records of local governments but rather provide examples of different kinds of public records in and across other entries. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments The diversity of 'local government records' makes this category quite difficult to score. First, local governments have differing responsibilities in different jurisdictions. For example local governments in the UK have more responsibilities than in Australia. Also, given the number of local government agencies in a state or country, the quality of recordkeeping and digital preservation practice can vary greatly. Additionally, the variety of records that are created by local governments means that some formats or record types may be generally at low risk, while others may be practically extinct. Given this complexity it is important to make clear that the imminence of action, significance of loss, and effort to preserve are context-dependent and generalized. The main factors that reduce risk for these records are that local government is regulated, and there are clear recordkeeping standards that apply to digital records. Also they have consistent funding (although it may not be enough and may not be directed at digital preservation). We feel that due to the breadth of records held by local governments, it is perhaps not appropriate for them to have a distinct record series, but rather be a featured example of other series. This approach would still assist in advocacy for local government as they would be able to cross reference their digital holdings against these classifications. Significant research by the UK National Archives into Local Government Archives in England underlines the digital skills shortages that exist, especially with respect to preservation. There may be a benefit from splitting into a) legally required public record and b) additional information that may enrich our digital preservation of society. My assumption was that the roles and requirements for records management are clearly defined, but if this is not the case and there are inadequate resources to match the requirement, then the risk goes up. Case Studies or Examples:
|
Records of Non-Governmental Agencies
Records of Non-Governmental Agencies
|
||
Records of independent agencies and contractors that act on behalf of the state in the delivery of public services, and which may be present in many diverse forms, but for which the NGO or contractors may lack the capacity to meet the complex digital preservation requirements that arise, or may have a business motive to minimize or ignore requirements for the maintenance of the record. |
||
Digital Species: Public Records |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within twelve months, detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on many people and sectors. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Born digital records of small and medium-sized agencies; fasting-changing internal manuals, advice or policies shared on intranets or EDRMS; records of care services; historic guidelines and manuals which evidence 'best practice at the time'; Documentation supporting long-lived contractual relations like Public Finance Initiatives; Organizational Slack channels; network drives; EDRMS; Email |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of preservation infrastructure; conflation of backup with preservation; loss of authenticity or integrity; Long-lived business processes; poor storage; churn of staff; significant volumes or diversity of data; poorly developed digitization specifications; ill-informed records management; poorly developed migration or normalizations specifications; longstanding protocols or procedures that apply unsuitable paper processes to digital materials; encryption; political instability; lack of sustained funding; denial of responsibility; failure to include archives within contract from commissioning agency; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Well managed data infrastructure; preservation enabled at the point of creation; carefully managed authenticity; use of persistent identifiers; finding aids; well managed records management processes; application of records management standards; recognition of preservation requirements at highest levels; strategic investment in digital preservation; transfer protocols to public archive; participation in digital preservation community. |
||
2023 Review This entry was added in 2019 as a subset of a previous entry for ‘Records of long duration from Local Government or Other Government Agencies.’ The split was intended to allow greater concentration on the challenges that these different types of agencies face. Non-governmental organizations typically operate across a broad range of digital formats and services acting on behalf of the public sector. The 2020 Jury noted the trend towards greater risk based on 2020 being a year of significant political and economic upheaval, putting additional strain on NGOs in these circumstances where already vulnerable records are likely to be at greater risk. Trends towards greater risk were also noted by the 2021 Jury and 2022 Taskforce, contributing examples like Grenfell to demonstrate the precarity of non-government agencies, especially when these risks overlap with those of local government, resulting in significance and impact of loss, the impetus for action and call to governing frameworks failing in enforcement for these agencies (e.g., examining current recordkeeping regimes keeping them accountable). The 2023 Council generally agreed with the Critically Endangered classification, with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council additionally recommended revisiting and rescoping this entry as part of the next major revision of the Bit List. Some Council members recommended splitting this entry into separate entries to differentiate the various risks associated with different types of digital non-governmental records. Others suggested that it is perhaps not appropriate to have a distinct entry or split entries for records of non-governmental agencies but rather provide examples of different kinds of these digital materials in and across other entries. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments The There is a large variation in the types of records held by NGOs. Additionally, the quality of digital preservation performed by NGOs can vary widely. Therefore, the same approach to scoring was taken for this entry as the one above. We consider records of NGOs to be at greater risk due to there being less regulation, and the regulations that exist being less stringently enforced. An additional risk factor for these records is a blurring of the lines of responsibility, which can lead to records 'falling through gaps', or to difficulties funding digital preservation practice. This can be further complicated by outdated legislation which does not take into account the complexity of privatisation and public/private partnerships. For example, the legislation that PROV operates under is 50 years old. This, in turn, can lead to regulation and enforcement being more complex than it is for government agencies. Case Studies or Examples:
See also:
|
Records of Quasi Non-Governmental Agencies
Records of Quasi Non-Governmental Agencies
Records from agencies at arms-length to government whether locally, nationally or internationally. They may be required to maintain archives for the purposes of transparency, sometimes for extended periods, and sometimes in diverse and complicated forms. |
||
Digital Species: Public Records |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within twelve months, detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Records of non-executive state or national agencies; museum or leisure trusts; industry or public regulators; public audit services; public-good funding and investment agencies; autonomous and semi-autonomous public agencies; sovereign wealth funds; public/private partnerships; publicly owned companies. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of preservation infrastructure; conflation of backup with preservation; loss of authenticity or integrity; Long-lived business processes; poor storage; churn of staff; significant volumes or diversity of data; poorly developed digitization specifications; ill-informed records management; poorly developed migration or normalizations specifications; long-standing protocols or procedures that apply unsuitable paper processes to digital materials; encryption; political instability; lack of sustained funding; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Well-managed data infrastructure; preservation enabled at the point of creation; carefully managed authenticity; use of persistent identifiers; finding aids; well-managed records management processes; application of records management standards; recognition of preservation requirements at highest levels; strategic investment in digital preservation; preservation roadmap; participation in digital preservation community. |
||
2023 Review This entry was added in 2019 as a subset of a previous entry for ‘Records of long duration from Local Government or Other Government Agencies.’ The split was intended to allow greater concentration on the challenges that these different types of agencies face. Records of quasi non-governmental agencies are at arm’s length to government, but the ’QuaNGO’ or ‘ALEO’ (Arms-Length Executive Organization) may lack the capacity to meet complex digital preservation requirements that arise, nor be able to deposit in the government archive. The 2021 Jury added that arm's length bodies are still public bodies, and they have a duty of care for maintaining evidence of their actions and transactions. They often receive public funding, and depending on the archives, legislation may be required to transfer to an archive. The issue is when there is a lack of clarity regarding the recordkeeping requirements or neglect of records and information once it has outlived its usefulness. These bodies still create records that affect citizen lives and have a duty to document, and therefore changed the classification from Endangered to Critically Endangered. The 2021 Jury and 2022 Taskforce noted a trend towards greater risk when looking at the precarity of records in QuaNGO agencies in periods of significant political and economic upheaval creating greater strains for funding to support preservation capacity. The 2023 Council generally agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council additionally recommended revisiting and rescoping this entry as part of the next major revision of the Bit List. Some recommended splitting this entry into separate entries to differentiate the various risks associated with different types of digital records of quasi non-governmental agencies. Others suggested that it is perhaps not appropriate to have a distinct entry or split entries but rather provide examples of different kinds of these digital materials in and across other entries. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments There is a large variation in the types of records held by QuaNGOs and/or ALEOs. Additionally the quality of digital preservation performed can vary widely. Therefore, the same approach to scoring was taken for this entry as the one above. Similar to the risks of NGOs, we consider these records to be at greater risk due to there being less regulation, and the regulations that exist being less stringently enforced. An additional risk factor for these records is a blurring of the lines of responsibility, which can lead to records 'falling through gaps', or to difficulties funding digital preservation practice. This can be further complicated by outdated legislation which does not take into account the complexity of privatisation and public/private partnerships. Although the split draws attention to the different pressures faced by QuaNGOs it could be further subdivided into legally required public records and additional information that may enrich our digital preservation of society. The classification assumes that the roles and requirements for records management are clearly defined, but if this is not the case or there are inadequate resources to match the requirements, then the risk goes up. While the 2022 trend shows increases in risk there are some green shoots of hope in Ireland found when working actively with the agencies, and communicating some of the concerns they have for their data so there's better awareness and hopefully that will turn into action. See also:
|