Unpublished Research Data

Unpublished Research Data

   Practically Extinct small

Data sets produced in the course of research but never shared or made available outside of the initial research team.

Digital Species: Research Outputs

Trend in 2023:

reduced riskMaterial Improvement

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Practically Extinct

Imminence of Action

Action is recommended within twelve months. Detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost.

Examples

Unpublished research data can include different kinds of unpublished research data outputs, such as unstructured or structured datasets, databases, or other organized collections of computerized information or data such as periodical articles, books, graphics and multimedia.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Originating researcher no longer active or changed research focus; staff on temporary contracts; dependence on single student or staff member; weak or fluid institutional commitment to subject matter; weak institutional commitment to data sharing; uncertainty over IPR or the presence of orphaned works; encryption; limited or dysfunctional data management planning.

‘Endangered’ in the Presence of Good Practice

Replication and documentation; data management plan; preservation pathway agreed.

2023 Review

This entry was added in 2019 as a subset of the ‘Unpublished Research Outputs’ reported in 2018, which was split into entries to draw attention to the different preservation requirements and concerns that arise. This entry relates specifically to research data which has not been shared or published by any means and is thus in contravention of the ‘FAIR’ principles which require data to be Findable, Accessible, Interoperable and Reusable. Without proper planning, research data can have a high barrier to re-use, especially where documentation is lacking. The 2019 Jury took the view that documentation and re-use go hand in hand, and researchers should be under no illusions that data not documented or shared faces material and immediate risks of extinction. The 2020 Jury agreed with the assessment. The 2021 Jury identified a trend towards reduced risk in light of more robust collaborative initiatives to jointly address the risk of data loss in and across research communities. The 2022 Taskforce identified a trend towards even more reduced risk based on material improvement over the last year (‘Material improvement’ trend), which had not only offered examples of good research data management and preservation practices but also suggested a significant shift toward a culture of change and collaboration across different research communities and stakeholders. Those mentioned included (but were not limited to) improvements and initiatives by the European Open Science Cloud (EOSC), Science Europe, Research Data Alliance (RDA), Digital Curation Centre (DCC) and related projects on the preservation of research data and outputs.

The 2023 Council, in light of the trends in 2021 and 2022, changed the classification from Practically Extinct to Critically Endangered, noting a positive trend of increased research data management activity and engagement by libraries, which should help to ensure that more research datasets are properly deposited in data repositories. They added that there was a general trend across many, if not most, HEI libraries producing research to do more in terms of research data management and a much larger part of what libraries do, with activities in this area growing and scaling up. However, the scale of unpublished datasets is hard to assess, as they are by definition unknown.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

If we do not know it exists, does it exist? It may also be that in certain circumstances this includes data that is unfavourable and has intentionally not been published. If perceived as high-value, someone in the research team will likely take steps to ensure it is protected. We can be proactive and offer advice, but ultimately it is down to them. We cannot keep everything!

This is a wide field, so the scale and impact are hard to describe, but the risk is higher than papers due to potential file format complexity.

Success is dependent on how successful an institution’s research data management communications are. Advocacy and research are needed to show the scale of the problem, as well as education regarding open science and preservation.

Simply having a data management plan prepared is not sufficient, it needs to be properly implemented and kept up to date and relevant for both the researcher and the repository which will take a copy of the data. DMP should be used to appraise what data is worth long term preservation (e.g. NERC Data Value Check List), and what data is of lower quality, non-reusable, and even a reputational risk should it be shared further.

Read More

Media Art by Deceased Artists or Defunct Workshops

Media Art by Deceased Artists or Defunct Workshops

   Critically Endangered small

Media art where the artists or creative technicians are either deceased or not able to provide guidance on authenticity and installation.

Digital Species: Media Art

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, including the development of new preservation tools or techniques.

Examples

Works produced by media artists now deceased, such as: Jeremy Blake, Beatriz Da Costa, Heiko Daxl or Stanislaus Ostoja-Kotkowski.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of documentation to enable maintenance; uncertainty over IPR or the presence of orphaned works; complex interdependencies on specific hardware, software or operating systems; lack of capacity in the gallery or workshop; lack of strategic investment; complex external dependencies; loss of institutional memory resulting from staff churn; poor working relationship between the gallery and artist/workshop; lack of conservation assessment.

‘Endangered’ in the Presence of Good Practice

Strong documentation; clarity of preservation path and ensuing responsibilities; proven preservation plan; capacity of workshop to support re-installation; capacity of gallery to conserve; capacity of gallery to re-install; retention of institutional memory including archives of correspondence between gallery and artist/workshop; strong and continuing working relationship between the gallery and artist/workshop; regular conservation assessment.

2023 Review

This entry was added in 2019 as a subset of the 2017 ‘Media Art,’ which was first introduced with particular reference to historical media art, but split by the 2019 Jury to ensure greater specificity in its recommendation. This entry represents works held in galleries where the artist is deceased or the workshop has closed, and there is limited prospect to obtain new documentation. The 2020 Jury found a trend towards greater risk based on how galleries, which often rely on visitors for income, have been closed for extended periods and circumstances of economic dislocation. The 2021 Jury agreed on a continued trend towards greater risk based on the increasing risk of this loss happening with more time sensitivity for early media artworks.

The 2023 Council agreed with the Critically Endangered classification with overall risks remaining on the same basis as before (‘No change’ to trend).

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). However, they add that it is important to bring attention to the key element and importance of providing guidance on authenticity and installation. Emulation tools are helping but the missing guidance on authenticity increases the risk.

Additional Comments

This entry includes a point in the lifecycle of all media art, so good practice recommendations are likely to become more important over time. Preservation issues may not become visible until the piece is brought out of storage for loan or exhibition, underscoring the value of continuous or periodic conservation assessment. The range of data/formats/hardware/software etc. can be new and varied, providing organizations with an ongoing technical challenge that they are not initially equipped to deal with. Some loss seems inevitable.

Preservation of legacy media artworks is dependent on access to obsolete technology and also the knowledge of how to operate said technology. Documentation around the production process and artist intent can be limited and more critical without any access to artists or technicians. This creates risk around the preservation of a truly authentic artwork..

Case Studies or Examples:

  • Resources and outputs from the Preserving and Sharing Born Digital and Hybrid Objects From and Across The National Collection project. See V&A Research Projects (n.d.) ‘Preserving and Sharing Born Digital and Hybrid Objects’. Available at: https://www.vam.ac.uk/research/projects/preserving-and-sharing-born-digital-and-hybrid-objects [accessed 24 October 2023].

  • This includes decision model work around acquisition of complex collections such as born digital and hybrid art. See Ensom, T, and McConnachie, S. (2022) ‘Preserving and sharing born-digital and hybrid objects from and across the National Collection’, Decision Model Report: March 2022. Available at: http://doi.org/10.5281/zenodo.7097489

See also:

  • The DPC ‘Preserving Digital Art’ Technology Watch Guidance Note is aimed at institutions starting to collect digital art as part of a wider collecting remit. It offers basic guidance on the specificities of digital art and how it may differ from other digital content in an institution’s care. See: Falcão, P. (2024) ‘Preserving Digital Art’, DPC Technology Watch Guidance Note 24-02. Available at: http://doi.org/10.7207/twgn24-02

  • NEW MEDIA MUSEUMS: Creating Framework for Preserving and Collecting Media Arts in V4, initiated by the Olomouc Museum of Art as a joint international platform for sharing experience with building and maintaining collections of new media artworks across different types of institutions. The aim of the project is to find workable methods for heritage institutions to build and maintain collections of media arts, which are necessary for safeguarding this area for the benefit of society. See Central European Art Database (2021) ‘NEW MEDIA MUSEUMS: Creating Framework for Preserving and Collecting Media Arts in V4’. Available at: http://cead.space/Detail/projects/3797 [accessed 24 October 2023].

  • The Collaborative Infrastructure for sustainable access to digital art LIMA project, to prevent the loss of digital artworks and to commonly develop the knowledge to preserve these works in a sustainable way. The project ‘Infrastructure sustainable accessibility digital art’ invests in research, training, knowledge sharing and conservation to prevent the loss of both digital artworks and the knowledge to preserve them. See LIMA (n.d.) ‘Collaborative infrastructure for sustainable access to digital art’. Available at: https://www.li-ma.nl/lima/article/collaborative-infrastructure-sustainable-access-digital-art [accessed 24 October 2023].

  • Ellis, T. (2023) ‘Saving Stan: Preserving the Digital Artwork of Joseph Stanislaus Ostoja-Kotkowski’, iPRES 2023 Conference, Urbana-Champaign, Illinois, USA, 19–22 September.

Read More

Grey Literature

Grey Literature

   Critically Endangered small

Semi-published research outputs such as blogs, dissertations, informal conference papers or commissioned reports which are not formally published but which can contain original and insightful contributions within scholarly communications. This entry covers a wide spectrum of very diverse types of materials which all have different preservation considerations.

Digital Species: Research Outputs

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost.

Examples

Blogs, technical reports, conference papers, dissertations, commercial research.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Originating researcher no longer active or changed research focus; staff on temporary contracts; dependence on single student or staff member; weak or fluid institutional commitment to subject matter; weak institutional commitment to data sharing; uncertainty over IPR or the presence of orphaned works; encryption; lack of recognition; non-disclosure agreements.

‘Endangered’ in the Presence of Good Practice

Use of persistent identifiers; embedded within repository infrastructure; quality assurance.

2023 Review

This entry was introduced in 2017 under ‘Research Data,’ though without explicit reference to grey literature. In 2019, the Jury split this entry into a range of contexts for research outputs. This entry represents activities which build towards formal publications and research outputs but which do not typically accumulate in institutional repositories. The 2020 Jury noted a trend towards greater risk because higher education and research institutions faced budget uncertainties, and a number of institutions introduced early severance schemes or put staff on short-term contracts at greater risk of redundancy. While this puts other types of research output at risk, the ad hoc nature of grey literature placed it at greater risk. The 2021 Jury agreed with the Critically Endangered classification but argued the content of grey literature is not entirely unique if it eventually makes its way into published outputs, and they also noted improvements and initiatives towards preservation of semi-published research data and outputs over the last year, which together led to a consensus on a 2021 trend towards reduced risk. The 2022 Taskforce agreed with the 2021 assessment, with risks remaining on the same basis as described (no change to the trend).

The 2023 Council agreed with the Critically Endangered classification and that overall risks remained on the same basis as before (‘No change’ to trend), but noted that there will always be an element of risk to materials under this entry due to its semi-official nature. The Council also noted that this entry covers a wide spectrum of material, and all had different preservation considerations.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

Loss of material like this would be common in the analogue world, but in the digital age, we have the capacity and perhaps something of a responsibility to ensure that it is captured: more of an opportunity lost to extend the available research resource. The ADS’s Grey Literature Library demonstrates what could be done if information architectures are deployed to mirror and extend professional practice.

Workflows and policies regarding tagging, collecting and EDRMS may help protect such data into the future. Past materials are almost certainly partially lost.

Not all funder-maintained specialist repositories accept grey literature for long-term storage (e.g., UKRI-NERC EDS). These are redirected to generic open data repositories such as Zenodo which mint DOIs but do not offer data quality assurance for different data types.

See also:

  • The Policy Commons has a mission to index and preserve grey literature from IGOs, NGOs, think tanks, governments and, to date, indexing and preserving around 4 million items from c.11,000 institutions from across the world. See Policy Commons (n.d.) Available at: https://policycommons.net/ [accessed 24 October 2023]

Read More

Consumer Social Media Free at the Point of Use

Consumer Social Media Free at the Point of Use

   Critically Endangered small

Social media platforms free at the point of use with a business model based on reselling user data for consumer behavior and/or advertising analysis, mainly for profit-driven corporations. This entry broadly includes digital content created, shared and hosted on social media platforms as well as current interfaces of social media platforms.

Digital Species: Social Media

Trend in 2023:

increased riskTowards even greater risk

Consensus Decision

Added to List: 2017

Trend in 2024:

increased riskTowards even greater risk

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected, should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools or services within this group would have a global impact.

Effort to Preserve | Inevitability

Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost.

Examples

Instagram, Facebook, Twitter/X, Pinterest, Yahoo Groups, Truth Social, Reddit, Mumsnet, Sina Weibo, Flickr, Bebo, and legacy BBS.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of preservation capacity in provider; Lack of preservation commitment or incentive from provider; proprietary products or formats, including user interface; poor data protection; inaccessibility to web archiving; political or commercial interference; Lack of offline equivalent; super-abundance; Uncertainty over IPR or the presence of orphaned works; Lossy compression in upload scripts.

‘Endangered’ in the Presence of Good Practice

Offline backup and documentation of media assets; Migration plan; Early warning from vendors; Roadmap from vendors; Accessible to web harvest; Suitable export functionality; Licencing enables preservation; Preservation commitment from vendor; Preservation capability in vendor; Resilient to hacking; Selection criteria.

2023 Review

This entry was added by the 2019 Jury as a subset of a broader social media entry first introduced in 2017. It was created as a standalone entry to draw attention to the different threats faced by online services that are paid for versus ‘free at the point of use’ (both depend on the business model of the vendor and the terms they impose). The 2021 Jury raised the risk classification from Endangered to Critically Endangered based on concerns arising with trends towards harmful and malicious hate speech as well as misinformation and deliberate deletion. The 2022 Taskforce agreed on a trend towards even greater risk based on the continued, significant trend towards hate speech, misinformation and disinformation, and deliberate deletion in light of ongoing global conflicts that include (but are not limited to) social and economic inequalities and climate change. In particular, they mentioned the sale of Twitter prompting a moment of instability in consumer social media, with the scale of Twitter, evident acrimony between parties prior to the sale and the hostile news coverage afterward, elevating the risks associated with social media. They also brought to attention issues surrounding platforms enabling extreme views not permitted on mainstream platforms, which emerged and proliferated noticeably and, from a preservation standpoint, could be argued are potentially at very high risk and historically significant.

Based on the assessment of the rescoped entry, the 2023 Council agreed on the Critically Endangered classification and noted an increase in the imminence of action required as well as the effort to preserve. The need for major efforts to prevent or reduce losses continues, but it is now much more likely that loss of material has already occurred and will continue to do so by the time tools or techniques have developed. There is a greater urgency to prioritize the assessment of these materials and develop tools or techniques to prevent or reduce further losses in this group.

The 2023 Council recommended further rescoping and adjusting of this and other social media entries in light of how web-based and cloud-based business products and services have developed in recent years. This included:

  • Clarifying the scope. This entry broadly refers to the preservation of content and interfaces of social media platforms, with these platforms designed to facilitate the creation and sharing of media through interactive social networks. These services, particularly those provided by largely unregulated (or underregulated) platforms, pose critical risks for not only capturing and preserving the content hosted on the social media platform but also the interfaces of the platforms themselves.

  • Similarly, the entry specifically refers to risks for digital materials created, shared, and hosted via social media services offered ‘free at the point of use,’ in which the business model and sustainability can only be guessed, and contracts tend to be asymmetrical in favor of the supplier. Moreover, because these services have a low barrier to entry, they may be favored by agencies or individuals least able to respond to closure or loss.

  • As part of this rescoping, relevant information concerning cloud-based aspects were incorporated into the ‘Cloud-based Services and Communications Platforms’ entry to more clearly differentiate the risks associated with cloud hosting and computing technologies, allowing this entry on consumer social media free at the point of use to focus on challenges, notably those relating to harvesting and managing content and interfaces of web-based social networking platforms.

The 2023 Bit List Council additionally recommended that the next major review for the Bit List include:

  • A restructure and splitting of the entry to create separate entries for ‘digital content hosted on social media platforms’ and for ‘interfaces of social media platforms’, where each can be teased out to provide greater clarity about specific risks, aggravating factors and recommended actions. This should include expanding on API access to data, providing examples of legacy content already lost, and pointing to examples where risk is especially high (e.g., things that are still up but alarmingly fragile!)

  • A consideration of merging the ’Data Posted to Defunct or Little-used social media platforms’ entry with this entry, to incorporate examples of loss in the presence of aggravating conditions.

  • A consideration of merging the ‘Born Digital Photos and Video Shared on Social Media’ entry with this entry, to provide examples of particular types of digital content hosted on social media platforms that are lost or at risk. This is mostly due to the fact that so many of the ‘regular’ social media platforms have tended toward more ways to mimic or copy TikTok-style videos, and making the distinction will become harder in the future since they all have similar functionality and ways to create photo/video content.

  • A consideration of merging the ‘Legacy Interfaces and Services’ entry with this entry to provide examples of particular interfaces of social media platforms that are lost or at risk

2024 Interim Review

The 2024 Council identified a trend towards even greater risk due to a number of factors, summarized below.

Creators and archivists relying on consumer social media free at the point of use inhabit a precarious position. Free services may be favored by agencies or individual creators who are least able to respond to closure or loss because of the low barrier to entry associated with ‘free at the point of use’ services. Proprietary interfaces and services pose risks, as companies prevent third-party attempts to preserve either hosted content and/or the end-user experience of the environment. An inability to preserve social media interfaces diminishes future potential for emulation and may inhibit researchers' ability to glean important context, as described in the Bit List 2023 review.

Additional barriers to preservation via web capture are also present in terms of service for user accounts that explicitly prohibit crawling. For example, the X Terms of Service state “You may not access the Services in any way other than through the currently available, published interfaces that we provide. For example, this means that you cannot scrape the Services, try to work around any technical limitations we impose, or otherwise attempt to disrupt the operation of the Services” and “crawling or scraping the Services in any form, for any purpose without our prior written consent is expressly prohibited” (X, 2023). Another example, from the Facebook Terms of Service, states “You may not access or collect data from our Products using automated means” (Facebook, 2022).

An additional recommendation for the next 2025 review is to assess if ‘proprietary formats’ (e.g. the platform interfaces) adequately demonstrates the scope of this entry and answers the first bullet point of the 2023 Council recommendation. The 2023 recommendations for re-scoping and combining entries will also be assessed in more detail in 2025.

2024 Council members also raised concerns regarding Artificial Intelligence and Machine Learning, noting that for this entry and, more broadly, anything related to Social Media, an emerging risk is AI training fears. This manifests in two ways:

  • Social media users deleting their content entirely in an attempt to prevent it from being used or sold for training a Large Language Model (Edwards, 2024). However, as noted by the Council members, it can be difficult to have a good sense of how widespread this is or how much it’s affecting content that people want to preserve – but it is nonetheless worth noting.

  • Website operators blocking scrapers for similar reasons, (Voorhees, 2024). This primarily targets AI crawlers but may also affect web crawlers used for preservation. It can similarly be difficult to have a good sense of how widespread this is, but it seems likely that anti-crawler provisions are only going to get worse, not better.

Additional Comments

Social media free-at-the-point-of-use remains at a critical risk due in large part to the policies of unregulated (or underregulated) corporate platforms such as Facebook, X (previously Twitter), and their parent companies. The content shared on these platforms and the history of the development of platform infrastructure and policy itself provide a critical source of information for policy-makers and researchers. The complete lack of preservation provision and deliberate obstruction of archiving attempts for public interest puts this valuable content at high risk of loss and draws attention to the critical risk posed by these examples of platforms.

Content hosted on social media platforms (that users might not have stored elsewhere) is at risk and users may lose the opportunity to keep their own data for personal archiving or to donate to an organization. Collecting organizations may lose the opportunity to archive hosted content within their collecting remit using web or API harvesting tools. In both instances, data remains at high risk because it is hosted by companies that could change policies or access on a whim. Also, the inability to archive even free content unless you have a login as an archivist (like with Browsertrix). Additionally, there are social media companies requiring payment to access data for preservation.

There are interfaces of social media platforms that researchers may want to see to study the evolution of the platforms over time (through web harvesting typically) that are at risk. Preservation is affected by researcher API access being shut down, halting preservation of entire platforms. There are also differences between the themes/collecting policies of institutions and researchers who are scraping their own data and depositing it in repositories.

Preserving this stuff en masse is still incredibly difficult, but many of these platforms allow the downloading of their own personal content/archives. However, these lose all the context of social media and therefore, whilst they do preserve the data, they do not preserve the essence of the material. Platforms like X (previously Twitter) have both opened and closed their API further in recent years, but others like Yahoo have closed, and Facebook, as well as X (formerly Twitter), continues to be almost hostile towards archiving and preservation attempts.

With digital materials from premium or institutional social media services, the business model and sustainability are more obvious, and contracts may be enforceable more readily. Moreover, because these services have a slightly higher barrier to entry, they may be favored by agencies that are better able to respond to closure or loss. Traditional web archiving can be employed where the user pays for a service, but the content is ultimately publicly available (such as Flickr). But much is unclear about how to preserve internal social media / closed networks that web archiving cannot get to, or existing tools do not cover.

Social media capture via web harvesting has become increasingly difficult. Social media platforms have done nothing to address the barriers to automated capture that prevent the preservation of even so-called public content. For example, campaign websites or other election-related content that is only published on Facebook or on X (previously Twitter) because these services are ‘free.’ This content is of particular concern as it appears on no other website. Web archivists are constantly shifting strategies and approaches and trying out new (but limited) tools to best capture this content. If we cannot successfully preserve these platforms, we are missing out on documenting organizations, campaigns and elections around the globe. Much of this data exists as data sets based on aggregated use rather than individual files.

Often these are external proprietary platforms bound by intellectual property law and potentially privacy law which will impede the imminence of action. What recourse do archives or digital repositories have to deal with this and capture the materials?

Case Studies or Examples:

  • Mentioned examples of additional barriers to preservation via web capture present in terms of service for user accounts that explicitly prohibit crawling included X (2023) ‘Terms of Service’, Effective: September 29, 2023. Available at https://web.archive.org/web/20240611040225/https://x.com/en/tos [accessed 06 September 2024]; and Facebook (2022) ‘Terms of Service’, Date of Last Revision: July 26, 2022. Available at: https://web.archive.org/web/20240610150804/https://www.facebook.com/terms/ [accessed 06 September 2024]

  • Mentioned examples relating to AI and ML concerns included Edwards, B. (2024) ‘Stack Overflow users sabotage their posts after OpenAI deal’, Ars Technica. Available at: https://arstechnica.com/information-technology/2024/05/stack-overflow-users-sabotage-their-posts-after-openai-deal/ [accessed 06 September 2024]; and Voorhees, J. (2024) ‘How We’re Trying to Protect MacStories from AI Bots and Web Crawlers – And How You Can, Too’, Available at: https://www.macstories.net/stories/ways-you-can-protect-your-website-from-ai-web-crawlers/ [accessed 06 September 2024]

  • A range of use cases are presented in Thomson, S. (2016) ‘Preserving Social Media’, DPC Technology Watch Report (16-02). Available at: http://doi.org/10.7207/twr16-02.

  • The National Library of Scotland ‘The Archive of Tomorrow: Health Information and Misinformation in the UK Web Archive’ project, to record the proliferation of misinformation about coronavirus. See Archive of Tomorrow (2022-2023), National Library of Scotland. Available at: https://www.nls.uk/about-us/working-with-others/archive-of-tomorrow/ [accessed 24 October 2023].

  • The archiving of the ‘In Her Shoes’ collection, part of the Archiving Reproductive Health (ARH) project. Working with key stakeholders, including activist organisations like Abortion Rights Campaign, Together for Yes, Terminations for Medical Reasons, Coalition to Repeal the Eighth, and many others, ARH gathered and preserved a selection of digital objects and research data, including social media, that tells part of the story of this historic campaign. ARH published collections of design and publicity material from activist groups, as well as a sequence of stories from the popular Facebook page ‘In Her Shoes’, a page where people anonymously shared stories of their experiences of being unable to access abortion in Ireland. This initiative received a 2022 Digital Preservation Award for Safeguarding the Digital Legacy. See Archiving Reproductive Health Project (2022), ‘Archiving Reproductive Health’, Digital Preservation Awards 2022. Available at: https://www.dpconline.org/events/digital-preservation-awards/dpa2022-archiving-reproductive-health [accessed 24 October 2023].

  • An example of a tool available to help libraries and archives with capture is Archive Social. See CIVICPLUS (n.d.) ‘ArchiveSocial’. Available at: https://archivesocial.com/ [accessed 24 October 2023].

See also:

  • In the 2023 NDSA Web Archiving Survey Report, one of the major takeaways related to respondents’ concerns about ability to collect social media—in particular, Twitter, Instagram, Facebook, and Reddit. Content housed within social networks has always been difficult to capture for a myriad of reasons and recent changes to numerous social platforms have made this task harder. See: National Digital Stewardship Alliance (NDSA) (2023) Web Archiving Survey Results: An NDSA Report. October 2023. Available at: https://doi.org/10.17605/OSF.IO/N5MYR [accessed 11 September 2024]

  • Willison, S. (2024) ‘Slop is the new name for unwanted AI-generated content’, 8 May 2024, Simon Willison’s Weblog. Available at: https://simonwillison.net/2024/May/8/slop/ [accessed 12 September 2024]

  • Cannelli, B. (2022) ‘Mapping social media archiving initiatives: state of the art, trends, and future perspectives’, IIPC Blog. Available at: https://netpreserveblog.wordpress.com/2022/11/30/mapping-social-media-archiving-initiatives-state-of-the-art-trends-and-future-perspectives/ [accessed 24 October 2023].

  • A 2022 report on a nationwide questionnaire survey conducted to obtain the responses of people to hypothetical scenarios of social media archiving by the National Diet Library in Japan, noting legal and ethical concerns as well as respondent views on the preserving of private data publicly available on social media. See Shiozaki, R. (2022) ‘People’s perceptions on social media archiving by the National Library of Japan’. Journal of Information Science. Available at: https://doi.org/10.1177/01655515221108692

Read More

Always Online Games

Always Online Games

   Critically Endangered small

Video games that are required to be continuously online. Gameplay is referenced here particularly as means of participation, along with social media and in-game interaction between players. This can include Massively Multiplayer Online games and single player games with always-on DRM.

Digital Species: Gaming

Trend in 2023:

increased risk Towards even greater risk

Consensus Decision

Added to List: 2019 (rescoped 2023)

Trend in 2024:

increased risk Towards even greater risk

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost.

Examples

Fortnite, World of Warcraft, Neverwinter, League of Legends.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of skills, commitment or policy from corporate owners; uncertainty over IPR or the presence of orphaned works; controversies around IPR; lack of offline backup; changing business model of providers; limited recognition of value of game play; limited recognition of value of game preservation; over dependence on goodwill of ad-hoc community; lack of preservation know-how at service providers; dependency on bespoke hardware or interfaces; increased reliance on always-on DRM for single player games.

‘Endangered’ in the Presence of Good Practice

Well documented code; IPR supportive of preservation; large and committed user community; removal of always-on DRM for single player games.

2023 Review

This entry was added in 2019 as a subset of the 2017 entry for ‘gaming’. The 2020 and 2021 Juries noted a trend towards greater risk, due to the increased significance of these games during the COVID Pandemic as well as the evolving nature of MMOs, to the extent that the 2021 Jury changed the risk classification from Endangered to Critically Endangered.

The 2023 Council agreed with the 2022 Taskforce suggestion to consider the naming and scope of the entry, rescoping this entry to ‘Always Online Games’ covering all games that have to always be online, whether that is due to being MMOs, server-based games or single-player games with Always-Online DRM. Games that have online components but are not required to always be online fit into the new ‘Games with Online Play Components’ entry.

2024 Interim Review

The 2024 Council identified a trend towards even greater risk based on shifts in business models and increased litigation over the last year, resulting in more shutdowns which impact preservation efforts. It also raises time sensitivity for action; if there are no efforts to preserve and those existing are further shutdown, this raises the likelihood of loss.

Additional Comments

Preservation for Always Online games in a playable state requires preservation or re-creation of the servers that are used to run these games. Even then, for MMOs or multiplayer games, it would be impossible to recreate these games at their various peaks. This nicely encapsulates why video recordings of (online) gameplay are important. They will never have the same configuration of subscribers, to say nothing of the innumerable changes made to the software over the years, which have significantly altered how the game works and looks. Loss is inevitable, and it has already happened. The social and cultural aspects of play are incredibly important, and on-screen recording is the most robust way to capture that.

Whilst it is expected that MMOs and always multiplayer games (such as Fortnite) would always require an internet connection due to their reliance on servers, single player games, or those where the primary gameplay is single player, being always online due to DRM provides an added risk to preservation. If the server shuts down, then even the single player components might not be playable, thus loss happens faster than a single player game that does not have a reliance on servers. For more details, see the ‘Shut Down or Discontinued Video Games’ entry.

Case Studies or Examples:

See also:

Read More

Open Source Intelligence Sources of Current Conflicts

Open Source Intelligence Sources of Current Conflicts

   Critically Endangered small

Open source intelligence produced, collected and analysed from publicly, openly available social media and web content with the purpose of answering a specific intelligence question and that supports crowd-sourced investigation and fact-checking to verify or refute claims of state agencies and rebel groups in the context of current political or military conflict.

Digital Species: Legal Data

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Social media sources relating to current conflicts, such as in Yemen or Syria.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Loss of authenticity; lack of preservation agency; limited or no digital preservation capability; Uncertainty over IPR or the presence of orphaned works.

‘Endangered’ in the Presence of Good Practice

Offline backup captured by the journalist or investigating authority.

2023 Review

This entry was added as a subset in 2019, as part of a broader ‘Open Source Intelligence Sources’ which the Jury split into three elements, relating to current, recent and historic sources. This entry relates in particular to materials relating to current and ongoing conflicts. Social media companies have a policy to take down or suppress content that they consider to be propaganda for terrorist groups. This has had the unintended consequence of deleting or suppressing content that was being used in open source investigation or fact-checking for journalistic or judicial purposes, and which may therefore be an impediment to refutation or prosecution. However, a new generation of cloud-based services, such as Hunchly, have emerged in the last few years, which allow investigators to copy and stabilize content to private accounts in the process of investigating it: so, the ethical requirements of social media companies and the integrity of the investigation are both served. The 2021 Jury noted that such content stays at risk, and the process of investigation is slower than algorithmic deletion. Nonetheless, there is a notable difference in the investigation of current conflicts than historic ones where evidence has been lost. The 2022 Taskforce identified a trend towards even greater risk based on the increased significance of crowd-sourced investigations and fact-checking in light of ongoing global conflicts that include (but are not limited to) those in Ukraine.

The 2023 Council agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council also added clarification to the meaning of ‘open source’ for this entry, to explain its meaning in relation to intelligence openly available online, noting that open source can also refer to a specific software or content licence that permits limited uses of IP so this distinction would be helpful for readers. 

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

The Council acknowledge the continuing challenge of ensuring the preservation of complete and accurate resources given that: platform owners continue to be obliged to remove content that violates community standards;  copyright and ownership increasingly hinders capture/preservation of the open source materials; and with the rise in fakes, preservationists must attend to standards for legal admissibility and authentication which vary from one jurisdiction to another.

Additional Comments

Preservation is important for social context and may be picked up inadvertently in other ways - but is ambiguous about who has ultimate responsibility for collecting and preserving this. 

Case Studies or Examples:

  • The Ukraine Investigations by GLAN and Bellingcat Justice & Accountability project to investigate alleged atrocity crimes taking place in Ukraine. The aim of the project is to conduct a set of open source investigations into incidents causing civilian harm occurring in Ukraine according to robust legal standards with the aim of making them available to national and international prosecutors who are gathering evidence of alleged crimes. In this case, the open source content gathered during Bellingcat’s investigations will be preserved by Mnemonic, an independent third-party organization maintaining an archive of digital content from Ukraine, as it has done for Syria, Yemen and Sudan. See Glan and Bellingcat (n.d.), ‘Methodology for Online Open Source Investigations’. Available at: https://www.glanlaw.org/online-open-source-methodology [accessed 24 October 2023]

See also:

  • The website of the Forensic Architecture (FA) research agency, based at Goldsmiths, University of London, offers examples of OSINT. See Forensic Architecture (n.d.). Available at: https://forensic-architecture.org/methodology/osint [accessed 24 October 2023]

  • The website of the Coalition for Content Provenance and Authenticity (C2PA). The C2PA addresses the prevalence of misleading information online through the development of technical standards for certifying the source and history (or provenance) of media content. C2PA is a Joint Development Foundation project, formed through an alliance between Adobe, Arm, Intel, Microsoft and Truepic. See Coalition for Content Provenance and Authenticity (n.d.). Available at: https://c2pa.org/ [accessed 24 October 2023]

  • Baumhofer, E. and Reilly, B.F. (2022) ‘Preserving Open Source Digital Evidence: A Guide for Practitioners Working on Dealing with the Past’, Available at: https://www.swisspeace.ch/articles/preserving-open-source-digital-evidence [accessed 24 October 2023]

  • Higgins, E. (2019) ‘Bellingcat and beyond. The future for Bellingcat and online open source investigation’, iPres Conference 2019, Amsterdam. Available at: https://www.youtube.com/watch?v=kZAb7CVGmXM [accessed 24 October 2023]

  • Dubberley, S., and Ivens, G. (2022) ‘Outlining a Human-Rights Based Approach to Digital Open Source Investigations’, The Human Rights, Big Data and Technlogy Project. Available at: http://repository.essex.ac.uk/32642/1/Outlining%20a%20Human-Rights%20Based%20Approach%20to%20Digital%20Open%20Source%20Investigations.pdf [accessed 24 October 2023]

Read More

Politically Sensitive Data

Politically Sensitive Data

   Critically Endangered small

Digital content where the knowledge to preserve exists, and there is no threat to obsolescence, but where political interests may be served by elimination, falsification or concealment.

Digital Species: Political Data

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2017

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Online News; social media and web-based campaigning; social media relating to 2016 UK/EU referendum; Promises made in Scottish independence referendum 2014; US Environmental Data; UK Public Finance Initiative (PFI) documents; Recordings of Leinster House.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Opaque terms and conditions that facilitate deletion or obfuscation; lack of access to web-harvesting; significant lobby interest; change of administration; data resides in single jurisdiction; reputational risk to collecting institution; uncertainty over IPR or the presence of orphaned works.

‘Endangered’ in the Presence of Good Practice

Robust political archives; robust preservation services for investigative journalists.

2023 Review

This entry was added in 2017 with additional comment and contextualization offered by the 2019 Jury. The 2019 Jury agreed that the nature and extent of political campaigning online continue to become more apparent, drawing attention to the manipulation of digital media but not explicitly the issue of deliberate deletion, alteration or concealment. They further noted that GDPR provides a pretext for the disposal of records, and that the increased capability of archives to secure the content from outgoing governments and ministers was a source of encouragement. Nonetheless, they pointed to a pressing need for a deep and comprehensive assessment of the risks faced by politically sensitive data and the impact which such deletions have on the public good. The 2020 Jury added a 2020 trend towards greater risk based on 2020 as a year of significant political and economic upheaval, in part because of the pandemic though also because of popular protest and the outcomes of elections around the world. Moreover it had been widely reported that senior officials in government have avoided scrutiny and record-keeping laws by using self-deleting messaging applications. The 2021 and 2022 reviews also identified trends towards greater risk based on the continuation and increase of significant political and economic upheaval. Moreover, they added how it had been widely reported that senior officials in government have avoided scrutiny and record-keeping laws by using self-deleting messaging applications. In those circumstances, politically sensitive records were likely to be at greater risk. The 2022 Taskforce agreed, and noted the significance of elimination, falsification or concealment in light of political upheaval, social and economic inequalities and climate change. The case of political upheaval and protest in Iran had further amplified the risks, and anonymous digital art and social media activism had burgeoned in response to gendered violence and acts of political repression in the latter half of the year. However, preservation infrastructures, such as national libraries and collecting archives within universities are conflicted, therefore unlikely, unable or unwilling to preserve content that is explicitly and radically critical of the regime. For those reasons there was a 2022 trend toward even greater risk.

The 2023 Council agreed with the Critically Endangered classification with overall risks remaining on the same basis as before (‘No change’ to trend). They also provided discussion and comments around GDPR abuses. GDPR can be abused for blocking access to public records and political data. The existence of ‘special category data’ under GDPR is used to justify denying access even to people’s own data. These justifications usually do not reflect the reality of how GDPR works at all, but it is used as a way to shut down these challenges.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

There is a question of whether it is the duty of archives/libraries to preserve the falsification but to instead preserve the constituent pieces to allow researchers to infer elimination, falsification or concealment.

See also:

  • World Wide Web Foundation, The Open Data Barometer, which provides a global measure of how governments are publishing and using open data for accountability, innovation and social impact, which looks at the 30 governments that have adopted the Open Data Charter and those that, as G20 members, have committed to G20 Anti-Corruption Open Data Principles. World Wide Web Foundation (n.d.) ‘The Open Data Barometer’. Available at: https://opendatabarometer.org/ [accessed 24 October 2023]

  • Ovenden, R., (2020) ‘Undelete our government’, Digital Preservation Coalition Blog. Available at: https://www.dpconline.org/blog/undelete-our-government [accessed 24 October 2023]

  • Mitcham, J. (2022) ‘What’s up with using WhatsApp?’, Digital Preservation Coalition Blog. Available at https://www.dpconline.org/blog/what-s-up-with-using-whatsapp [accessed 24 October 2023]

  • Example of data rescue work by the Environmental Data & Governance Initiative (EDGI), initially formed in November 2016 to document and analyze changes to environmental governance that would transpire under the Trump Administration. EDGI subsequently became the preeminent watchdog group for material on federal environmental data issues on government websites, and a national leader in highlighting President Trump’s impacts such as declines in EPA enforcement. See Environmental Data & Governance Initiative (n.d.) ‘Archiving Data’. Available at: https://envirodatagov.org/archiving/ [accessed 24 October 2023]

  • Johnston, L. and England, E. (2021) ‘A Framework Enabling the Preservation of Government Electronic Records’, Digital Preservation Coalition Blog. Available at: https://www.dpconline.org/blog/bit-list-blog/blog-nara-wdpd [accessed 24 October 2023]

Read More

Records of Local Government

Records of Local Government

   Critically Endangered small

Records from local government (i.e., below the state level) which are required for transparency and may be in many diverse forms, but in which the local authority may lack the capacity to manage the complex digital preservation requirements that arise.

Digital Species: Public Records

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Born digital records of small and medium-sized agencies; fasting-changing internal manuals, advice or policies shared electronically; records of care services; Documentation supporting long-lived contractual relations like Public Finance Initiatives; Organizational Slack channels; network drives; EDRMS; Email.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of preservation infrastructure; conflation of backup with preservation; loss of authenticity or integrity; Long-lived business processes; poor storage; churn of staff; significant volumes or diversity of data; poorly developed digitization; ill-informed records management; poorly developed migration or normalization; long standing protocols or procedures that apply unsuitable paper processes to digital materials; encryption; political instability; lack of sustained funding; Uncertainty over IPR or the presence of orphaned works.

‘Endangered’ in the Presence of Good Practice

Well managed data infrastructure; preservation enabled at the point of creation; carefully managed authenticity; use of persistent identifiers; finding aids; well managed records management processes; recognition of preservation requirements; strategic investment in digital preservation; preservation roadmap; participation in digital preservation community.

2023 Review

This entry was added in 2019 as a subset of a previous entry for ‘Records of long duration from Local Government or Other Government Agencies.’ The split was intended to allow greater concentration on the challenges that these distinct types of agency face. Local government typically operates across a broad range of digital formats and services, but it is unclear and unlikely that relatively small archival agencies are properly funded locally to support the wide range of digital preservation requirements that arise. The 2020 Jury noted a trend towards greater risk based on significant political and economic upheaval placing additional strain on local government and its agencies, making already vulnerable records at greater risk. Trends towards greater risk was also noted by the 2021 Jury and 2022 Taskforce, contributing examples like Grenfell to demonstrate the precarity of local government records, especially when these risks overlap with records of non-governmental agencies, resulting in significance and impact of loss, the impetus for action and call to governing frameworks where failing in enforcement (and depending on the jurisdiction).

The 2023 Council generally agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council additionally recommended revisiting and rescoping this entry as part of the next major revision of the Bit List. Some Council members recommended splitting this entry into separate entries to differentiate the various risks associated with different types of digital public records, Others raised concerns regarding the breadth of records held by local government, and that it is perhaps not appropriate to have a distinct entry or split entries for records of local governments but rather provide examples of different kinds of public records in and across other entries.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

The diversity of 'local government records' makes this category quite difficult to score. First, local governments have differing responsibilities in different jurisdictions. For example local governments in the UK have more responsibilities than in Australia. Also, given the number of local government agencies in a state or country, the quality of recordkeeping and digital preservation practice can vary greatly. Additionally, the variety of records that are created by local governments means that some formats or record types may be generally at low risk, while others may be practically extinct. Given this complexity it is important to make clear that the imminence of action, significance of loss, and effort to preserve are context-dependent and generalized.

The main factors that reduce risk for these records are that local government is regulated, and there are clear recordkeeping standards that apply to digital records. Also they have consistent funding (although it may not be enough and may not be directed at digital preservation).

We feel that due to the breadth of records held by local governments, it is perhaps not appropriate for them to have a distinct record series, but rather be a featured example of other series. This approach would still assist in advocacy for local government as they would be able to cross reference their digital holdings against these classifications.

Significant research by the UK National Archives into Local Government Archives in England underlines the digital skills shortages that exist, especially with respect to preservation. There may be a benefit from splitting into a) legally required public record and b) additional information that may enrich our digital preservation of society. My assumption was that the roles and requirements for records management are clearly defined, but if this is not the case and there are inadequate resources to match the requirement, then the risk goes up.

Case Studies or Examples:

  • The Grenfell Tower fire and Grenfell Tower Inquiry illustrate the precarity of local government records, especially when third-party contractors are involved. Not only does it show the potential impact of aggravating conditions for Records of Local Government, but it also applies to those of Records of Non-Governmental Agencies. See Grenfell Tower Inquiry (n.d.) ‘Grenfell Tower Inquiry’. Available at: https://www.grenfelltowerinquiry.org.uk/ [accessed 24 October 2023]

  • In Scotland, there is record keeping legislation that is relevant and governs some of this, such as the Public Records Scotland Act of 2011. See National Records of Scotland (n.d.), ‘Public Records (Scotland) Act 2011’. Available at: https://www.nrscotland.gov.uk/record-keeping/public-records-scotland-act-2011 [accessed 24 October 2023]

  • The work and outputs of the EDRMS Preservation Taskforce, such as the EDRMS Preservation Toolkit, may be helpful for guidance in this context. See Digital Preservation Coalition (2021) ‘EDRMS Preservation Toolkit’. Available at: https://www.dpconline.org/digipres/implement-digipres/edrms-preservation-toolkit [accessed 24 October 2023]

  • The Kickstart Cymru project, which builds on the work that has been undertaken in Wales to preserve and provide access to digital information now and in the future. Underpinned by the Digital Preservation Policy for Wales, it is a multi-stranded initiative involving archivists, researchers, consultants, students and IT professionals to promote digital preservation in the local authority, education and cultural sectors. This included funding for programme partnership of six archive services to support local government collaboration to solve shared problems with one issue identified being the need to provide long term access and to preserve records on business systems with operational lifespans less than the need to preserve the records. It is responsive to specific sectoral needs, but with an overarching aim of enhancing digital preservation capacity. Elements of the initiative include building skills; addressing specific digital preservation issues, co-creation of documentation and providing kits to undertake practical preservation. See Archive Wales (2022), ‘Kickstart Cymru: Enhancing digital preservation capacity in Wales’, Digital Preservation Awards 2022. Available at: https://www.dpconline.org/events/digital-preservation-awards/dpa2022-kickstart-cymru [accessed 24 October 2023].

  • The issues and approaches raised by the Tuvalu Future Now Project, a set of three major initiatives designed to preserve its nationhood, governance and culture in the event of a worst-case scenario. The third initiative is the development of a digital nation. It includes digitising and transferring access to government and consular services and all accompanying administrative systems into the cloud to enable elections to continue to be held, and government bodies to continue in their roles. It also includes a virtual copy of Te Afualiku, the first island in Tuvalu to be digitally recreated through satellite imagery, photos and drone footage, creating a digital twin to not only help inform decisions around urban planning and development but also examine how to use augmented and virtual reality to allow displaced and future generations of Tuvaluans to continue to exist as both a culture and a nation, complete with ancestral knowledge and value systems. If this concept becomes a reality, the Tuvaluan people will be able to interact with one another in a digital dimension, in a way that imitates real life and helps to preserve shared language and customs. See Fainu, K. (2023) ‘Facing extinction, Tuvalu considers the digital clone of a country’, The Guardian. Available at: https://www.theguardian.com/world/2023/jun/27/tuvalu-climate-crisis-rising-sea-levels-pacific-island-nation-country-digital-clone [accessed 24 October 2023].

Read More

Records of Non-Governmental Agencies

Records of Non-Governmental Agencies

 

 Critically Endangered small

Records of independent agencies and contractors that act on behalf of the state in the delivery of public services, and which may be present in many diverse forms, but for which the NGO or contractors may lack the capacity to meet the complex digital preservation requirements that arise, or may have a business motive to minimize or ignore requirements for the maintenance of the record.

Digital Species: Public Records

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Born digital records of small and medium-sized agencies; fasting-changing internal manuals, advice or policies shared on intranets or EDRMS; records of care services; historic guidelines and manuals which evidence 'best practice at the time'; Documentation supporting long-lived contractual relations like Public Finance Initiatives; Organizational Slack channels; network drives; EDRMS; Email

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of preservation infrastructure; conflation of backup with preservation; loss of authenticity or integrity; Long-lived business processes; poor storage; churn of staff; significant volumes or diversity of data; poorly developed digitization specifications; ill-informed records management; poorly developed migration or normalizations specifications; longstanding protocols or procedures that apply unsuitable paper processes to digital materials; encryption; political instability; lack of sustained funding; denial of responsibility; failure to include archives within contract from commissioning agency; Uncertainty over IPR or the presence of orphaned works.

‘Endangered’ in the Presence of Good Practice

Well managed data infrastructure; preservation enabled at the point of creation; carefully managed authenticity; use of persistent identifiers; finding aids; well managed records management processes; application of records management standards; recognition of preservation requirements at highest levels; strategic investment in digital preservation; transfer protocols to public archive; participation in digital preservation community.

2023 Review

This entry was added in 2019 as a subset of a previous entry for ‘Records of long duration from Local Government or Other Government Agencies.’ The split was intended to allow greater concentration on the challenges that these different types of agencies face. Non-governmental organizations typically operate across a broad range of digital formats and services acting on behalf of the public sector. The 2020 Jury noted the trend towards greater risk based on 2020 being a year of significant political and economic upheaval, putting additional strain on NGOs in these circumstances where already vulnerable records are likely to be at greater risk. Trends towards greater risk were also noted by the 2021 Jury and 2022 Taskforce, contributing examples like Grenfell to demonstrate the precarity of non-government agencies, especially when these risks overlap with those of local government, resulting in significance and impact of loss, the impetus for action and call to governing frameworks failing in enforcement for these agencies (e.g., examining current recordkeeping regimes keeping them accountable).

The 2023 Council generally agreed with the Critically Endangered classification, with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council additionally recommended revisiting and rescoping this entry as part of the next major revision of the Bit List. Some Council members recommended splitting this entry into separate entries to differentiate the various risks associated with different types of digital non-governmental records. Others suggested that it is perhaps not appropriate to have a distinct entry or split entries for records of non-governmental agencies but rather provide examples of different kinds of these digital materials in and across other entries.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

The 

There is a large variation in the types of records held by NGOs. Additionally, the quality of digital preservation performed by NGOs can vary widely. Therefore, the same approach to scoring was taken for this entry as the one above.

We consider records of NGOs to be at greater risk due to there being less regulation, and the regulations that exist being less stringently enforced.

An additional risk factor for these records is a blurring of the lines of responsibility, which can lead to records 'falling through gaps', or to difficulties funding digital preservation practice. This can be further complicated by outdated legislation which does not take into account the complexity of privatisation and public/private partnerships. For example, the legislation that PROV operates under is 50 years old. This, in turn, can lead to regulation and enforcement being more complex than it is for government agencies.

Case Studies or Examples:

  • The Grenfell Tower fire and Grenfell Tower Inquiry illustrate the precarity of local government records, especially when third-party contractors are involved. Not only does it show the potential impact of aggravating conditions for Records of Local Government, but it also applies to those of Records of Non-Governmental Agencies. See Grenfell Tower Inquiry (n.d.) ‘Grenfell Tower Inquiry’. Available at: https://www.grenfelltowerinquiry.org.uk/ [accessed 24 October 2023]

  • There can be some grey areas depending on the legislative context. The Public Records Scotland Act 2011, for example, covers government agencies and any non-government org contracted to do work on behalf of government agencies. See National Records of Scotland (n.d.), ‘Public Records (Scotland) Act 2011’. Available at: https://www.nrscotland.gov.uk/record-keeping/public-records-scotland-act-2011 [accessed 24 October 2023]

See also:

  • The Policy Commons has a mission to index and preserve grey literature from IGOs, NGOs, think tanks, governments and, to date, indexing and preserving around 4 million items from c.11,000 institutions from across the world. See Policy Commons (n.d.) Available at: https://policycommons.net/ [accessed 24 October 2023]

Read More

Records of Quasi Non-Governmental Agencies

Records of Quasi Non-Governmental Agencies

   Critically Endangered small

Records from agencies at arms-length to government whether locally, nationally or internationally. They may be required to maintain archives for the purposes of transparency, sometimes for extended periods, and sometimes in diverse and complicated forms.

Digital Species: Public Records

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Records of non-executive state or national agencies; museum or leisure trusts; industry or public regulators; public audit services; public-good funding and investment agencies; autonomous and semi-autonomous public agencies; sovereign wealth funds; public/private partnerships; publicly owned companies.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of preservation infrastructure; conflation of backup with preservation; loss of authenticity or integrity; Long-lived business processes; poor storage; churn of staff; significant volumes or diversity of data; poorly developed digitization specifications; ill-informed records management; poorly developed migration or normalizations specifications; long-standing protocols or procedures that apply unsuitable paper processes to digital materials; encryption; political instability; lack of sustained funding; Uncertainty over IPR or the presence of orphaned works.

‘Endangered’ in the Presence of Good Practice

Well-managed data infrastructure; preservation enabled at the point of creation; carefully managed authenticity; use of persistent identifiers; finding aids; well-managed records management processes; application of records management standards; recognition of preservation requirements at highest levels; strategic investment in digital preservation; preservation roadmap; participation in digital preservation community.

2023 Review

This entry was added in 2019 as a subset of a previous entry for ‘Records of long duration from Local Government or Other Government Agencies.’ The split was intended to allow greater concentration on the challenges that these different types of agencies face. Records of quasi non-governmental agencies are at arm’s length to government, but the ’QuaNGO’ or ‘ALEO’ (Arms-Length Executive Organization) may lack the capacity to meet complex digital preservation requirements that arise, nor be able to deposit in the government archive. The 2021 Jury added that arm's length bodies are still public bodies, and they have a duty of care for maintaining evidence of their actions and transactions. They often receive public funding, and depending on the archives, legislation may be required to transfer to an archive. The issue is when there is a lack of clarity regarding the recordkeeping requirements or neglect of records and information once it has outlived its usefulness. These bodies still create records that affect citizen lives and have a duty to document, and therefore changed the classification from Endangered to Critically Endangered. The 2021 Jury and 2022 Taskforce noted a trend towards greater risk when looking at the precarity of records in QuaNGO agencies in periods of significant political and economic upheaval creating greater strains for funding to support preservation capacity.

The 2023 Council generally agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). The 2023 Council additionally recommended revisiting and rescoping this entry as part of the next major revision of the Bit List. Some recommended splitting this entry into separate entries to differentiate the various risks associated with different types of digital records of quasi non-governmental agencies. Others suggested that it is perhaps not appropriate to have a distinct entry or split entries but rather provide examples of different kinds of these digital materials in and across other entries.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

There is a large variation in the types of records held by QuaNGOs and/or ALEOs. Additionally the quality of digital preservation performed can vary widely. Therefore, the same approach to scoring was taken for this entry as the one above.

Similar to the risks of NGOs, we consider these records to be at greater risk due to there being less regulation, and the regulations that exist being less stringently enforced.

An additional risk factor for these records is a blurring of the lines of responsibility, which can lead to records 'falling through gaps', or to difficulties funding digital preservation practice. This can be further complicated by outdated legislation which does not take into account the complexity of privatisation and public/private partnerships.

Although the split draws attention to the different pressures faced by QuaNGOs it could be further subdivided into legally required public records and additional information that may enrich our digital preservation of society. The classification assumes that the roles and requirements for records management are clearly defined, but if this is not the case or there are inadequate resources to match the requirements, then the risk goes up.

While the 2022 trend shows increases in risk there are some green shoots of hope in Ireland found when working actively with the agencies, and communicating some of the concerns they have for their data so there's better awareness and hopefully that will turn into action.

See also:

Read More

Scroll to top