Legacy Video Files

Legacy Video Files

 

 Endangered large

Video files in any format containing moving pictures and sound recordings, particularly those that are proprietary, contain or utilize encrypted Digital Rights Management (DRM) or carrier bound.

Digital Species: Sound and Vision, Formats

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent losses in this group, such as the development of new preservation tools or techniques.

Examples

STARDIVA; AVI; MOV; MKV; MP3; MP4; on DVD or other carriers.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Lack of replication; encryption; digital rights management; Uncertainty over IPR or the presence of orphaned works; proliferation of file formats; weak or non-existent technical documentation; lack of preservation capability or commitment; poorly managed or digitization processes or QA; reliance on encoding/decoding software; Uncertainty over IPR or the presence of orphaned works.

‘Vulnerable’ in the Presence of Good Practice

Effective replication; normalization of file formats; strong technical documentation; preservation pathway; good descriptive cataloguing; clear licensing to enable preservation; trusted repository.

2023 Review

This entry was added in 2019 under ‘Video files’ to emphasize the issues of video preservation that pertain to offline recording, whether from broadcast, film industry, institutional and private collections. The 2019 Jury noted the connections between this entry and others relating to social media but argued for a standalone to emphasize the range of issues tied to numerous formats and standards. The 2021 Jury discussed the need for further rescoping, arguing that the entry was too broad to be useful without specifying at-risk types or formats. For this reason, its scope was narrowed to legacy videos that are proprietary, encrypted or carrier-bound. The classification remained Endangered with a 2021 trend towards greater risk given the growing content of at-risk legacy video files but a limited mandate. The 2022 Taskforce noted no change to the trend (they agreed these risks remain on the same basis as before).

The 2023 Council agreed with the Endangered with the overall risks remaining on the same basis as before (‘No change’ to trend). Additionally, they agreed that a submitted Bit List nomination for the NSV-based STARDIVA storage format would provide a good example of a video file format especially at risk due to aggravating conditions rather than a separate stand-alone entry.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Council members in the Formats species groups recommended further rescoping of overlaps, in particular to differences between formats and physical carries mentioned, which have different risks and issues. The listing of DVDs or other carriers, for example, may be better suited under the Portable Media species.

Additional Comments

There are simply too many formats and too many standards, but the FFMPEG project and its related tools have significantly mitigated the technical risk to most video files. This enables a practitioner to transform the vast majority of file formats to safer preservation formats while retaining significant properties. However, technical risk is only one of the factors. There needs to be institutional engagement with audio-visual data as a priority. The issue then becomes one of identifying the organizations responsible and, constrained by the cost to store video data, making effective selection decisions.

Case Studies or Examples:

  • The NSV-based STARDIVA storage format is a video format with multiple audio streams used in simultaneous translation session recording by agencies such as the UN. It is a proprietary format that is no longer supported, cannot be natively preserved and cannot be viewed correctly using standard video playback tools. As noted by the nominator, given its use by agencies such as the UN, the loss of this format would be a loss of an international record. The nominator added that by using MediaArea LeaveSD it can be partially normalized for preservation purposes. See MediaArea (2022) ‘LeaveSD’. Available at: https://mediaarea.net/LeaveSD or https://github.com/MediaArea/LeaveSD. [accessed 24 October 2023]

 

See also:

  • NFSA, 2015. Deadline 2025: collections at risk. Of note, on page 04, “Tape that is not digitized by 2025 will in most cases be lost forever as: Analogue video and audiotape, as well as early digital tape formats, will be effectively inaccessible due to the practical inability to maintain playback systems”. NFSA (2015), ‘Deadline 2025’. Available at: https://www.nfsa.gov.au/corporate-information/publications/deadline-2025 [accessed 24 October 2023].

Read More

Legacy Media Art

Legacy Media Art

   Endangered large

Media art in storage or not otherwise displayed but where the artists or technicians are available to support installation.

Digital Species: Media Art

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within three years, detailed assessment in one year.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

It would require a small effort to preserve materials in this group, requiring the application of proven tools and techniques.

Examples

Media art in storage.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Lack of documentation to enable maintenance; uncertainty over IPR or the presence of orphaned works; complex interdependencies on specific hardware, software or operating systems; lack of capacity in the gallery or workshop; lack of strategic investment; complex external dependencies; loss of institutional memory resulting from staff churn; poor working relationship between the gallery and artist/workshop; lack of conservation assessment.

‘Vulnerable’ in the Presence of Good Practice

Strong documentation; clarity of preservation path and ensuing responsibilities; proven preservation plan; capacity of workshop to support re-installation; capacity of gallery to conserve; capacity of gallery to re-install; retention of institutional memory including archives of correspondence between gallery and artist/workshop; strong and continuing working relationship between the gallery and artist/workshop; regular conservation assessment.

2023 Review

This entry was added in 2017 as ‘Media Art,’ which was first introduced with particular reference to historical media art. The 2019 Jury rescoped this entry to ensure greater specificity in its recommendation to represent works held in galleries but no longer displayed, but where there is a continuing working relationship between the gallery and the artist or workshop and reasonable expectation that support for preservation could still be obtained when required. The 2020 Jury identified a trend towards greater risk, given that many museums and galleries, which often rely on visitors for income, had been closed for extended periods. Moreover, any form of digital materials that rely on an individual’s knowledge is at particular risk by a pandemic. For similar reasons, the 2021 Jury also identified a 2021 trend towards greater risk, noting that digital materials in museums and galleries records are likely to be at greater risk in these circumstances.

The 2023 Council agreed with the Endangered classification with overall risks remaining on the same basis as before (‘No change’ to trend), while also noting a decrease in imminence of action as well as the required effort to preserve.

2024 Interim Review

The 2024 Council agreed These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

This entry attempts to capture a point in the lifecycle of media art where preservation risks are increasing but not yet critical. There is a risk that preservation issues will not become apparent until the piece is brought out of storage when considered for loan or exhibition – often on timescales that make it too late to address preservation concerns effectively. Galleries should be aware that the range of data/formats/hardware/software embedded in media art can be wide and vary at different speeds.

Sooner action is needed to prevent the material from becoming Critically Endangered once the artist has died or relationships break down. Where the artist is still around, there is a major reduction in the inevitability of loss and its potential to be a potentially newsworthy subject. The loss of it would be just as impactful and significant though.

Preservation of legacy media artworks is dependent on access to obsolete technology and also the knowledge of how to operate said technology. Documentation around the production process and artist intent can be limited. This is a risk in terms of preserving a truly authentic artwork.

See also:

  • The DPC ‘Preserving Digital Art’ Technology Watch Guidance Note is aimed at institutions starting to collect digital art as part of a wider collecting remit. It offers basic guidance on the specificities of digital art and how it may differ from other digital content in an institution’s care. See: Falcão, P. (2024) ‘Preserving Digital Art’, DPC Technology Watch Guidance Note 24-02. Available at: http://doi.org/10.7207/twgn24-02

  • Archiving Australian Media Arts Project is a research project funded by the Australian Research Council involving collaboration between university researchers and cultural institutions. The aim was to develop a good practice method for stabilising digital media artworks, providing emulated access to the artworks to researchers in reading rooms, and to investigate the contemporary exhibition and re-display of historical media artworks. This has also led to Aus Easi project. Available at: https://aama.net.au/ and https://auseaasi.org/ [accessed 06 September 2024]

  • NEW MEDIA MUSEUMS: Creating Framework for Preserving and Collecting Media Arts in V4, initiated by the Olomouc Museum of Art as a joint international platform for sharing experience with building and maintaining collections of new media artworks across different types of institutions. The aim of the project is to find workable methods for heritage institutions to build and maintain collections of media arts, which are necessary for safeguarding this area for the benefit of society. See Central European Art Database (2021) ‘NEW MEDIA MUSEUMS: Creating Framework for Preserving and Collecting Media Arts in V4’. Available at: http://cead.space/Detail/projects/3797 [accessed 24 October 2023].

  • The Collaborative Infrastructure for sustainable access to digital art LIMA project, to prevent the loss of digital artworks and to commonly develop the knowledge to preserve these works in a sustainable way. The project ‘Infrastructure sustainable accessibility digital art’ invests in research, training, knowledge sharing and conservation to prevent the loss of both digital artworks and the knowledge to preserve them. See LIMA (n.d.) ‘Collaborative infrastructure for sustainable access to digital art’. Available at: https://www.li-ma.nl/lima/article/collaborative-infrastructure-sustainable-access-digital-art [accessed 24 October 2023]

Read More

Electronic Hospital and Medical Records

Electronic Hospital and Medical Records

   Endangered large

Personal medical records and records of hospital treatment are increasingly—if not uniformly—born digital. By implication, those records should be retained through the lifetime of the patient, or in some instances longer as required for intergenerational study; and yet there is little evidence of the medical profession participating in the digital preservation community.

Digital Species: Sensitive Data

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2017

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop.

Examples

Medical scans; records of treatment and care plans; health advice and notifications.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Loss of context; loss of authenticity or integrity; poor storage; lack of understanding; churn of staff; significant volumes of data; significant diversity of data; ill-informed records management; poorly developed transfer and integrity checking; poorly developed migration or normalizations specifications; long standing protocols or procedures that apply unsuitable paper processes to digital materials; encryption; Uncertainty over IPR or the presence of orphaned works owners.

‘Vulnerable’ in the Presence of Good Practice

Well-managed data infrastructure; preservation enabled at the point of creation; carefully managed authenticity; use of persistent identifiers; well-managed records management processes; application of records management standards; recognition of preservation requirements at highest levels; strategic investment in digital preservation; preservation roadmap; participation in the digital preservation community

2023 Review

This entry was first submitted in 2017 under ‘Medical and hospital records.’ At that time, there was limited capacity to address the topic. It was published as ‘of concern’ to revisit and review by the 2019 Jury and also independently received as a submission to the open nomination process under ‘Electronic hospital and medical records.’ The entry covers a broad range of material, and it may be useful in future years to split the entry into more discrete entries. Still, the 2021 Jury agreed to keep the current description and classification to draw attention to the scale of the digital preservation challenges which arise in hospitals and the medical profession. The same reasoning for greater risk in 2020 was used for 2021; there has been significant strain through the Covid pandemic, with resources stretched to meet an overwhelming demand and rigid, exacting protocols. In this environment, it is hard to avoid the sense that records are also now at greater risk. The Jury further commented that hospital records may be at greater risk than we think, where there may already be poor maintenance of records during their lifecycle, poor migration planning, etc. The 2022 Taskforce recommended that the 2023 Council bring in additional subject matter expertise for feedback and comment on any changes in risks relating to growth and volume of born-digital records, increasing or peculiar budget strain conditions, changes pertaining to sensitivity and potential destruction linked to ransomware or conflicts.

The 2023 Council agreed with the previous Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend) though also suggesting an increased timeline for imminence of action and greater inevitability of loss.

2024 Interim Review

The 2024 Council recommends that a major rescoping of the Sensitive Data species is necessary, with plans to remove it as a species and incorporate key elements and examples to relevant entries for the next 2025 Bit List. This is because it is not clear how sensitive data works as a species, when many of the other species mentioned could have sensitive data concerns, and the sensitivity of the data is more like an extra category of risk that potentially applies across any species.

Additional Comments

Increasing sensitivity and awareness of data protection requirements could act inadvertently as a barrier to lifecycle data management. It is striking how little evidence is of the health technology companies participating in the global digital preservation community.

The processes implemented by Sao Joao hospital (see below) are encouraging, but too many medical establishments are operating in an excessively ad-hoc way when it comes to records management. As well as preservation, issues of data protection and ethical obligations are to the forefront when working with this kind of material.

Case Studies or Examples:

  • The São João University Hospital Center (SJUHC) Health Records Repository project offers an example of changing practices relating to the project’s implementation of a long-term digital preservation repository capable of ingesting, preserving and providing access to digital clinical information. As part of the Hospital’s digital transformation strategy, the Health Records Repository promotes change in the management of daily medical records through the implementation of procedures for preparation, digitization and preservation of health records. The results of the last two years of activity of the Health Records Digital Repository reveal a higher efficiency in the access and reuse of clinical information in the context of healthcare. This initiative was nominated for a 2022 Digital Preservation Award. See SJUHC and the Portuguese National Archives (2022), ‘Long-term preservation of Digital Health Records’, Digital Preservation Awards 2022. Available at: https://www.dpconline.org/events/digital-preservation-awards/dpa2022-digital-health-records [accessed 24 October 2023].

  • The National Library of Scotland ‘The Archive of Tomorrow: Health Information and Misinformation in the UK Web Archive’ project as it relates to capture of health advice published on the web. See Archive of Tomorrow (2022-2023), National Library of Scotland. Available at: https://www.nls.uk/about-us/working-with-others/archive-of-tomorrow/ [accessed 24 October 2023].

  • The Conti cyber-attack on Health Service Executive Ireland. See Health Service Executive Ireland (2021)Conti cyber attack on the HSE: Independent Post Incident Review’. Available at: https://www.hse.ie/eng/services/publications/conti-cyber-attack-on-the-hse-full-report.pdf [accessed 24 October 2023].

Read More

Digital Recordings Published via Cloud-based Music Sharing Platforms

Digital Recordings Published via Cloud-based Music Sharing Platforms

 

 Vulnerable small

Music licensed and playable through corporate platforms protected by rights management and subscription revenues and presented as compressed single-track recordings.

Digital Species: Sound and Vision, Cloud

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Spotify, iTunes, Bandcamp, SoundCloud.

‘Crtically Endangered’ in the Presence of Aggravating Conditions

Lack of skills, commitment or policy from corporate owners; conflating backup with preservation; loss of original multitrack recordings; poor planning and roadmap for corporate infrastructure; slapdash procurement or migration to new systems; mergers and acquisitions; profusion of corporate systems; uncertainty over IPR or the presence of orphaned works; single point of failure; technical protection measures that inhibit preservation actions; encryption.

‘Vulnerable’ in the Presence of Good Practice

Strong backup and documentation; use of open formats and open source software; data management planning for preservation; licensing that enables preservation; corporate preservation capability; resilient to hacking; authenticity and integrity managed; recognition of preservation functions at executive level; technology watch; regular preservation audits; accreditation and participation in the professional preservation community.

2023 Review

This entry was previously under the 2017 ‘Digital Music Production and Sharing’ entry until it was split by the 2019 Jury into four subsets, recognizing the different challenges faced. It is particularly concerned with the music industry at scale and the services that connect the vast majority of artists to their audiences. These are typically large and well-funded, and typically recognize the value of the content they publish. But this is not without risks. It is perhaps surprising that the music industry does not yet have any equivalent to the non-print legal deposit regime that applies to other types of publication, including sheet music in some jurisdictions. The 2020 Jury agreed with the risk classification with no change to trend. The 2021 Jury noted a large amount of vulnerable material on user-driven platforms where material can be very ephemeral (removals resulting from, e.g., account deletion, space limitations, copyright claims) and the issue of licensing with the instability of the business model. For this reason, the scope was widened to include ad hoc sharing so that the entry broadly included all platforms such as SoundCloud and Bandcamp, which are more community-driven, as well as Spotify; this resulted in a raised classification from Vulnerable to Endangered with 2021 trend towards greater risk. The 2022 Taskforce agreed with the classification, and that risks remain on the same basis as before, with no change to trend.

The 2023 Council agreed with the Endangered classification and overall risks remaining on the same basis as before (‘No change’ to trend) but also noted an increase in the imminence of action needed for capture given instabilities and ephemeral aspects.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

The preservation of recorded music is one of our generation’s most important jobs, but it is unclear where responsibility lies. There are commercial incentives to do so, but also incentives to reduce costs. Whilst public archives are permitted to keep this material in some jurisdictions, they typically do not have the resources to do so. Consequently, there is an expectation that rights holders will maintain their own archival copies but may not do so. National collecting organizations may need to develop a role to address this.

If managed well, there is hope. It may not be an issue in the cases where the production company would hold original recordings and, if a streaming service lost a track (e.g., Spotify), they would go to the production company and ask for a copy. However, it is an issue for those outside of production companies and platforms such as SoundCloud and Bandcamp, which are more community-driven.

Read More

Digital Music and Ephemera Shared on Social Media

Digital Music and Ephemera Shared on Social Media

   Endangered large

Digital materials created by musicians and fans as a by-product of performance or recording, shared on websites and other social media platforms.

Digital Species: Sound and Vision, Social Media

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to address losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Fan sites; private or illicit recordings of concerts; informal music sharing between networks such as TikTok, MySpace and Facebook.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Dependence on social media provider; lack of offline equivalent; uncertainty over IPR or the presence of orphaned works; unstable or small community of interest; encryption.

Vulnerable in the Presence of Good Practice

Offline equivalent; intellectual property rights conducive to preservation; partnership with collecting institutions; availability to web archiving.

2023 Review

This entry was created in 2019 as a subset of a previous 2017 entry, ‘Digital Music Production and Sharing,’ which was split to draw attention to the different challenges faced by the different forms. While there are some overlaps with other entries relating to social media as well as those relating to community-generated content, it is a separate entry to emphasize the context in which music is shared and enjoyed; this context could be lost if attention were on products controlled by studios or artists. The 2020 Jury agreed that risks remained on the same basis as before, with no change to trend. The 2021 Jury discussed content increasingly being shared across multiple platforms, which is both good and bad for risk. A multi-platform nature provides an element of protection against total loss, but the role and type of interaction with the content on each platform are also important and expanding with limited attempts at preservation. For these reasons, the 2021 trend moved towards greater risk with the need for selective approaches based on the increasing volume of material. The 2022 Taskforce noted no change to the that trend (they agreed these risks remain on the same basis as before).

The 2023 Council agreed with the Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

The ephemera is increasingly stored on websites that themselves are fragile and are removed, and nothing held on these services can be relied on in archival timeframes.

Web archiving and social media archiving have matured, so a representative sample is probably readily available for particular countries which are more mature in their digital preservation activities as opposed to other countries which are not.

This entry also connects to other entries ‘Consumer Social Media Free at the Point of Use,’ ‘Data Posted to Defunct or Little-used Social Media Platforms’. There are similarities in regard to increased uncertainty around major social media sites, such as X (previously Twitter), and the preservation risks associated with underlying social media preservation which has an impact on the digital objects that fall under this entry, however this entry draws attention to additional risks associated with preservation of the digital forms and contexts in which these materials are shared and enjoyed.

Case Studies or Examples:

  • The case of MySpace’s removal of MP3s demonstrates a big loss of shared digital recordings through the platform and subsequent recovery efforts by an academic group and Internet Archive. See Kleinman, Z. (2019) ‘MySpace admits losing 12 years’ worth of music uploads’, BBC New Available at: https://www.bbc.co.uk/news/technology-47610936 [accessed 24 October 2023] and Sketch the Cow (2019) ‘The Myspace Dragon Hoard (2008-2010’, Internet Archive. Available at: https://archive.org/details/myspace_dragon_hoard_2010 [accessed 24 October 2023]

  • The case of ‘Yahoo Groups’ closure serves to underline the fragility of community content hosted by third parties. See Brinkmann, M. (2020) ‘Farewell Yahoo Groups! Shutting down on December 15, 2020’, gHacks Technology News. Available at: https://web.archive.org/web/20230902153115/https://www.ghacks.net/2020/10/14/farewell-yahoo-groups-shutting-down-on-december-15-2020/

Read More

Corporate Records of Long Duration on Network Drives, Intranets and EDRMS

Corporate Records of Long Duration on Network Drives, Intranets and EDRMS

   Endangered large

Records on internal corporate network drives, intranets or document management services where access is limited to a distinct group of users, and in which the lifecycle of the record or the business processes they support is greater than the technology on which they are created or retained.

Digital Species: Sensitive Data

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2017

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within twelve months; detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

It would require a major effort to preserve materials in this group, with the development of new preservation tools or techniques.

Examples

Born-digital records of small and medium sized enterprises; fasting-changing internal manuals, advice or policies shared on intranets or EDRMS; records of long-lived products and services; Historic guidelines and manuals which evidence ‘best practice’; Documentation supporting long-lived contractual relations; Online terms and conditions; Corporate Slack channels; Google Drives; EDRMS; Email.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Lack of systematic preservation function; lack of preservation path or plan for data; dependence on proprietary products or formats; poor management of data protection; political or commercial interference; lack of offline equivalent; over-abundance through poor disposal or naming and version control; lack of capacity; lack of commitment; loss or lack of documentation; sector-specific software or data types; encryption; Uncertainty over IPR or the presence of orphaned works owners.

‘Vulnerable’ in the Presence of Good Practice

Preservation infrastructure and pathways; replication; appraisal and selection including de-duplication.

2023 Review

This entry was added in 2017 to draw attention to the pressing need for digital preservation in business, especially in small to medium enterprises. The 2020 Jury noted how the Covid Pandemic has caused profound dislocation across the economy and placed many companies and agencies at financial risk. The likelihood of liquidation, mergers or acquisitions means that these records are trending towards greater risk. The 2021 Jury agreed with the trend towards greater, adding that increased risk is not necessarily because there are no assigned parent archives to take on these materials; rather, it is because they too often sit in these spaces for some time before being transferred to the archives. They are often not well managed or maintained by their creating agencies, putting them at risk of accidental deletion or corruption. There remain increased risks without business continuity and trust.

The 2023 Council agreed with the continued Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend), though also noting an increased imminence of action needed for this entry and highlighted the importance of processes in the preservation of these records. They recommended that the next review considers rescoping this entry, possibly adding a new entry about covering sensitive data in databases that would fall into both the Sensitive Data and new Databases digital species groups.

2024 Interim Review

The 2024 Council recommends that a major rescoping of the Sensitive Data species is necessary, with plans to remove it as a species and incorporate key elements and examples to relevant entries for the next 2025 Bit List. This is because it is not clear how sensitive data works as a species, when many of the other species mentioned could have sensitive data concerns, and the sensitivity of the data is more like an extra category of risk that potentially applies across any species. Additionally, further input from those working with corporate records in this context is invited.

Additional Comments

Corporate records should form part of organizational records management schemes, and so responsibilities should be clear; however, this may be much more challenging for smaller organizations without dedicated roles or with complex data types.

Processes become as important as technology when it comes to preserving this kind of material. If an organization does not have good records organization, naming conventions etc. that may make material as vulnerable to loss as technological failure or format obsolescence could.

Closer collaboration over the digital record lifecycle with recordkeeping organizations such as IRMS/ARA and digital preservation organizations would help to ensure best practice from (before) record creation to its long-term preservation and would help to identify any risks and bridge gaps ‘from the cradle to the grave’. Joining forces and resources will enable the community to raise awareness of the impact of best practices on the organizational governance and related efficiencies.

See also:

Read More

Content on Cloud Video Services Produced by the Service Provider

Content on Cloud Video Services Produced by the Service Provider

   Endangered large

Video materials, primarily films and television programs, which are produced by companies that maintain their own distribution platforms and are exclusively available through these platforms.

Digital Species: Sound and Vision, Cloud

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within three years, and detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Netflix, Amazon Prime, Disney+.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Lack of skills, commitment or policy from corporate owners; conflating backup with preservation; loss of original recordings; lack of preservation voice at executive level; poor planning and roadmap for infrastructure; slapdash procurement or migration to new systems; mergers and acquisitions; profusion of corporate systems; uncertainty over IPR or the presence of orphaned work; single point of failure; technical protection measures that inhibit reasonable preservation actions.

‘Vulnerable’ in the Presence of Good Practice

Backup and documentation; use of open formats and open source software; data management planning; licensing that enables preservation; corporate preservation capability; resilient to hacking; authenticity and integrity managed; recognition of preservation functions at executive level; technology watch; preservation audits; participation in the preservation community.

2023 Review

This entry was added in 2019 to represent collections that are highly significant in cultural and social terms. It was adopted as the Jury was unclear whether the content could be played outside of the producers’ publication platform, with technical dependencies between content and software amplified by rights management. The 2020 and 2021 Juries agreed with the Endangered classification, with discussions around how the growth of content produced with no or limited preservation mandate leads to greater risk. The continued scale of that growth and opacities regarding preservation by companies also led to the 2022 Taskforce noting a trend towards even greater risk. However, just as the 2022 Taskforce was completing its work, they welcomed the news of BFI taking on responsibility for the preservation of key titles from Netflix, commenting on how it represents a commitment to act on previous recommendations but not yet a ‘material improvement’ at that stage so there was no change to the 2022 trend at that time.

The 2023 Council agreed with the Endangered classification, with the overall risks remaining on the same basis as before (‘No change’ to trend). 

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

This entry has five aspects to consider: 1. It falls outside the scope of traditional regulatory frameworks and archiving has not yet been included in any legislative framework, unlike broadcast TV, where there is a designated archive in most developed nations. 2. As a result, the collection and preservation of content from online platforms is underdeveloped, and the content remains unavailable in public archives. 3. These risks are mitigated by the fact that the commercial archives are technologically advanced, with mature digital ecosystems and skills, and much of the content has a ‘long tail’ business model, and as commercial products have value, preservation incentives are clear. 4. However, these are often stored at scale on LTO tapes, and so specific issues arise with the obsolescence of LTO tape technologies for the broadcast sector. 5. Nonetheless, issues remain around archiving relevant assets which may not be valued by the production company.

It may also be worth considering broadening legal deposit legislation so there is a mandate to deposit this content with an appropriate repository - though the volume may be unwelcome as many institutions are under-resourced.

Case Studies or Examples:

  • Work by the BFI National Archive in 2023 in the UK. A formal agreement with Netflix in 2022 was followed by a similar agreement with Amazon Prime Video in Summer of 2023, and by October 2023, the digital preservation workflow for curator-selected UK Netflix content was established, with two complete seasons (20 episodes) under preservation, and throughput building. See: BFI (2022) ‘Bridgerton, Top Boy and Heartstopper join the BFI National Archive and the nation’s screen heritage’, BFI News. Available at: https://www.bfi.org.uk/news/bridgerton-top-boy-heartstopper-bfi-national-archive-netflix [accessed 24 October 2023]

Read More

Cloud-based Services and Communications Platforms

Cloud-based Services and Communications Platforms

 

 Endangered large

Digital content produced, stored and accessed within commercial cloud-based services and communications platforms. This entry broadly includes services based on a costed subscription and contract business models, premium or institutional versions, and also free online utilities offered at no cost to end-users but with a business model based on gathering and reselling consumer insights.

Digital Species: Cloud

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within twelve months. Detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost.

Examples

Google services such as Drive, Docs, and Sheets; Microsoft services such as Sharepoint and Teams; Slack, Prezi, Yammer, DropBox.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Unstable business model from service providers; abandonment of the service due to various reasons (e.g., service provider bought over or pivots to new market opportunities); lack of export functionality; unstable terms and conditions; lack of onsite copy of key media; lack of investment in infrastructure; lack of strategic plan for IT provision; uncertainty over IPR or the presence of orphaned works; conflating preservation and access.

‘Vulnerable’ in the Presence of Good Practice

Clear export and migration pathways; preservation responsibility shouldered by the service provider; offline backup for key media; fit to preservation and records management plan; strategic roadmap for adoption.

2023 Review

This entry was added in 2021 as a merging of two separate 2019 entries, ‘Consumer Cloud-based Utilities’ and ‘Premium or Institutional Social Media’ to place emphasis on the similarities and common threats faced by services that are both ‘paid-for’ and ‘free-at-the-point-of-use’, namely similar aggravating conditions relating to increasing dependencies on the vendor’s business models and the terms and conditions imposed. The 2021 Jury also noted a trend towards increased risk in light of greater reliance on the cloud and localized disruptions to cloud services over the pandemic and wider (global) dependence on these services, especially Google Drive, for record-keeping and business workflows. The 2022 Taskforce agreed with the previous assessment, with no change to trend.

The 2023 Council agreed with the Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to 2023 trend), but also noted increases in imminence and effort to preserve, recognizing that while the need for major efforts to prevent or reduce losses continues, it is now much more likely that loss of material has already occurred, and will continue to do so, by the time tools or techniques have been developed.

This entry was previously categorized under the Social Media digital species. The 2023 Council adjusted this and other social media entries in light of how web-based and cloud-based business products and services had developed in recent years. This included:

  • Narrowing the scope. The scope of the entry was narrowed to focus more specifically on the various risks associated with digital content created, stored and shared using cloud-based services, especially business-related tools or collaboration tools which are not being well preserved (e.g. Slack, Google Drive, Sharepoint). These challenges primarily relate to services using their own cloud-based format, export functionality and quality.

  • As part of this rescoping, relevant information concerning cloud-based aspects was incorporated from the previous ‘Born Digital Photographs and Video Shared via Social Media Platforms’ and ‘Consumer Social Media Free at the Point of Use’, and this entry also now falls under a new ‘Cloud’ species group to more clearly differentiate between social media and cloud services–the Council adopts the view that just because a service is web-based and allows users to upload content for cloud hosting does not necessarily mean that it is ‘social’ or ‘media’. 

2024 Interim Review

The 2024 Council agreed these risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

Most platforms allow users to export their own data from them, which is helpful.

Similar to the ‘Born Digital Photos and Video Shared on Social Media’ entry, significance and impact scores are high because some users exclusively create and store important content on these services but uses for these services vary greatly. Also, subscription services, such as Microsoft Teams, though far from having adequate preservation provision, will have more robust back-up and recovery governed by institutional contracts, whereas Google Drive / Google 'office' services free-at-the-point of use do not provide these mitigating measures.

Dropbox is a content hosting/storage service and does support downloading a file the same quality as the file uploaded. If any one of these platforms disappeared overnight or put new restrictions on access to user content, it would certainly make headlines, as witnessed with Flickr's change in storage limit capacity for non-paying users.

There are similarities and common threats faced by services that are both ‘paid-for’ and ‘free-at-the-point-of-use’, namely aggravating conditions relating to increasing dependencies on the vendor’s business models and the terms and conditions imposed. However, with digital materials from consumer cloud-based utilities, the business model and sustainability can only be presumed, and contracts tend to be asymmetrical in favour of the supplier. Moreover, because these services have a low barrier to entry, they may be favoured by agencies or individuals least able to respond to closure or loss. If referring to the entire platforms and the risk of the entirety of data on these, the concern is that the corporation providing the service suddenly decides it is no longer of value to them. In these circumstances, materials could be removed quickly. That has happened previously and will certainly be seen again. Preservation is not a commitment that most providers make.

Existing tools could be modified to tackle some of the closed networks. Still, it is likely to require investments, perhaps related to corporate records in some cases (thinking about internal Slacks, for instance), and more education about the importance of preserving this material and not trusting the publishing platforms to host the content forever.

Case Studies or Examples:

Read More

Born Digital Photographs and Video shared via Social Media or Uploaded to Cloud Services

Born Digital Photographs and Video shared via Social Media or Uploaded to Cloud Services

   Endangered large

Digital images or video with no analogue equivalent and where the only copy is online with a social media platform. This entry includes images or videos created and shared as part of personal digital archiving, but also for businesses and others publishing data only via these services. Users of these services will likely lose their data if social media companies fold or make extracting or downloading data difficult.

Digital Species: Social Media, Sound and Vision, Web, Cloud

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2018

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within three years, and detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve

Loss seems likely: by the time tools or techniques have been developed, the material will likely have been lost.

Examples

Flickr; Vimeo; YouTube; Instagram; Periscope; Snapchat; TikTok; Vine; Facebook; X (previously Twitter).

‘Critically Endangered’ in the Presence of Aggravating Conditions

Lack of skills, commitment or policy from corporate owners; lack of storage replication by provider; dependence on proprietary products or formats; poor management of data protection; inaccessibility to automated web crawlers; political or commercial interference; lack of offline equivalent; over-abundance; Uncertainty over IPR or the presence of orphaned works; lossy compression applied in upload scripts.

‘Vulnerable’ in the Presence of Good Practice

Offline backup; lossless compression; good documentation; access to web harvesting; clarity of intellectual property rights that enable preservation; credible preservation commitment from service provider; export pathway.

2023 Review

This entry was added by the 2018 Jury as a subset of a broader social media entry first introduced in 2017 to draw attention to the different challenges faced by the growing volumes of photographs and video recordings on social media continuing to rise and, therefore, the scale of the challenge to ensure a meaningful legacy aggravated by the issue of overabundance in which appraisal decisions for preservation or deletion are overwhelmed. The entry shares many challenges with others in the social media group, with a dependency on a global service provider whose business model can only be presumed and tied to users via asymmetrical contracts that favour the supplier. Moreover, because these services have a low barrier to entry, they are used by agencies or individuals least able to respond to closure or loss. Both the 2020 and 2021 reviews of the entry noted a trend towards greater risk. The 2020 trend referenced the closure of the EverAlbum photo storage and changes to the Flickr free service, which provided examples of the short turnaround of closures within the photo-sharing community and pointed to the volatility in the market. The 2021 trend was added in light of surrounding global crises (predominantly the coronavirus pandemic, compounded by vaccine hesitancy, but also the deterioration of the world's democracies) as a result of widespread misinformation, increasing the significance and impact of loss of digital materials. The 2022 Taskforce agreed these risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (no change to the trend).

The 2023 Council agreed with the Endangered classification and noted an increase in effort to preserve, recognizing that while the need for major efforts to prevent or reduce losses continues, it is now much more likely that loss of material has already occurred, and will continue to do so, by the time tools or techniques develop.

The 2023 Council recommended adjusting this and other social media entries in light of how web-based and cloud-based business products and services had developed in recent years. This restructuring and revision included:

  • Narrowing the scope. The scope of the entry has been narrowed to draw attention to the challenges of preserving images and videos disseminated through social media platforms specifically (e.g., sharing-driven and social networking platforms such as YouTube, Instagram and TikTok). These challenges primarily relate to harvesting and managing the images, video recordings and data generated by users' interactions on web-based networking platforms.

  • As part of the above rescoping, the entry's name was changed from ‘Born Digital Photographs and Video Shared via Social Media or Uploaded to Cloud Services’ to ‘Born Digital Photographs and Video Shared via Social Media Platforms’, and information concerning cloud-based aspects were incorporated into the ‘Cloud-based Services and Communications Platforms’ entry to more clearly differentiate the risks associated with cloud hosting and computing technologies. 

The 2023 Council additionally recommended that the next major review revisit and consider merging the Born Digital Photographs and Video shared via Social Media Platforms and Consumer Social Media Free at the Point of Use entries. This is mostly due to the fact that so many of the ‘regular’ social media platforms have tended toward more ways to mimic or copy TikTok style videos, and making the distinction will become harder in the future since they all have similar functionality and ways to create photo/video content.

2024 Interim Review

The 2024 Council agreed These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

They added that the platforms listed are publication platforms for media, and there’s a risk of vulnerability, loss and lack of tools to preserve media longer than what the platform can provide, especially for creative industries, small business, and political activists. Politicians and government agencies will post public records to these platforms. Without archived YouTube, Vimeo, etc. publication pages, those public-facing records lose contextual value. They also recommend that for the 2025 review, rescoping or absorbing this entry into another should be discussed with the wider Bit List council.

2024 Council members also raised concerns regarding Artificial Intelligence and Machine Learning, noting that for this entry and, more broadly, anything related to Social Media, an emerging risk is AI training fears. This manifests in two ways:

  • Social media users deleting their content entirely in an attempt to prevent it from being used or sold for training a Large Language Model (Edwards, 2024). However, as noted by the Council members, it can be difficult to have a good sense of how widespread this is or how much it’s affecting content that people want to preserve – but it is nonetheless worth noting.

Website operators blocking scrapers for similar reasons, (Voorhees, 2024). This primarily targets AI crawlers but may also affect web crawlers used for preservation. It can similarly be difficult to have a good sense of how widespread this is, but it seems likely that anti-crawler provisions are only going to get worse, not better.

Additional Comments

We can point to some examples of content only on YouTube, for instance, that may be desired for acquisition for a library. Typically, YouTube would be acquired through web archiving, but with recent and ongoing challenges capturing this content, it may require contacting the creator to try to acquire the original video files to preserve through other workflows. This creates the challenge of determining who to reach out to, how to transfer those files and, if the files only exist on the social media platform, how to extract them to transfer to an organization for preservation. With crawling capabilities limited these days, Libraries will have to rely more on individuals archiving their own content and donating it to organizations. It's not clear what that workflow looks like and if there are adequate methods to support it.

The types of users for these services vary greatly - from a private individual uploading a few videos to share with friends to major agencies who use the platform to disseminate important information or research. The extent to which private individuals and even large institutions are aware of digital preservation risks is unclear, though anecdotal evidence suggests that awareness is extremely low. Therefore, it can be assumed that most users (regardless of the significance of their content) do not keep local copies or take other measures to mitigate the risk of loss from these types of platforms. Additionally, risk varies from platform to platform. YouTube, for example, only allows low-quality downloads even for content owners. Therefore, if a content owner lost or deleted an original video file, it would be impossible to recover a high-quality copy from YouTube.

The vast majority of content may be accessible for as long as the platform where it is hosted is popular (and has a viable business model); however, more insidious content (such as malicious misinformation or hate speech) may be deleted by content creators (potentially backed by hostile governments) to avoid prosecution or tracing. It is unclear to what extent these platform providers are compelled to provide access to servers / deleted content or private content for evidential purposes in the course of legal or criminal investigations. The lack of transparency and standardized international regulation of these platforms make their content vulnerable to exploitation and malicious use by individuals, corporations, and hostile governments.

With digital materials from premium or institutional social media services, the business model and sustainability are more obvious, and contracts may be enforceable more readily. Moreover, because these services have a slightly higher barrier to entry, they may be favoured by agencies better able to respond to closure or loss. Traditional web archiving can be employed where the user pays for a service, but the content is ultimately publicly available (such as Flickr). But much is unclear about how to preserve internal social media / closed networks that web archiving cannot get to, or existing tools do not cover.

Case Studies or Examples:

See also:

Read More

3D Digital Engineering Drawings

3D Digital Engineering Drawings

   Endangered large

3D digital engineering models produced as part of building or engineering design processes. The production of such drawings has progressed from a digital analogue of paper to complex digital environments such as BIM (Building Information Modelling) which combine original drawings, libraries of compound objects, and links to external data sets such as about the performance of materials and maintenance of parts.

Digital Species: Engineering Data

Trend in 2023:

No change No Change

Consensus Decision

Added to List: 2017

Trend in 2024:

No change No Change

Previously: Endangered

Imminence of Action

Action is recommended within three years, and detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability 

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Building Information Management (BIM), Computer Aided Design (CAD), Product Data Management in engineering and architecture.

‘Critically Endangered’ in the Presence of Aggravating Conditions

 Lack of skills, commitment or policy from corporate owners; lack of preservation mandate or collecting institution; lack of preservation capability in data owner; irregularities in supply chains; complex or long data supply chains; dependencies on proprietary software or formats; lack of persistent identifiers; uncertainty over IPR or the presence of orphaned works; temporary joint-venture companies; poor records management; poor regulatory compliance; encryption.

‘Vulnerable’ in the Presence of Good Practice

Well managed data infrastructure; preservation from the point creation; carefully managed IPR; persistent identifiers; well managed records management processes; recognition of preservation requirements at highest levels; strategic investment in digital preservation; host clearly identified; participation in the digital preservation community.

2023 Review

This entry was first submitted in 2017 when the Jury lacked the capacity to consider it in detail. In 2019 it was assessed with additional expertise co-opted, with the decision to remain a very broad category, including major one-off construction and engineering projects, a long tail of more minor building programmes, and large volume but homogeneous production processes in engineering. The 2021 Jury agreed with its Endangered status with an identified 2021 trend towards greater risk. They noted the key consideration that the lifecycle of the products and the data that describes them vastly exceeds the short life cycles of the infrastructures on which they are designed. This challenge is compounded by supply chains that may involve many different stages of production, as well as the delivery of large projects through transitory joint venture companies that have no residual mechanism or capacity to preserve the data thereafter. Although there had been advancements in the development of new preservation tools and techniques for these materials, there were recent examples of the loss of 3D architectural drawings; those have had a huge impact, especially at the local level, as well as significant impacts on infrastructure, travel, and how people interact with built environments throughout the world. The 2022 Taskforce agreed these risks remained on the same basis as before (‘No change’ to the 2022 trend).

The 2023 Council agreed with the Endangered classification and seconded the trend reported last time; that risks continue on the same basis as before (‘No change’ to 2023 trend). Most of the complexities of the format and issues remained the same, such as reliance on proprietary software and complex or unknown copyrights with the datasets. Moving forward, it was highlighted by the Council that there needs to be a greater focus and understanding on the long-term preservation of these outputs within the sector.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additionally, some Council members pointed to where there may be overlaps between the 3D Design Engineering Drawings and Formats species groups for this entry, in particular in ways to further address Step, PDF/E, STL, 3DM, etc.

Additional Comments

Data in this category enables the safety and security of critical infrastructure, but the responsibility to maintain data is unclear, nor are retention periods clear. Although examples of good practice exist, the extent to which there are working solutions at large seems doubtful, and it is surprising that there are not more diverse success stories to report.

Case Studies or Examples:

  • The Grenfell Tower Inquiry offers a case to consider how the loss of 3D Digital Engineering Drawings can have a huge impact, especially at the local level. For example, if Grenfell had been done using 3D technologies, do we have confidence that those materials would have been adequately preserved? What would have been the local impact? What would have been the impact on the inquiry? See Grenfell Tower Inquiry (n.d.) ‘Grenfell Tower Inquiry’. Available at: https://www.grenfelltowerinquiry.org.uk/ [accessed 24 October 2023]

  • In 2006, it was reported that the Airbus A380 was 2 years behind schedule due to different offices using different versions of the CATIA CAD/CAM software. See Shelly, T. (2006) ‘What can go wrong when you give IT the large’, Manufacturing Management. Available at: https://www.manufacturingmanagement.co.uk/content/features/what-can-go-wrong-when-you-give-it-the-large/ [accessed 24 October 2023]

See also:

  • The DPC Design and Construction Records technology watch report, which aims to support archival professionals as well as active designers and facilities managers, considering acquisition, preservation, and access approaches that account for both the technical and cultural components of the broad range of born-digital design and construction records created throughout the course of designing, building, and maintaining a built space. As well as bringing together a helpful summary of relevant work in this area and discussing a range of case studies it also covers the concept of visual digital literacy which is the first step towards understanding and managing this content. See Leventhal, A, and Thompson, J. (2021) ‘Preserving Born-Digital Design and Construction Records’, DPC Technology Watch Report 21-01. Available at: http://doi.org/10.7207/twr21-01

  • The Library of Congress had a symposium on 3D design and assets in 2017. See Leventhal, A. (2018) ‘Designing the Future Landscape: Digital Architecture, Design & Engineering Assets’, Library of Congress. Available at: https://www.loc.gov/preservation/digital/meetings/DesigningTheFutureLandscapeReport.pdf [accessed 24 October 2023]

Read More

Scroll to top