Blog

Unless otherwise stated, content is shared under CC-BY-NC Licence

File format identification: A student project at the University of Sheffield Library

Chris Loftus

Chris Loftus

Last updated on 23 August 2019

This blog has been written by Peter Vickers; a postgraduate student in Speech and Language Processing hired by the University Library, as part of the University of Sheffield’s OnCampus programme, to look into file identification and archiving.


Forgotten Scripts

Below is an inscription written in Linear A, a Minoan script which has been found on thousands of historical objects across Greece. Because the language bears no close similarity to a language we understand, and we have no Rosetta Stone to decipher the language, linguists have had to use speculation and comparison to attempt to decode the script. Whilst over the past decades, Linear A has been related to the proto-Greek Linear B, the Hittitie Luwian script, Phonecian, and Indo-Iranian, none of these comparisons have either achieved widespread academic acceptance or allowed for the translation of much of the Linear A corpus. For now, at least, Linear A, and all of the Debts, Curses, Tax Returns it encodes are indecipherable. 

Given our cultural interest in lost languages and the knowledge they might encode, I wonder what researchers in 100 years will make of all the digital content we create. Linear A is 3,500 years old – old enough to be forgiven for having been forgotten. Meanwhile, last week I found myself unable to access the data on a five inch floppy disk, which were still in use twenty years ago. Of course, the loss is not the same – I could use the library’s archival system to read the disc. However, the data on the disc might itself be in an obsolete file format. Comparing it to the Linear A problem : recovering the data might be compared to the legibility of our script, whilst opening might be compared files it to our ability to translate it.

Read More

Open repositories: or how I learned to start worrying and hate jingoism

Hrafn Malmquist

Hrafn Malmquist

Last updated on 29 July 2019

Disclaimer: I must state that the following blog-post is written in a personal capacity, airing opinions that are my own and are not intended to endorse a particular piece of software. They should not be considered official on behalf of my current employer, The University of Edinburgh.

Last month, in June 2019, I attended the fourteenth Open Repositories (OR) conference held in Hamburg, organised by Hamburg University. Hamburg is a beautiful city, and this coincided with the Hamburg University’s centenary.

It is one of the biggest conferences in the world of its kind and had a packed four day schedule. It was the first OR I attended and I delivered a presentation: “Automating OAIS compliant digital preservation using Archivematica and DSpace”. A bit more about that later. I saw many interesting talks, both from an ideological perspective as well as technical (I am a developer although I do have a background in library and information science). I’ll now proceed to tell you a bit about my experience at the conference.

Read More

Enhancing Services to Preserve New Forms of Scholarship

Karen Hanson

Karen Hanson

Last updated on 22 July 2019

Karen Hanson is Senior Research Developer for Portico


The last decade or so has seen the emergence of a new kind of scholarly work - the enhanced digital monograph. While still recognizable as monographs, these resources include a variety of dynamic features that cannot be replicated in print format. These works represent a leap forward for scholarship, but their formats, use of dynamic features, and composite nature present complex preservation challenges. 

To help address these challenges, a new collaborative project funded by the Andrew W. Mellon Foundation partners preservation institutions, libraries, and university presses that are producing enhanced monographs. The goal is to examine what aspects of these works can be preserved at scale, and produce guidelines to improve their preservability that publishers and authors can use while creating these works.

Read More

Ten Years On – Some Myths Debunked About the Artist FKA The DPC Leadership Programme

Sharon McMeekin

Sharon McMeekin

Last updated on 19 July 2019

Our illustrious (!) leader William Kilbride started with the DPC in February 2009, and one of the first new initiatives he introduced the DPC’s Leadership Programme. For ten years now the programme has been one of the core elements of our workforce development activities. It offers grants so that our members can attend training and development opportunities they may not otherwise be able to. The programme has also helped ensure that organizations who offer training can have some assurance of a return on their investment. In its lifetime the DPC Leadership Programme has provided well over 100 grants for members to attend training and development opportunities. This began back in May 2009 with 2 grants for individuals from the National Library of Wales and Cambridge University to attend the Digital Preservation Training Programme.

Read More

An update from Oxford

Michael Popham

Michael Popham

Last updated on 18 July 2019

Michael Popham is Head of Digital Collections & Preservation at Bodleian Libraries, University of Oxford


It has now been six months since the Polonsky funded “Digital Preservation at Oxford and Cambridge” project (www.dpoc.ac.uk) officially came to a close, but the impact of this work is still causing ripples across both organizations.

Within the Bodleian Libraries at the University of Oxford, we have been seeking funding to support a number of business cases created as a direct result of recommendations arising from the work of the Polonsky Fellows. The digital assets in our care have been acquired over an extended period of time (three decades or more) and are extremely varied: consisting of digital images and textual transcriptions of items in our physical collections, research data and outputs, born-digital archival deposits, databases used to catalogue discrete collections of specialist material, and assorted A/V files (created for even more assorted reasons), employing almost every technology and file format that has been popular over the past 30 years. As the Bodleian Libraries seek to collect and create ever-increasing amounts of digital data, the scale of the challenge we face is growing exponentially.

Read More

Controlling the costs of long-term digital accessibility - A cost model for long-term digital accessibility

Herman Uffen, Tamar Kinkel and Shannon Roest

Herman Uffen, Tamar Kinkel and Shannon Roest

Last updated on 15 July 2019

Herman Uffen, Tamar Kinkel and Shannon Roest work for BMC on behalf of the Dutch Digital Heritage Network


The difficulties of managing and controlling the costs of digital sustainability

Controlling and managing the costs of digital sustainability remains a recurring topic in the field. Back in 2015 the 4C-project stated the following view: “In five years time it will be easier to design or procure more cost effective and efficient digital curation services because the costs, benefits and the business cases for doing so will be more widely understood across the curation life cycle and by all relevant stakeholders. Cost modelling will be part of the planning and management activities of all digital repositories.”

This view has not been realized yet. The costs of digital sustainability are often still unclear and difficult to manage: This because they are usually difficult to determine and are often not recorded as such in the regular financial exploitation of institutions. Furthermore, these structural costs are often funded in a project-based manner (focus on the short term).

In response to these findings the Dutch Heritage Network developed and implemented a cost model in the fields of: cultural heritage, media, archives and science in the Netherlands. This with the intent of creating transparency in the costs of digital sustainability and to create a tool which enables controlling and managing these costs in the future. 

Read More

Plans for Future Particle Colliders and Their Impact on Data Preservation

Jamie Shiers

Jamie Shiers

Last updated on 12 July 2019

Jamie Shiers works in the Information Technology Department at CERN and is Manager of the Data Preservation for Long-Term Analysis in High Energy Physics (DPHEP) Collaboration


Every 5 – 7 years, physicists from around the world get together to discuss their views on the priorities for Particle Physics – both in Europe and in collaboration with corresponding plans for other parts of the world[1]. At the most recent of these symposia, held in Granada in May 2019, with the intent of forming a strategy that can be approved by the CERN Council in May 2020, there was notable enthusiasm for a new electron-positron collider (this might be linear, circular, built in Europe or elsewhere). Should such a machine be hosted at CERN – for example, in a 100m circular tunnel corresponding to one of the proposals – it would be unlikely to enter operation before the mid to late 2030s.

Read More

‘Access is What we are Preserving’: But for Whom?

David Underdown and Leontien Talboom work at The National Archives UK


Designated Communities, Representation Information and Knowledge Bases in a Wiki World 

Thanks to Andy Jackson of the British Library's web archiving team for partial inspiration for the title of this post (the rest of the conversation is also largely relevant to the post too)

Introduction

Most of us in the digital preservation field are familiar with the Open Archival Information System (OAIS) model. After nearly two decades this model has become a backbone for many of our architectures, certification and protocols. Terms such as AIP, SIP and DIP are in common use and the first sighting of the OAIS diagram at a conference is frequently remarked on. The model has given us a common language to communicate our digital preservation needs with. But how many of us have actually read and engaged with the model further than the most common terms from it? We would like to admit that the first time we read the guidelines of the OAIS model was at the start of this year, even though we have been using terms from the model throughout both our digital archiving careers.

Read More

Developing a digital preservation service for the British Museum - Part 1

Glenn Cumiskey

Glenn Cumiskey

Last updated on 4 July 2019

Glenn Cumiskey is Digital Preservation Resource Manager at the British Museum


The British Museum faces many challenges in its plans to better manage its data. For example, the capacity of the British Museum storage area network (SAN) has for some years been unable to meet the growing demands placed on it by the voluminous growth of data created by Museum staff in pursuit of their day-to-day activities.

Data growth is driven by many factors. As technologies reduce in cost but increase in the size and complexity of the assets they generate, the number of assets created and the storage footprint of these will naturally and potentially exponentially increase.

Read More

Not Your Childhood Fire Drills

Bradley Daigle

Bradley Daigle

Last updated on 27 June 2019

Bradley Daigle is the Content Lead for the Academic Preservation Trust and Chair of the Coordinating Committee for the NDSA's Leadership Group


While at the most recent PASIG, held in Mexico City, I sat in the audience and responded to a question that I interpreted to be about preservation stewardship responsibility and where it resides. Apparently, the solution I put forward was not something that others had considered or were planning. Therefore, in this post, I am providing some context and explanation of what we are doing at Academic Preservation Trust (APTrust) to provide a more rounded approach to preservation responsibility. This overview will be a description of what I call “fire drills” or what our tech team refers to as “test restores”.

Read More

Scroll to top