Blog

Unless otherwise stated, content is shared under CC-BY-NC Licence

iPres 2019 session - NEW HORIZONS // Access & FAIR

Leontien Talboom

Leontien Talboom

Last updated on 7 October 2019

Leontien is a collaborative PhD student at The National Archives, UK and University College London, her research is about access to born-digital material. She attended iPres2019 with support from the DPC's Leadership Programme which is generously funded by DPC supporters.


The next session that I will be covering from iPres 2019 is another New Horizons session, this time focusing on access and the FAIR data principles. The FAIR data principles are a set of guiding principles in order to make data findable, accessible, interoperable and reusable. If you want further information on the FAIR principles, the LIBER Europe report on FAIR is a great place to start, it gives a good overview of the basic principles and a comprehensive list of references that you can follow up.

As access is one of the main research topics in my PhD project I really wanted to attend this session and see how other organisations and people approach this topic. Just like the previous session that I covered, three papers were presented, with some time left for questioning at the end.

Read More

iPres 2019: A roundup

John Pelan

John Pelan

Last updated on 4 October 2019

John Pelan is Director of the Scottish Council on Archives and he attended iPres2019 with support from the DPC's Leadership Programme which is generously funded by DPC Supporters.


I attended iPres 2019 as a representative of the Scottish Council on Archives, not as a digital preservation or records management professional.  In my pre-event blog for iPres 2019, I wrote that I hoped that the conference would improve my knowledge of digital preservation and related issues which, in turn, would help inform SCA’s programme of work.  However, I was not prepared for the incredible diversity, complexity and technicality of subjects covered.  While I did, at times, feel like a fish out of water, I did come away from the event with a better understanding of the importance and increasing urgency of managing and preserving digital material.  My highlights included the presentation on the challenges and lessons of setting up an open access repository with four universities in Palestine; the panel discussion on preserving eBooks; the three keynote speakers; and, of course, chatting to new people at the conference reception.

Read More

The World Digital Preservation Day effect

Sarah Middleton

Sarah Middleton

Last updated on 3 October 2019

World Digital Preservation Day is just around the corner. November 7th is just 5 weeks away which means, for me, the countdown has really begun!

We’ve chosen a theme - ‘At-risk Digital Materials’ - to tie in with the new edition of the BitList of 'Digitally Endangered Species' we’re publishing on the day and work on that is underway in earnest, we’ve updated logos, added MORE logos in different languages, created event packs and stickers, posted them to all corners of the globe, invited a whole bunch of interesting bloggers to write for us on the day…so now it’s getting exciting!

Read More

iPres 2019 session - NEW HORIZONS // Web Archiving

Leontien Talboom

Leontien Talboom

Last updated on 2 October 2019

Leontien is a collaborative PhD student at The National Archives, UK and University College London, her research is about access to born-digital material. She attended iPres2019 with support from the DPC's Leadership Programme which is generously funded by DPC Supporters.


The first session that I will be covering from iPres is on Web Archiving. My own research is around access to born-digital archival material, as web sites and other web material are one of the many examples of born-digital material, I couldn't miss this session! During this session three papers were presented, all with a slightly different approach to web archives.

Read More

Landing on the moon

Lizzie Richmond

Lizzie Richmond

Last updated on 26 September 2019

Lizzie Richmond works at the University of Bath


It has been 2 years since our last blog. We would like to be able to report giant leaps in digital preservation at the University of Bath, but the truth is there haven’t been any. That doesn’t mean there hasn’t been progress; there has. It’s just that sometimes it can feel like the small steps aren’t really moving you forward.

I saw the film ‘First Man’ recently and it reminded me (again) just how mind-blowingly amazing it is that the 1969 moon landing ever happened. So much innovation, ingenuity, perseverance and pure blind faith to arrive some place no one had ever been. So many failures, trips back to the drawing board, recalibrations and adjustments.

Read More

iPres 2019 New Horizons Panel - Sustaining Digital Preservation in the Nuclear Field

Jaana Pinnick

Jaana Pinnick

Last updated on 25 September 2019

Jaana Pinnick is Research Data & Digital Preservation Manager at the British Geological Survey and attended iPRES2019 with support from the DPC's Leadership Programme which is generously funded by DPC Supporters.


The full title of this New Horizons panel was 'Achieving criticality of preservation knowledge: sustaining digital preservation in the nuclear field'. Working at the British Geological Survey and its National Geoscience Data Centre to preserve earth and geoscience data, this session was a must for me! The purpose of the panel was to provide exchange of ideas for the digital preservation community at large to share thoughts and experiences on preserving records in the nuclear sector. The classified nature of its information makes it difficult to exchange data with the wider community.

I was glad to hear my fellow DPC scholarship winner Elizabeth Kata from the International Atomic Energy Agency (IAEA) and Jim Moye from J&A Preservation talk about the particular issues in very long-term preservation, but I was disappointed to hear that Jenny Mitcham from DPC was unable to join them. However, William Kilbride did his best Jenny Mitcham impression which was much appreciated by the audience!

Read More

My armchair iPRES highlights

Jenny Mitcham

Jenny Mitcham

Last updated on 19 September 2019

Digital preservationists flocked to Amsterdam in huge numbers this week to attend iPRES 2019 - an international opportunity for conversations about all things digital preservation!

I was disappointed to have to cancel my own plans to attend the conference at the last minute, but undeterred, decided to engage as much as I could remotely (mostly from the comfort of my dining room...not actually an armchair). I could not miss out on potentially hearing about new theories, models, standards and examples of good practice in digital preservation.

It was great to have access to the programme and all of the papers, panel and poster abstracts online from the iPRES2019 programme and of course to be able to follow the prolific tweeting on #ipres2019. I tried to read the conference papers ahead of time, which gave context to the deluge of tweets.

So this is not your typical conference round up (no pictures of interesting sights and local food!) but I’ve instead tried to pick out some of the papers that were of particular interest to me, and to encourage you (whether you were there or not) to dive in and have a look.

Read More

Introducing the DPC RAM

Jenny Mitcham

Jenny Mitcham

Last updated on 20 September 2019

“If you can’t measure it, you can’t control it.”

Martin Robb, National Programme Manager, NDA

 

I’ve heard this phrase several times since starting work on a digital preservation project with the Nuclear Decommissioning Authority here in the UK. Colleagues at the NDA were very keen that as part of our two year project with them, we found an appropropriate way of measuring where they are now in their digital preservation journey and establishing a clear direction of travel.

Maturity modelling was the obvious answer.

As mentioned in a previous blog post we didn’t want to re-invent the wheel, so we did some research, looking at digital preservation maturity models that were available, hoping to find one that was suitable to use in the context of the NDA.

Read More

Integrated Preservation Suite (IPS): a scalable preservation planning toolset for diverse digital collections

Peter May

Peter May

Last updated on 16 September 2019

Peter May is the British Library’s Digital Preservation Technical Architect


Preservation planning is a long established function in digital preservation. Its purpose is to ensure that digital content can move forwards through time for future users without suffering unacceptable loss, either to intellectual content or functionality. Many different activities support preservation planning, and at the British Library this has included collection profiling, format sustainability assessments, defining digital preservation policy, content sampling, and preservation risk modelling. These activities have led to an excellent understanding of what is needed to preserve our digital content and the risks that are likely to manifest.

Missing from this picture, however, was the ability for us to put this knowledge into practice in an automated manner so that technical risks can be effectively and efficiently mitigated, at scale, and across all the collections. Our approach, formalised in our Integrated Preservation Suite (IPS) project, is our developing solution to this challenge.

Read More

How to correctly identify the file type of a text file from its contents?

Santhilata Kuppili Venkata

Santhilata Kuppili Venkata

Last updated on 13 September 2019

Dr Santhilata Kuppili Venkata is Digital Preservation Specialist / Researcher at The National Archives, UK


The Plain text file format identification is of interest for in the digital preservation area. At The National Archives (TNA), we have initiated the research to identify text file formats as the main topic. We carry on the research for the question:  'How to correctly identify the file type of a text file from its contents?'

Motivation to start this research and the dataset used for this purpose are discussed in part 1 published earlier. We present the methodology to the text file format identification as a classification problem in this part. As of now, we consider the classification of five formats - two programming source codes (Python and Java), two data file types (.csv and .tsv) and one text file type (.txt).

Methodology - ML to the rescue

Artificial Intelligence (AI) and Machine learning (ML) have become an integrated part of our lives. Machine learning is a set of generic algorithms that can understand and extract patterns from data and predict the outcome. TNA deals with a huge variety of file types for digital archiving. Hence an iterative process model is appropriate to include file types gradually. The methodology should be flexible enough to apply to more file types progressively. As and when a new file type is to be included, its features (specific characteristics) should be compared against the existing features and engineered to add to the list. Hyperparameters for the models should be adjusted accordingly to get a better performance. The flow graph in Figure 1 shows the methodology developed.

Read More

Scroll to top