Jenny Mitcham

Jenny Mitcham

Last updated on 3 November 2022

On this World Digital Preservation Day of 2022 I wanted to celebrate file format identification. It is a topic that is central to digital preservation ingest workflows and close to many of our hearts. We are so lucky within this community to have a number of different tools that can carry out and automate file format identification at scale, and of course the amazing resource that is PRONOM that underpins so many of those tools.

Leading up to World Digital Preservation Day I started thinking about what we would do if we didn't have those tools, and also what happens when we have to investigate and interpret each file individually. These thoughts were also triggered by a tweet from Helen Dafter of the Postal Museum about a double file extension that she had recently come across. It was great to see the file format enthusiasts of the digital preservation community piling in to try and help solve the mystery...!

Screenshot 2022 11 02 172223

So what would happen if we had to identify each file format individually...?

Firstly I thought it would be quite hard...and secondly I thought it could play out a little bit like a 'knock knock' joke...

all jokes

Helen's digital preservation dilemma proved too difficult to illustrate...but perhaps this demonstrates that we can't always identify everything... 

knock knock pdf xml

 

Thankfully, file identification normally isn't this hard...thanks to all of those people and organizations who put time and effort into making it easier for all of us.

However, we will always have occasional struggles with specific files that can't be automatically identified, and it is good to know that when you do, the wonderful digital preservation community will rise to the challenge and help try to solve them with you!

 

 

Comments

Helen Dafter
2 years ago
So much value in the community - helping me realise I hadn't missed the obvious, giving my confidence to try different tools and approach the problem from a different angle, and signposting me to things to try. While I didn't fully resolve the issue I came away with confidence that I had carried out sufficient due diligence to go back to the depositor. Always appreciated!
Quote

Scroll to top