Claire Newing is a Web Archivist at The National Archives, UK.
I’m really excited to be writing about how The National Archives (UK) has improved our social media archive. Different types of social media content are listed as either ‘Endangered’ or ‘Critically Endangered’ on the DPC ‘Bit List’ of Digitally Endangered Species so it seems like an appropriate subject for World Digital Preservation Day 2020.
We originally launched our social media archive in 2014 following a project we worked on with our former technical supplier, Internet Memory Research (IMR), to develop a method of capturing YouTube and Twitter content. My colleague, Tom Storrar, wrote about the project in detail in this post on The National Archives Blog: https://blog.nationalarchives.gov.uk/archiving-social-media/.
Caption: UK Government Web Archive YouTube Archive access pages as they appeared in June 2014 -
https://webarchive.nationalarchives.gov.uk/20140603161203/http://www.nationalarchives.gov.uk/webarchive/videos.htm
Capturing the YouTube and Twitter channels of UK ministerial departments became part of our business as usual work but we always knew that we wanted to do more. When we entered into a contract with new suppliers, MirrorWeb, in 2017, we were able to expand the programme to begin capturing the channels of a wider range of UK central government organisations.
The method of capture remains broadly the same. MirrorWeb capture posts directly from the APIs provided by the platforms and make them available to users through custom interfaces but some changes made to the backend process meant that were also able to increase the frequency at which we captured those channels. Whereas IMR captured all the available contents of each channel in full around twice each year, MirrorWeb check the channels and capture any newly added content daily. Finally, the custom front end of the service was redesigned and re-launched alongside other UK Government Web Archive pages.
Shortly after we re-launched the web archive we undertook some user research which revealed that web archive users were generally unaware of the social media archive. When we showed it to them they thought it was interesting but felt it was not very useful without a search function.
Last year, we launched a project with MirrorWeb to improve the social media archive, guided by the results of the user research. In November 2019 we were proud to re-launch the archive with the following improvements.
We were able to start capturing UK Government Flickr channels for the first time. This was particularly important at the time as Flickr had announced that free account users would only be able to store 1000 photos. If an account held more than 1000 images, the additional images would start to be deleted from 12 March 2019. We identified that several accounts in scope for our programme were at risk of losing images so we were very pleased to be able to capture the channels in full before the deletion took place.
Flickr posts are captured directly from the API and made available through a custom interface in a similar way to YouTube and Twitter content. By capturing directly from the API we were able to capture all the posts which existed on the channel at the time of capture. Some government organisations started using Flickr over a decade ago so we were able to a great deal of older content including some posted by organisations such as the Department for Innovation, Universities and Skills (DIUS) which closed some years ago.
Image of former Prime Minister Gordon Brown archived from the Department for Innovation, Universities and Skills (DIUS) Flickr channel in the UK Government Social Media Archive - https://webarchive.nationalarchives.gov.uk/flickr/diusgovuk/3487816057
We also, further increased the number of channels we regularly capture and now capture the channels of the full range of organisations in scope for our web archiving programme, including temporary bodies such as public inquiries, those set up to support specific campaigns, such as _#knifefree, and those of NHS bodies with a national focus. We now capture over 300 Twitter channels, over 200 YouTube channels and almost 50 Flickr channels.
Access page for the archived _#knifefree YouTube channel in the UK Social Media Archive - https://webarchive.nationalarchives.gov.uk/video/UCjfhjCG1BlOFi-bkYwDMYJg
As part of the improvement project we also undertook a complete re-design of our social media archive access pages. Our platform access pages were adapted to better display the increased number of channels. A filter has been added to the top of each page to help users find channels quickly.
YouTube platform access page in the UK Government Social Media Archive filtered to display channels with a
name including the word ‘Research’ - https://webarchive.nationalarchives.gov.uk/video/
Channel access pages now include a search function to enable users to search by keyword(s) within the channel and a date range filter.
Finally, we introduced a full text search function (https://webarchive.nationalarchives.gov.uk/social/search/) which enables users to search by keyword(s) across the content of the social media archive. It searches the full text of archived Tweets and the titles and descriptions of archived YouTube videos and Flickr multimedia posts. Search results can be filtered by platform, filter and channel and keyword(s) can be included or excluded. We believe this is a world first and are very excited by how it has improved the findability of content in the archive and by the new opportunities it offers our users.
UK Government Social Media Archive full text search facility showing results of a
search for “Mo Farah” - https://webarchive.nationalarchives.gov.uk/social/search/
We have many plans to further improve the archive in future. They include: researching methods of capturing other social media platforms such as Instagram and LinkedIn, working to introduce an integrated full text search function which will search across the UK Government Web Archive and UK Social Media Archive and undertaking a further user research project to guide future developments.
Please try out the new and improved archive and send any feedback to: webarchive@nationalarchives.gov.uk.
Finally, we all know that the internet, including social media, is really all about the cats, and the UK Government Social Media Archive is no exception. So I’ll leave you with a video archived from the HM Treasury YouTube channel of Gladstone, the Treasury cat, advertising London Open House.
Video entitled ‘Where’s Gladstone gone?’ archived from the HM Treasury YouTube channel held in the UK Government
Social Media Archive: https://webarchive.nationalarchives.gov.uk/video/hmtreasuryuk/1qBHH2NGII4