Roxana Maurer is the Digital Preservation Co-ordinator at the National Library of Luxembourg (BnL).
Yesterday evening the National Library (BnL) was supposed to host the ceremony for the Digital Preservation Awards 2020 here, in Luxembourg City. Like so many other in-person events however, the Awards were moved online because of the pandemic. As I had the pleasure and privilege to be one of this year’s judges, I can only recommend that you read more about the finalists on the DPC’s website, because extraordinary work is happening in the field of Digital Preservation. Moreover, definitely do not forget to check the winners at 12:00 GMT today!
Although it might be easier to focus on the negative aspects with everything that has been happening this year, I would like to use this blogpost to focus on this year’s theme for World Digital Preservation Day (WDPD20): “Digits: for Good”. There are still two months left in this year, but I’m going to take this opportunity to look at and celebrate all the things we did manage to do this year (or the things we learned from what didn’t work out), instead of looking at what we had planned or hoped to do.
Let’s start with Web Archiving. Although the BnL joined quite late the Web Harvesting party, my colleagues Ben Els, Digital Curator for the Web Archive, and Yves Maurer, Technical Manager for the Web Archive and Deputy Coordinator for the IT Department, have done a wonderful job, especially in the past year. The move to the new building of the library last year in October coincided with giving access to the public to our Web Archive, within the Reading Room. Activities have only intensified since the beginning of this year.
In the first two months of the year, before Luxembourg was touched by the COVID-19 pandemic, Ben’s efforts have been focused on promoting BnL’s web archiving activities and raising awareness. January saw the release of the webarchive.lu site, an information platform on the archiving of the Luxembourg web and an access point for any search of websites archived by the BnL, but also interviews in the written news media and on radio. Together with colleagues from the National Audiovisual Center (CNA), Ben and Yves also organised “Content at risk”, a conference-debate initially planned for the World Digital Preservation Day 2019, about the challenges for the preservation of digital content on new media, weakened by the volatility of the web.
March saw a rise of COVID-19 cases in Luxembourg and the arrival of the lockdown in mid-March meant that all the BnL teams had to redefine the way they did their work and sometimes completely change the focus of their activities. In terms of web harvesting, the BnL had an immediate and clear focus for the upcoming months, which Ben has described in detail in his IIPC blogpost Luxembourg Web Archive – Coronavirus Response and the WARCnet paper Exploring special web archive collections related to COVID-19: The case of the BnL.
On my side of the aisle, after switching successfully into production with our digital preservation system at the beginning of May (from home, during lockdown, might I add), I decided to take the additional challenge of trying to ingest BnL’s Web Archive. DPC colleagues might remember the discussion we had with the DPC Web Archiving & Preservation Working Group at the beginning of June. A few months later, I was quite happy to share my success of seeing the first WARC files into our Digital Preservation platform:
Unfortunately the joy didn’t last long, because after only 3 weeks I had to not only stop the ingest, but also delete everything that I had ingested until that point. If you’re interested in the details of this journey full of ups and downs, you can have a look at my short talk at WeMissiPRES: Ingesting Web Archives into Digital Preservation systems: Necessity or heresy?
I cannot finish this section about web archiving without mentioning the great work done by Ben and Yves for the DPC/IIPC Introduction to Web Archiving Training or for the organization of next year’s IIPC Web Archiving Conference.
“Digitization is not digital preservation” – how many times have we all heard this? Nevertheless, digitization is one very important topic in the realm of Digital Preservation at the BnL, the digitized content being one of the oldest and most important flows of data that is a candidate for long-term access and discovery.
On the side of ingesting our digitized content there is not much to say, other than the fact that the work continues with our digital preservation software providers for ingesting and exporting our multiple manifestations METS/ALTO objects.
One major development I would like to mention is the release of our new IIIF-compliant viewer for digitized (METS/ALTO) content, the hard-working efforts of Ralph Marschall, my colleague responsible for Digitization at the BnL.
Ralph and I were also interviewed by Tageblatt, Luxembourg's second biggest daily newspaper, which ran a two-page article about Digitization, Digital Preservation and even Persistent Identifiers at the National Library.
The BnL is offering its digitized content not only through its viewer, but also through Open Data sets (where copyright is not an issue) available on the Open Data Platform of the National Library of Luxembourg. This year saw the addition of new sets from our digitization campaigns and additional work is in progress for documenting the new viewer’s APIs.
BnL’s digitized content is also the first data set to be used in Machine Learning projects. Pit Schneider started working in April at the BnL, as a Data Scientist and his first two projects focus on improving Optical Character Recognition (OCR), respectively improving the exploration of digitized content through Named Entity visualization, map integration, Wikidata integration and timelines, the first BnL Artificial Intelligence projects from hopefully many more to come.
Although not comparable in size with the two previously discussed data flows, discussing about digital preservation at the BnL cannot be done without mentioning the most diverse and, at times, the most difficult data flow: born-digital content (excluding websites). For the BnL that includes publications received through legal deposit, but also diverse other born-digital content from either the BnL or external partners. The work that goes into collecting this content and discussing with copyright holders about access and reuse issues is often invisible, but in most cases essential if we are to preserve and transform these objects for long-term access and discovery. Yorick Schmit, Digital Curator and responsible for Digital Legal Deposit at the BnL, is doing a lot of this work in our project and I couldn’t be happier working with him on all collections aspects from ingest to access.
The last item I would like to mention here is the release of the info.persist.lu platform and of the Persistent Identifier Service for other Luxembourg institutions. Although I was already talking about the use of ARKs as persistent identifiers at the BnL in my blogpost for the World Digital Preservation Day 2018, it was only this June that saw this release, after several months teamwork with Yorick.
Looking at the planned projects list I had prepared at the beginning of this year I might feel a pang of regret and disappointment that many of them didn’t happen. Nevertheless, having a look at all the things that my colleagues and I managed to achieve this year, despite extraordinary circumstances and additional stress sometimes, I cannot but feel grateful. Therefore, this World Digital Preservation Day here’s raising a glass to all the great things accomplished so far and to many more to come next year!