Natalie Vielfaure is the Digital Curation Archivist for the Research Services and Digital Strategies unit at the University of Manitoba Libraries in Winnipeg, Manitoba, Canada.
At the start of the COVID-19 pandemic, the University of Manitoba Libraries and the University of Winnipeg Archives launched a coordinated effort to archive websites documenting the COVID-19 experience in Manitoba, Canada. This post provides reflections on this experience and the challenges of documenting ongoing social phenomena.
With the global proliferation of COVID-19, many of us working in the GLAM sector have felt the pressures of capturing this unprecedented experience for posterity. Cautionary tales of the ‘forgotten pandemic’ of 1918 stress the importance of documenting the COVID-19 experience through the acquisition of diaries, photographs, web archives, and other documentary heritage. Like many institutions, the University of Manitoba Libraries and the University of Winnipeg Archives undertook a coordinated effort to capture this historical event by crawling websites related to our regional pandemic experience.
The ongoing endeavour led us to reflect on some of the challenges of documenting ongoing social events in perpetuity. The COVID-19 pandemic has no definitive end date. While it is a unique experience, similar challenges present themselves in collections documenting other ongoing social phenomena, such as truth and reconciliation efforts in Canada, and systemic racism. In such cases, the greatest challenge is determining how to effectively and realistically capture a representation of these events without exhausting institutional resources. Like all digital preservation activities, web archiving is constrained by finite resources such as limited storage space and staff time. Consequently, defining the scope of such activities is an important step in ensuring that resources are effectively used. But how do you scope something that is not entirely ‘scope-able’?
Though capturing Manitoba’s COVID-19 experience is already limited by geographic boundaries and subject matter, setting parameters around time is more complex. Both the event and the recovery period that will follow are important to document, but neither have timelines that can be predicted. Theoretically, one could set an arbitrary end-date, where the goal might be to capture the first year of the pandemic, for example. Alternatively, storage space might be capped, where the goal might be to capture 10 GB of storage annually over a three year period. However, while these methods might make the work more realistically achievable and sustainable, with no measurable timeline, neither guarantee that the resulting collection will be truly representative of the overall experience.
Capturing something that has become such a ubiquitous part of our daily lives presents another hurdle. Typically, scoping rules may help automate the capture of content which includes keywords such as “COVID-19”, “Coronavirus”, or “pandemic”. But, what happens when the event you’re documenting becomes so pervasive that it no longer needs to be named? Several months into the pandemic, many webpages are no longer using the term “pandemic” at all, nor are they referring to the virus by name. Its newfound omnipresence means that when we read about travel restrictions, schools, masks, or closures, among other subjects, we know we’re reading about COVID-19, even when it’s not explicitly called out. Consequently, tasks that can be automated still require a heavy investment of staff resources to ensure continued relevance.
In fact, human intervention seems to be the only way to come close to surmounting these obstacles. While limited staff resources present challenges here as well, being selective and carefully appraising potential content is crucial to stretch those finite resources as much as possible and slow down the pace at which storage space is consumed. Periodically reviewing content to ensure that key elements of the pandemic – regional data, lived experience, decision-making and responses by government and health officials, economic impact, etc., – are captured is also crucial to ensure that the resulting collection is cohesive and representative. Sharing resources through cross-institutional collaborations and looking for funding opportunities also supports more equitable distribution of human, financial, and material resources.
These may not be novel solutions, but they are important considerations. Even with the best laid plans, our goal posts are, at best, set in ever-shifting terrain as we undertake such projects. With no clear and foreseeable end point, some institutions undertaking this work may very well exhaust their resources before we see the end of the pandemic. We may be required to concede that good is good enough and prematurely bring a close to a project even where there is still much left to capture. Though the way forward is unclear, this work is indisputably important and relies on the resiliency and input of the GLAM community. A comprehensive and holistic experience of the pandemic may never be fully captured, but by engaging in partnerships and collaboration we may, at the very least, be able to grant future generations of researchers access to a representative snapshot of these unprecedented times, and ensure that this pandemic is not forgotten.