Sally DeBauche is Digital Archivist at Stanford University Libraries in the USA
Email offers singular insight into and evidence of a person or organization’s self-expression, as well as records of collaboration, professional, social, and familial networks, and all manner of transactions. Email is an essential component of the archival record -- the modern equivalent of the type or hand-written correspondence of past centuries. However, email is also a complex format that poses many technical challenges to archivists working to preserve and provide access to it. In their 2018 report, the Task Force on Technical Approaches for Email Archives described email as, “…not one thing, but a complicated interaction of technical subsystems for composition, transport, viewing, and storage.”
Compounding this complexity, email is not a constant or consistent format. As email technology has evolved and email clients have fallen in and out of use, archivists working with historical email collections will encounter a wide variety of email file types. Thus, the most essential tasks of capturing, processing, preserving, and providing access to email pose a host of technical obstacles for cultural heritage institutions.
While there are many challenges in preserving email, there is also a growing community of archivists and technical specialists working to meet them. Harvard University is currently redeveloping their EAS email preservation tool as open source software so that other institutions can use it in their processing workflows. The RATOM project is developing Natural Language Processing, digital forensics, and machine learning based software for appraisal and processing of email archives. These projects, among other efforts throughout the community, demonstrate that the importance of preserving email is well recognized by archivists, institutional administrators, and funding agencies and foundations alike.
Funding does play a significant role in the advancement of email preservation. Many projects that focus on email preservation, including the RATOM project and the ePADD project are partially or entirely grant funded. Indeed, the Task Force on Technical Approaches for Email Archives report identified the securing of additional funding as a prerequisite for further development for almost all of the research projects that it profiled.
Interoperability between email processing tools is another critical step to ensuring that practitioners are able to work with the wide variety of email archives in their institution’s holdings, seamlessly moving through each stage in the processing workflow. A toolset that works in an interoperable manner will lower the technical barrier for use, allowing more institutions to work with email.
The ePADD project team was honored with the Digital Preservation Coalition’s recognition of our efforts to promote curation and access for historical email collections. Through the platform provided by DPC, the ePADD team hope to contribute to and grow the discussion of email preservation and access. Only through greater research and experimentation, and critically, the publication of that research, can our professional community begin to systematically preserve this vital piece of the historical record.