Elvis Valdes Ramirez, UN International Residual Mechanism for Criminal Tribunals (IRMCT)
Background
The International Residual Mechanism for Criminal Tribunals (“the Mechanism”) is the successor of the International Criminal Tribunal for the former Yugoslavia (“ICTY”) and the International Criminal Tribunal for Rwanda (“ICTR”) which has, over the last two decades, accumulated large quantities of digital records. The Mechanism is mandated, under Article 27 of its statute, with managing, including preserving and providing access to, the archives of the ICTR, the ICTY, and the Mechanism itself. The digital component of the archives is estimated at approximately 3 petabytes and is composed of all types of born digital and digitized material in a variety of formats coming from network shared drives, business systems, Electronic Documents and Records Management Systems (“EDRMS”), email systems, websites and a selection of bespoke systems which were developed in-house. The Mechanism implemented an EDRMS, currently HP Records Manager, which is in use for the management and access of records, and a Digital Preservation System (DPS) (Preservica) for the preservation of digital material. The Mechanism’s implementations of the EDRMS and the DPS adhere to policies and guidelines established by the United Nations Archives and Records Management Section (ARMS) and follows International good practice and standards.
The challenge
The main challenge was to find a solution that would facilitate appropriate packaging and structuring of metadata records and their related objects (files) after they are exported out of the EDRMS and before they are ingested into the DPS. This was all to be done in a manner that conforms to United Nations policies and international standards for good practice.
The following list highlights some of the agreed prerequisites, constraints and assumptions for the solution
-
Records exported out of the EDRMS consist of metadata and objects (files).
-
There are no similar technical implementations by other organizations to transfer records from an EDRMS to a DPS which could be used.
-
A proper assessment of the export functionalities and capabilities provided by the Mechanism’s EDRMS system is done.
-
Metadata of each record exported out of the EDRMS must be packaged and formatted in XML, using a metadata standard approved by the Mechanism. A proprietary metadata schema must be built to accommodate information that cannot be mapped using existing metadata standards.
-
Ingested records must have a pre-defined minimum set of metadata for digital preservation.
-
No resources (both technical and human) are available to develop a programming interface using the APIs and SDKs that are available in both systems.
-
The DPS technical capabilities for creation of Submission Information Packages (SIP) and ingest of records are properly assessed and well understood.
-
A solid mechanism of integrity checks and access control must be implemented during the process.
-
Solution must be implemented with existing resources and endorsed and approved by management.
Case study
In order to address the challenge the Digital Archivists of the organization clearly articulated the business requirements. Initial assessments were made to either wrap the selected metadata within METS files or create BAGIT files and then ingest those as SIPs into the DPS. After testing this was discarded in preference for a tool which came with the DPS for creation of SIPs that includes descriptive metadata. Other metadata (structural, administrative, preservation and technical) are added during creation of Archival Information Packages (AIP). A decision was subsequently taken to develop an application to automate the packaging and structuring of metadata and associated objects for ingest. The application must input records exported out of the EDRMS and save descriptive metadata files and related objects (files) in the predefined structure which is required for creation of SIPs by the SIP Creator tool which came with the DPS.
Main steps in the application’s workflow
1. The application is launched.
2. A user enters the following parameters:
-
Location of exported file from the EDRMS
-
A character delimiter, if a delimited separated value file is uploaded
-
The style sheet to use (the style sheets are based on a metadata schema e.g. MODS)
-
Additional information such as: prefix for output file names, output file’s extension, etc.
-
Output folder where the files and their metadata are going to be saved
3. A user starts the process.
4. The application packages records (metadata) exported out of the EDRMS in the selected metadata standard format and related objects (files), and creates a predefined structure in the selected location, where the output is saved to be used as input by the SIP creator tool of the DPS.
The following were the key requirements/specifications of the application
-
Metadata exported out of the EDRMS and used as input by the application must be in a delimited separated values file (comma, tab, etc.), or XML format.
-
Columns on the exported files out of the EDRMS must contain metadata information, and optionally some configuration information used by the application.
-
Metadata created for each record must be based on an XML schema (international metadata standard or bespoke schema).
-
Separate style sheets (xslt) must be created for each metadata standard used in the application, mapping columns on the delimited separated values file exported from the EDRMS against respective schema elements.
-
Style sheets used to create metadata files must be (routinely/regularly) validated against corresponding schemas by the application.
-
Mapping of metadata columns on the delimited separated values file exported from the EDRMS against schema elements must be validated by the application.
-
Checksum calculation must be conducted on objects (files) when they are moved or copied to the output location during the process using one of existing algorithms (MD5, SHA1, etc.)
-
The application must save the objects (files) and their related descriptive metadata files in a predefined hierarchical structure.