In this issue:
- What's on, and What's new
- Editorial: Aiming for Obsolescence (William Kilbride, DPC)
- Who's who: Sixty second interview with Gareth Knight, Digital Curation Specialist, King’s College London
- One world: Max Kaiser, Bettina Kann, Sven Schlarb (Austrian National Library); Christoph Becker, (Vienna University of Technology); Hannes Kulovits (Austrian State Archives)
- Featured Project: Kate Fernie (MDR Associates) tells us about the DigiCurV project
- Your view: Commentary, questions and debate from readers
Oracle Preservation and Archives Special Interest Group
4-6th April 2011
This meeting will bring together Education, Government, Healthcare and Research organisations from Europe, the Middle East and Africa interested in discussing computing solutions and best practices in the area of long-term data retention and access. This meeting will address practical architectural topics around Repositories, Data Curation, Content Management, Semantic Data Management, Tiered Storage, and Long-Term Preservation.
Preserving Digital Sound and Vision: A briefing
8th April 2011
Emerging tools and services for digital preservation are typically built around the need to preserve texts, documents, images and data sets. Audio and video – broadly defined as time-based-media - have received less attention within the library and archive communities, partly because they have historically been seen as distinct, partly because they present new technical challenges, and partly because they have hitherto represented only a small proportion of the collections which memory institutions and research archives collect. However, the simplicity with which digital video and audio can be captured and the ease and popularity of online distribution means that they are now ubiquitous, creating new concern for long term access. As more and more of our cultural and scientific legacy is being created in digital audio-visual formats, so those managing long term access to data need to understand the challenges and opportunities which these formats bring. New skills and new techniques will be required to ensure our digital audio and video memory is accessible tomorrow.
Managing and sharing social science research data
8 April 2011
This workshop is aimed at social researchers at all stages of their career, which will cover key legal and ethical issues surrounding the managing and sharing of data in research with people (surveys, interviews, focus groups, observations and ethnography). The focus of the workshop will be on the provision of guidance and practical exercises, with content based on advice provided to researchers by the UK Data Archive.
AQuA workshop - solving digital content issues and automating the solutions
11-13 April 2011
Manually checking material for these kinds of problems is laborious, challenging and, most critically, expensive. Checking samples of material reduces the cost, but can let through problematic quality issues. Automated tools that can check every digital item in a precise way should allow us to reduce our costs and increase the overall quality of our digital collections. The AQuA events will provide the opportunity to get hands on experience of developing and applying digital preservation techniques and technology to digital collections.
Joint NGLIS, GIG, CILIP conference - Opening Up Government - Do Information Professionals have the key to the door?
11 April 2011
The government library and information professional community will join forces for a one day conference which is being jointly run by the Network of Government Library and Information Professionals (NGLIS), Government Information Group (GIG) which forms part of CILIP, and the Committee of Departmental Librarians (CDL). The organizers of this joint conference would like to offer sponsorship to a newly qualified information professional or student to attend the forthcoming CDG New Professionals Conference.
Getting Started in Digital Preservation
15 April 2011
Following on from the very successful 'Decoding the Digital' conference, the British Library Preservation Advisory Centre and the Digital Preservation Coalition are delighted to invite you to the last of four events designed to raise awareness of digital preservation issues, increase involvement with digital preservation activities and sign-post the support and resources available to help you on your way. This event provides an introduction to digital preservation, builds an understanding of the risks to digital materials, includes practical sessions to help you apply digital preservation planning and tools, and features speakers sharing their own experience of putting digital preservation into practice.
UK Data Archive Seminar on Data management planning and practices\n for Research Centres and Programmes
4 May 2011
UK Data Archive has been running a project called JISC Data Management for ESRC Research Data-Rich Investments. This one-day seminar on data management planning and practices for research centres and programmes will bring together researchers, directors and support staff of ESRC-funded research centres and programmes with funding councils and data services. The aim is to share knowledge and good practices in research data management, and discuss the roles and responsibilities of various stakeholders in research centres and programmes need to plan and implement effective data management and sharing.
DCC Research Data Management Forum (RDMF): Planning for research data management: meeting funder imperatives
5-6 May 2011
The sixth meeting of the Research Data Management Forum (RDMF6) will be hosted by the University of Leicester on 5th and 6th May 2011. The theme for the meeting will provide an opportunity to explore the principles for data management planning that are set by the major funders, what these mean in practice for researchers and institutions, as well as how plans are assessed, monitored and their agreed outputs measured.
I2S2 pre-RDMF workshop: Data infrastructure challenges: working across scale, disciplinary and institutional boundaries
5 May 2011
This half day workshop will explore research data management challenges in a diverse range of contexts. Specifically, it will explore data integration and interoperability across differing degrees of scale, data flows between disciplines and data exchange across and within institutional boundaries
DCC DC 101 Lite: DATUM
12 May 2011
The DCC will be running a one-day workshop as part of the JISC funded DATUM for Health: Research data management training for health studies project. DC 101 provides an introduction to data management and digital curation; the roles and responsibilities involved, and an overview of current tools that can assist with data management and curation activities.
Digital Preservation Training Programme (DPTP)
16-18 May 2011
Digital Preservation Training Programme (DPTP) is designed for all those working in institutional information management who are grappling with fundamental issues of digital preservation. It provides the skills and knowledge necessary for institutions to combine organisational and technological perspectives, and devise an appropriate response to the challenges that digital preservation needs present.
Aligning National Approaches to Digital Preservation
23-25 May 2011
Ensuring long-term access to digital resources is a task few institutions or even countries can take on by themselves. Cooperation is key to successful digital preservation: cooperation between individual institutions, sectors, and countries.
This conference intends to provide a participatory forum for information exchange and focused work on these topics for the purpose of building international collaborations to support the preservation of our collective digital memory. The outcomes for the event will be a strategic alignment of national approaches to enable new forms of international cooperation and an edited volume that documents an action plan for building collaboration among interested digital preservation initiatives.
CILIPS One Day Seminar on Shared Services
25 May 2011
This one day seminar will explore the cooperative landscape in library services.
A Roundtable Meeting on the Economic Sustainability of Digital Information (The ESDI Roundtable)
The purpose of this Roundtable meeting is to take a focused look at issues relating to economic sustainability, a topic that originates from the work of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access (BRTF) (http://brtf.sdsc.edu/ ). Discussion at the Roundtable will doubtless reference the BRTF findings, but the purpose is also to focus on new work and to hear from an international mix of participants about the various national actions that are currently in train to ensure an economically sustainable digital future. The meeting will feature a presentation and discussion of work that is being commissioned by the Digital Curation Centre/JISC and OCLC Research to produce a reference model that will help decision-makers understand the economic context surrounding digital lifecycle management, and inform the development of viable economic sustainability strategies. The reference model will be based on the findings and recommendations of the BRTF Final Report, but aims to translate the report’s conclusions into a practically-oriented tool for economic decision-making. The goal is to produce a reference model that underpins the economic aspect of lifecycle digital planning and management, much as the OAIS Reference Model has underpinned planning and management for technical/workflow issues.
Data For Life: Digital Preservation and Health Sciences
26 May 2011
This DPC briefing day, held in conjunction with the Datum project at the University of Northumbria and sponsored by JISC, is intended to introduce key concepts of digital preservation to students and information managers working in the health and wellbeing sectors. It will provide a forum to review and debate the latest development in the preservation of digital qualitative research data in the health field and it will initiate a discussion on how the necessary skills can most effectively be developed. Based on commentary and case studies from leaders in the field, participants will be presented with emerging tools and technologies and will be encouraged to propose and debate the future for these developments.
Digital Preservation Management: Short-Term Solutions for Long-Term Problems
5-10 June 2011
The intended audience for this five-day workshop series is managers at organizations of all kinds who are or will be responsible for managing digital content over time. The workshops were initially developed at Cornell University beginning in 2003 under the direction of Anne Kenney and Nancy McGovern.
Doctoral Consortium of the ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES
13 June 2011
The Doctoral Consortium is a workshop for Ph.D. students from all over the world who are in the early phases of their dissertation work (i.e., the consortium is not intended for those who are finished or nearly finished with their dissertation). The goal of the Doctoral Consortium is to help students with their thesis and research plans by providing feedback and general advice on using the research environment in a constructive and international atmosphere. Students will present and discuss their research in the context of a well-known and established international conference, in a supportive atmosphere with other doctoral students and a panel of established researchers.
DCC Roadshow: Glasgow
22-24 June 2011
The DCC is organising a series of inter-linked UK workshops aimed at supporting institutional data management, planning and training. The events will run over 3 days and will provide institutions with advice and guidance tailored to a range of different roles and responsibilities. The workshops are free and can be booked individually. The third DCC Roadshow is being organised in conjunction with the University of Glasgow Library and will take place from 22-24 June 2011.
DPC Directors' Group
27th June 2011
Details to follow, Oxford
Digital Forensics for Preservation: DPC Members' Briefing
28th June 2011
Details to follow, Oxford
International Curation Education Forum
29 June 2011
The event is being subsidised and led by JISC in association with: the Digital Curation Centre (DCC); the Institute of Library and Museum Services; the School of Library and Information Science, University of North Carolina at Chapel Hill; and the Department of Information Studies, University College London. The Forum will be an ideal opportunity for a number of different groups to congregate including: academics; curation training professionals; digital curators; repository managers; archivists; records managers; data managers; data librarians; publishers; commercial service providers; and students. It should be of interest to anyone who attended the DigCCurr conferences at UNC Chapel Hill (2007 & 2009) and will also build on the discussions of the IDEA (International Digital Curation Education Action) Group.
DMP-ESRC data management outputs published
The UK Data Archive's DMP-ESRC project has published a series of its outputs including generic data management recommendations for research centres and programmes; and an activity-based data management costing tool.
International Journal of Digital Curation Volume 6, Issue 1
This issue contains 16 peer-reviewed papers and 5 general articles, drawn from among those presented at the 6th International Digital Curation Conference (IDCC) and both the 2009 and 2010 International Conferences on Preservation of Digital Objects (iPres).
Helping institutions to make the right strategic decisions
As senior managers are looking for more information about finance and costs, a fresh toolkit from JISC is helping them source hard evidence to support decision making. The popular strategy infokit has been revamped to include a major new section on ‘business intelligence’ and the role it can play in enabling institutions to make informed evidence-based decisions.
UK Open Access Group statement
The UK Open Access Implementation Group calls on universities not to enter into one-to-one negotiations with publishers on self-archiving rights for their staff, and instead to rely on publicly declared rights as shown on the Sherpa-RoMEO website. The OAIG membership includes: Guild HE, Universities UK, UCL, Wellcome Trust, The University of Salford, The University of Edinburgh, SCONUL, Research Libraries UK, Research Councils UK, Public Library of Science, JISC, Association of Research Managers and Administrators.
Issue 66 of Ariadne is now available.
New and enhanced version of DROID now available
DROID 6.0 is now available from The National Archives. DROID 6.0 identifies files in a completely new way, making it more accurate than previous versions. In addition to using binary signatures, DROID 6.0 now uses container signatures for looking inside zip and OLE2 formats, for much better identification. It is also easier to update and create new signatures. DROID 6.0 is faster too, optimising recognition code so that it needs to scan a lot less of the file than previously to make an accurate identification.
Invest to save, says Dr Malcolm Read
Speaking to policy makers, funders, vendors and researchers at a Belgium workshop run by e-Infranet, which develops policies to promote world-class ICT infrastructures, Dr Read said: “In an uncertain funding climate it’s essential that we prioritise. Cloud computing can give universities access to economies of scale which offers real financial benefits – as well as the potential to improve your carbon footprint and deal more flexibly with the changing needs of students and staff in the fast-moving university environment.” JISC is currently funding 11 pilot cloud projects together with the Engineering and Physical Sciences Research Council to look into the usefulness of the cloud for research in more detail and find out what the benefits and issues are.
A new Virtual Research Environment (VRE) tool called Ojax++ was recently launched to the global e-research community that allows scholars to get the most from popular web-based applications. Funded by Science Foundation Ireland, under the direction of Dr Judith Wusteman at the UCD School of Information and Library Studies, Ojax++ enables researchers to use popular online tools, such as GoogleDocs, Delicious, blogging tools and Twitter, as well more research-specific Web 2.0 tools. Ojax++ then aggregates the data from those applications so that, regardless of which web applications researchers use to conduct their research, they can organise their work and collaborate on that work in one place, using Ojax++. The tool has been made freely available to the e-research community.
Scholarly Electronic Publishing Bibliography 2010
Digital Scholarship has released the Scholarly Electronic Publishing Bibliography 2010. It covers digital copyright, digital libraries, digital preservation, digital rights management, digital repositories, economic issues, electronic books and texts, electronic serials, license agreements, metadata, publisher issues, open access, and other related topics. Most sources have been published from 1990 through 2010. Many references have links to freely available copies of included works. The Scholarly Electronic Publishing Bibliography 2010 is available as an open access PDF file and a low-cost paperback. All versions of the bibliography are available under a Creative Commons Attribution-Noncommercial 3.0 United States License.
Digital Curation and Preservation Bibliography 2010
Digital Scholarship has also released the Digital Curation and Preservation Bibliography 2010. This 80-page book presents over 500 English-language articles, books, and technical reports that are useful in understanding digital curation and preservation. This selective bibliography covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns. Most sources have been published from 2000 through 2010; however, a limited number of key sources published prior to 2000 are also included. Many references have links to freely available copies of included works.
The Digital Curation and Preservation Bibliography 2010 is available as an open access PDF file and as a low-cost paperback. All versions of the bibliography are available under a Creative Commons Attribution-Noncommercial 3.0 United States License.
Researchers want support storing research data
Researchers express a clear need for help in storing the research data they use daily. They do find it very important to be in control of what happens to their data. Researchers wish to control who has access to the data and under which conditions. These findings came from a review of various literature commissioned by SURFfoundation. The study shows that researchers regard preservation as a next step, outside their immediate scope of interest. Storage and preservation are distinct issues for researchers. This study collates the requirements of researchers with regard to storage of and access to research data. The research is based on fifteen recent sources, covering the Netherlands, the UK, USA, Australia and Europe.
LoC Digital Preservation Newsletter
The March 2011 Library of Congress Digital Preservation Newsletter is now available.
RIN Information Handling Surveys
Two new RIN surveys are open for contribution. The first aims to investigate the experiences of supervisors with respect to their research students' (PhD, DPhil etc) knowledge and skills in information handling. The second looks at the experiences of UK research students (PhD, DPhil, etc but not Masters-level students) with respect to a particular aspect of research: the information handling skills and knowledge required in their subject.
JISC Inform 30 goes digital
JISC Inform 30 Spring 2011 is in a new interactive format full of video interviews, comment and advice on how digital technologies can help you address today's financial and business challenges in your college or university. We have included comment boxes so you can add your own thoughts and links to other articles creating a valuable set of resources to refer to.
The March/April 2011 issue of D-Lib Magazine is now available.
DPC funds DPTP scholarships
The DPC is pleased to announce that it will be offering five scholarship to attend the Digital Preservation Training Programme in Glasgow, 16-18th May 2011. The deadline for applications is 1200 on Friday 29th April. All DPC members are entitled to apply.
NISO and UKSG Announce 30 More Publishers Endorse KBART
The National Information Standards Organization (NISO) and UKSG are pleased to announce that another 30 publishers are now able to supply metadata that conforms to the recommended practice, KBART: Knowledge Bases And Related Tools (NISO RP-9-2010). Endorsement of this publication, which contains practical recommendations for the timely exchange of accurate metadata between content providers and knowledge base developers, indicates that the format and content of data supplied by the publisher to knowledge bases and related tools conform to the KBART recommendations. The newest endorsers are the American Psychological Association, Edinburgh University Press and the ScitationR platform, which delivers metadata on behalf of 28 society publishers.
Information Literacy State-of-the Art Report: Colombia
A new "State of the Art" of information literacy report has been published. This is part of the series prepared for the IFLA Information Literacy section, and it is available in Spanish and in English. This one is on information literacy in Colombia and it reports on developments in libraries and universities on this subject, and publications and research that have been carried out in recent years.
Digital Preservation Policies: Guidance for archives
TNA have released a new publication to help archives develop digital preservation policies.
Editorial: Aiming for Obsolescence (William Kilbride, DPC)
Many of you will have heard me quip about the paradox at the core of the DPC’s strategic plan: it’s our mission to make ourselves obsolete. The more effective we are, the more rapidly our members will be able to develop, refine, communicate and embed digital preservation into their organisations ... and the less need there will be for our programme of ‘enabling’ and ‘agenda setting’. My obsolescence will be a measure of my success. It’s going to be tough selling that to a future employer.
Let’s clear up a few things. Firstly and perhaps most importantly, this question about digital preservation is tractable. It’s entirely fixable and we can already see in many cases how it will be fixed and how it is being fixed. If you doubt this, simply look back over the last decade of DPC Annual Reports and note how agencies have passed through various stages of confusion, reflection, assessment, development and deployment. Granted these have not travelled in a straight line nor have they been at a uniform pace - research institutions and research data managemebt seems to be way out in the lead. In short DPC members will quickly show you what success looks like.
Secondly digital preservation is one of the major challenges of our time: but it’s not one of the major questions. We know what to do about it and can be reasonably confident about finding answers. We may argue about the details of implementation and we can quickly descend into the obscure minutiae that my teenage cousins tell me to call ‘too much information’ (OMG - TMI). That’s emphatically different from some of the really great questions of our generation: how to manage the developing worlds’ reasonable expectations for economic growth without compromising the West’s desire to retain standards of living based on monopolistic consumption; how to reconcile Western ideals of individual autonomy with Eastern ideals of social and corporate action; how to maintain (or introduce?) conscience into a genuinely free market; what is justice and how might it be embedded within institutions? Such challenges have kept intellectuals awake for many years and will do so for many years to come. They are based on fundamentally incommensurate positions and their resolution as often as not boils down to questions of power. Digital preservation is simply not like this.
Finally, almost mystically, the case for digital preservation is a subset of the case for being. I’ve been told that digital preservation is irrelevant without access. I disagree but for radically different reasons than you might expect: if the best we can suggest is access then we’ll make ourselves irrelevant as well as obsolete. Time and again people ask me to help them make a case to senior managers to justify digital preservation investment within their institutions. I struggle because there is no exogenous answer. If there were it would be anodyne (cf access). The reason why we need digital preservation is so that we can get on with our real jobs. It enables smarter, safer, greener, healthier and wealthier communities. It makes for better science, government, commerce and law. It enables creativity, diversity and witness. Digital preservation is about witness, impact and legacy and it matters because the organiZations we work for think witness legacy and impact is important too. The rest is detail.
One may well ask therefore, if digital preservation is tractable and if the reason for addressing it is so compelling, why is digital preservation not more mainstream? (Put another way – if obsolescence is a measure of success for an organization like the DPC, is the fact that we’re currently recruiting for two new staff the sign of a sort of strategic failure?) I’m confident that readers will want to provide comments on this topic and they’re encouraged to do so using the comments boxes below. For now I’d posit three underlying dynamics pertaining to the technological domain that mean that the tools and skills necessary to secure the long term value of data are not yet fit for purpose.
Firstly, it’s a truism that data volumes continue to expand. The speed of data growth is normally discussed with projections of what will be in the future: and sure enough the 2010 Digital Universe Report continues to supply astounding numbers about the scale of the data problem we’re going to face. You may think we’ve got data management problems now but they predict that by 2020 the Digital Universe will have expanded to 44 times its current size. You can develop a more subtle analysis of data growth by looking backwards. At a seminar organised by the Incremental Project I was reminded that in the late 1990s the majority of archaeological research projects produced data sets smaller than 1GB (Condron et al 1999). You could fit dozens and dozens of them on the natty little memory sticks which the Incremental Project has distributed to help communicate their findings. Our processes have needed to expand to keep up with the pace of production.
Secondly, consider the growing complexity of data sets and the increasing difficulty of drawing accurate and meaningful boundaries around them. Web archiving provides a useful illustration of this. There was a time when web pages were relatively static, relatively self contained and users had a simple shared point of view to them. I wouldn’t want to underestimate the technical challenges but the process of web archiving presented a relatively robust and easily understood workflow of harvest, storage, preservation and access. But web 2.0 means that web pages are more like live data feeds which update very frequently and with multiple personalized points of view and with greater integration of content. The humble HTML document is a much more complicated beast than it was in 1996. The same can be said of very many data types. It means our tools need to be all the more refined to manage the ever greater sophistication of the digital objects we work with.
Finally, think of the growing expectations and reliance we place on data. The increase in production and sophistication are linked also a growing appetite and demand for data. The most dramatic expression of this in recent days has been the recognition of cybercrime as a significant threat to national security. The 2010 Strategic Defence and Security Review presents this is stark terms:
‘Over the last decade the threat to national security and prosperity from cyber attacks has increased exponentially. Over the decades ahead this trend is likely to continue to increase in scale and sophistication, with enormous implications for the nature of modern conflict. We need to be prepared as a country to meet this growing challenge’ (HMG 2010, 4)
This ‘Tier One’ risk would have been unimaginable to the last time a Conservative defence minister undertook a defence review: it’s a sign of the dependence that civil society places on electronic resources. Digital preservation may only a small part of our response to this risk but it points to the fact that the stakes in long term data management are rising. Digital preservation is no longer an interesting but marginal research activity.
There are other, strictly non-technological considerations that would keep the DPC relevant for a long time: cultural change within organisations; workforce development; legal and regulatory acuity; capacity; commercialisation. But the underlying technological drivers remain strong and mean that digital preservation is going to be dynamic and interesting for some time yet.
I’m going to need help making myself obsolete.
Condron, F, Richards, J, Robinson, D, and Wise A (1999) Strategies for Digital Data: Findings and Recommendation from Digital Data in Archaeology a Survey of User Needs, Archaeology Data Service, online at: http://ads.ahds.ac.uk/project/strategies/ last visited 01/04/2011
Her Majesty’s Government (2010) Securing Britain in an Age of Uncertainty: The Strategic Defence and Security Review, TSO, online at: http://www.direct.gov.uk/prod_consum_dg/groups/dg_digitalassets/@dg/@en/documents/digitalasset/dg_191634.pdf?CID=PDF&PLA=furl&CRE=sdsr last visited 1st April 2011
Who's Who: sixty second interview with Gareth Knight, Digital Curation Specialist, King’s College London
Where do you work and what's your job title?
I work for the Centre for e-Research at King’s College London as Digital Curation Specialist.
Tell us a bit about your organisation
The Centre for e-Research is a research centre located at King’s College London. It was launched in 2008, following the closure of the Arts and Humanities Data Service (AHDS) and Methods Network. We perform applied research in the areas of: sustainable e-infrastructures for research; digital libraries and digital archives including data use, creation, curation and preservation; researcher practices in the digital domain; and ICT-Methods with particular expertise in e-Science, geo-spatial and geo-temporal methods, text mining, textual analysis, and the use of grid and cloud infrastructures. Much of our work is collaborative, performed in conjunction with staff located in King’s College London, as well as external partners located in the higher education, cultural heritage and business communities in the UK, Europe and internationally.
We recently developed and launched an MA in Digital Asset Management (MADAM) Programme, with the Centre for Computing in the Humanities (CCH). The MA equips students with a range of strategic, technical and practical skills necessary to work in libraries, archives, cultural heritage institutions and business enterprises.
What projects are you working on at the moment?
I’m the Primary Investigator for the JISC-funded FIDO (Forensic Investigation of Digital Objects) project. The project is working with the Archives & Information Management (AIM) service at King’s College London to develop new practices to handle digital media. Although traditionally concerned with paper records, they are increasingly receiving information stored on removable media and computer systems. To enable archive staff to identify user-created information and transfer it into a digital archive in a documented manner, the project is evaluating the use of digital forensics techniques and trialling the use of several open source forensics tools. Once suitable tools have been identified, we will be working with archival staff to ensure that they are aware of the issues associated with digital data and are familiar with working practices for handling digital records.
FIDO is, to some extent, a follow-on to the recently concluded Preservation Exemplar at King’s (PEKin) project. In this project, we sought to identify the diverse types of digital assets being created by researchers, research groups and business units in the college, determine the current provision of data management, and implement a digital archive for use by AIM. When performing the investigation, we used elements of the Data Asset Framework (DAF) and DRAMBORA toolkits, to identify data assets that may have archival value, determine the potential issues that may limit access or use to these assets over time, and produce a set of recommendations for avoiding or mitigating these risks. We have adopted the Alfresco Document Management System (http://www.alfresco.com/), which serves as a dark archive for the curation and preservation digital assets of short and/or long-term value to the institution.
I’m also busy on the MA in Digital Asset Management programme at the moment, teaching the Digital Preservation: Theory and Practice module, as well as the odd lesson for other modules. We’ve examined a number of topics, including trusted repositories, preservation planning, significant properties/characteristics, preservation cost models, and everything in between.
How did you end up in digital preservation?
Similar to most people working in the field, by accident. I started with a Bachelors degree in Sociology and used it to examine student reaction to the adoption of Computer Aided Learning systems in the late 90s. This led to an MSc course on Computing for Business. On the basis of my knowledge of humanities research and computing, I was able to get a job at the Arts & Humanities Data Service Executive in London, contributing to the digital repository that was being developed at the time and working on various preservation-related projects. When the AHDS closed in 2008, I moved into the Centre for e-Research, and have been working on various research projects ever since.
What are the challenges of digital preservation for an organisation such as yours?
The digital preservation challenges that we encounter are similar to other institutions – ‘How do we identify and capture data that should be preserved?’ and ‘How do we improve data management practices?’
We are attempting to address these challenges through a combination of education, to ensure that data creators are aware of the value that their data may have in the long-term to themselves and others, and technology, working with research departments to embed curation tools and activities into their data processing workflow. The data assessment activity performed in the PEKin project showed that many staff members had been affected by data access issues as a result of hardware or software incompatibility, or had been delayed in their work, as a result of being unable to locate data or not having sufficient contextual information to understand content. However, many recognized that they were probably creating the same issue for others, expressing uncertainty on the retention requirements for their data, or the correct approach for organizing and storing their data. We have been working with these data creators and managers to develop documentation and training suitable to their needs.
We are also beginning to consider the impact of new technologies that are being introduced into the workplace. The popularity of portable devices, from mobile phones to tablet devices provides new environments in which to create data. However, there is as-yet little understanding of the preservation needs of data created on these systems.
What sort of partnerships would you like to develop?
We are keen to work with institutions on topics related to the the management of research and academic business data, and the development of e-infrastructure that will support research.
If we could invent one tool or service that would help you, what would it be?
A content characterisation tool, similar to the PLANETS XCL Extractor/Comparator that allows comparison of large numbers of complex objects, such as videos and CAD documents, would be fantastic. The PLANETS XCL tool does a good job with images and there have been efforts to combine the disparate functionality of various metadata extraction tools in FITS. However, there are many other types of object commonly found in digital repositories that could be supported.
And if you could give people one piece of advice about digital preservation ....?
May I be greedy and suggest two things? First, as is often stated, don’t worry about preserving your data for the next 50 years. Just consider what needs to be done in the next 5 years. When those five years have elapsed, consider the next five years, and repeat as necessary.
Second, not everything needs to be stored in the long-term. Rather than keep everything, determine what you must or should retain in the long-term. Once done, consider when you should remove the data that you are unable or unwilling to retain.
If you could save for perpetuity just one digital file, what would it be?
It should be something that has had a considerable impact upon our culture and exists in a digital only form. It would have to be the first version of Tetris developed by Alexey Pajitnov in 1984.
Of course, I would hope that someone else will have saved an emulator or own one of the original machines, in order to access it.
Finally, where can we contact you or find out about your work?
In this section we invite a partner or colleague to update us about major work on their home country that will interest readers, or about major international initiatives. In this issue we talk to....
(from left to right) Max Kaiser, Head of Research and Development, Austrian National Library; Bettina Kann, Head of Digital Library Division, Austrian National Library; Sven Schlarb, Senior Developer, Digital Library, Austrian National Library; Christoph Becker, Senior Researcher, Vienna University of Technology; and Hannes Kulovits, Head of Digital Archive Division, Austrian State Archives
Digital Preservation in Austria has been a collaborative effort from the very beginning: Already in 1999-2001, the Austrian Online Archive (AOLA), a joint project of the Austrian National Library and Vienna University of Technology, addressed the emerging challenge of archiving the web and produced one of the first prototype web archives in Europe. Ten years later, digital preservation research and practice in Austria has evolved into a compact, but highly active interdisciplinary field, well connected both nationally and with the international community. The complementary mix of institutions that is tackling preservation challenges ranges from research to memory institutions, including the Vienna University of Technology, the Austrian Institute of Technology, the Austrian National Library and the Austrian State Archives. The domains encompass a spectrum from the digitised cultural heritage, online publications and eGovernment records to the web and its contextualised dynamic content.
Preserving the scientific and cultural digital heritage requires integration of activities and research across institutional and disciplinary boundaries. Academic research is developing conceptual models and methods, and applied research in practical preservation tools is growing as well: Secure Business Austria, an industrial research center for IT-Security, recently created a new department for Digital Preservation Research.
Collection and preservation of publications play a significant role in safeguarding a country's cultural memory. Legal frameworks governing the deposit of publications to national libraries have recently been amended in many countries to include online publications as well. In Austria, the relevant amendment to the Media Act, enacted in March 2009, empowers the Austrian National Library to collect online publications and build up an archive of Austrian websites. The amendment regulates not only the means of collection, but also the subsequent use of the material.
Already in 2004, the Austrian government introduced a comprehensive records management system to increase speed and transparency of administrative processes. The 2007 EU benchmark survey confirms Austria a top position on both, full online availability and sophistication, with scores of 100% and 99%; the latter was improved to 100% by 2010. Consequently, all transactions from and to government departments are fully digital, producing born-digital electronic records. The Austrian State Archives have the mandate to archive these records and provide long-term access for government agencies and citizens.
To ensure the security and integrity of bit-stream storage, the Austrian State Archives and the Austrian National Library use a high-security governmental backup data center located in the Austrian Alps. While this facility guarantees almost 100% data recovery, it is the memory institution’s responsibility to ensure authentic access and understandability of content over time.
The difficulty and complexity of creating full descriptions of digital objects, including the specification of the file formats, storage media, and viewers, were recently demonstrated with the “Planets Time Capsule” which was deposited in a Swiss mountain vault in spring 2010 in a joint action initiated by the Vienna University of Technology, the British Library, and the Austrian National Library. This event achieved extraordinary visibility in media coverage around the world. The awareness raised by such initiatives also supported the visibility of the International Conference on Preservation of Digital Objects (iPRES 2010) which was hosted by the Austrian National Library and the Vienna University of Technology in September 2010.
Activities have also been successful in raising the awareness of digital preservation for emerging problem areas. In a national project, the Vienna University of Technology and the Austrian Institute of Technology collaboratively developed a practical tool named HOPPLA that integrates fully automated migration services for small businesses and home users. This is primarily aimed at providing flexibility and a low entrance barrier for small-scale scenarios in which digital preservation will increasingly be an important factor. On the other end of the wide range of scenarios, the challenges to preserve national cultural heritage require a focus on very scalable solutions.
The Austrian National Library’s digital archive consists of born-digital material from legal deposit and files from the digitisation of the library’s analogue collections. The library has been carrying out large scale digitisation projects for several years. The most significant initiative so far is a public private partnership with Google where 600.000 public domain books with a total of around 180 million pages will be digitised in the next six years. While in digitisation projects institutions have control over file formats, metadata and workflows, web archiving and government records-keeping pose much more complex challenges.
The Austrian National Library’s web archiving program (Web@rchiv Austria) started in 2008 with the primary objective to collect and preserve a significant and representative part of the national web space. The collection strategy combines “domain harvesting” of the .at domain, “selected harvesting” of websites with frequent updates like newspaper and “event harvesting” of websites related to significant events like elections. So far, these activities have resulted in the collection of around 500 million files from 1,1 million domains, with a total size of more than 10 TB. It is inherently impossible to exert control on the formats in which this material is produced, and the result is an extremely heterogeneous collection of material, posing challenges in long term preservation and access.
During the last four years the Austrian State Archives have been establishing a digital long-term archive that provides access to trusted information. In addition to the technical infrastructure, different committees have been established to guarantee steady evolution of the entire system for the upcoming eight years. In a preservation committee experts from the problem domain, research and industry, advise on strategies to preserve the content in the future. The results of past DP projects as well as active participation in future DP projects are both vital for the success of this work.
Many recent advances in the field of digital preservation have been driven by international research and development projects. In this context, Austrian research and memory institutions have been contributing to EC-funded digital preservation research from the very beginning. Participation in projects like the DELOS Digital Preservation Cluster and DigitalPreservationEurope have led foundations for upcoming initiatives. In particular, the Planets FP6 project (Preservation and Long-term Access through Networked Services) with three Austrian partners made a significant impact in digital preservation research and practice. In order to ensure sustainability of the project outcomes and results, the Open Planets Foundation (OPF) was established in 2010 and is currently taking over maintenance and further development of key project results.
Planets results have been taken up by Austrian memory institutions: The Austrian State Archives have been integrating the Planets preservation planning tool Plato into operations for trustworthy, evidence-based decision making. A comprehensive preservation plan takes into account legal and technical constraints such as storage space, infrastructure and delivery, IPR issues, user objectives, object characteristics, and costs. The Austrian National Library used both the preservation planning tool and the Planets Testbed to evaluate data migration strategies and implemented several Planets services as prototypes. In addition, integration of the Planets services platform with third party systems like ExLibris Rosetta has been demonstrated on a typical data migration scenario.
While Planets focused on distributed services, SCAPE (Scalable Preservation Environments) takes key outcomes forward to deliver scalable control and operations of quality-assured actions through a data-centric preservation platform. SCAPE is a new European project in FP7, involving key partners of Planets, including the Austrian National Library, the Austrian Institute of Technology, and the Vienna University of Technology. SCAPE has three concrete application areas, Digital Repositories from the library community, Web Content from the web archiving community, and Research Data Sets from the scientific community, for which the project will develop scalable preservation solutions.
The insight into digital preservation problems is improving and first solution components are emerging, but clearly, a number of challenges are waiting for us. The key challenge of scalability is being addressed in SCAPE, and both the National Library and Secure Business Austria are part of the FP7 Network of Excellence APARSEN. Moreover, the Parliamentary Archives are engaging in ARCOMEM, an FP7 project targeting the increasingly challenging dynamics of the web to transform archives into “collective memories” integrated with their user communities.
New frontiers are opened up by the FP7 IP TIMBUS - Timeless Business Processes and Services - in which Secure Business Austria plays a leading role. TIMBUS uses established digital preservation knowledge to tackle emerging challenges in environments such as Supply Chain Manufacturing and Civil Engineering, where interdependent networks of services will need to be executed in a future in which technological environments may speak entirely different languages than they do now.
In this section we invite a partner or colleague to update us a new project or feature that will interest readers. In this issue we hear from....Kate Fernie, MDR Partners (Consulting) Limited
January saw the launch of a new project - Digital Curation Vocational Education Europe (1) or DigCurV - for which I am the project manager. The project has been funded by the European Commission’s flagship Leonardo da Vinci programme (2) to establish a framework and curriculum for vocational education and training for digital curators in libraries, archives and museums. It feels like an encouraging sign that the EC sees education and training in digital preservation as important.
In DigCurV we have a group of partners with a strong track record in digital preservation and curation, including HATII, DPC and DCC in the UK; and from Europe Goettingen State and University Library (for the nestor qualification consortium), Fondazione Rinascimento Digitale, Trinity College Dublin, and Vilnius University Library; with the iSchool at the University of Toronto and IMLS from Canada and the USA.
The project began with a lively kick-off meeting in London (3) where we discussed the status of vocational training in the member countries and topical issues such as innovation in adult education, the perspectives of both employers and staff, continuing professional development, methodologies, accreditation and certification. We also made plans for the coming thirty-months of the project.
This year we will be carrying out surveys to inform the development of the curriculum. There is to be a study made of existing training opportunities (both a desk-top study and an online questionnaire to course providers(4)). There is to be a study made of existing training opportunities (both a desk-top study and an online questionnaire to course providers) and a complementary study of training needs in the sector from the perspectives of both employers and staff. The aim is to identify the key skills and competences, and training profiles. Next year will see the first draft of the curriculum and a series of focus groups and other events to gather feedback from stakeholders.
By the end of the project our aim is to launch the DigCurV curriculum as a framework from which to develop vocational training and continuing professional development.
Compiled by Kirsten Riley.
What's new is a joint publication of DPC and DCC.