Supporting Evidence |
Key motivators |
Hurricane Sandy caused flooding in data centers resulting in potential loss of business data |
|
U.S. Nuclear System Runs on Early Computers and 8-Inch Floppy Disks |
|
As much as 80% of scientific data from the 1990s is irretrievable |
|
Precedent-setting Supreme Court opinions contain links to online sources that are disappearing |
|
Meet the digital historians on a mission to preserve data for future generations |
|
Future-Proofing Critical Digital Data in an Increasingly Complex Global Regulatory Environment , extract from report undertaken by the IGI https://iginitiative.com supported by Preservica. Full report available here |
|
Preserving history and ensuring citizen access to digital government records using the cloud , extract from report undertaken by the IGI https://iginitiative.com supported by Preservica. Full report available here |
|
A Practical Approach to Governing 170 Years of Critical Corporate Records , extract from report undertaken by the IGI https://iginitiative.com supported by Preservica. Full report available here |
|
In regulated industries such as financial services, digital archiving can help firms meet specific compliance needs. MiFID II, for example, requires that all firms keep unalterable records of all electronic communications intended to conclude in or confirm a transaction. The unalterable, date and time-stamped format of digital archives can also provide organisations with legally admissible records of all online activity, enabling disputes to be more easily resolved. Information provided by MirrorWeb |
|
For brands and public sector organisations, digital archiving allows them to capture a permanent record of web and social media content, protecting it from alteration and unauthorised use. It also ensures that content continues to deliver value long into the future. The use of big data techniques such as sentiment analysis to understand customer engagement and brand perception over time, for example, could be used to inform future marketing strategy. Information provided by MirrorWeb |
|
2.5 quintillion bytes of data are created online every single day. To try and conceptualise that, if you laid out 2.5 quintillion one pence coins, it would cover the surface area of Earth five times over. Information provided by MirrorWeb |
|
Over 90% of all the data in the history of the world was generated in the last two years (although that window is shortening!). |
|
52% of links to web pages of government departments quoted in Hansard between 1997 and 2006 were broken by 2007 | Corporate/Cultural Memory |
Every single minute:
Information provided by MirrorWeb |
|
95 million photos and videos are shared on Instagram every day. Information provided by MirrorWeb |
|
The average size of a web page is approximately 3MB, and the average website is about 50/60MB. The time taken to crawl a website would depend on a number of factors, most notably on the make up of the URIs, i.e., how many media files, pages, images, PDFs etc. there are. The other major factor is the structure of the site in terms of links and the CMS used, as this has a significant impact on the current limitations of crawl technology such as Heritrix. Information provided by MirrorWeb |
|
MirrorWeb recently worked with The National Archives to migrate the UK Government Web Archive, including Twitter and YouTube content, to the MirrorWeb cloud platform. It took two weeks to capture and transfer 120TB of data from 72 hard drives at The National Archives to internal hard drives, before transferring and hosting that archive in the cloud. To put that in perspective, 120TB of data is five and a half times the complete film catalogue on Netflix. Information provided by MirrorWeb |
|
The average university website might be around 30-60GB, and this would take anything between 6-20 hours on average to crawl, dependent on the platform and makeup of the content and links within. Information provided by MirrorWeb |
|
The cloud does not need infrastructure to accommodate growth, cutting down on a lot of storage overheads and meaning less costs for customers. For MirrorWeb to archive 30GB of website data, it would cost just £650 for the year, and £300 for a social media account annually. Information provided by MirrorWeb |