Web Crawl

To read more about Web Crawling, see DPC Technology Watch Report:

Parent Tags

Create or Receive (Acquire)


Monitoring Change in Web and Social Media Content: an Urban Big Data Centre Methods Seminar

Description Invitation to DPC Members to Attend an Urban Big Data Centre (UBDC) Methods Seminar co-hosted by DPC with talks by Raymond Cha of the Environmental Data & Governance Initiative (EDGI) on Monitoring US Federal Websites on Climate Change, Energy, and the Environment and Shawn Walker of Arizona State University on The Ephemerality of Social Media: How Social Media Changes Over Time and Its Impact on Our Research Raymond Cha of the Environmental Data & Governance Initiative (EDGI)...

Read More

Web Archiving & Preservation Task Force Reconvened

Description This will be the first reconvened meeting of the Web Archiving & Preservation Task Force where participants will discuss and agree on the new Terms of Reference and commence discussions on the topics important to their institutions, their collections, and their users. The Task Force was first formed in 2010 in an effort to coordinate national web archiving programmes. In recent years, however, new developments in web archiving have emerged and many more organisations have turned...

Read More

Netherlands Institute for Sound and Vision seeks server-side web archiving case studies

The Netherlands Institute for Sound and Vision has begun investigating strategies for the preservation of complex dynamic websites that cannot be captured with current web crawling tools. The team would like to explore whether server-side web archiving could be a promising strategy to deal with this task. However, this is unexplored territory and they are keen to learn how other institutions would approach similar challenges. 

Read More

Preserving digital cultural heritage: Better together!

{jcomments on} Barbara Signori is Head of e-Helvetica at the Swiss National Library, Bern. The Swiss National Library has a mandate to collect, catalogue, store and disseminate the cultural heritage created in Switzerland and abroad by and about the Swiss. This sounds like a clear enough mission, but dig deeper and this mandate raises all sorts of tough questions especially in a digital world. First of all, what is digital cultural heritage? Obviously it goes far beyond e-books and...

Read More

Web preservation demands access

Daniel Gomes is Arquivo.pt Service Manager for the Foundation for Science and Technology in Portugal. "Collect the web to preserve it?! I don't envy that job." That is a direct quote from my first "real-world" meeting. I was 23 years old, I had just graduated from the University and that was my first job. We were in the year 2000. One year later, we had developed a running prototype to perform selective collection of online publications. It was the first effort to preserve the...

Read More

Two early episodes on digital preservation… plus one!

José Borbinha works at INESC-ID – Instituto Superior Técnico (IST) at Lisbon University, Portugal (Episode 1) When unsuccessful digital preservation can be convenient The year of 1998 was special. In May, it opened the Lisbon World Exposition! In June, it was held the “Sixth DELOS Workshop on Preservation of Digital Information” in the beautiful Tomar. Finally, in October, I became CIO of the National Library of Portugal. In retrospective, 1998 was my definitive commitment with this...

Read More

Archivo web de Proceso de Paz y Posconflicto

Johanna Gallego Gutiérrez is Digital Deposit Manager for Biblioteca Nacional de Colombia, in Bogotá La Biblioteca Nacional de Colombia ha iniciado la construcción del Archivo de la web y de recursos estáticos digitales sobre proceso de paz y posconflicto en Colombia. Esta iniciativa pretende recolectar, custodiar, preservar y divulgar, para las generaciones presentes y futuras, la historia web del importante momento que vivimos en nuestro país, a través de las...

Read More

Lossy Accelerant: Surfeit and Fragment in Digital Collections Archives

Jefferson Bailey is Director of Web Archiving Programs for The Internet Archive in the USA Archival collections have always been incomplete. Being homogenous, selective groups of records preserved through time, they support attestation and evidentiary consideration only through their longitudinal availability. Multiple appraisal, selection, and processing strategies have developed over the history of the archival endeavor to address the ways in which the archival collection is, by...

Read More

Web archiving for all! Web archiving with Webrecorder

Guest blogger Anna Perricci at Rhizome introduces us to the Webrecorder In her recent post, Sara Day Thomson described how digital preservation can be a conversation stopper at parties and at passport control. I empathize though for me the puzzlement she describes is a real paradox: as our lives turn increasingly online so it seems obvious that some evidence of our collective neuroses, passions and creativities should be preserved.  Perhaps the web’s most astonishing feature is the...

Read More

A breakthrough year for web archiving in 2016?

{jcomments on} Anyone who works with web archives quickly becomes used to the fact that most people have not even heard of them – even fewer understand what they are and where you might be able to access them. In 2016, however, it seemed as though web archives began to filter into the public consciousness, to move from the technology pages of the more serious newspapers to the political and even cultural sections. In May 2016, for example, the BBC announced plans to close its Food website,...

Read More

Scroll to top