Skip to main content
  • Book
  • © 2021

The Past Web

Exploring Web Archives

  • Provides practical information about web archives, offers inspiring examples for web archivists and shares recent research results about access methods for exploring preserved information

  • Targets academics and advanced professionals in digital humanities, social sciences, history, media studies, and information or computer science

  • Serves as an initial reference for students in various areas of knowledge by introducing how to explore online history through web archives

Buying options

eBook USD 129.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-63291-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book USD 169.99
Price excludes VAT (USA)
Hardcover Book USD 169.99
Price excludes VAT (USA)

This is a preview of subscription content, access via your institution.

Table of contents (22 chapters)

  1. Front Matter

    Pages i-xix
  2. The Era of Information Abundance and Memory Scarcity

    1. Front Matter

      Pages 1-3
    2. The Problem of Web Ephemera

      • Daniela Major
      Pages 5-10
    3. Web Archives Preserve Our Digital Collective Memory

      • Daniela Major, Daniel Gomes
      Pages 11-19
  3. Collecting before it vanishes

    1. Front Matter

      Pages 21-22
    2. Web Archiving in Singapore: The Realities of National Web Archiving

      • Ivy Huey Shin Lee, Shereen Tay
      Pages 33-42
    3. Archiving Social Media: The Case of Twitter

      • Zeynep Pehlivan, Jérôme Thièvre, Thomas Drugeon
      Pages 43-56
    4. Creating Event-Centric Collections from Web Archives

      • Elena Demidova, Thomas Risse
      Pages 57-67
  4. Access methods to analyse the Past web

    1. Front Matter

      Pages 69-70
    2. A Holistic View on Web Archives

      • Helge Holzmann, Wolfgang Nejdl
      Pages 85-99
    3. Interoperability for Accessing Versions of Web Resources with the Memento Protocol

      • Shawn M. Jones, Martin Klein, Herbert Van de Sompel, Michael L. Nelson, Michele C. Weigle
      Pages 101-126
    4. Image Analytics in Web Archives

      • Eric Müller-Budack, Kader Pustu-Iren, Sebastian Diering, Matthias Springstein, Ralph Ewerth
      Pages 141-151
  5. Researching the Past Web

    1. Front Matter

      Pages 153-154
    2. Critical Web Archive Research

      • Anat Ben-David
      Pages 181-188

About this book

This book provides practical information about web archives, offers inspiring examples for web archivists, raises new challenges, and shares recent research results about access methods to explore information from the past preserved by web archives.

The book is structured in six parts. Part 1 advocates for the importance of web archives to preserve our collective memory in the digital era, demonstrates the problem of web ephemera and shows how web archiving activities have been trying to address this challenge. Part 2 then focuses on different strategies for selecting web content to be preserved and on the media types that different web archives host. It provides an overview of efforts to address the preservation of web content as well as smaller-scale but high-quality collections of social media or audiovisual content. Next, Part 3 presents examples of initiatives to improve access to archived web information and provides an overview of access mechanisms for web archives designed to be used by humans or automatically accessed by machines. Part 4 presents research use cases for web archives. It also discusses how to engage more researchers in exploiting web archives and provides inspiring research studies performed using the exploration of web archives. Subsequently, Part 5 demonstrates that web archives should become crucial infrastructures for modern connected societies. It makes the case for developing web archives as research infrastructures and presents several inspiring examples of added-value services built on web archives. Lastly, Part 6 reflects on the evolution of the web and the sustainability of web archiving activities. It debates the requirements and challenges for web archives if they are to assume the responsibility of being societal infrastructures that enable the preservation of memory.

This book targets academics and advanced professionals in a broad range of research areas such as digital humanities, social sciences, history, media studies and information or computer science. It also aims to fill the need for a scholarly overview to support lecturers who would like to introduce web archiving into their courses by offering an initial reference for students.

Keywords

  • Web Archiving
  • Digital Libraries
  • Web Searching
  • Web Retrieval
  • Information Science
  • History of the Internet

Reviews

The Past Web: Exploring Web Archives has become an important part of the web archiving field.  It is one of the most important sources in recent years. This book brings together recent academic studies on web archives and presents different information on how web archivers and researchers can benefit from and access web archives through case studies." Aykut KAYA, Bursa Uludag University, Prof. Dr. Fuat Sezgin Central Library.

Editors and Affiliations

  • Fundação para a Ciência e a Tecnologia, Lisbon, Portugal

    Daniel Gomes

  • Data Science & Intelligent Systems, Computer Science Institute, University of Bonn, Bonn, Germany

    Elena Demidova

  • School of Advanced Study, University of London, London, UK

    Jane Winters

  • University Library J. C. Senckenberg, Goethe University Frankfurt, Frankfurt am Main, Germany

    Thomas Risse

About the editors

Daniel Gomes is the leader of Arquivo.pt - the Portuguese web-archive at the Foundation for Science and Technology (a Portuguese Government Institution). He started Arquivo.pt as an academic project during his PhD, accomplished in 2007, and led it to become the research infrastructure he currently manages. During this process, he led several dissemination and communication activities complementary to the technological development and operation of the research infrastructure. From 2012 to 2015, he managed in parallel the web development team of the Foundation for National Scientific Computing. His research interests include user experience, digital preservation, information retrieval and web archiving.

Elena Demidova is the leader of the “Data Science and Intelligent Systems” group at the University of Bonn and a member of the L3S Research Center at the Leibniz University of Hannover, Germany. In the past, she worked as a research group leader at the L3S Research Center and as a Senior Research Fellow at the Web and Internet Science Group at the University of Southampton, UK. Elena had leading roles in several large-scale EU-funded and national projects, most recently including coordination of Cleopatra - a Marie Skłodowska-Curie ITN. Her main research interests are in data analytics, mobility, multilingual data, Open Data, the Web and Semantic Web.

Jane Winters is Chair of Digital Humanities at the School of Advanced Study, University of London. She is a Fellow of the Royal Historical Society, and a member of RESAW (Research Infrastructure for the Study of the Archived Web) and the Advisory Boards of the European Holocaust Research Infrastructure and the Living with Machines project. Her research interests include digital history, web archives, big data for humanities research, peer review in the digital environment, text editing, and open access publishing, and she has led or co-directed a range of digital projects in these areas.

Thomas Risse is head of Electronic Services at the University Library J. C. Senckenberg of the Goethe University Frankfurt. Before joining the Library he was deputy managing director and research group leader at the L3S Research Center, Hannover, Germany. He was coordinator or technical director of several European projects in the area of digital libraries and web archives. Thomas is a member of the Steering Committee of the IEEE International Conference on Data Engineering and of the International Conference on Theory and Practice of Digital Libraries.

Bibliographic Information

Buying options

eBook USD 129.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-63291-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book USD 169.99
Price excludes VAT (USA)
Hardcover Book USD 169.99
Price excludes VAT (USA)