Skip to main content
Log in

When expectations meet reality: common misconceptions about web archives and challenges for scholars

  • Original Paper
  • Published:
International Journal of Digital Humanities Aims and scope Submit manuscript

Abstract

As the study of digital history, politics, and culture emerges as an academic discipline, web archives will play a valuable role as sources of information. Those wishing to engage with web archives will need both specific technical skills and a high-level understanding of how the web works. This paper examines the nature and type of misconceptions that web archivists form when they create and utilise web archives. In order to carry out this research, the author qualitatively analyzed support tickets submitted by web archivists using the Internet Archive’s Archive-It (AIT), the most popular web archiving service. The tickets comprised 2544 interactions between web archivists and AIT support specialists. This paper describes the expectations AIT users bring to web archives, and the differences between their expectations and the realities of the web archiving process. It identifies the most prominent misconceptions AIT users have about both web archives and the web itself, analyses the challenges these misconceptions can pose for researchers, and recommends ways in which these can be addressed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Abrams, S., Antracoli, A., Appel, R., Caust-Ellenbogen, C., Denison, S., Duncan, S., & et al. (2019). Sowing the seeds for more usable web archives: a usability study of archive-it. The American Archivist, 82(2), 440–469. https://doi.org/10.17723/aarc-82-02-19.

    Article  Google Scholar 

  • AlNoamany, Y., Weigle, M.C., & Nelson, M.L. (2015). Detecting off-topic pages in web archives. In Kapidakis, S., Mazurek, C., & Werla, M. (Eds.) Research and advanced technology for digital libraries: lecture notes in computer science, (Vol. 9316 pp. 225–237). Cham: Springer International Publishing.

  • Archive-It. (2020a). Archive-it user guide. https://support.archive-it.org/hc/en-us/categories/201179946-Archive-It-User-Guide. Accessed 12 Aug 2020.

  • Archive-It. (2020b). Learn more [Resource document]. https://archive-it.org/learn-more. Accessed 27 Jul 2020.

  • Archives Unleashed. (2020). The archives unleashed project [Resource document]. https://archivesunleashed.orghttps://archivesunleashed.org. Accessed 14 Aug 2020.

  • Brügger, N., & Schroeder, R. (Eds.). (2017). The web as history: using web archives to understand the past and the present. London: UCL Press.

  • Brunelle, J., Kelly, M., SalahEldeen, H., Weigle, M.C., & Nelson, M.L. (2015). Not all mementos are created equal: measuring the impact of missing resources. International Journal on Digital Libraries, 16(3-4), 283–301.

    Article  Google Scholar 

  • Costa, M., & Silva, M. (2010). Understanding the information needs of web archive users. In Masanés, J., Rauber, A., & Spaniol, M. (Eds.) Proceedings of the international web archiving workshop IWAW 2010 (pp. 9–16). Austria: Vienna.

  • Dougherty, M., Meyer, E.T., Madsen, C.M., Heuvel, C.V.D., Thomas, A., & Wyatt, S. (2010). Researcher engagement with web archives: State of the art. Joint Information Systems Committee Report. http://papers.ssrn.com/sol3/papers.cfm?abstractid=1715000. Accessed 02 Jul 2012.

  • Glaser, B., & Strauss, A. (2009). The discovery of grounded theory: Strategies for qualitative research [Kindle book]. Aldine Transaction. http://amazon.com/o/ASIN/0202302601/ (Original work published 1967).

  • Hockx-Yu, H. (2014). Access and scholarly use of web archives. Alexandria, 25(1-2), 113–127. https://doi.org/10.7227/ALX.0023.

    Article  Google Scholar 

  • Johnson-Laird, P.N. (2010). Mental models and human reasoning. Proceedings of the National Academy of Sciences of the United States of America, 107(43), 18243–18250.

    Article  Google Scholar 

  • Kiesel, J., Kneist, F., Alshomary, M., Stein, B., Hagen, M., & Potthast, M. (2018). Reproducible web corpora: interactive archiving with automatic quality assessment. Journal of Data and Information Quality, 10(4). https://doi.org/10.1145/3239574.

  • Klein, M., Shankar, H., Balakireva, L., & Van de Sompel, H. (2019). The memento tracer framework: balancing quality and scalability for web archiving. In Doucet, A., Isaac, A., Golub, K., Aalberg, T., & Jatowt, A. (Eds.) Digital libraries for open knowledge (pp. 163–176). Cham: Springer International Publishing.

  • Latour, B. (1987). Science in action. Cambridge: Harvard University Press.

    Google Scholar 

  • Milligan, I. (2019). History in the age of abundance?: how the web is trans- forming historical research. Montreal, Canada: McGill-Queen’s Press.

    Google Scholar 

  • Nakanishi, H., Xie, B., Zhou, J., & Wang, H. (2017). How influential are mental models on interaction performance? exploring the gap between users’ and designers’ mental mo, dels through a new quantitative method. Advances in Human-Computer Interaction, 2017.

  • Nelson, M.L. (2013). Web archives at the nexus of good fakes and awed originals. Presented at the Coalition for Network Information (CNI) 2019 Spring Membership meeting, St. Louis, MO, USA. https://www.slideshare.net/phonedude/web-archives-at-the-nexus-of-good-fakes-and-flawed-originals.

  • Poursardar, F., & Shipman, F. (2018). How perceptions of web resource boundaries differ for institutional and personal archives. In 2018 IEEE international conference on information reuse and integration (iri) (pp. 126–129).

  • QSR International. (2016). Nvivo product range. http://www.qsrinternational.com/nvivo-product.

  • Reyes Ayala, B. (2017). Web archives: a preliminary exploration of user expectations vs. reality. In Proceedings of the web archiving and digital libraries workshop (wadl). https://vtechworks.lib.vt.edu/bitstream/handle/10919/97989/WADL2017.pdf (pp. 14–15).

  • Reyes Ayala, B. (2018). A grounded theory of information quality in web archives (Doctoral dissertation). ProQuest Dissertations and Theses, 227. https://login.ezproxy.library.ualberta.ca/login?url=https://search-proquest-com.login.ezproxy.library.ualberta.ca/docview/2130612579?accountid=14474 (Order No. 11005435).

  • Staggers, N., & Norcio, A. (1993). Mental models: concepts for human-computer interaction research. International Journal of Man-Machine Studies, 38 (4), 587–605.

    Article  Google Scholar 

  • Winters, J. (2017). Coda: web archives for humanities research - some reflections. In Brügger, N., & Schroeder, R. (Eds.) The web as history: using web archives to understand the past and the present (pp. 238–248). London: UCL Press.

Download references

Acknowledgements

I would like to thank Lori Donovan and Jefferson Bailey of the Internet Archive, without whose help this research would not have been possible.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brenda Reyes Ayala.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ayala, B.R. When expectations meet reality: common misconceptions about web archives and challenges for scholars. Int J Digit Humanities 2, 89–106 (2021). https://doi.org/10.1007/s42803-021-00034-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42803-021-00034-3

Keywords

Navigation