Abstract
The aim of this article is to provide an exploratory analysis of the landscape of web archiving activities in Europe. Our contribution, based on desk research, and complemented with data from interviews with representatives of European heritage institutions, provides a descriptive overview of the state-of-the-art of national web archiving in Europe. It is written for a broad interdisciplinary audience, including cultural heritage professionals, IT specialists and managers, and humanities and social science researchers. The legal, technical and operational aspects of web archiving and the value of web archives as born-digital primary research resources are both explored. In addition to investigating the organisations involved and the scope of their web archiving programmes, the curatorial aspects of the web archiving process, such as selection of web content, the tools used and the provision of access and discovery services are also considered. Furthermore, general policies related to web archiving programmes are analysed. The article concludes by offering four important issues that digital scholars should consider when using web archives as a historical data source. Whilst recognising that this study was limited to a sample of only nine web archives, this article can nevertheless offer some useful insights into the technical, legal, curatorial and policy-related aspects of web archiving. Finally, this paper could function as a stepping stone for more extensive and qualitative research.
Similar content being viewed by others
Notes
For instance, obtaining prior authorisation of right holders, creating new exceptions for reproduction or communication to the public for archiving purposes and obtaining a fair balance between the public interest in preserving information of cultural or historical significance and the interests of rights holders.
It is the case for France with the DADVSI Law (see « Loi n° 2006–961 du 1er août 2006 relative au droit d’auteur et aux droits voisins dans la société de l’information »), for Luxembourg (see « Loi luxembourgeoise du 25 juin 2004 portant réorganisation des instituts culturels de l’Etat »), for United Kingdom (see « Legal Deposit Libraries (Non-Print Works) Regulations of 5th April 2013 »), For Denmark (see “Danish Act n° 1439 on Legal Deposit of Published Material of 22nd December 2004”).
For instance, The Netherlands, Portugal and Switzerland (at the federal level).
Prior authorization of the right holders is not necessary for websites that have fallen into the public domain or that were made available under the system of Creative Commons License (Beunen and Schiphof 2006, p. 16).
This approach is the one of Arquivo.pt. in Portugal (Arquivo.pt, n.d.-c).
Let us indicate that websites are composed of a set of elements that can be each protected by copyright (original texts, images, search engine, database, etc.) and may each have a different right holder (KB Nederland n.d.-e). We also have to underline the fact that websites can also be composed of elements protected by other rights such as trademark law, database right, neighboring rights and image right (KB Nederland n.d.-b).
Act for which the consent of the right holders is in principle required.
In France, the DADVSI Law has introduced an exception allowing acts of reproduction and communication related to the web legal deposit (see French Heritage Code, art. L132–4 to L132–6). In the United Kingdom, Sections 19 to 31 of the Legal Deposit Libraries (Non-Print Works) Regulations of 5th April 2013 and Section 44A of the Copyright, Designs and Patents Act of 15th November 1988 allow the realization of certain activities related to web legal deposit without that they violate copyright.
For instance, in France, Article L132–2-1 of the French Heritage Code authorize the “Bibliothèque Nationale de France” to turn to domain names management bodies or to the Higher Audiovisual Council to identify the publishers and producers of websites. There is also a similar legal provision in Denmark (See Danish Act n° 1439 on Legal Deposit of Published Material of 22nd December 2004, §11).
It is the case in France (see French Heritage Code, art. R132–23-1, II), United Kingdom (see Legal Deposit Libraries (Non-Print Works) Regulations of 5th April 2013, Section 16 (4)) and Denmark (see Danish Act n° 1439 on Legal Deposit of Published Material of 22nd December 2004, §10).
The Legal Deposit Libraries (Non-Print Works) Regulations 2013
In the case of the National Library of Ireland, this only counts for the web archive collections that were based on a selective policy. Access conditions to the web material collected during the top-level domain crawl that started in 2017 were not yet defined at the time of the interview.
See Legal Deposit Libraries (Non-Print Works) Regulation of 5th April 2013, Section 23.
See French Heritage Code, art. R132–23-2.
References
Archives Unleashed Project. (2018). The Archives Unleashed Project. Retrieved from http://archivesunleashed.org/. Last accessed on 20/04/2018.
ArchiveSpark GitHub. (2018). Helgeho/ArchiveSpark: An Apache Spark framework for easy data processing, extraction as well as derivation for Web archives and archival collections, developed by the Internet Archive and L3S Research Center. Retrieved from https://github.com/helgeho/ArchiveSpark. Last accessed on 20/04/2018.
Arquivo.pt. (2018). Arquivo.pt (Portuguese web-archive): official playlist. Retrieved from https://www.youtube.com/playlist?list=PLKfzD5UuSdETtSCX_TM02nSP7JDmGFGIE. Last accessed on 12/02/2018.
Arquivo.pt. (n.d.-a). Arquivo.pt. Retrieved from http://www.arquivo.pt. Last accessed on 22/01/2018.
Arquivo.pt. (n.d.-b). Knowledge. Retrieved from https://www.fccn.pt/en/knowledge/arquivo-pt/. Last accessed on 20/10/2017.
Arquivo.pt. (n.d.-c). Crawling and archiving Web content. Retrieved from http://sobre.arquivo.pt/en/help/crawling-and-archiving-web-content/#qe-faq-2416. Last accessed on 20/10/2017.
Arquivo.pt. (n.d.-d). Terms and conditions. Retrieved from http://sobre.arquivo.pt/en/about/terms-and-conditions/. Last accessed on 31/01/2017.
Arquivo.pt. (n.d.-e). What is Arquivo.pt - the Portuguese Web Archive? Retrieved from http://sobre.arquivo.pt/en/help/what-is-arquivo-pt/. Last accessed on 20/10/2017.
Ben-David, A., & Huurdeman, H. (2014). Web archive search as research: Methodological and theoretical implications. Alexandria, 25(1–2), 93–111.
Beunen, A. & Schiphof, T. (2006). Legal aspects of web archiving from a Dutch perspective (report commissioned by the National Library in The Hague).
BnF. (2014). Historique de l’archivage web. Retrieved from http://www.bnf.fr/fr/professionnels/archivage_web_bnf/a.depot_legal_internet_histoire.html#SHDC__Attribute_BlocArticle1BnF. Last accessed on 22/01/2018.
BnF. (2016). BnF Collecte de web (BCWeb). Retrieved from https://collecteweb.bnf.fr/login.html. Last accessed on 04/02/2018.
BnF. (2017a, February). Collectes ciblées de l’internet français. Retrieved from http://www.bnf.fr/fr/collections_et_services/anx_pres/a.collectes_ciblees_arch_internet.html. Last accessed on 16/12/2017.
BnF. (2017b). Internet archives. Retrieved from http://www.bnf.fr/en/collections_and_services/book_press_media/a.internet_archives.html. Last accessed on 21/09/2017.
BnF. (2017c). Guide des archives de l’Internet [Brochure]. Retrieved from http://www.bnf.fr/documents/guide_archives_internet.pdf. Last accessed on 20/09/2017.
BnL. (n.d.). Appel à participation - Bibliothèque nationale de Luxembourg. Retrieved from: http://crawl.bnl.lu/2017/06/appel-a-participation-bibliotheque-nationale-de-luxembourg-web-archive/. Last accessed on 26/01/2018.
British Library. (2017a, April 18). The challenges of web archiving social media [web log message]. Retrieved from http://blogs.bl.uk/webarchive/2017/04/the-challenges-of-web-archiving-social-media.html. Last accessed on 30/10/2017.
British Library. (2017b, May 17). Web Archiving Engagement Manager. Retrieved from https://www.bl.uk/people/experts/jason-webber. Last accessed on 04/02/2018.
British Library. (n.d.-a). UK web archive. Retrieved from https://www.bl.uk/collection-guides/uk-web-archive. Last accessed on 31/10/2017.
British Library. (n.d.-b). Explore the British Library. Non-print legal deposit: FAQs. Retrieved from http://www.bl.uk/catalogues/search/non-print_legal_deposit.html. Last accessed on 31/10/2017.
Brozzler GitHub. (2018). internetarchive/brozzler: brozzler - distributed browser-based web crawler. Retrieved from https://github.com/internetarchive/brozzler. Last accessed on 20/04/2018.
Brügger, N., Laursen, D., & Nielsen, J. (2017). Exploring the domain names of the Danish web. In N. Brügger & R. Schroeder (Eds.), The web as history. Using web archives to understand the past and present (pp. 62–80). London: UCL Press.
BUDDAH, Big UK Domain Data for the Arts and Humanities. (2014) Bursaries. Retrieved from https://buddah.projects.history.ac.uk/news/bursaries/ . Last accessed on 04/02/2018.
Chakraborty, A., & Nanni, F. (2017). The changing digital faces of science museums: A diachronic analysis of museum websites. In N. Brügger (Ed.), Web 25. Histories from the first 25 years of the world wide web (pp. 157–174). New York: Peter Lang.
Clarke, N. (2016). JWAT. Retrieved from https://sbforge.org/display/JWAT/JWAT. Last accessed on 20/04/2018.
Costa, M. & Silva, M. (2010). Understanding the information needs of web archive users. In Proceedings of the 10th International Web Archiving Workshop (pp. 9-16).
Costa, M. & Silva, M. (2011). Characterizing search behavior in web archives. In Proceedings of the 1st International Temporal Web Analytics Workshop.
Deswarte, R. (2015). Revealing British euroscepticism in the UK web domain and archive case study. Retrieved from http://sas-space.sas.ac.uk/6103/#undefined. Last accessed on 25/01/2018.
Dooley, J. (2016 October). Metadata to meet user needs. Presented at the OCLC Member Forum. Los Angeles.
Dooley, J. M., Farrell, K. S., Kim, T. & Venlet, J. (2017). Developing web archiving metadata best practices to meet user needs. Journal of Webstern Archives, 8(2), Art. 5, 15 pp.
Dougherty, M., Meyer, E. T., Madsen, C., van den Heuvel, C., Thomas, A., & Wyatt, S. (2010). Researcher engagement with web Archives: State of the art. London: JISC.
Fbarc GitHub. (2018). justinlittman/fbarc: A commandline tool and Python library for archiving data from Facebook using the Graph API. Retrieved from https://github.com/justinlittman/fbarc. Last accessed on 20/04/2018.
Foo, C. (2016). Welcome to Wpull’s documentation! - Wpull 2.0.1 documentation. Retrieved from https://wpull.readthedocs.io/en/master/#. Last accessed on 20/04/2018.
Free Software Foundation. (2017) Wget - GNU Project - Free Software Foundation. Retrieved from https://www.gnu.org/software/wget/. Last accessed on 20/04/2018.
Gomes, D. (2017a, November 30). Web preservation demands access. Retrieved from http://www.dpconline.org/blog/idpd/web-preservation-demands-access. Last accessed 14/12/2017.
Gomes, D. (2017b, November 24) Personal interview via Zoom with Daniel Gomes /Interviewers: Sally Chambers, Friedel Geeraert, Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file].
Grab-site GitHub. (2018). ludios/grab-site: The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns. Retrieved from https://github.com/ludios/grab-site. Last accessed on 20/04/2018.
Graff, E. & Sepetjan, S. (2011). Le dépôt légal en France. Les cahiers de la propriété intellectuelle, 2011/1, 179–180.
Harvey, D. R. (2005). Preserving digital materials. München: KG Saur.
Helmond, A., Nieborg, D., & van der Vlist, F. N. (2017). The political economy of social data: A historical analysis of platform–industry partnerships. In Proceedings of the 8th International Conference on Social Media & Society (SMSociety 17) New York: ACM Press. https://doi.org/10.1145/3097286.3097324.
Hockx-Yu, H. (2014). Archiving social media in the context of non-print legal deposit. Paper presented at IFLA, Lyon.
IIPC. (2017). Why archive the web? Retrieved from http://netpreserve.org/web-archiving/. Last accessed on 22/01/2018.
IIPC. (2018). OpenWayback. Retrieved from http://netpreserve.org/web-archiving/openwayback/. Last accessed on 09/02/2018.
ISO. (2017). Information and documentation - WARC file format (ISO 28500:2017).
KB Nederland (n.d.-a) Selectie bij webarchivering. Retrieved from https://www.kb.nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/selectie-bij-webarchivering. Last accessed on 19/12/2017.
KB Nederland (n.d.-b). Legal issues. Retrieved from https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-archiving/legal-issues. Last accessed on 22/09/17.
KB Nederland (n.d.-c). Web archiving. Retrieved from https://www.kb.nl/en/organisation/research-expertise/long-term-usability-of-digital-resources/web-archiving. Last accessed on 22/09/17.
KB Nederland (n.d.-d) KB-webarchief: veelgestelde vragen. Retrieved from https://www.kb.nl/organisatie/onderzoek-expertise/e-depot-duurzame-opslag/webarchivering/kb-webarchief-veelgestelde-vragen. Last accessed 08/12/2017.
KB Nederland (n.d.-e) Gebruiksvoorwaarden webarchief Koninklijke Bibliotheek. Retrieved from https://www.kb.nl/bronnen-zoekwijzers/databanken-mede-gemaakt-door-de-kb/webarchief-kb/gebruiksvoorwaarden-webarchief-koninklijke-bibliotheek. Last accessed on 08/12/2017.
Kelly, M. (2017). Web Archiving Integration Layer (WAIL). Retrieved from https://machawk1.github.io/wail/. Last accessed on 20/04/2018.
Koerbin, P. (2017). Revisiting the world wide web as artefact: Case studies in archiving small data for the National Library of Australia’s PANDORA archive. In N. Brügger (Ed.), Web 25. Histories from the first 25 years of the world wide web (pp. 191–206). New York: Peter Lang.
Kremer, I. (2016). About the Time Travel Service. Retrieved from http://timetravel.mementoweb.org/about/. Last accessed on 20/04/2018.
Kunze, S. & Power, B. (n.d.). The 1916 Easter Rising Web Archive Project, p. 2. Retrieved from https://archivedweb.blogs.sas.ac.uk/files/2017/06/RESAW2017-PowerKunze-The_1916_Easter_Rising_web_archive_Project.pdfp_.pdf. Last accessed on 2/11/2017.
Laursen, D., & Møldrup-Dalum, P. (2017). Looking back, looking forward: 10 years of web development to collect, preserve and access the Danish web. In N. Brügger (Ed.), Web 25. Histories from the first 25 years of the world wide web (pp. 207–228). New York: Peter Lang.
Lin, J., Milligan, I., Wiebe, J., & Zhou, A. (2017). Warcbase: Scalable analytics infrastructure for exploring web archives. Journal on Computing and Cultural Heritage, 10(4), 1–30. https://doi.org/10.1145/3097570.
Maemura, E., Becker, C., & Milligan, I. (2016). Understanding computational web archives research methods using research objects. In James Joshi, George Karypis, Ling Liu, et al., 2016 IEEE International Conference on Big Data (Big Data)(pp. 3250–3259).
Masanès, J. (2005). Web archiving methods and approaches: A comparative study. Library Trends, 54(1), 72–90.
Maurer, Y. & Els, B. (2017a, November 24). Personal interview via GoToMeeting with Yves Maurer and Ben Els/Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Eveline Vlassenroot. [M4A file].
Maurer, Y. & Els, B. (2017b, November 24). Written answers given by the Bibliothèque nationale de Luxembourg via Google Docs before the personal interview with Yves Maurer and Ben Els/Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Eveline Vlassenroot.
Moesgaard, J. & Larsen, T. H. (2017a, November 30). Personal interview via GoToMeeting with Jakob Moesgaard & Tue Hejlskov Larsen/Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Sally Chambers and Alejandra Michel.
Moesgaard, J. & Larsen, T. H. (2017b, November 30). Written answers given by the Danish Royal Library via Google Docs before the personal interview with Jakob Moesgaard & Tue Hejlskov Larsen/Interviewers: Emmanuel Di Pretoro, Friedel Geeraert, Gerald Haesendonck, Sally Chambers and Alejandra Michel.
National Archives. (n.d.-a). How to use the web archive. Retrieved from http://www.nationalarchives.gov.uk/webarchive/information/. Last accessed on 19/10/2017.
National Archives. (n.d.-b). UK Government web archive. Retrieved from http://www.nationalarchives.gov.uk/webarchive/. Last accessed on 31/10/2017.
National Library of Ireland. (2017a). NLI Review 2016. Retrieved from https://www.nli.ie/GetAttachment.aspx?id=011e629f-1a5a-4cde-91d7-8a62ccf84bef. Last accessed 9/10/2017.
National Library of Ireland. (2017b). Web Archive FAQ & Resources. Retrieved from https://www.nli.ie/en/web-archive-faq.aspx. Last accessed on 9/10/2017.
National Library of Ireland. (n.d.-a) NLI Web Archive: A record of the online life in Ireland. Retrieved from http://collection.europarchive.org/nli. Last accessed on 1/02/2018.
National Library of Ireland. (n.d.-b). Rights and Reproductions. Retrieved from https://www.nli.ie/en/rights-reproductions.aspx. Last accessed on 31/01/2018.
National Library of Ireland. (n.d.-c). Web Archive. Retrieved from https://www.nli.ie/en/web_archive.aspx. Last accessed on 31/01/2018.
National Library of Ireland. (n.d.-d). Web archive collections. Retrieved from http://www.nli.ie/en/udlist/web-archive-collections.aspx. Last accessed on 20/10/2017.
NCDD. (n.d.). Expertgroep webarchivering. Retrieved from http://www.ncdd.nl/kennis-en-advies/expertgroepen/expertgroep-webarchivering/. Last accessed on 08/12/2017.
Netarkivet.dk. (2016a). Selektive høstninger. Retrieved from http://netarkivet.dk/om-netarkivet/selektive-hostninger_2016/. Last accessed on 31/10/2017.
Netarkivet.dk. (2016b). Adgang til Netarkivet. Retrieved from http://netarkivet.dk/adgang/. Last accessed on 31/01/2018.
Netarkivet.dk. (2017). Brugermanual til Netarkivet. Retrieved from: http://netarkivet.dk/wp-content/uploads/2015/03/Netarkivet_Strategi_Langtidsbevaring_1.0_150115.pdf . Last accessed on 1/02/2018.
Nielsen, J. (2016). Using web archives in research - an introduction. Retrieved from http://www.netlab.dk/wp-content/uploads/2016/10/ Nielsen_Using_Web_Archives_in_Research.pdf. Last accessed on 18/01/2018.
Node-warc GitHub. (2018). N0taN3rd/node-warc: Parse And Create Web ARChive (WARC) files with node.js. Retrieved from https://github.com/N0taN3rd/node-warc. Last accessed on 20/04/2018.
Ogden, J., Halford, S. & Carr, L. (2017). Observing web archives. The case for an ethnographic study of web archiving. WebSci. June (25-28). https://doi.org/10.1145/3091478.3091506.
Posthumus A. and van Luin, J. (2017a, December 6). Personal interview via UC4all with Antal Posthumus and Jeroen van Luin/Interviewers: Eveline Vlassenroot and Friedel Geeraert.
Posthumus A. and van Luin, J. (2017b, December 6). Written answers given via Google Docs by the National Archive before the personal interview with Antal Posthumus and Jeroen van Luin/Interviewers: Eveline Vlassenroot and Friedel Geeraert.
Pywb GitHub. (2018). webrecorder/pywb: Core Python Web Archiving Toolkit for replay and recording of web archives https://pypi.python.org/pypi/pywb . Retrieved from https://github.com/webrecorder/pywb. Last accessed on 20/04/2018.
RESAW (Research Infrastructure for the Study of Archived Web Materials). (2012). About RESAW. Retrieved from http://resaw.eu/about/. Last accessed on 04/02/2018.
Reyes Ayala, B. (2013). Web archiving bibliography 2013. Texas: UNT Digital Library.
Roche, X. (2018). HTTrack Website Copier. Retrieved from http://www.httrack.com/. Last accessed on 20/04/2018.
Rosenthal, C. (2017, July). NetarchiveSuite. Retrieved from https://sbforge.org/display/NAS/NetarchiveSuite. Last accessed on 20/04/2018.
Ryan, M. (2017, November 16). Personal interview via GoToMeeting with Maria Ryan/Interviewers: Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file].
Schneider, S. M., & Foot, K. A. (2005). Web sphere analysis: An approach to studying online action. In C. Hine (Ed.), Virtual Methods - Issues in Social Research on the Internet. Oxford: Berg Publishers, 157–171.
Schneider, S., & Foot, K. (2008). Archiving of internet content. In W. Donsbach (Ed.), The international encyclopedia of communication. Oxford: Blackwell. https://doi.org/10.1002/9781405186407.wbieca051.
Schroeder, R., & Brügger, N. (2017). Introduction: The web as history. In N. Brügger & R. Schroeder (Eds.), The web as history. Using web archives to understand the past and present (pp. 1–19). London: UCL Press.
Sierman, B., & Teszelszky, K. (2017). How can we improve our web collection? An evaluation of web archiving at the KB National Library of the Netherlands (2007-2017). Alexandria, 27, 94–107. https://doi.org/10.1177/0955749017725930.
Social Feed Manager. (2018). Social Feed Manager. Retrieved from https://gwu-libraries.github.io/sfm-ui/. Last accessed on 20/04/2018.
Tanésie, P. & Aubry, S. (2017, December 12). Le dépôt légal du web à la BnF : organisation, procédures et outils. Presentation given at the Bibliothèque nationale de France, Paris.
Tanésie, P., Aubry, S., Wendland, B. (2017, December 12), Personal interview at the BnF with Pascal Tanésie, Sara Aubry & Bert Wendland/Interviewers: Sally Chambers, Rolande Depoortere, Friedel Geeraert, Alejandra Michel, and Eveline Vlassenroot [mp3 file].
Teszelszky, K. (2017a, November 8). Personal interview via GoToMeeting with Kees Teszelszky/Interviewers: Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file].
Teszelszky, K. (2017b, November 8). Written answers given via Google Docs by the KB Nederland before the personal interview with Kees Teszelszky/Interviewers: Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot.
The National Archives. (n.d.). UK Government Web Archive. Retrieved from http://www.nationalarchives.gov.uk/webarchive/. Last accessed on 1/02/2018.
Twarc GitHub. (2018). DocNow/twarc: A command line tool (and Python library) for archiving Twitter JSON. Retrieved from https://github.com/DocNow/twarc. Last accessed on 20/04/2018.
UK Web Archive. (n.d.-a). About. Retrieved from https://www.webarchive.org.uk/ukwa/info/about. Last accessed on 30/10/2017.
UK Web Archive. (n.d.-b). Browse. Retrieved from https://www.webarchive.org.uk/ukwa/browse. Last accessed on 1/02/2018.
UK Web Archive. (n.d.-c). Frequently asked questions. Retrieved from https://www.webarchive.org.uk/ukwa/info/faq. Last accessed on 30/10/2017.
UK Web Archive. (n.d.-d). SHINE. Retrieved from https://www.webarchive.org.uk/shine. Last accessed on 05/02/2018.
Van de Sompel, H., Nelson, M.L., Sanderson, R. (2013). RFC 7089: HTTP Framework for Time-Based Access to Resource States—Memento. Retrieved from http://tools.ietf.org/rfc/rfc7089.txt. Last accessed on 20/04/2018.
WALK (Web Archives for Longitudinal Knowledge). (n.d.). Datasets. Retrieved from: http://webarchives.ca/datasets. Last accessed on 04/02/2018.
WARCAT GitHub. (2017). chfoo/warcat: Tool and library for handling Web ARChive (WARC) files. Retrieved from https://github.com/chfoo/warcat. Last accessed on 20/04/2018.
Warcio GitHub. (2017). webrecorder/warcio: Streaming WARC/ARC library for fast web archive IO https://pypi.python.org/pypi/warcio . Retrieved from https://github.com/webrecorder/warcio. Last accessed on 20/04/2018.
Warctools GitHub. (2016). internetarchive/warctools: warctools. Retrieved from https://github.com/internetarchive/warctools. Last accessed on 20/04/2018.
Webber, J. (2017, November 16). Personal interview via GoToMeeting with Jason Webber/Interviewers: Sally Chambers, Gerald Haesendonck, Alejandra Michel and Eveline Vlassenroot. [M4A file].
Weber, M. S. (2017). The tumultuous history of news on the web. In N. Brugger & R. Schroeder (Eds.), The web as history. Using web Archives to understand the past and the present (pp. 83–100). London: UCL Press.
Webrecorder. (n.d.). Collect & revisit the web. Retrieved from https://webrecorder.io/. Last accessed on 19/02/2019.
Webrecorder Player for Desktop Github. (2018). webrecorder/webrecorderplayer-electron: Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder). Retrieved from https://github.com/webrecorder/webrecorderplayer-electron. Last accessed on 20/04/2018.
Webster, P. (2017). Users, technologies, organisations: Towards a cultural history of world web archiving. In N. Brügger & N. (Eds.), Web 25. Histories from 25 years of the world wide web (pp. 175–190). New York: Peter Lang.
Acknowledgements
The research outlined in this article was conducted in the context of the PROMISE-project. This project received funding from the Belgian Science Policy Office (BELSPO) in December 2016, through their Belgian Research Action through Interdisciplinary Networks (BRAIN) research programme, for a 24-month period. The project was initiated by the Royal Library of Belgium and the State Archives of Belgium and the project consortium also includes the universities of Ghent and Namur and the Information and Documentation School of the Brussels-Brabant Institute of Higher Education (HE2B IESSID). We would like to thank the interviewees and their colleagues for taking the time to answer our many questions.
Author information
Authors and Affiliations
Corresponding authors
List of institutions and representatives consulted
List of institutions and representatives consulted
-
National Library of The Netherlands: Kees Teszelszky (Researcher web archiving, Digital Preservation Department)
-
National Archive of The Netherlands: Antal Posthumus (Adviser recordkeeping, Directie Infrastructuur & Advies) and Jeroen van Luin (Acquisition and Maintenance of Digital Archives)
-
National Library of France (BnF): Pascal Tanésie (Assistant to the head of the department of digital legal deposit), Sara Aubry (Web Archiving Project Manager, IT department) and Bert Wendland (IT Department)
-
National Library of Luxembourg: Yves Maurer (Webarchiving Technical Manager) and Ben Els (Digital Curator)
-
The Royal Danish Library: Jakob Moesgaard (Specialkonsulent, Department of Digital Legal Deposit and Preservation) and Tue Hejlskov Larsen (IT analyst)
-
The UK National Archives: Tom Storrar (Head of Web Archiving) and Claire Newing (Web Archivist)
-
The British Library: Jason Webber (Web Archiving Engagement and Liaison Manager)
-
Arquivo.pt.: Daniel Gomes (Head of Arquivo.pt., the Portuguese web-archive, Advanced Services Department)
-
National Library of Ireland (NLI): Maria Ryan (Web Archivist)
Rights and permissions
About this article
Cite this article
Vlassenroot, E., Chambers, S., Di Pretoro, E. et al. Web archives as a data resource for digital scholars. Int J Digit Humanities 1, 85–111 (2019). https://doi.org/10.1007/s42803-019-00007-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42803-019-00007-7