Abstract
Over the last two decades publishing and distributing content on the Web has become a core part of society. This ephemeral content has rapidly become an essential component of the human record. Writing histories of the late 20th and early 21st century will require engaging with web archives. The scale of web content and of web archives presents significant challenges for how research can access and engage with this material. Digital humanities scholars are advancing computational methods to work with corpora of millions of digitized resources, but to fully engage with the growing content of two decades of web archives, we now require methods to approach and examine billions, ultimately trillions, of incongruous resources. This article approaches one seemingly insignificant, but fundamental, aspect in web design history: the use of tiny transparent images as a tool for layout design, and surfaces how traces of these files can illustrate future paths for engaging with web archives. This case study offers implications for future methods allowing scholars to engage with web archives. It also prompts considerations for librarians and archivists in thinking about web archives as data and the development of systems, qualitative and quantitative, through which to make this material available.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
See the wiki for OpenWayback at https://github.com/iipc/openwayback/wiki
See the documentation for pywb at https://pywb.readthedocs.io/en/latest/manual/apps.html#wayback-pywb.
See the file format description at https://www.loc.gov/preservation/digital/formats/fdd/fdd000236.shtml.
See the Internet Archive documentation at https://webarchive.jira.com/wiki/spaces/ARS/pages/90997503/WAT+Overview+and+Technical+Details.
See the UK Web Archive Link Analysis visualization https://www.webarchive.org.uk/ukwa/visualisation/ukwa.ds.2/linkage and the ongoing Web Archives for Longitudinal Knowledge (WALK) Project by partners at the University of Waterloo, the University of Alberta, and York University http://webarchives.ca/ for more information.
For final reports from the BUDDAH project, see the blog https://buddah.projects.history.ac.uk/2016/04/.
References
Anderson, I. (2008). History and computing. Making History. Retrieved from http://www.history.ac.uk/makinghistory/resources/articles/history_and_computing.html.
Archer, J., & Jockers, M. L. (2016). The bestseller code: Anatomy of the blockbuster novel. New York: St. Martin’s Press.
Bailey, J., & Taylor, N. (2017). Web Archiving Systems APIs (WASAPI) for systems interoperability and collaborative technical development. Paper presented at the CNI Fall 2017, Washington DC, US.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022.
Brinkmann, M. (2007). How to avoid saving spaceball.gif at Flickr. gHacks Tech News. Retrieved April 25, 2018 from https://www.ghacks.net/2007/09/29/how-to-avoid-saving-spaceballgif-at-flickr/.
Brügger, N. (2017). The archived website and website philology. Nordicom Review, 29(2), 155–175. https://doi.org/10.1515/nor-2017-0183.
Clement, T. E., Auvil, L., & Tcheng, D. (2016). High performance sound technologies for access and scholarship. Retrieved from http://hdl.handle.net/2152/33295.
Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 363–370). Michigan: AnnArbor. https://doi.org/10.3115/1219840.1219885.
Gallinger, M., & Chudnov, D. (2016). Library of Congress lab: Library of Congress digital scholars lab pilot project report. Washington, DC: The Library of Congress Retrieved from http://digitalpreservation.gov/meetings/dcs16/DChudnov-MGallinger_LCLabReport.pdf.
Jackson, A. (2015). Tracing clear.gif: Jupyter Notebook. UK Web Archive Github Repository. https://nbviewer.jupyter.org/github/ukwa/halflife/blob/master/clear/tracingclear.gif.ipynb.
Jockers, M. L. (2013). Macroanalysis: Digital methods and literary history. Urbana: University of Illinois Press.
Johnson, P. (2011). Digital folklore with Olia Lialina & Dragan Espenschied: The transcript. Retrieved from http://artfcity.com/2011/05/13/digital-folklore-with-olia-lialina-dragan-espenschied-the-transcript/.
Kruse, W. G., II, & Heiser, J. G. (2001). Computer forensics: Incident response essentials. Boston: Addison–Wesley Professional.
Lialina, O. (2013). Olia’s collection of clear/blanc/0/transparent/cover/beacon GIFs. Retrieved from http://www.collection.evan-roth.com/olia_lialina/clear.gif/.
Lin, J., Milligan, I., Wiebe, J., & Zhou, A. (2017). Warcbase: Scalable analytics infrastructure for exploring web archives. Journal on Computing and Cultural Heritage (JOCCH), 10(4), 22.
Lorang, E. M., Soh, L.-K., Datla, M. V., & Kulwicki, S. (2015). Developing an image-based classifier for detecting poetic content in historic newspaper collections. D-Lib Magazine, 21(7/8). https://doi.org/10.1045/july2015-lorang.
Mears, J. (2017). Read collections as data report summary. Retrieved April 25, 2018 from https://blogs.loc.gov/thesignal/2017/02/read-collections-as-data-report-summary/.
Milligan, I., Ruest, N., & Lin, J. (2016). Content Selection and Curation for Web Archiving: The Gatekeepers vs. The Masses. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries (pp. 107–110). New York, NY, USA: ACM. https://doi.org/10.1145/2910896.2910913
Newman, D. J., & Block, S. (2006). Probabilistic topic decomposition of an eighteenth-century American newspaper. Journal of the Association for Information Science and Technology, 57(6), 753–767.
Owens, T. (2015). Designing online communities: How designers, developers, community managers, and software structure discourse and knowledge production on the Web. New York: Peter Lang.
pabouk. (2013). How does Google’s cleardot.gif track email recipients with a generic URL? Super User. Retrieved April 25, 2018 from https://superuser.com/questions/658098/how-does-googles-cleardot-gif-track-email-recipients-with-a-generic-url.
Padilla, T. (2017). On a collections as data imperative. Retrieved April 25, 2018 from http://digitalpreservation.gov/meetings/dcs16/tpadilla_OnaCollectionsasDataImperative_final.pdf.
Rønn-Jensen, J. (2006). Who invented the spacer.gif? Retrieved from http://justaddwater.dk/2006/03/03/who-invented-the-spacergif/.
Rønn-Jensen, J. (2007). Who invented the spacer.gif (Part 2). Retrieved from http://justaddwater.dk/2007/02/11/who-invented-the-spacergif-part-2/.
Siegel, D. (1997). The Web is ruined and I ruined it. XML.Com. Retrieved from https://www.xml.com/pub/a/w3j/s1.people.html.
Smith, R. M. (1999). The Web Bug FAQ. Retrieved April 25, 2018 from https://w2.eff.org/Privacy/Marketing/web_bug.html.
Smith, D. A., Cordell, R., & Dillon, E. M. (2013). Infectious texts: Modeling text reuse in nineteenth-century newspapers. In Big Data, 2013 IEEE International Conference on (pp. 86–94). IEEE.
Underwood, T. (2014). Theorizing research practices we forgot to theorize twenty years ago. Representations, 127(1), 64–72. https://doi.org/10.1525/rep.2014.127.1.64.
Author information
Authors and Affiliations
Corresponding author
Additional information
The following research represents the opinions, perspectives and ideas of the authors. It does not necessarily represent the perspectives of any institutions with which they are affiliated.
Rights and permissions
About this article
Cite this article
Owens, T., Thomas, G.H. The invention and dissemination of the spacer gif: implications for the future of access and use of web archives. Int J Digit Humanities 1, 71–84 (2019). https://doi.org/10.1007/s42803-019-00006-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42803-019-00006-8