Encyclopedia of Big Data

Living Edition
| Editors: Laurie A. Schintler, Connie L. McNeely

Data Center

  • Mél HoganEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-32001-4_299-1

Synonyms

Definition/Introduction

Big Data requires big infrastructure. A data center is largely defined by the industry as a facility with computing infrastructure, storage, and backup power. Its interior is usually designed as rows of racks containing stacked servers (a motherboard and hard drive). Most data centers are designed with symmetry in mind, alternate between warm and cool isles, and are dimly lit and noisy. The data center functions as a combination of software and hardware designed to process data requests – to receive, store, and deliver – to “serve” data, such as games, music, emails, and apps, to clients over a network. It has redundant connections to the Internet and is powered from multiple local utilities, diesel generators, battery banks, and cooling systems. Our ever-growing desire to measure and automate our world has seen a surge in data production, as Big Data.

Today, the data center is considered the heart and brain of Big Data and the Internet’s networked infrastructure. However, the data center would be defined differently across the last few decades as it underwent many conceptual and material transformations since the general-purpose computer was first imagined, and instantiated, in the 1940s.

Knowledge of the modern day data center’s precursors is important because each advancement marks an important shift from elements internal to external to the apparatus, namely, in the conception of storage as memory. Computers, as we now use them, evolved from the mainframe computer as data center and today supports and serves Big Data and our digital networked communications from afar. Where and how societal data are stored has been always an important social, historical, and political question, as well as one of science and engineering, because the uses and deployments of data can vary based on context, governmental control and motivations, and level of public access.

One of the most important early examples of large-scale data storage – but which differs from today’s data center in many ways – was ENIAC (Electronic Numerator, Integrator, Analyzer, and Computer), built in 1946 for the US Army Ballistic Research Laboratory, to store artillery firing codes. The installation took up 1800 sq. ft. of floor space, weighed 30 t, was expensive to run, buggy, and very energy intensive. It was kept in use for nearly a decade.

In the 1960s, there was no longer a hard distinction between processing and storage – large mainframes were also data centers. The next two decades saw the beginning and evolution of microcomputers (now called “servers”), which would render the mainframe and data center ostensibly, and if only temporarily, obsolete. Up until that point, mainframe computers used punch cards and punch tape as computer memory, which was pioneered by the textile industry for use in mechanized looms. Made possible by the advent of integrated circuits, the 1980s saw a widespread adoption of personal computers at the home and office, relying on cassette tape recorders, and later, floppy disks as machine memory. The mainframe computer was too big and too expensive to run, and so the shift to personal computing seemed to offer mitigation of these issues, which would see significant growth once again in the 1990s due to the widespread implementation of a new client-server computing model.

Today’s Data Center

Since the popularization of the public Internet in the 1990s, and especially the dot-com bubble from 1997 to 2000, data have exploded as a commodity. To put this commodity into perspective, each minute of every day, more than 200 million emails are sent, more than 2 million Google searches are performed, over 48 h of video is uploaded the YouTube, and more than 4 million posts appear on Facebook. Data are exploding also at the level of real-time data for services like Tinder, Uber, and AirBnB, as well as the budding self-driving car industry, smart city grids and transportation, mass surveillance and monitoring, e-commerce, insurance and healthcare transactions, and – perhaps most significantly today – the implementation of the Internet of Things (IoT), virtual and augmented reality, and gaming. All of these cloud-based services require huge amounts of data storage and energy to operate. However, despite the growing demand for storage – considering that 90% of data have been created in the last 2 years – data remain largely relegated to the realm of the ephemeral and immaterial in the public imaginary, which is a conception further upheld by the metaphor of “the cloud” and “cloud computing.” Cloud servers are no different than other data centers in terms of their materiality. They differ simply in how they provide data to users. The cloud relies on virtualization and a cluster of computers as its source to break down requests into smaller component parts (to more quickly serve up the whole) without all data (as packets) necessarily following the same physical/geographical path.

For the most part, users cannot access the servers on which their data and content are stored, which means that questions of data sovereignty, access, and ownership are also important threads in the fabric of our modern sociotechnical communication system. By foisting a guarded distance between users and their data, users are disconnected also from a proper understanding of networked culture, and the repercussions of mass digital circulation and consumption. This distance serves companies’ interests insofar as it maintains an illusion of fetching data on demand, in and from no apparent space at all, while also providing a material base that conjures up an efficient and secure system in which we can entrust our digital lives.

In reality, there are actual physical servers in data centers that contain the world’s data (Neilson et al. 2016). The data center is part of a larger communications infrastructure that stores and serves data for ongoing access and retrieval. The success of the apparatus relies on uninterrupted and seamless transactions at increasingly rapid speeds. The data center can take on various forms, emplacements, and purposes; it can be imagined as a landing site (the structure that welcomes terrestrial and undersea fiber optics cables), or as a closet containing one or two locally maintained servers. But generally speaking, the data center we imagine (if we imagine one at all) is the one put on virtual display by Big Tech companies like Google, Microsoft, Facebook, Apple, Amazon, etc. (Vonderau and Holt 2015). These companies display and curate images of their data centers online and offer virtual tours to highlight their efficiency and design – and increasingly their sustainability goals and commitments to the environment. While these visual representations of data center interiors are vivid, rich, and often highly branded, the data center exteriors are for the most part boxy and nondescript. The sites are generally highly monitored, guarded, and built foremost as a kind of fortress to withstand attacks, intruders, and security breaches.

Because the scale of data centers has gotten so large, they are often referred to as server farms, churning over data, day in and day out. Buildings housing data centers can be the size of a few football fields, require millions of gallons of water daily to cool servers, and use the same amount of electricity as a midsize US town. Smaller data centers are often housed in buildings leftover and adapted from defunct industry – from underground bunkers to hotels to bakeries to printing houses to shopping malls. Data centers (in the USA) have been built along former trade routes or railroad tracks and are often developed in the confusing context of a new but temporary market stability, itself born of economic downturns in other local industries (Burrington 2015).

Advances have been made in the last 5 years to reduce the environmental impacts of data centers, at the level energy use in particular, and this is done in part by locating data centers in locations with naturally cooler climates and stable power grids (such as in Nordic countries). The location of data centers is ultimately dependent on a confluence of societal factors, of which political stability, the risk of so-called natural disasters, and energy security remain at the top.

Conclusion

Due in part to the secretive nature of the industry and the highly skilled labor of the engineers and programmers involved, scholars interested in Big Data, new media, and networked communications have had to be creative in their interventions. This has been accomplished by drawing attention to the myth of the immaterial as a first steps to engaging every day users and politicizing the infrastructure by scrutinizing its economic, social, and environmental impacts (Starosielski 2015). The data center has become a site of inquiry for media scholars to explore and counter the widespread myths about the immateriality of “the digital” and cloud computing, its social and environmental impacts, and the political economy and ecology of communications technology more broadly.

Without denying them their technological complexities, data centers, as we now understand them, are crucial components of a physical, geographically located infrastructure that facilitates our daily online interactions on a global scale. Arguably, the initial interest in data centers by scholars was to shed light on the idea of data storage – the locality of files, on servers, in buildings, in nations – and to demonstrate the effects of the scale and speed of communication never before matched in human history. Given the rising importance of including the environment and climate change in academic and political discourse, data centers are also being assessed for their impacts on the environment and the increasing role of Big Tech in managing natural resources. The consumption rates of water and electricity by the industry, for example, are considered a serious environmental impact because resources have, until recently, been unsustainable for the mass upscaling of its operations. Today, it is no longer unusual to see Big Tech manage forests (Facebook), partner with wastewater management plants (Google), use people as human Internet content moderators/filters (Microsoft) or own large swaths of the grid (Amazon) to power data centers. In many ways, the data industry is impacting both landscape and labor conditions in urban, suburban, rural, and northern contexts, each with its own set of values and infrastructural logics about innovation at the limits of the environment (Easterling 2014).

Cross-References

Further Readings

  1. Burrington, I. (2015). How railroad history shaped internet history. The Atlantic, November 24. http://www.theatlantic.com/technology/archive/2015/11/how-railroad-history-shaped-internet-history/417414.
  2. Easterling, K. (2014). Extrastatecraft: The power of infrastructure space. London: Verso.Google Scholar
  3. Neilson, B., Rossiter, N., & Notley, T. (2016). Where’s your data? It’s not actually in the cloud, it’s sitting in a data centre. August 30, 2016. Retrieved 20 Oct 2016, from http://theconversation.com/wheres-your-data-its-not-actually-in-the-cloud-its-sitting-in-a-data-centre-64168.
  4. Starosielski, N. (2015). The undersea network. Durham: Duke University Press Books.CrossRefGoogle Scholar
  5. Vonderau, P., & Holt, J. (2015). Where the internet lives: Data centers as cloud infrastructure. In L. Parks & N. Starosielski (Eds.), Signal traffic: Critical studies of media infrastructures. Champaign: University of Illinois Press.Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of Communication, Media and FilmUniversity of CalgaryCalgaryCanada