Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

In prior chapters we discussed the optimization of the computing infrastructure inside the data center for energy efficiency. However, within the framework of the entire data center, this is only part of the energy story.

Data Center Management and Power Distribution

As mentioned in Chapter 1, the infrastructure surrounding computers in the data center is equally important to consider in the context of overall energy efficiency. If the infrastructure is required to support the computing, it needs to be included in the overall energy equation. Data center infrastructure itself has multiple missions; along with sheltering the computers from natural elements like humidity, extreme temperatures, and natural disasters, it provides office space for the engineers and technicians who operate the data center, and it manages the computing resources. The data center infrastructure handles energy delivery to the computing resources and the disposal of waste heat from them. It also fulfills a mission of resiliency by providing both physical security and some form of survivability planning in the event of power outages.

In this chapter, we will touch on many of these aspects, especially as they pertain to energy management in the data center.

Data Center Facilities

Data center facilities vary widely in form, scale, and architecture, depending on local conditions, economics, and data center requirements. For instance, data centers can be housed in large purpose-built structures, as special purpose spaces within existing buildings, or in previously existing buildings adapted for a new purpose.

Large purpose-built data centers—such as those built by large Internet companies like Google, Apple, Facebook, and Microsoft—tend to be located in geographies that provide low-cost power and have close proximity to large populations centers (with proximity generally measured by ping times of less than 10–20 milliseconds) they serve. These data centers may be built with facility powers ranging from approximately 1 to 20 megawatts.

By way of contrast, many special purpose and general purpose data centers are housed within existing buildings. These data centers tend to be smaller and tend to be built to serve local or specialized purposes. Although smaller data centers can be built to high efficiency energy efficiency standards, in many cases these are of secondary importance to other considerations such as security, proximity to a specific physical location, or simply convenience.

Among the most interesting facilities are those built in buildings either renovated or adapted from another purpose. For instance, Google recently built a data center inside a converted paper mill in Finland.Footnote 1 Because of changing demand for paper due to shifts in reading habits and the increased use of tablet computers, obsolete or excess paper mills, which are fitted to supply large amounts of electricity and cooling water, can make good candidates for alternative sites for data centers.Footnote 2

Other data centers have been built inside of underground cavesFootnote 3 or on mountain tops,Footnote 4 and some have even been proposed to float in off-shore barges.Footnote 5 In each case, although the physical infrastructure of these facilities is quite different, the need to supply large amounts of electricity and ample capacity to remove the energy as waste heat are always common factors. It is the engineering of the power and cooling infrastructure that really distinguishes the efficiency of the data center.

Power Infrastructure

Although data centers may differ in mission, from providing network edge services to core compute to providing highly secure data processing and storage, in almost all cases, they require highly conditioned uninterruptable power to meet the high availability requirements their customers demand. The power to the data center needs to be highly conditioned to protect the servers, storage systems, and networking equipment in the data center from power transients. Both low-power conditions and power surges can cause equipment reliability issues and extended equipment downtime depending on duration and severity. Figure 9-1 shows a highly schematic layout of the connection of the electrical grid to the data center. When electrical grid power is interrupted, the uninterruptable power supply (UPS) assumes the load of the data center until the back-up generators can be started and reach full capacity. Due to cost constraints, it’s typical to support only the computing equipment on an uninterruptable basis.

Figure 9-1.
figure 1

Schematic of the connection of the electrical grid to the data center

Power Distribution Efficiency

Although most of this book has concerned itself with the energy efficiency of the servers themselves, a key factor in overall data center efficiency is the energy loss in getting energy to the servers themselves. So-called distribution losses (bringing power from a remote generation facility to the site of use) can range from a few percent in cases where the load is close to the generating capacity—such as the large data centers in Quincy, Washington, and The Dalles, Oregon, which are within a few miles of hydroelectric energy sources—to 10%–20% when data centers are several hundred miles from electricity generation. Although these losses can be extremely important in overall efficiency, addressing them relies primarily on the siting of the data center facility, which is outside the scope of this book.

Power Conditioning

Power conditioning for the data center is among the most important missions of the facility. The power for the servers needs to be both “clean,” meaning free from spikes or interruptions that might affect the availability of the servers in the data center, and economical, so the data center can achieve its mission at the lowest feasible cost. In this section, we’ll look at the high-level topologies of two ways this can be achieved, in what are called AC and DC power distribution.

In AC power distribution, power to the server is supplied as AC voltage, typically at 208 VAC in the United States. AC power distribution is by far the most dominant in the industry. The flow of energy from the grid first powers the UPS system and then goes to the rack and row-level power distribution units (PDUs), where power is metered to the individual servers while at the same time protecting adjacent servers from electrical faults at any individual server.

Figure 9-2 shows a typical power distribution diagram for an AC data center. Power from the electrical grid feeds the UPS), which in turn provides clean uninterrupted power to the rack or row-level PDUs. These units convert power to levels usable by the servers while at the same time protecting the facility from faults at individual servers. The power and voltage conversions are highlighted. The UPS is shown with a bypass to allow more efficient operation.

Figure 9-2.
figure 2

A typical power distribution diagram for an AC data center

One significant concern with this standard topology is the repeated conversion between AC and DC voltages between the grid and the server. Each conversion can results in a several percentage point sacrifice in efficiency, which can contribute to higher electricity costs.

Two of the conversions in the UPS—AC to DC and then back from DC to AC—can be eliminated by using a bypass or, more colloquially, eco-mode of the UPS. Although there may be concerns about the switch over time between the bypass and battery power in the event of a power failure, these concerns have been largely mitigated by technical development of suppliers. Typical switch-over times are now about one quarter of a power cycle, far below the damage threshold for the IT equipment in the data center. The Green Grid has adopted the use of a bypass as a “best known method.”Footnote 6

As an alternative to AC, there is compelling evidence that DC power to the data center can improve overall efficiency.Footnote 7 The primary efficiency gains come from the elimination of inefficient conversions from AC to DC. An example topology for this is shown in Figure 9-3. Power from the electrical grid feeds a converted PDU, which provides 48 VDC to the batteries and the servers. Voltage and AC to DC conversion steps are highlighted.

Figure 9-3.
figure 3

A typical power distribution diagram for a DC data center

The DC power infrastructure is inherently simpler than the AC power infrastructure because the number of required conversions, and hence expensive high-reliability high-power electrical gear, is reduced. However, the supply and experience base for AC power equipment is much larger since AC has been and remains the dominant industry standard. For instance, although DC can and has been used safely for years in the communications world, technicians in more traditional data centers are not currently trained to use it. Thus, although there may be some theoretical advantages of one approach over the other, the reality is that both high efficiency and low total cost can be achieved in both approaches provided the proper engineering practices, such as the use of UPS bypass in the AC case, are implemented.

Back-up Systems

Data center services need to be maintained even in the event of a power outage. For this, data centers rely on back-up power systems comprised, generally, of a short-term and longer term backup system. The short-term system is put into place to react quickly to fluctuations in supply and provide power, often for only a few minutes, until the high capacity longer term backup system can take over the full electrical load of the data center.

Uninterruptable power supplies (UPS) have long been built using large arrays of lead-acid batteries, and many data centers still use this simple but reliable technology. Lead-acid batteries, which are essentially like the battery in as typical automobile, can be purchased on the open market. The technology is mature and has not changed significantly in years. A downside to using batteries is that they need to be maintained and replaced on a regular basis to provide the intended high reliability of a back-up system.Footnote 8

As a result, data center operators have sought and implemented alternative UPS schemes. One popular alternative is a high-speed rotating fly wheel system. In this case, rather than storing energy chemically, as in a battery, energy is stored kinetically through a high-speed, high-mass rotating flywheel. Flywheel systems take up less space than a battery system and require significantly less maintenance. But they also provide only a few seconds (in the range of 10–20, though this can vary depending on the size of the installation and load) of autonomy, implying that the back-up generators need to start up properly the first time. In addition, the size of the back-up generators needs to be increased slightly because, in addition to running the data center, they must also re-energize the flywheel system in a short amount of time to restore facility back-up.Footnote 9 Flywheel systems have the additional advantage that they run on AC supply, eliminating the needs for DC conversion.

The most common type of long-term back-up power for a data center is a diesel generator. The use of diesel generators is very common not only in data centers but also in other critical facilities such as hospitals, and thus they have a well-understood maintenance record and it is easy to find repair experts. Typically, diesel generators can represent about 10% of the capital cost of a data center. Although some proposals for reducing the cost of the generator capacity through intelligent IT have been made, these have not been widely adopted.Footnote 10

There has been some innovation in backup power systems in data centers to avoid both cost and some of the downsides of having to operate large diesel generating plants (with the concomitant exhaust and noise) on a regular basis. For instance, the Facebook data center in Sweden forgoes about 70% of the typical back-up generator capacity because they are able to take advantage of redundant electrical grids in the area. This approach is brilliant and easily provides very high reliability to the power network at a very economical cost scale. However, redundant electrical grids are not common and thus this approach is of limited use in most cases.

Another alternative to diesel generators are fuel cells. Fuel cells convert fuel (typically either hydrogen or methane) to energy in an electrochemical reaction. Fuel cells have many advantages over other power sources. In addition to be relatively compact and clean, they can be brought close to the load, eliminating grid and distribution losses. Indeed, some data center operators have considered eliminating grid energy entirely for this reason, reporting favorable overall total cost of ownership under assumptions for realistic cost parameters.Footnote 11

The eBay data center in Utah, commissioned in 2013,Footnote 12 gets most of its power from natural gas fuel cells built by Bloom Energy. According to eBay, the fuel cells reduce CO2 emissions about 50% and also increase the reliability of the data center. Indeed, the fuel cells are the primary source of energy for the data center, reducing costs associated with back-up generators and UPS systems.

Cooling Infrastructure

Providing power to the servers is an important side of the data center energy equation. On the opposite side, equally important, is the removal of all the waste heat generated by the servers. It’s important to note that all the energy used to run the servers (and all the other equipment in the data center) needs to be removed as waste heat. That implies, for instance, that for a 10 megawatt data center load, exactly 10 megawatts of waste heat need to be dissipated to balance the energy input to the facility.

It’s instructive to understand the evolution of data center cooling since it tells a strong story about advancing the infrastructure side of overall data center efficiency. Typical data centers build in the 1990s mainly relied on central computer room air conditioning (CRAC) units to provide cooling. Warm air coming from racks from servers was pulled into the CRAC units, chilled to temperatures as low as 60 degrees Fahrenheit, and pushed back out into the room through perforated raised floor tiles. This served adequately, though hot spots did turn up (often due to the poor uniformity of air circulation), which required special treatment. In addition, cold air entering the computer room was instantly mixed with warmer air, reducing the effectiveness of the cooling units.

Figure 9-4 shows the layout of two data center types. The top figure is a typical “ballroom” configuration typical prior to the year 2000. Newer data centers, such as the one shown on the bottom of Figure 9-4, segregate warm and cold air and use local ambient air to economize operations.

Figure 9-4.
figure 4

Two data center types: A typical, old “ballroom” configuration (top), and today’s data center (bottom)

Starting in the early 2000s, engineers realized that segregating warm and cold air from each other by pushing cold air up and behind the server and then pulling the warm air out through the front of the server could improve the efficiency of the cooling units significantly. For instance, a study by T-Systems and Intel found that segregating hot and cold air could reduce the power usage effectiveness (PUE) of a model data center from a rating of 1.8 to around 1.3, a significant reduction in infrastructure energy use.Footnote 13 The work showed that even small air leaks, if not controlled, could have a significant effect on reducing the efficiency of the infrastructure.

The third major step in the evolution of data center infrastructure was the advent of what is called free-air cooling or the use of outside air to cool the data center. In many climate locations, outside air temperatures and humidity levels are low enough to effectively cool severs without additional help from air conditioners, provided air flows are maintained. There are many examples of data centers built to these standards, including the famous “chicken coop” data center, which relies on the tendency of warm air to rise where cross winds are, and then these move the warm air out of the building.Footnote 14 The industry has studied the capacity of different climates to support natural air cooling and studies have been published by the Green Grid where the effects of an updated AHRAE standard to allow wider temperature and humidity ranges were considered.Footnote 15

The final stage of data center facility evolution is the advent of the high-temperature data center. Silicon and computer hardware components can tolerate higher temperatures than humans can. Studies have shown that even off-the-shelf components can operate safely at 40°C.Footnote 16 To further reduce cooling, there is a push to provide operational capability at 50°C.

Simplified Total Cost Models of Cost and Compute Infrastructure

A significant amount of work is available on total cost of ownership (TCO) models, including an excellent (and publicly available) one developed by Jonathan Koomey that gives significant detail for determining accurate cost benchmarks.Footnote 17

In this section, rather than provide detailed models, we focus on a higher level perspective to help you understand the larger trends in cost. There is an inherent danger in using simplified models to make what can be relatively complex business decisions, but using them to help shape insight and recognize macroscopic trends can be useful.

Figure 9-5 shows a simplified TCO model of a 10,000 square foot data center. The model calculates the overall TCO of the entire data center, including building, infrastructure, and IT equipment costs. As important as what is in the model is what is not included. Important cost parameters like architectural choices, software licenses, labor and warranty coverage, insurance, and taxes could tip the conclusions of the model substantially and would need to be added for any responsible business decision.

Figure 9-5.
figure 5

A simplified TOC model of a 10,000 square foot data center calculated for nominal value

At the most basic level, a cost model takes into account the capital costs (generally assumed to be incurred once and then amortized over some standard depreciation period to annualize the expense) and ongoing operating expense, which are paid on an as-used basis. Important parameters are highlighted in Figure 9-5. Figure 9-6 shows that operational electrical costs are a sizeable fraction of the overall TCO of a data center. Both the annual electrical energy cost and the electrical and cooling infrastructure scale directly with the watts consumed by the servers, thus a net reduction in server power can save on both operational and capital costs in the data center.

Figure 9-6.
figure 6

The operational electrical costs are a sizeable fraction of the overall TCO of a data center

The ranges of electrical infrastructure cost depend primarily on architectural choices as mentioned earlier. Decisions, such as the kind and number of back-up generators, electrical redundancy, layout, and location, can all affect the choices of costs widely. In general, the range of $8.00 to $20.00 per watt represents reasonable estimated bounds, though deviations both above and below this range are possible.

Data center efficiencies in the range from PUE = 1.1 up to 3.0, where PUE is the power usage effectiveness, as defined in Chapter 1, are known in the industry. Generally the PUE in older facilities is much higher than it is in more modern facilities that use ambient cooling to reduce overhead costs.

Performance per Watt per Dollar

It’s common to hear people concerned about data center costs talk in terms of “performance per watt per dollar,” yet there seems to be no good description of this phrase in published literature. In this section, we briefly discuss where terms with the units of performance per watt per dollar come in to the cost of ownership equations.

In order for you to understand this, we need to add one more dimension, which we call computational work. Computational work is not the same a physical work, but it can nevertheless be thought of in a similar way—making a physical change on a system (in the case of a computer, the bits). This physical change requires energy, and thus the energy required to make the change can be equated to work done.

For the sake of the present case, we’ll consider the computational work rate to be done by a data center as a number, T, of transactions per second. The specific type of transaction isn’t important, and we’ll make a simplifying assumption that these transactions are uniform. If the capacity of the server, called its performance, is p transactions per second, then the number of servers, N, required in the data center is just

N = T/p

Now the total power use of the data center, Ptotal, will be the power required by the server, Pserver, plus the power used by the infrastructure, or

P total = N * P server  * PUE

Which, upon substituting the relationship above, becomes

P total = T * P server  * PUE/p

Noting that Energy equals Power * Time, we can annualize the costs by considering the Power and the transactions rate, T, over a standard interval of time, which we’ll take to be one year. When you perform some simple algebra, it is easy to calculate cost efficiency in terms of transaction per dollar of operational energy cost:

T/(P total * Energy Cost) = p/(P server * Energy Cost * PUE)

Note that the period of time over which the costs are averaged cancels in both the numerator and denominator, as we’d expect. The right-hand side has units of performance per watt per dollar. This equation has intuitive appeal. To maximize operational efficiency, you would want to maximize the transactions per energy cost. The equation highlights that this is achieved by maximizing system performance; minimizing server power and energy costs; and maximizing the data center infrastructure efficiency. Maximizing performance per watt per dollar minimizes the cost per transaction.

Summary

In this chapter we have shown that maximizing data center efficiency resolves to maximizing both the energy efficiency of the servers and the data center infrastructure. Power distribution plays a large role in the management of overall data center efficiency. Although AC dominates the current design and distribution of power to data centers, DC offers equivalent efficiency with fewer power conversions. As data centers move to alternative power sources like solar and fuel cells, which inherently provide DC power, we can expect to see a greater foothold of DC in data center design.

Finally, we have shown that total cost of ownership can offer a complex set of trade-offs in optimizing overall data center design, with both server performance and efficiency gains offering some of the most powerful variables in the overall optimization. This optimization reduces to maximizing “performance per watt per dollar” in order to achieve maximum cost efficiency.