Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

14.1 Introduction

During the past few decades, complementary metal-oxide semiconductor (CMOS) technology scaling along Moore’s law has been the classical solution for the semiconductor industry to meet the ever-increasing demand for lower cost and higher performance [14]. However, in the nanometer regime, the pace of the transistor scaling has been slowing down due to the challenges and hindrances of severe short-channel effects, increasing variability, and power/thermal problems [57]. Also, due to the increase of functionality (transistor counts) of a planar (single active-layer) integrated circuit (IC), the complexity of interconnecting the devices increases dramatically and requires a large number of metal layers. Consequently, performance improvement from transistor scaling cannot be fully exploited and has gradually been constrained by interconnects [8].

Under this scenario, three-dimensional (3D) integration has been proposed as a promising technology to overcome the bottleneck of interconnects in future advanced nanoscale ICs [9, 10]. Three-dimensional integration scheme involves monolithic stacking of multiple active layers and leads to a considerable reduction in the number and average lengths of the longest global wires seen in traditional planar (2D) chips by providing shorter “vertical” paths for connection (Fig. 14.1). Besides the benefits of interconnect performance [911], it is conceivable that heterogeneous integration of designs in CMOS and various non-silicon technologies (SiGe, GaAs, InP, etc.) will be easier to realize in a single chip by the 3D architecture (different active layers) than by existing 2D (planar) chips [9].

Fig. 14.1
figure 14_1_148491_1_En

Cross-sectional view of a vertically stacked n active-layer IC interconnected by through-wafer vias (TWV). From [12] © 2006 ACM, Inc. Included here by permission

Although 3D technology promises significant benefits, thermal issues are expected to offset the gains from this technology due to the degradation of performance and reliability [9, 13]. It is obvious that the heat problem will become worse due to the dramatic increase of power density when a 2D chip is redesigned into a 3D structure with the same functionality (identical power dissipation in a smaller size). Similarly, heat and thermal problems are expected to be exacerbated for 3D applications when many planar (2D) designs are integrated by stacking one layer on top of another [1315]. Due to the low thermal conductivity of the dielectrics between active layers [16], heat generated by the stacked active layers located away from the heatsink are difficult to transfer, and hence, lead to temperature gradient in the vertical direction of a 3D chip. Therefore, heat and thermal considerations are imperative for determining the practical applicability of 3D technology and for evaluating various 3D design options.

Traditionally, thermal infrared (IR) imaging system has been used for acquiring thermal profiles of 2D ICs [17, 18]. This system offers limited resolution of substrate thermal profiles and is not suitable for 3D designs with multiple layers. Similarly, integrated thermal sensors are commonly employed to ensure that hot-spots do not exceed the specified maximum temperature criteria in high-performance ICs. However, only a limited number of sensors can be integrated into each active layer due to the routing and pin-out constraints. Most importantly, these techniques can only provide thermal profiles after fabrication that is not practical for early design optimization. Hence, an accurate chip-level thermal profile estimation methodology is necessary, especially for 3D applications.

In this chapter, we will focus on the thermal challenges of 3D (multilayers) ICs and discuss their implications for thermal management and their mitigation methods. First, impact of heat on device and interconnect reliability issues are briefly described. Then, an analytical die temperature model of a multilayer IC will be reviewed. In addition, the origin of various electrothermal couplings between chip power, substrate (die) temperature, operating frequency, and supply voltage will be discussed. Subsequently, an accurate chip-level leakage-aware methodology for 3D IC thermal profile estimation is illustrated that self-consistently takes various electrothermal couplings into consideration with a realistic package thermal model that comprehends different packaging layers and noncubic structure of the package. Finally, implications of temperature profiles generated by the proposed methodology for 3-D IC power estimation and thermal management are discussed.

14.2 Thermal Effects in 3D ICs in the Nanometer Regime

While continued scaling of CMOS technologies provides substantial benefits in the form of higher transistor packing density, higher circuit performance, and lower cost of ICs, power consumption and power densities (Watts per unit chip area) have been increasing steadily [5, 6]. Moreover, as CMOS has scaled from generation to generation, power dissipation has historically increased proportionately to increasing transistor density and switching speeds. However, with the minimum feature size of the transistor entering the nanometer regime (<100 nm), leakage power has become a significant fraction of the overall chip power [7]. Also, most leakage mechanisms are strongly temperature dependent. This strong coupling between temperature and leakage can cause further increase in total power dissipation. When several active layers are stacked (3D architecture), the heat and thermal problems are expected to be exacerbated [13, 15].

14.2.1 Impact of Heat on Device and Interconnect Reliability

Elevated substrate temperature is widely known to have a strong impact on the performance and lifetime of devices and interconnects under “field”, “accelerated testing”, and “burn-in” conditions. Higher temperature increases the risk of damaging the devices and interconnects (since major back-end and front-end reliability issues including electromigration (EM), time-dependent dielectric breakdown (TDDB), and negative-bias temperature instability (NBTI) have strong dependence on temperature), even with advanced thermal management technologies [1921]. Moreover, due to the increase in the number of interconnect levels and introduction of low-κ dielectric materials with poor thermal conductivity, chip-level thermal problems have become even worse [13, 16, 22]. Hence, there is a critical need to accurately estimate the silicon substrate thermal gradients and temperature profile for the development and thermal management of future generations of all high-performance ICs, including 3D chips.

14.2.2 Analytical Average Die Temperature Model

A schematic diagram of a 3D IC with n active layers is illustrated in Fig. 14.2a. Each active layer (including one chip layer and one metallization layer) is separated by the glue layer. The 3D IC can be represented by a first-order equivalent thermal circuit as shown in Fig. 14.2b where P and T denote the power dissipation and the average die temperature of each active layer [13].

Fig. 14.2
figure 14_2_148491_1_En

(a) A 3D (n layers) IC. Thermal solutions (package and heatsink) are attached to the substrate of chip layer 1. (b) Equivalent thermal circuit used for evaluating the temperature of the 3D stack. P and T denote the power dissipation and the average die temperature of each active layer, respectively. T amb is the ambient temperature. θ ja and θ layer represent the junction-to-ambient and layer-to-layer effective thermal resistance, respectively

Preliminary analysis for estimating the average die temperature of a 3D IC can be carried out by employing the equivalent thermal circuit. Assuming the direction of heat flow is from layer n to layer 1 and to the heatsink, the die temperature of the first layer can be estimated by the following:

$$ T_1 = T_{\textrm{amb}} + \theta _{\textrm{ja}} \cdot \left( {\sum\limits_{k = 1}^n {P_k } } \right), $$
((14.1))

where T amb represents the ambient temperature and θ ja denotes the effective junction-to-ambient thermal resistance. Similarly, the temperature rise (above T amb) of each active layer can be calculated by the following analytical expression, where θ i is θ ja (when i = 1) and θ layer (when i >1):

$$ \Delta T_j = \sum\limits_{i = 1}^j {\left[ {\theta _i \cdot \left( {\sum\limits_{k = i}^n {P_k } } \right)} \right].} $$
((14.2))

According to the first-order equivalent thermal circuit in Fig. 14.2b, heat can only flow toward T amb (package and heat-sink). Thus, the highest temperature is determined by the temperature of the uppermost (nth) layer. From [13], the temperature rise of the uppermost layer is expected to increase with the square of the number of active layers (~n 2) under assumptions of identical power dissipation in each layer and identical thermal resistance between adjacent layers. Hence, it is clear that heat and thermal problems are expected to exacerbate (highest temperature increases quadratically) as the number of active layer in a 3D IC increases [13].

Although the first-order analytical model comprehends the thermal couplings between different active layers, the analysis simply employs a 1D equivalent thermal circuit with constant and average power dissipation at each layer that arbitrarily imposes the direction on heat transfer. This assumption (average power for each active layer) results in increasingly higher temperatures at the layers which are further away from the heat-sink and can mislead the estimation of temperature in a multiple-layer (3D) IC. As shown in Fig. 14.3, when the power dissipation is nonuniform, temperature could be lower at a point that is away from the heat-sink. Moreover, the first-order 1D analysis also ignores the electrothermal couplings within each active layer (e.g., correlations between temperature and power) which become critical in nanoscale designs [23]. Moreover, as the power dissipation of the entire 3D IC increases, a more detailed consideration of thermal solutions (e.g., package and heat-sink in Fig. 14.2a) must be taken into account (described in the following subsections).

Fig. 14.3
figure 14_3_148491_1_En

Cross-sectional view of a steady-state temperature profile. The power dissipation of the active layer (defined by two dotted lines) is nonuniform (only one heat source). The boundaries are assumed to be adiabatic except the top (attached to heat-sink)

14.2.3 Origin and Significance of Electrothermal Couplings

Typically, switching power and leakage power are the two major contributors to total chip power dissipation. The short-circuit component is relatively small and can be considered as a constant factor of total power [24, 25].

The switching power results from the charging and discharging of circuit capacitances between different voltage levels and increases with the chip frequency and supply voltage. The leakage power, especially subthreshold leakage, used to be negligible, but is rapidly becoming the dominant contributor to the total chip power because it is highly temperature sensitive (being thermionic emission based) (Fig. 14.4a) and exacerbates with technology scaling (Fig. 14.4b). Note that gate leakage (tunneling based) is temperature independent and can be mitigated by gate engineering [26]. Also, the junction (diode) leakage is relatively small as compared to subthreshold leakage [27].

Fig. 14.4
figure 14_4_148491_1_En

(a) Transistor off-state leakage current for N-metal-oxide semiconductor field-effect transistor (N-MOSFET) (45- and 90-nm effective channel lengths) based on BSIM3 models as a function of operating temperature. (b) Leakage power dissipation of an NMOS device for different technology nodes based on BSIM3 models showing the impact of temperature. The leakage power dissipation is normalized with respect to (w.r.t) the value at 130-nm node at 25°C. (c) Nominal supply voltage, threshold voltage, and static power based on ITRS’04, as technology scales. From [28] © 2007 IEEE

The subthreshold leakage increases significantly due to the fact that supply voltage (V dd) scaling necessitates threshold voltage (V th) scaling to maintain a required performance according to International Technology Roadmap for Semiconductors (ITRS) prediction (Fig. 14.4c).

In addition, elevated temperature lowers the threshold voltage of the transistor, and thus increases the leakage further [29]. Moreover, since the gap between the wavelength of light for optical lithography and the polysilicon gate length is increasing (Fig. 14.5a) [30], device channel length exhibits a significant amount of within-die variations [31], which in turn, leads to a significant impact on the distribution of leakage as shown in Fig. 14.5b.

Fig. 14.5
figure 14_5_148491_1_En

(a) Increasing gap between polysilicon gate length and lithographic wavelength for different technology nodes [30]. (b) Distributions of frequency and standby leakage current for different microprocessors on a single wafer. (Courtesy: S. Borkar, Intel) (c) Transistor drive (drain) current for N-MOSFET (45- and 90-nm effective channel lengths) based on BSIM3 models as a function of operating temperature. From [28] © 2007 IEEE

The performance itself depends on temperature due to the dependence of the transistor on-current on operating temperature. Although the threshold voltage decreases at higher operating temperature and partially offsets the performance degradation resulting from the lower carrier mobility, the transistor on-current still decreases at higher operating temperatures (Fig. 14.5c).

The increase in total chip power consumption causes higher die temperature, which further increases subthreshold leakage. Therefore, a strong feedback loop builds up, leading to various electrothermal couplings [23], which had been inconspicuous in earlier generation of ICs. Fig. 14.6 illustrates such electrothermal couplings between performance, power dissipation, supply voltage, threshold voltage, and die temperature.

Fig. 14.6
figure 14_6_148491_1_En

(a) Models for various metrics are expressed in functional format. Couplings are indicated using broken lines. L nom is the nominal gate length, α is the switching activity, C is the total load capacitance, F is the operating frequency, t ox is the gate oxide thickness, and X j is the junction depth. (b) Electrothermal couplings between different design metrics. As technology scales, the couplings between total power, leakage, and temperature (shown by dotted arrows) become increasingly prominent. From [28] © 2007 IEEE

14.3 Self-Consistent Temperature Estimation for 3D ICs

As elevated and nonuniform temperature in a 3D IC extensively impacts the reliability, performance, and thermal management, acquiring accurate temperature profile of each active layer is necessary in the early design stage (before the 3D chip is fabricated). In this section, a self-consistent 3D temperature profile estimation methodology is presented. The method incorporates the electrothermal couplings, as well as a realistic package thermal model to improve the accuracy of the thermal profile estimation and it is implemented via one of the widely used efficient algorithms for solving heat diffusion equations.

14.3.1 Typical Chip Package Structure and Heat Transfer Mechanisms

Due to the increase in silicon junction temperature for nanometer-scale technologies, packaging has been transformed from playing the traditional role of a protective mechanical enclosure to a sophisticated thermal management platform [32, 33]. Fig. 14.7 illustrates a cross-sectional view of a typical package structure of a planar high-performance IC including a “flip-chip land grid array” package and a socket that interfaces with the printed circuit board. The die is mounted on a package substrate (carrier).

Fig. 14.7
figure 14_7_148491_1_En

A typical package assembly (drawing not to scale). Although the package structure is specific for a planar (2D) high-performance IC, the package can be applied to 3D ICs as well. From [28] © 2007 IEEE

Along the main heat transfer path as shown in Fig. 14.7, the die and the package substrate are attached to an integrated heat spreader (IHS). The IHS, with a relatively larger area than that of the die, spreads the nonuniform heat from the die region to the top of the IHS. This improves the heat flux from a smaller die area to a larger surface that serves as the mating surface for the heat-sink. Since the surface of these three major components (die, IHS, and heat-sink) are never smooth enough to have a perfect contact, they are bonded together with a thermal interface material (TIM) applied between them. The TIM improves the poor thermal conductivity caused by surface roughness (conductivity of TIM is much larger than that of air) and thus enhances the overall thermal performance of the packaging stack-up and cooling mechanisms.

There is a second heat transfer path from the die to the printed circuit board, through the interconnect and dielectric layers, input/output (I/O) pads, and carrier as shown in Fig. 14.7. The thermal resistance of this path (from junction to the printed circuit board) is normally several orders of magnitude higher than that of the major heat transfer path [34]. Therefore, this path can be neglected in the analysis because of the small fraction of heat it can transfer.

Heat is a form of energy that can be transferred as a result of temperature difference by three different modes: (1) conduction, in which heat passes through the matter itself, (2) convection, in which heat is transferred by relative motion of portions of the heated body, and (3) radiation, in which heat is directly transferred between distant portions of the body by electromagnetic radiation. The effect of radiative heat losses can be neglected (effects of heat conduction and convection are considered) since its influence is negligible when forced convection is employed in most high-performance ICs [35]. The silicon die is the main source of heat generation. Heat can be exchanged and transferred by conduction within the entire packaging stack-up and by convection at the surface of the heat-sink.

14.3.2 Full-Chip Package Thermal Model

Practical packaging structures typically employ the heat spreader and heat-sink with larger dimensions (compared to the die) to improve the thermal performance of the main heat transfer path (the realistic package thermal model is shown in Fig. 14.8a). In practice, the area of the heat spreader and heat-sink are at least 9x and 30x larger than the area of the die, respectively. Note that not only does the packaging structure involve different materials with different thermal properties but also their dimensions with respect to the silicon die are different, which will significantly influence the heat transfer as well as the substrate thermal profile. The cubic package thermal mode, on the other hand, refers to a model in which all different package layers have identical areas and dimensions.

Fig. 14.8
figure 14_8_148491_1_En

(a) Side view of a realistic package thermal model indicating different dimensions for each layer. The thickness of different layers and the dimension of the layers are not drawn to scale. Note that the die could be planar or 3D IC. (b) Sketch of the discretization of the thermal packaging stack-up. Each node (circle) represents a discretized cell with a temperature value (T). Each discretized cell has six adjacent cells connected by edges (lines). Relationships between two adjacent cells are governed by (14.5) or (14.6) depending on heat transfer mechanisms. Effective thermal conductivity of cells between two adjacent layers (darker nodes) can be determined by (14.7) since the dimensions of a discretized cell are equal (i.e., dx = dy = dz). From [28] © 2007 IEEE

The temperature profile cannot be solved analytically due to the presence of complex geometry and complicated boundary conditions. Thus, numerical approaches will be employed for thermal profile estimation.

The fundamental physics of heat transfer in a chip is governed by the following 3D heat conduction equation and is subject to heat convection as the boundary condition [36]:

$$ \rho C_p \frac{\partial }{{\partial t}}T(x,y,z,t) = \nabla \cdot \left[ {k(x,y,z,t)\nabla T(x,y,z,t)} \right] + g(x,y,z,t) $$
((14.3))
$$ k(x,y,z,t)\frac{\partial }{{\partial n_i }}T(x,y,z,t) = h\left[{T(x,y,z,t) - T_{\textrm{amb}} } \right] $$
((14.4))

where ρ is the density of the material (kg/m3), C p is the specific heat of material (J kg–1°C), T is the temperature (°C), k is the thermal conductivity of the material (W m–1°C), g is the internal heat generation (W m–3), n i is the outward direction normal to the boundary surface, h is the convective heat transfer coefficient (W m–2°C), and T amb is the temperature of the ambient air surrounding the package measured at a specified distance sufficiently far away from the surface of the entire package. Note that k is a measure of the ability of the material to conduct heat. Although it varies with temperature, the variance is relatively small within the range of operation [36]. Hence, a constant value of k is employed for each material in the packaging structure at the nominal temperature in the analysis. Also, for each layer, the thermal conductivity is identical in all directions (i.e., the material of each packaging layer is considered to be isotropic and homogeneous).

The aforementioned partial differential equations and boundary conditions can be rewritten as (14.5) and (14.6) where the temperature (T) is a function of the position (x,y,z) and time (t).

$$ \frac{{\partial T}}{{\partial t}} = \left( {\frac{k}{{\rho C_p }}}\right)\left( {\frac{{\partial ^2 T}}{{\partial x^2 }} +\frac{{\partial ^2 T}}{{\partial y^2 }} + \frac{{\partial ^2T}}{{\partial z^2 }}} \right) + \frac{p}{{\rho C_p }} $$
((14.5))
$$ \frac{{\partial T}}{{\partial n_i }} = \frac{h}{k}\left[ {T - T_{\rm amb} } \right]$$
((14.6))

Electrothermal couplings are incorporated into the thermal model and the parameter p in (14.5) is a function of temperature, time, and the position within the die. Unlike the constant quantity g in (14.3), the parameter p represents the heat generation including electrothermal couplings and is recalculated at each evaluation step in a self-consistent manner.

The entire thermal packaging stack-up (packaging material layers) is discretized based on a typical high-performance package structure as Fig. 14.7. Relationships between discretized cells are governed by the heat partial differential equations and boundary conditions shown in (14.5) and (14.6). Physical thermal parameters, such as thermal conductivity, density, and specific heat of different layers, depend on material properties. Note that the dimensions of a discretized cell are chosen to be equal (i.e., dx = dy = dz). Thus effective thermal conductivity (k eff) of cells between two adjacent layers, as represented by darker nodes in Fig. 14.8b between layer 1 and layer 2, can be simply determined by (14.7).

$$ \frac{2}{{k_{\rm eff} }} = \left( {\frac{1}{{k_1 }} + \frac{1}{{k_2 }}} \right),$$
((14.7))

where k 1 and k 2 represent the thermal conductivity of material in layer 1 and layer 2, respectively. A perfect thermal contact between the TIM layer and the adjacent materials is assumed since TIM is applied between two different layers to reduce the thermal contact resistance caused by surface roughness.

14.3.3 Numerical Approach and Methodology Overview

Partial differential equations (PDEs) of the general form shown in (14.8) are classified as parabolic PDEs (where φ is a function of x, y, z, and t) [36, 37] and can be solved using the finite difference approximation by two well-known approaches: explicit and implicit methods.

$$ \frac{{\partial \varphi }}{{\partial t}} = \alpha \left({\frac{{\partial ^2 \varphi }}{{\partial x^2 }} + \frac{{\partial ^2\varphi }}{{\partial y^2 }} + \frac{{\partial ^2 \varphi}}{{\partial z^2 }}} \right) $$
((14.8))

The explicit method is simple and straightforward [36, 37]. The explicit method calculates the state of a system at the next time step from the state of the system at the current time. However, in many cases, time steps must be very small to maintain stability; this results in long computation time for a steady-state analysis. In order to overcome the aforementioned disadvantages of the explicit method, the implicit method considers both the current state and the state at the next time step [36, 37] and the stability can be maintained over much larger values of time step. However, this method is more complicated to set up and massive matrix manipulations require a considerable amount of computation memory and runtime for each time step.

The alternating direction implicit (ADI) method is a widely used algorithm for the numerical solution of parabolic PDEs involving multiple spatial variables [38, 39]. The advantage of applying this method arises from transferring a multiple dimensional parabolic PDE into a succession of 1D problems. Therefore, no large-scale matrix has to be computed, and it is easy to implement. Thus, the ADI method is employed as the core algorithm to solve the heat PDEs for achieving higher computation efficiency. It is important to note that although other computationally efficient methods exist, choosing any one of them over the others does not affect the accuracy of results.

In order to accurately estimate on-chip thermal gradients and the power dissipation profile, a self-consistent temperature profile estimation methodology is proposed with the capability of incorporating precise layout geometry and the power dissipation of individual circuit blocks in a chip [40].

Fig. 14.9 illustrates the overview of the methodology for substrate temperature profile estimation. The chip is partitioned into a mesh according to the information provided by the layout geometry and power distribution map. Nominal power dissipation (including switching and leakage power) for each functional block is used as initial value according to its activity, depending on specific circuit implementation and application. Note that for a 3D IC, each active layer will have different layout geometry and power distribution. Physical parameters such as specific heat, thermal conductivity, and heat transfer coefficient depend on specific packaging material properties and applied cooling techniques. The full-chip realistic package thermal model is then incorporated, which comprehends both vertical and lateral heat transfer paths. Boundary conditions are determined by the operating environment. The simulator uses layout geometry, nominal power dissipation, boundary conditions, and physical thermal/packaging parameters as initial values to formulate PDEs and then solves these equations in a self-consistent manner using the ADI method for every mesh element. The algorithm converts a multiple-dimensional parabolic PDE into a succession of 1D linear equations. The electrothermal couplings are also embedded in the core of the simulator that simultaneously estimates temperature-dependent quantities for each simulation step. Once the difference of the temperature evaluation between two steps is within a certain range, the evaluation stops and the steady-state temperature profile is obtained. However, if the temperature exceeds the maximum criteria (defined by reliability constraints) for certain extreme cases due to poor packaging solutions or high power dissipation, the evaluation will terminate and thermal runaway will be reported.

Fig. 14.9
figure 14_9_148491_1_En

The electrothermally aware methodology for silicon substrate temperature profile estimation. From [40] © 2007 IEEE

The key aspect of the proposed approach as compared to traditional methods is illustrated in Fig. 14.10. Although the entire thermal profile can be obtained by the traditional evaluation, the traditional method is apparently misleading because it ignores the correlation between power and temperature. While one might think of applying the traditional evaluation iteratively by updating the temperature-dependent power (as shown by the dotted arrows), however, this dramatically increases the computation time. In addition, once the steady-state temperature is evaluated without considering the electrothermal couplings, the iterations (as shown by the dotted arrows) based on inaccurate information is meaningless. On the other hand, the proposed self-consistent approach evaluates the steady-state temperature profile by employing the ADI method such that the correlation between the power and the temperature can be incorporated at each time step. Hence, the self-consistent method inherently generates a more accurate power profile, which can then be used to generate an accurate temperature profile by efficient PDE solvers.

Fig. 14.10
figure 14_10_148491_1_En

Diagram showing the difference between traditional evaluation and the proposed self-consistent substrate thermal profile estimation methodology. Due to the strong interdependence of temperature and leakage power, temperature at the central block (T 0) is not simply a function of the nominal power dissipation within and adjacent to the center block (P 0, P 1, P 2, P 3, P 4) as per traditional analysis. Nominal power distribution should be updated self-consistently with the temperature evaluation (e.g., P i is updated to ). The proposed method evaluates the temperature by incorporating the correlation between power and temperature at each time step (∆t). Note that ∆t is chosen to be much larger than the electrical switching time of gates or logics in the area of concern. From [40] © 2007 IEEE

14.3.4 Setup and Implementation: An Example of a 2D IC Thermal Profile Estimation

A design with a die size of 10 × 10 mm2 (discretized into 100 × 100 grids) and with power densities per functional block is shown in Fig. 14.11. The power dissipation of the chip or each functional block depends on the application (workload, activity, etc.). However, in this analysis, the power distribution map is known. The nominal total power consumption of the chip at ambient temperature (45°C) is 96 W (nominal active power = 93.1 W, leakage power = 2.9 W). The short-circuit component is relatively small; therefore it is neglected for simplicity. The physical and thermal properties of all packaging layers are evaluated according to a practical packaged high-performance microprocessor [40].

Fig. 14.11
figure 14_11_148491_1_En

Functional block layout of a chip. Power densities associated with functional blocks are also shown. The circle encloses a region where blocks have the highest power density. The triangle encloses the functional blocks that have higher leakage power dissipation than all other blocks. From [40] © 2007 IEEE

In order to demonstrate the importance of incorporating electrothermal couplings and realistic package thermal model for estimating the substrate temperature profile, four different simulation scenarios are compared using the design shown in Fig. 14.11. Although the results of the proposed methodology have not been verified against direct measurements, the method simply ensures the self-consistency between power and temperature during each iteration of the PDE solver, which has been validated against an industrial-quality computational fluid dynamics (CFD ) software [41]. The same heat equations are employed and the inclusion of the electrothermal couplings does not change the fundamental equations governing thermal transport via heat conduction and convection but provides an algorithm to self-consistently solve the temperature and leakage power. Hence, once the core of the solver has been validated against the CFD, the results of the methodology can be trusted even with the inclusion of the electrothermal couplings.

Although the results are specific to the aforementioned 2D IC, the conclusions are more generic. It can be observed that there is a region indicated by a circle in Fig. 14.11 where blocks have highest power density. In addition, there is a region indicated by a triangle where blocks have 10 times leakage power dissipation with respect to the values of other functional blocks. However, the average power density of the circuit blocks in the triangle is around 60% of the average power density value in the circle.

Fig. 14.12 and Fig. 14.13 represent the silicon substrate temperature profiles generated under four different scenarios, respectively:

  1. 1.

    Traditional method + cubic package thermal model

  2. 2.

    Traditional method + realistic package thermal model

  3. 3.

    Self-consistent method + cubic package thermal model

  4. 4.

    Self-consistent method + realistic package thermal model

Note that all temperature profiles are shown using a constant temperature range (56–66°C) for ease of comparison in Fig. 14.12 and Fig. 14.13.

Fig. 14.12
figure 14_12_148491_1_En

Silicon substrate temperature profile generated by traditional evaluation without considering electrothermal couplings. (a) A cubic package thermal model is employed. Only one hot-spot can be observed. T max is approximately 65.49°C and located in the region with higher power density. T maxT min is approximately 8°C. (b) A realistic package thermal model is employed. Only one hot-spot can be observed. T max is approximately 64.23°C and located in the region with higher power density. T maxT min is approximately 11°C. From [40] © 2007 IEEE

Fig. 14.13
figure 14_13_148491_1_En

Silicon substrate temperature profile generated by the proposed self-consistent method. (a) A cubic package thermal model is employed. Two hot-spots can be observed. The highest temperature (T max) is approximately 65.69°C and located in the region with higher percentage of leakage power. T maxT min is approximately 8°C. (b) A realistic package thermal model is employed. Two hot-spots can be observed. The highest temperature (T max) is approximately 63.81°C. T maxT min is approximately 11°C. From [40] © 2007 IEEE

The impact of electrothermal couplings on the substrate temperature evaluation can easily be observed by comparing Fig. 14.12b and Fig. 14.13b, which both employ the realistic package thermal model (Fig. 14.8a) and the same cooling conditions. The substrate thermal profile (Fig. 14.12b) is generated using a traditional thermal simulator without considering electrothermal couplings. The highest temperature (hot-spot) is approximately 64.23°C and is located in a region with the highest power density (indicated by a circle in Fig. 14.11). However, a different substrate temperature profile (Fig. 14.13b) is obtained by employing the proposed self-consistent methodology. From the temperature profile in Fig. 14.13b, two hot-spots can be observed: one in the region with the highest power density and the other in the region with a higher percentage of leakage power. Unlike the traditional evaluation, the highest temperature is around 63.81°C and is located in the region with a higher percentage of leakage power (indicated by the triangle in Fig. 14.11). Note that the self-consistent methodology comprehends the couplings between power (active and leakage) and temperature. The steady-state power dissipation (active and leakage) is self-consistent with the temperature and may not be equal to the nominal power dissipation.

As explained in [28], regions with higher switching power density do not necessarily yield a higher temperature due to the various electrothermal couplings. Although the highest temperature values are similar in Fig. 14.12b and Fig. 14.13b, the temperature profile obtained by the self-consistent evaluation shows an additional hot-spot and thus a different temperature distribution. The traditional estimation is clearly misleading in terms of hot-spot count, location, and the overall spatial temperature profile as it neglects the electrothermal couplings between power dissipation and temperature.

The impact of employing two different package thermal models for the cooling path on the temperature profile estimation can be observed by comparing Fig. 14.13a and Fig. 14.13b. For fair comparison, the layout, power density distribution, and discretization of the die are kept identical. In addition, the physical and thermal properties of each packaging layer material are kept constant in both models. Fig. 14.13a shows the estimated substrate temperature profile by using a cubic (unrealistic) package thermal model. Although the electrothermal couplings are considered, unrealistic package thermal model underestimates the lateral heat spreading of packaging layers (particularly in IHS and heat-sink), and thus results in a higher maximum and average substrate temperature. However, it is also important to note that although the maximum temperature is lower, the temperature gradient from the hot-spot to the edges of the chip is higher while employing the realistic package thermal model (e.g., T max is 65.69°C in Fig. 14.13a and 63.81°C in Fig. 14.13b; T maxT min in Fig. 14.13a and Fig. 14.13b are about 8°C and 11°C, respectively). Due to the use of larger heat spreader and heat-sink in the realistic package thermal model, better lateral heat spreading leads to lower maximum temperature but to even lower temperatures at the edges of the chip. This, in turn, is expected to impact the physical design issues such as partitioning and placement schemes for high-performance ICs including multicore designs.

14.3.5 3D IC Thermal Profile Estimation: Analysis and Implications

In [9, 42, 43, 44], several possible applications for this revolutionary 3D technology have been explored. One of the most promising applications is that of integrating a processor-and-memory system on a single 3D chip. Preliminary thermally aware performance analysis of the 3D processor-memory hierarchy (assuming an average temperature for each active layer) is performed with different benchmarks at different processor frequencies [12]. The impact of thermal constraint on performance of the processor-memory hierarchy is summarized in Fig. 14.14.

Fig. 14.14
figure 14_14_148491_1_En

Execution time per instruction (t exe) and maximum average temperature as a function of operating frequency for 2D and 3D chip (processor-memory hierarchy). (a) Highly memory-intensive application (mcf). (b) Less memory-intensive application (twolf). From [12] © 2006 ACM, Inc. Included here by permission

For the application, which is highly memory-intensive (e.g., benchmark: mcf ), the execution time per instruction (t exe) is lower for a 3D system (Fig. 14.14a) while for the application that is less memory-intensive (e.g., benchmark: twolf), the difference in execution time between 2D and 3D system is negligible (Fig. 14.14b). Moreover, when the system is constrained by a maximum allowable temperature (which arises from reliability concerns), the maximum allowable frequency (f max) of a system is limited. For memory-intensive systems (Fig. 14.14a), even though thermal considerations place a lower limit on the f max in 3D, better performance can still be achieved as compared to the 2D system running at a higher frequency. This is because the 2D system cannot overcome the memory interface bottleneck. On the other hand, for applications that are not memory intensive (Fig. 14.14b), the system performance is not dominated by memory accesses. Hence, under this scenario, the 2D system, which has a higher limit of f max, has a system performance better than the 3D system, which is constrained to operate at a lower frequency [12]. Note that this analysis employs an average temperature for active layers and this implies the temperature will be higher at the layer which is away from the heat-sink. However, when detailed temperature profile is taken into consideration, the average temperature model may mislead the temperature estimation and the performance analysis.

As the traditional planar (2D) technology has already been threatened by power and associated thermal problems, the success of 3D integration not only depends on the development of processing technologies but also requires thorough and accurate estimation of thermal profiles in a 3D IC.

Chip-level thermal and reliability issues of planar (2D) IC designs can be comprehended by employing the aforementioned thermal profile estimation methodology while considering the packaging and electrothermal couplings. From Fig. 14.7, the major heat transfer path of a planar (2D) IC is clearly from the active layer to the thermal packaging and heat-sink. However, due to the presence of different power dissipation and distribution of different active layers in a 3D IC, the direction of the heat flow significantly depends on the arrangement and the placement of the active layers.

A generic 3D IC with three active layers is considered in this subsection (one active layer is shown in Fig. 14.11 and the power density maps of the other two active layers are shown in Fig. 14.15). Note that the thickness of each active layer in a 3D IC (around 50 μm, in Fig. 14.1) is much smaller than that in a planar (2D) IC (several hundred microns) for practical integration, assuming the power dissipation of active layers in Fig. 14.15 is 20 W and 10 W, respectively. Typically, in order to reduce the thermal resistance between active layers and heat-sink, layers with higher power dissipation will tend to be placed closer to the heat-sink than layers with less power dissipation. Thus, in this scenario the layer shown in Fig. 14.11 will be attached directly to the package structure and followed by Fig. 14.15a and subsequently by Fig. 14.15b. Note that besides the boundary that is attached to the heat-sink and exposed to the ambient, all other boundaries are considered to be adiabatic in the analysis.

Fig. 14.15
figure 14_15_148491_1_En

Functional block layout of two additional active layers (first layer shown in Fig. 14.11) in a 3D IC. Power densities associated with functional blocks are also shown. (a) The nominal total power consumption of the layer at ambient temperature (45°C) is 20 W (nominal active power = 19.6 W, leakage power = 0.4 W). (b) The nominal total power consumption of the layer at ambient temperature (45°C) is 10 W (nominal active power = 9.8 W, leakage power = 0.2 W)

With the same packaging and environment conditions as in the previous subsection, steady-state temperature profiles of the layers in the 3D IC can be estimated by using the self-consistent methodology (Fig. 14.16). The maximum temperature for each active layer is 64.19°C (layer 1), 63.89°C (layer 2), and 63.47°C (layer 3), respectively. Although power dissipation of layer 2 and layer 3 is much lower than layer 1, steady-state temperature profiles of layer 2 and layer 3 are influenced and raised by layer 1. As discussed in Section 14.2, employing first-order analytical thermal model for 3D IC analysis with average power and temperature of each active layer certainly misleads the temperature estimation and overestimates the maximum temperature of layers that are away from the heat-sink.

Fig. 14.16
figure 14_16_148491_1_En

Temperature profiles of the active layers in the 3D IC generated by the proposed self-consistent method. Similar to Fig. 14.2a , layer 1 is directly attached to the thermal solutions (e.g., heatsink)

14.4 Implications and Opportunities for 3D IC Thermal Management

Unlike planar (2D) ICs, thermal management for 3D ICs requires thorough considerations not only for each active layer (2D level) but also the correlative impacts between active layers (3D level).

At the 2D level, power-reduction techniques and thermal management for conventional planar (2D) ICs, including device-, circuit-, and architecture-level techniques, can be directly employed to reduce power dissipation and thermal gradient. For CMOS technologies, device short-channel effects [29], which lead to higher subthreshold leakage, have been shown to be improved via substrate engineering. For instance, vertically nonuniform doping (retrograde channel profile) enhances inversion layer mobility because of the lower surface doping [45, 46], while laterally nonuniform channel implants (halo doping) reduce threshold voltage roll-off by compensating 2D charge-sharing effects in short-channel transistors [4749]. Transistor gate-tunneling leakage, which increases with the ever-thinning silicon dioxide gate dielectric [50], can be alleviated by replacing the thin silicon dioxide by a thicker insulating material with higher dielectric constant (high-κ). In addition, a metal gate electrode was also used to replace the poly-silicon gate to have a better control of the threshold voltage [51, 52]. Similarly, at the circuit level, low-power design methodologies include dual- or multi-V dd and V th schemes as well as adaptive body-biasing techniques can be applied [53]. Transistor gating is also considered for low-power or power-constrained designs. For instance, the clock-gating technique is used to reduce clock tree power dissipation [54]. Power gating and sleep transistor insertion techniques reduce leakage by turning off idle circuitry [55]. Thermally aware placement schemes (within-layer) are also available in the literature to optimize performance and operating temperature [56, 57]. Moreover, in [58], a 3D IC thermal placement method using an iterative force-directed approach is presented. At the architecture level, pipelining and parallel (including multi-core) structures are often implemented in low-power designs. The throughput can be maintained at a lower V dd by parallel implementation. Also, applying pipelining can reduce power consumption while the switching rate and V dd are reduced [59]. Note that these methods reduce power consumption at the cost of area, performance, or noise margin penalty.

Besides the aforementioned techniques, chip cooling has always been considered as an effective knob for power and thermal management [60]. Conventional cooling techniques and thermal management for planar (2D) ICs cannot be directly applied for 3D IC thermal management but require a holistic consideration including all active layers and thermal solutions (packaging, etc.). For instance, the boundary conditions of one active layer are determined by its adjacent layers, and hence, the aforementioned 2D-level (within-layer) placement schemes are required to comprehend the layer-to-layer effects.

Fig. 14.17
figure 14_17_148491_1_En

Temperature profiles of the active layers in the 3D IC generated by the proposed self-consistent method. All layers are identical with the same power distribution and dissipation (one-third as compared to that of Fig. 14.11 ). Layer 1 is closest to the heat-sink. In this analysis, it is assumed that all boundaries are adiabatic besides the one attached to the heat-sink and exposed to the ambient. The maximum temperature is (a) 58.02°C, (b) 58.51°C, and (c) 58.77°C

As shown in the aforementioned example (Fig. 14.15 and Fig. 14.16), layer 1 has the highest power dissipation and the highest temperature occurs in layer 1 even though this layer is closest to the heat-sink. However, temperature profiles and the maximum temperature of the active layers in a 3D IC will change when employing different arrangement schemes of layers. Here we discuss an example with three identical active layers stacked in a 3D IC. The power distribution of the active layers is similar to Fig. 14.11 but the power dissipation of each functional block in the active layer is one-third as compared to that of Fig. 14.11 (note that stacking three high-power dissipation layers with the same thermal solutions leads to thermal runaway). Fig. 14.17 shows the steady-state temperature profile of these three layers. It can be observed that the highest temperature now occurs at layer 3 that is away from the heat-sink.

Moreover, alternative materials with higher thermal conductivities can also improve heat removal of 3D ICs. While, thermal and interlayer vias are shown to mitigate thermal problems in 3D ICs [61], employing metallic carbon nanotube (CNT) bundle vias to replace copper at different locations in the interconnect stack shows substantial benefits in controlling the back-end temperature [62]. In practice, CNTs have been fabricated as thermal and source bumps for flip-chip high-power amplifiers [63]. Also, it has been experimentally shown that the thermal conductivity of TIMs can be improved by employing free-standing CNT arrays or combinations of CNT arrays and existing TIMs [64].

14.5 Summary

Three-dimensional integration technology with multiple active layers has been considered as a promising candidate to alleviate the interconnect delay problems in nanoscale VLSI circuits and to realize heterogeneous integration in the same chip. As heat and thermal effects already significantly impact reliability and performance in high-performance planar (2D) ICs, obviously, heat and thermal problems in 3D ICs are worse since 3D ICs are stacked by, and thus inherited from 2D ICs. In this scenario, accurate thermal profile estimation is critical in the early design stage (before the 3D chip is fabricated).

It is shown that the first-order analysis simply employs a 1D equivalent thermal circuit with constant and average power dissipation at each active layer and results in higher temperature at the layer which is away from the heat-sink that misleads the estimation of temperature in a 3D IC. On the other hand, the proposed self-consistent 3D temperature profile estimation methodology incorporates the electrothermal couplings that are increasingly prominent as technology scales. In addition, a realistic package thermal model is considered to improve the accuracy of the thermal profile estimation. Impact of layer stacking on the temperature profile of a 3D IC is also presented.

The 3D thermal profiles are also strongly influenced by the nature of the application running on the chip. Although various techniques for power saving or thermal management for planar (2D) ICs can be applied to 3D ICs, considerations of active layer arrangement in a 3D IC as well as 3D thermally aware placement schemes can severely influence the steady-sate temperature profile of each active layer. Furthermore, overall thermal conductivity of 3D ICs can be improved by employing higher thermal conductivity materials (e.g., CNTs) between active layers or in the packaging structure.