Reducing Library Characterization Time for Cell-aware Test while Maintaining Test Quality

Cell-aware test (CAT) explicitly targets faults caused by defects inside library cells to improve test quality, compared with conventional automatic test pattern generation (ATPG) approaches, which target faults only at the boundaries of library cells. The CAT methodology consists of two stages. Stage 1, based on dedicated analog simulation, library characterization per cell identifies which cell-level test pattern detects which cell-internal defect; this detection information is encoded in a defect detection matrix (DDM). In Stage 2, with the DDMs as inputs, cell-aware ATPG generates chip-level test patterns per circuit design that is build up of interconnected instances of library cells. This paper focuses on Stage 1, library characterization, as both test quality and cost are determined by the set of cell-internal defects identified and simulated in the CAT tool flow. With the aim to achieve the best test quality, we first propose an approach to identify a comprehensive set, referred to as full set, of potential open- and short-defect locations based on cell layout. However, the full set of defects can be large even for a single cell, making the time cost of the defect simulation in Stage 1 unaffordable. Subsequently, to reduce the simulation time, we collapse the full set to a compact set of defects which serves as input of the defect simulation. The full set is stored for the diagnosis and failure analysis. With inspecting the simulation results, we propose a method to verify the test quality based on the compact set of defects and, if necessary, to compensate the test quality to the same level as that based on the full set of defects. For 351 combinational library cells in Cadence’s GPDK045 45nm library, we simulate only 5.4% defects from the full set to achieve the same test quality based on the full set of defects. In total, the simulation time, via linear extrapolation per cell, would be reduced by 96.4% compared with the time based on the full set of defects.

manufacturing steps, ICs are highly susceptible to manufacturing defects, and thus they need to undergo testing to weed out the defective parts before they are shipped to customers. Unfortunately, in practice, tests are not perfect either; not all defective parts are recognized during testing, which causes test escapes. Complex ICs are used in safety-critical applications, such as automotive and healthcare, where defects have profound consequences, and test escapes cannot be tolerated. It has always been and will continue to be a key responsibility of IC test engineers to reduce the fraction of test escapes [6].
Digital logic ICs are designed on the basis of a library of standard cells. Conventional automatic test pattern generation (ATPG) tools target faults on the boundaries of cells, such as stuck-at [7] and transition faults [27]. Intra-cell defects are typically not in the scope of conventional ATPG tools, and hence only detected fortuitously [12,25,28]. Cell-internal defects cause a significant fraction of test escapes [6]. Cellaware test (CAT) reduces test escapes by explicitly targeting cell-internal defects [18]. The general CAT methodology consists of two stages. In Stage 1, library characterization, we first determine, for each library cell based on its layout, the set of locations where cell-internal open and short defects might occur, and model defects using selected resistors. Subsequently, defect characterization utilizes dedicated analog simulation to determine which cell-level test pattern detects which cell-internal defect; the result is encoded in a defect detection matrix (DDM) per cell. An example DDM for an AND2X1 cell is shown in Fig. 1. A DDM is a binary matrix. Its rows correspond to cell patterns, which contain stimulus bits for all cell inputs and the corresponding defect-free response bit on a single cell output. The columns correspond to defects that are identified to occur in the cell potentially.
A DDM entry denotes whether a cell pattern detects a defect (denoted by '1') or not (empty denotation). The '1's in each pattern row show which defects are detected by this cell pattern; the '1's in each column indicate the cell patterns detecting the corresponding defect. Stage 2, cell-aware ATPG, uses as inputs the IC netlist with library cells as its building blocks and the set of DDMs for these library cells. For each cell instance in a chip design, an intra-cell defect is covered if there is at least one cell pattern (1) which, according to the DDM, detects this particular defect, and (2) which the cell-aware ATPG tool is able to successfully expand from cell-to-chip level. After a cell pattern is expanded successfully, fault simulation identifies all the other covered defects in the pattern expansion path. This paper focuses on Stage 1, library characterization, as it is fundamental to both test quality and test cost of CAT. The set of potential defects should be realistic and complete. Too few defect locations cause the test to miss defects; this negatively impacts the test and product quality. Too many defect locations imply unnecessary time-consuming analog simulations during defect characterization. We propose a library characterization tool flow that covers all cellinternal defects to guarantee the test quality and significantly reduces the defect simulation time with maintaining the test quality.
We automatically identify a full set of potential locations of open defects on and short defects between both intracell interconnects and transistor terminals. During the full set of defect-location identification (DLI), we use parasitic extraction (PEX) not only in its conventional role to create accurate analog simulation models of library cell layouts, but also to analyze the cell layouts to identify possible defect locations on and between interconnects. The number of identified defect locations quickly grows large, even for small library cells, making the time cost of the following defect simulation unaffordable. However, many defects in the full set are equivalent with respect to their logic fault behaviors, and hence require the same test patterns, namely cause identical DDM columns. To reduce the number of defects to be simulated while maintaining the full-set-based test quality, we first collapse the full set to a subset compact set by grouping equivalent defects and selecting only one defect per group into the compact set, which is used as input of the defect simulation. With the assistance of defect simulation, we verify if all collapsed defects cause identical DDM columns. An error is determined if a collapsed group of equivalent defects contains at least one defect non-equivalent to the others. A non-equivalent defect in a collapsed defect group causing one DDM column to be missed for this group negatively impacts the test quality. Some errors are specific to simulation user settings, and hence cannot be avoided before the simulation. Subsequently, we eliminate errors by identifying and adding the missed DDM columns with additional simulations, thus compensate the test quality to the same as that based on the full set of defects. The experiment results show that for 351 combinational cells in Cadence's GPDK045 45nm technology, we reduce 96.4% simulation time comparing with simulating the full set of defects. This paper extends our work in [11] with a refined PEX cell model, an improved solution for reducing the defect simulation time with maintaining the test quality, and additional experiment results on defect simulation.
The remainder of this paper is organized as follows. Section 2 reviews related prior work. The proposed library characterization tool flow is introduced in Section 3. Section 4 describes a cell model used for the PEX and the general PEX settings for the defect-location identification. To reduce the defect simulation time and maintain the test quality, we describe how we handle the open and short defects in Section 5 and 6, respectively. Section 7 presents experiment results and Section 8 concludes this paper.

Related Prior Work
CAT identifies only those cell patterns that actually contribute to the detection of intra-cell defects; this requires knowledge of which defects to expect. For this, CAT is inspired by layout-level defect-to-fault analysis methods like inductive fault analysis (IFA) [32], which are intractable to be applied at chip level, but very feasible if applied on individual library cells. CAT was introduced by Hapke et al. in 2009 [15]. Initially, only hard (low ohmic) short defects ( 1Ω ) in combinational logic cells were targeted. Later, this was extended to hard (high ohmic) open defects (1GΩ ), weaker short defects (1Ω -20kΩ ) [13,16], and sequential cells such as scan flip-flops [14]. Several studies have shown that CAT offers superior test quality and reduces test cost in comparison with n-detect [18,25,28], embedded multi-detect [12,38], or gate-exhaustive test [5,17]. Industrial application has provided experimental evidence of the effectiveness of CAT to reduce test escapes [13,18,19,33,37]. CAT also improves the diagnosis accuracy and efficiency by recording the layout location of potential defects [26,36].
The quality improvement promised by CAT critically depends on the details of the DLI and defect characterization steps, as they determine which intra-cell defects will be considered and which cell-patterns can actually contribute to the detection of these defects. No prior work discloses which intra-cell defects exactly are modeled and when the defect set is pruned, how that is done, and what the effect is on the compute time for analog defect simulation. Other papers use PEX for DLI, but do not report how the many user controls of a PEX tool were set [14,15,34]; in this paper, we fully specify all these details. Prior publications present different opinions regarding the inclusion of inter-layer shorts: Hapke et al. include them [14,15], while Liu et al. claim they do not occur and hence can be omitted [24]. We support both options, as we have observed that in advanced-node technologies, inter-layer shorts can indeed occur [10], but do not want to burden a user of more mature technology nodes with intra-cell defects that are indeed very unlikely to occur in these nodes. In addition to intra-cell defects, [14] includes also shorts, opens, stuck-at, and transition faults at the cell I/Os in cell-aware ATPG. In this paper, we demonstrate that these cell-boundary defects are often equivalent to already modeled cell-internal defects. In those cases, our approach does not require to explicitly include them into the defect set. In this paper, we also assess the defect equivalence from the analog simulation perspective to reduce the number of defects to be simulated. To maintain the test quality, which is the most important test metric, we use the simulation to efficiently verify the defect equivalence.

Library Characterization Flow
The proposed CAT library characterization invoking several other EDA tools along the way is depicted in Fig. 2.
Step 1, for each cell in a library, PEX with dedicated user settings is performed. The output is the cell's transistorlevel netlist with extracted parasitic resistors and capacitors.
Step 2, on the basis of the cell's netlist, we determine a full set of defect locations with possible open-and short-defect locations. This approach is embedded in Cadence's tool Modus. Parasitic resistors on and capacitors between cell-internal interconnects are all locations of open-respectively short-defect candidates for the interconnects. Transistor-internal defects are modeled by opens on and shorts between transistor terminals. However, the 1 3 full set of defect locations can be large just for a single cell. Simulating the full set of defects takes a lot of time, which is impractical. For example, for the AND2X1 cell in Cadence's GPDK045 library, the full set contains 61 open defect locations and 769 short defects.
To save downstream defect characterization time, In Step 3, we collapse the full set to a compact set based on the defect equivalence [11]. As equivalent defects are detected by the same set of cell patterns which is identified by analog simulation, per group of equivalent defects simulating only one defect is enough. Also, during the cell-aware ATPG, any one of the required cell patterns is expanded successfully, all of the equivalent defects are detected simultaneously. The advantage of the defect collapsing algorithm is that computing time is negligible compared with the simulation time and genetic to all technologies. However, due to the effect of specific simulation user settings and intra-cell parasitic resistors and capacitors dependent on technology, it is possible that not all the collapsed defects are equivalent, and hence we would miss some defects, which are reflected as missing DDM columns. To maintain the test quality based on the full set of defects, we add the missed DDM columns by additional simulations in Step 5.
In Step 4, Modus' defect characterization function adds a resistive value to each defect location in the compact set; this allows us to model hard as well as weak resistive open and short defects [20]. The defects and the defect-free netlist are submitted to the analog simulator. Each defect is injected into the defect-free netlist. For each short defect, we simulate the defective netlist for the exhaustive set of one-cycle cell patterns to detect a static fault; for each open defect, we simulate the defective netlist a set of two-cycle cell patterns to detect a delay fault. Only if the simulation for the combination of defect d and cell pattern p gives a response which differs from the defect-free cell on at least one output of the cell, we mark in the DDM entry for defect d and pattern p as 'detected'. For all library cells, all defects are simulated with all possible cell patterns, which generates per library cell a DDM as the output of the defect characterization.

In
Step 5, with the assistance of the upstream defect characterization, we verify if all defects in each collapsed group of defects result in identical DDM columns. If not, we identify missed DDM columns to compensate the test quality.
Defect simulation is time consuming, due to the fact that it contains three nested loops: for all cells, for all defects, and for all cell patterns. Fortunately, this task needs to be executed only once per library release, while the resulting DDMs can be reused for all IC designs based on the same library. Some defects cannot be detected by any cell pattern. These non-detectable defects do not affect cell functionality even if they are present. Therefore, lower defect coverage is a good sign which indicates the cell is not easy to be affected by the potential defects, namely the cell design is robust. Only the detectable defects continue to Stage 2, cell-aware ATPG, as cell-internal faults.

PEX and Its Cell Model
The conventional role of parasitic extraction (PEX) in IC design is to determine the non-ideal electrical behavior of on-chip interconnects. This behavior is typically not part of the original design intent (and hence the term 'parasitic'), but is present nevertheless and therefore needs to be assessed in order to build an electrically accurate simulation model of the circuit in question. The electrical non-idealities in the circuit's interconnects are lumped into resistors R, capacitors C, and inductors L, which are added into the circuit's original netlist [29]. In standard cells, interconnect lengths are very small, and hence the resulting L values are very small and therefore can be ignored [21].
A net that electrically connects two or more terminals is divided into net segments, which are bounded by terminals and/or internal nodes. Below, we present a mathematical description of the cell model as generated by the PEX tool. Let Cells be the set of library cells. Let Terminals c be the set of net terminals of library cell c ∈ Cells . Terminals c contains the inputs and outputs of cell c (a.k.a. ports), power ( V DD ) and ground ( V SS ), and the source, drain, gate, and bulk terminals of the transistors in c. Let IntNodes c be the set of all internal net nodes of library cell c; internal nodes are the points in a net where the PEX tool starts a new net segment. With Nodes c we denote the set of all nodes, i.e., Nodes c = Terminals c ∪ IntNodes c .
The extracted parasitic resistors and capacitors can be represented by two weighted undirected graphs, G R = (V, E R ) and G C = (V, E C ) , with a common set of vertices V = Nodes c , but different edge functions E R = R(i, j) and E C = C(i, j) , with R, C: NodesR, C ∶ Nodes c × Nodes c ⟶ ℝ + where R(i, j) and C(i, j) specify respectively the parasitic resistance and capacitance between nodes i and j as extracted by the PEX tool.
We define the abbreviation function segm: Nodes c × Nodes c ⟶ Bool 1 to denote that nodes i and j are electrically connected by a net segment: The function conn_net: Nodes c × Nodes c ⟶ Bool is the transitive closure of segm, denotes that two nodes are electrically connected through zero or more net segments, and is defined as follows: The function net: Nodes c ⟶ P(Nodes c ) yields, for a given node i, the net of i, i.e., the set of nodes with which i is electrically connected in cell c: The net function effectively partitions the set Nodes c in a number of disjoint non-singleton nets, such that the following holds: The set of disjoint nets in cell c is Nets c = ⋃ i∈Nodes c {{net(i)}}. Each node n ∈ Nodes c resides in a processing layer. Let Layers be the set of processing layers. Layer function lyr: Nodes c ⟶ Layers gives the layer of a node. For i, j ∈ Nodes c with segm(i, j), either (1) lyr(i) = lyr(j) and the net segment between i and j is said to be in lyr(i), or (2) lyr(i) ≠ lyr(j) and the net segment between i and j is a vertical interconnection between different layers, called a via or contact. The PEX tool can list the x, y layout coordinates of each node.
A net can fork to multiple destinations. We define Forks, the set of fork nodes in cell c, as follows: The function conn_branch: Nodes c × Nodes c ⟶ Bool denotes that two nodes are part of the same net branch, and is defined as follows: The function branch: Nodes c ⟶ P(Nodes c ) yields, for a given node i, the branch of i, i.e., the set of nodes that are part of the same net branch as i: The set of all branches in cell c is Branches c = ⋃ i∈Nodes c {{branch(i)}}. We illustrate the model described above with an inverter cell INVX1 from the Cadence GPDK045 cell library [3] as example in Fig. 3. Figure 3a shows the layout. We marked out in the schematic (see Fig. 3c) the in total twelve terminals: one input port (A), one output port (Y), two transistors with four terminals each, V DD , and V SS . Figure 3a shows in the PEX model each terminal exists in one of four layers: n-diffusion, p-diffusion, poly, and metal-1. V DD /V SS and cell input/output ports are on the metal-1 layer. In addition, the PEX tool has identified eight internal nodes. The total set of 20 nodes partitions into four disjoint nets: In cells more complex than INVX1, two adjacent transistors in the layout are designed to share the same diffusion area as their source or drain. However, in the DSPF format netlist, the PEX tool assigns a particular name for each transistor terminal even though two different terminal names represent the same diffusion area. Consequently, the interconnection between such two terminals is presented as a virtual 0.001Ω net. Even though the 0.001Ω resistor does not physically exist, its resistance is too small to impact the simulation results. The real parasitic resistance of the shared diffusion area is extracted and packaged into the device model that is also included in the extracted netlist for an accurate simulation. Figure 4 shows an example virtual interconnect in the AND2X1 cell, designed as a NAND circuit followed by an inverter, from Cadence's GPDK045 library [3]. The two NMOS transistors in the NAND part share the same diffusion area, and hence no physical interconnect between the two NMOS transistors. The interconnection between the terminals S and D shown in the schematic is extracted as a virtual net.
We use Cadence Quantus v18.1 to perform PEX. Figure 5 lists the general PEX settings. The remaining PEX settings specific to short-and open-defect location identification are described in the following two sections. Users should set the input database format generated by a layout versus schematic EDA tool. We use the 'pvs' database generated by Cadence Pegasus (Line 1). We need to extract resistor (R) and capacitor (C) values (Line 2). We use DSPF (Cadence's detailed standard parasitic format) as output format (Line 3). The benefits in the CAT context of DSPF over, for example, the well-known Spice format [35] are that DSPF format [2] features (1) an explicit layer list, (2) explicit listing of nets, and (3) explicit listing of net segments in diffusion layers. For usage during diagnosis, we include as comments into the netlist for the parasitic Rs and Cs: layer information, segment widths, and x, y location coordinates (Lines 4-6).

PEX for Open-defect Location Identification
An open defect on an interconnect might occur between any two connected nodes (i, j) with segm(i, j). We analyze for each library cell its layout by letting the PEX tool extract all net segments and their parasitic resistor values. Each extracted segment segm(i, j) with its parasitic resistor R(i, j) is considered as a potential open-defect location on the interconnects. Reasons for us to let a PEX tool split a net into multiple segments are (1) a fork in a net with multiple destinations, as open defects on different branches of the fork affect different terminals, hence should be separated; (2) a vertical interconnect from one layer to another (referred to as a via which in library cells is typically called "contact"); or (3) a 90 • bend ('L shape') within a layer, as the contacts and the 90 • bends are sensitive locations of manufacturing defects. Figure 6 lists the settings of the user options of Quantus to identify the full set of open-defect locations on intra-cell interconnects. In Line 1, we instruct Quantus not to break up segments into multiple segments only because the layout length of the segment exceeds a limit. The two options in Lines 2-3 specify that defect-sensitive circuit elements are included in the PEX output netlist: vias, and 90 • bends in nets -the latter is identified by electromagnetic analysis.
Parallel segments (other than parallel vias and contacts) are expected not to occur in the heavily-optimized circuits that standard-cell libraries are (Line 4). The only type of parallel R segments that occur frequently in library cells are parallel vias. The inherent redundancy of multiple parallel vias is meant to increase the yield, and effectively work as one (lower-resistance) parallel connection. Lines 5-6 merge them into one segment with replacement R value. To control complexity, our CAT flow works on the basis of one defect insertion at a time. However, by varying the resistance value of the open defect in the merged via, we can model any number of parallel vias as being defective. We suppress listing of parasitic resistors in dangling segments (Line 7). As the missing R's do not affect the simulation results and open defects in dangling segments would not cause faulty behavior, we utilize this option. This option does not affect the extraction of capacitors to the dangling segment nodes, and therefore it does not affect the fault simulation results.
A R value is indicative of the probability that the corresponding segment will suffer from an open defect: a large R value indicates that the corresponding net segment is thin and/or long and both increase the probability for that interconnect to be affected by an open defect. The parameter min_res in Line 8 filters out parasitic resistors below a specified limit (in Ohm) during PEX. We want the PEX tool to include all resistor segments in the cell model as open-defect locations, and set a user-defined option R in the DLI function to filter out some open-defect locations with small Rs based on users' requirements. Quantus accepts a min_res value only larger than 0, therefore we specify min_res low enough (i.e. 10 −10 Ω ) to guarantee that all segment Rs are extracted.

Full Set of Open-defect Locations
A standard cell is built up from interconnected transistors. We consider open and short defects for both interconnects and transistors. With the dedicated user settings of the PEX tool, parasitic resistors indicate the potential open-defect locations on cell-internal interconnects. For transistors, we adapt transistor-internal defects to transistor-boundary defects. For example, bad doping causes high impedance at source or drain, which can be modeled by adding high-ohmic resistors at source or drain terminal [23]; corrosion of the gate metallisation or poor gate etching results in a loss of the gate controllability, which can be modeled by gate terminal open [8,9]. By default, opens on source, drain, and gate terminals are considered for transistors.
The identification of the full set of open-defect locations is described in Algorithm 1. This algorithm has several following controls with which the user can authorize the tool to filter out certain open locations. The default settings for these open-defect locations are such that no locations are filtered out and thus best test quality can be obtained.  Type 2: the branch is a part of net V DD or V SS and is also on power rail (i.e. in metal-1 layer). On the power rails of standard cells in a circuit design, multiple voltage sources are designed and implemented with contacts that are from the external power supplier to the power rails. Signal open defect cannot block the voltage source for transistors. The concept is shown in Fig. 7. Even though R open1 blocks the voltage source on the left side, the voltage source on the right side still serves. Therefore, the two open defects on the branch that consists of segm(0, 1) and segm(1, 2) do not have a different effect on the circuit.

Open-Defect Equivalence
Type 3: if a branch consists of multiple segments and an open defect on such a branch blocks the signal transitions for all downstream PMOS or NMPS gate terminals, we classify such a branch into Type 3. A net can drive signal or multiple parallel CMOS transistor gate terminals. An example net structure driving multiple transistors is shown in Fig. 8. The net connecting six transistor drain terminals and six transistor gate terminals is split into 19 branches of which numbers are marked beside the branches. We use different colors for adjacent branches. Among Branches 7, 10, 11, and 12 which consist of multiple segments, only Branch 10 is in Type 3. On a Type 3 branch, the opendefect location affects the caused delay size. We explain later.
Type 4: the a branch consists of multiple segments but an open defect on this branch cannot block the signal transitions for all downstream PMOS or NMPS gate terminals. This branch type can be found in standard cells with high drive strength, as cell designer use multiple parallel transistors controlled by the same net to increase the drive strength. These multiple parallel transistors are switched on simultaneously, such that large conductive current through the transistors quickly charge or discharge the downstream components. In Fig. 8, among Branches 7, 10, 11, and 12 which consist of multiple segments, Branches 11 and 12 are in Type 4.
Open defects at different locations result in different RC networks of a branch. Figure 9a shows an example RC network of a defect-free branch. The capacitive coupling   [30]. If we assume node 4 is closer to the downstream transistor network than node 0, the time constants of the delays due to open1 and open4  (2) the total gate capacitance of the driven transistors [30]. Figure 10 shows the load capacitances after R open1 on Net n2 that is driving an inverter. The two components of the load capacitance are drawn in red and blue colors, respectively. Eight nodes, internal nodes 0-5 and two gate terminals, split Net n2 into seven segments and three branches. Net n2 has two neighboring nets, n1 and n3. The parasitic capacitances between n2 and the two neighboring nets are distributed based on each node of n2. Ci j denotes the capacitance distributed between node i and neighboring net j. Ci j is calculated by summing the values of extracted capacitors of which one coupling node is i and the other coupling node is on net j. Two open defects open1 and open2 are at different locations on the same branch of net n2. During the defect characterization, we inject only one open defect each time. Assuming the resistances of open1 and open2 are equal, open1 causes a longer delay than open2 due to the extra interconnect capacitors C1 n1 and C1 n3 .
The interconnect capacitance after an open defect is also affected by the status of logic values on the neighboring nets. A resistive open defect causes a delay fault and hence can be detected by two-cycle test patterns. We perform simulations of a set of two-cycle cell patterns; each of them contains only one cell input transition and makes at least one cell output transit from the first to the second cycle. The two-cycle cell patterns cause either a static or a dynamic logic value on each neighboring net. In the following, we discuss the effect of static and dynamic neighboring nets on the interconnect capacitance after an open defect.  Figure 11b shows the situation that the logic value on the neighboring net is with opposite transition to n2. In the first cycle of the cell pattern, due to the voltage difference on Net n2 and Net n1, Net n2 with high voltage value charges the parasitic capacitors between Net n1. In the second cycle, the voltage value on Net n1 is high while the voltage value on Net n2 is low, the parasitic capacitors between n1 and n2 are first discharged and then charged. Therefore, the equivalent load capacitance between Nets n1 and n2 are 2 × extracted capacitance. Consequently, neighboring nets with opposite transition to the open branch increase the delay size. Similarly, a neighboring net with the same transition with the open branch decreases the delay size. [1] builds an indication function I n (p) to evaluate the impact of logic value state on |l| neighboring nets. I n (p) is approximated as Equation 9.
Given an open defect on a net segment segm(i, j) with |l| neighboring nets, the function open_net: Nodes c × Nodes c (9) The effect of open-defect location on delay size at cell output can be ignored. For example, in Fig. 12

Open-defect Test Quality Verification and Compensation
After the open-defect characterization, only for each branch in Type 2 described in Section 5.3, we verify if all open defects cause identical DDM columns. If not, we ensure the best test quality by performing additional simulations to identify all different DDM columns.  This section describes how we handle short defects. We first present how we use the PEX tool to assist the short DLI in Section 6.1. Subsequently, we describe the identification of the full set of short-defect locations in Section 6.2. Section 6.3 analyzes the short-defect electrical behaviors to identify the equivalent short defects, which is the basis of short-defect collapsing and test quality verification. The algorithms of short-defect collapsing and characterization are described in Sections 6.4 and 6.5, respectively. Section 6.6 verifies and compensates the short-defect test quality.

PEX for Short-defect Location Identification
To identify the full set of short-defect locations between intra-cell interconnects, we let the PEX tool extract the parasitic coupling capacitors between all nodes in the library cell; Fig. 13 shows the associated user options for Quantus. All self-coupling capacitors between nodes on the same net are extracted to obtain an accurate simulation model (Line 1). The DLI script will ignore these self-coupling capacitors for short-defect location identification, as we do not want to consider short defects between nodes on a single net. Floating nets are expected not to occur in the heavily optimized circuits that library cells are. However, just in case they do occur any way, the command in Line 2 removes them as a single short connected to floating nets cannot cause a fault. We want a simple cut-off threshold to filter out capacitors for which the nodes which are too far apart in the cell's layout. These far apart nodes pose a low risk to become shorted by a defect that would affect the interconnects. The Quantus MinC function does not provide such a straightforward cutoff filter. Hence, we set MinC as low as possible to extract all capacitors and apply our own cut-off threshold for capacitor values in the Modus DLI script. Quantus does not accept MinC=0, and hence we set MinC=10 −25 Farad below which Quantus does not extract any capacitance value (Lines 3-5).
The capacitance is indicative of the chance that these nets will indeed suffer from a short defect between the two nodes: a large C value indicates that the corresponding two nets run for a significant length parallel at close distance from each other and therefore have an increased probability to be affected by a short defect. Our DLI approach can identify short-defect locations both within a layer as well as in between physically adjacent layers. Note that the number of node pairs grows quadratically with the number of nodes.

Full Set of Short-defect Locations
For intra-cell interconnects, we guide the PEX tool to extract all parasitic capacitors between net nodes and consider each parasitic capacitor C(i, j) with conn_net(i, j) = False as a potential interconnect short defect location. For transistors, we model defects inside them as short defects between transistor terminals. Gate-oxide breakdown between gate and source or drain can be exactly modeled by a short resistor between gate and source or drain terminals [22]. Improperly large diffusion area of source and drain due to bad doping may cause a short defect between source and drain [9]. By default, we model transistor-terminal shorts between gatesource, gate-drain, source-drain [8,9,31]. The parasitic capacitors between interconnects and the pairs of transistor terminals constitute the full set of short-defect locations per cell.
Modus' DLI function has the following user controls with which the user can authorize the tool to filter out certain short locations.
-Parameter disableTrTerminalShorts allows to explicitly disable the identification of short-defect locations on transistor terminals (default: false). -TrTerminalShortSet specifies which set of transistor terminal pairs are considered as possible short locations (default: {gate-source, gate-drain, gate-bulk, source-drain}). -A blocklist specifies layer pairs between which shorts should not be considered. This list is technology dependent. Advanced CMOS technologies with feature sizes below 10nm typically have more layers in their library cells and some of these layers are sensitive for inter-layer shorts [4] (default: empty list). -Threshold value C th defines that extracted parasitic capacitors will not be identified as short-defect locations if C < C th (default: C th = 0).
The DLI operation to identify the full set of short-defect locations within a library cell is described in Algorithm 5. In Line 02, for cell c, the full set of short-defect locations is initialized. Subsequently, the parasitic Cs are evaluated to identify locations for short defects between intra-cell nets (Lines 03-07). For each net pair a, b ∈ Nets c , if the node pair of i ∈ a and j ∈ b is not on the blocklist and for which holds C(i, j) ≥ C th , we add it to the full set (Lines 05-06). Next, for all transistors, short-defect locations between the terminals of TrTerminalShortSet are identified, only if not disabled by the user (Lines 08-12).

Short-defect Equivalence
A short defect can happen between any two nodes from two disjoint Nets a and b and cause the whole net-pair shorted together, shown in Fig. 14 which is a partial schematic of a full adder cell in Cadence's GPDK045 library. To test a short defect between a and b, we should first activate the short defect by assigning opposite logic values on a and b, respectively. As each cell-internal net is driven by either pull-up or pulldown network, logic 1 on a net is obtained by switching on the pull-up network to connect the net to the power net V DD , Fig. 15 Short defects between nets with parasitic resistance while logic 0 is from the ground net V SS through the pulldown network. When a static signal is applied on a net, no current flows on the net as the logic signal is used to control downstream transistor gates that are considered as high impedance terminals. Therefore, the activation consequence is that a conductive path from V DD to V SS results from the short defect, shown in Fig. 14. If the interconnects are ideal, namely without any parasitic resistance, the identical logic values on both nets are interpreted from the voltage division between the transistor networks driving the two shorted nets, assuming the shortdefect resistance is small and can be ignored. If the short-defect resistance increase, we should also consider the voltage drop on the short defect increases, but the analysis in this section is still valid. No matter which two nodes from Net a respectively b are shorted and how big R short is, the voltage value at the short location is fixed. All short defects between the two nets cause the same electrical effect and hence are equivalent. Please note that it is possible that multiple cell patterns activate a short defect. In Fig. 14 Considering the parasitic resistance, per input combination, the voltage values at different short-defect locations are slightly different. We build a short defect model including the extracted parasitic resistors on nets, shown in Fig. 15. To make the analysis easily to understand, we assume R short approximate to 0 Ω , namely the voltage values at two shorted nodes are equal. The numbers of nodes on Nets a and b are na and nb, respectively. The short defect can be between node a i on Net a and node b j on Net b if C(a i , b j ) exists in the extracted DSPF netlist, where i ∈ [0..na] and j ∈ [0..nb] . The logic value at a short-defect location is determined by the voltage division between two parts resistances on the two sides of the location. In Fig. 15, the two parts of resistances are ( R pull-a + R a ) and ( R pull-b + R b ), where R pull-a and R pull-b are the resistances of switched-on transistors in the pull networks driving Nets a and b, respectively; R a and R b are the Net a respectively Net b parasitic resistances involved in the conductive path from V DD to V SS due to the short defect. R a is the sum of resistances from node a 0 till node a i which is the short location node. Similarly, R b is the sum of resistances from node b 0 till the short location b j . A cell pattern causing opposite logic values on Nets a and b determines the conducted transistors in the pull-up and pull-down networks, and therefore determines the network resistances R pull-a and R pull-b . R a and R b are determined by the short-defect location.
Given a cell pattern p activating a short defect between Nets a and b, the voltage values V a and V b at the shorted node a i respectively b j vary with the locations of a i and b j , as their locations affect R a and R b . If the voltage variation makes the interpreted logic value on the nets change between 0 and 1, two groups of equivalent short defects exist between the nets for p.
Typically, the parasitic resistance of interconnects is small, and hence the voltage drop on parasitic resistors is much smaller than that on transistors. As source-drain voltage of transistors does not significantly change with the variation of the short location, we assume the transistors in the shorted path can be seen as linear resistors [30]. With this assumption, the larger R a and the smaller R b , the more voltage drop on Net a. Therefore, it is more probable that the interpreted logic values at node a 0 to node a i are different. When R a is maximized and R b is minimized by shorting node a na and b 0 , shown in Fig. 16a, the voltage division is determined by the ratio of If a pull-up network drives Net a and a pull-down network drives Net b, the short defect in Fig. 16a leads to the minimum V a and V b which the most probably is interpreted into logic 0, which means the logic value on Net a is forced to change from logic 1 to 0. If a pull-down network drives Net a and a pull-up network drives Net b, as the voltage drop on Net a is maximized and the voltage drop on Net b is minimized to zero, the short defect in Fig. 16a leads to the maximum V a and V b , namely leads to the highest probability that logic values on a and b are 1 and a is still dominated by b. Similarly, Fig. 16b shows the shortdefect location which causes the highest probability that Net a dominates Net b. In summary, between Net pair [a, b] the two diagonal short defects shown in Fig. 16a and Fig. 16b determine the range of voltages at shortdefect locations for every cell pattern activating the short defects between Nets a and b. If for all cell patterns, the voltage values at the two diagonal short-defect locations are interpreted into identical logic values, all short defects between a and b are equivalent.
Power net V DD and ground net V SS are voltage sources, and no transistor network drives them. Assuming a short defect happens between V DD /V SS and some other net, the voltage at the short-defect location is determined by the voltage drop on the parasitic resistance of V DD /V SS . Typically, the parasitic resistance on V DD or V SS is up to several decades ohm, while the transistor network resistance is at least several hundreds ohm. The voltage drop on V DD or V SS is much smaller than that on the other side of the short defect, and the change of shortdefect location cannot lead to a drastic voltage variation at the short-defect location. Therefore, the voltage values at the short-defect locations are always close to the value on V DD or V SS , namely V DD or V SS always dominates the other net driven by transistors.
In summary, the short-defect location affects the voltage value on the shorted nets through the parasitic resistance involved into the shorted path. Typically, transistor network resistances are much larger than the parasitic resistances, and therefore varying the short-defect location cannot make the logic value on the shorted nets change, namely all short defects are equivalent. However, in some specific situations, varying the short-defect location result in some non-equivalent defects that are discussed later. The range of voltages at the short-defect location is determined by the two diagonal short locations. If the two diagonal short defects between a net pair are equivalent, all shorts between the net pair are equivalent.

Short-defect Collapsing
Based on the analysis of short-defect equivalence, for many net pairs, the short defects are equivalent. To minimize the defect count to be simulated, we collapse the full set of short defects into a compact set by identifying only one shortdefect location and simulate the compact set. Afterward, we identify the net pairs with non-equivalent short defects and identify the DDM columns for all these short defects to compensate the test quality. This location should meet two conditions: (1) the location is based on an extracted parasitic capacitor and (2) compared with the other extracted capacitors, the node pair of this capacitor leads to on one net the maximum parasitic R while on the other net the minimum parasitic R involved into the shorted path from V DD to V SS , and therefore results in either boundary of the voltage variation due to the short location.
The short-defect collapsing algorithm is described in Algorithm 6. As we want to identify at most one short per net pair, we use array netShorts to store which net pairs already have been assigned a short defect; in Line 03, this array is initialized. Subsequently, for each net pair a, b ∈ Nets c which does not have a short identified yet (Line 05), we identify the node pair i ∈ a and j ∈ b in the full set of short-defect locations and leading to the maximal parasitic R of Net a and minimal R of Net b that are involved into the Fig. 17 The number of transistors but not nets increases with the drive strength shorted path (Lines 08-13). Next, for all transistors, if the user does not disable the transistor terminal shorts, only the transistor terminal pairs between which there is not already a short between the two corresponding nets are added to the compact set (Lines [15][16][17][18][19][20]. If the transistor terminal leads to larger difference of the involved parasitic resistances, we replace the interconnect short location by the transistorterminal pair (Lines 21-25).

Short-defect Characterization
Defect characterization is performed with only one short defect per net pair. To verify the test equivalence of short defects, except the DDM entry, we record more simulation results: per combination of short defect and cell pattern, voltage value V p at the selected short-defect location and the current I p in the conductive path from V DD to V SS .

Short-defect Test Quality Verification and Compensation
To check if all short defects per net pair are equivalent, we should know for each cell pattern activating the short defects, if the two diagonal short defects are equivalent. Based on the simulation result of the selected short defect per net pair, we identify the cell patterns for which the short defects between a net pair are possible to be non-equivalent, and hence cause different resulting DDM entries. Afterward, we simulate the other diagonal short defect for the net pair with only these suspicious cell patterns. If the two diagonal short defects are non-equivalent, we perform additional simulations for all short defects between the net pair to obtain all different DDM columns to maintain the test quality.
With the voltage and current values at the location of each simulated short defect, we calculate V p ∕I p which is the sum of the resistance of pull-down network and parasitic resistance on the corresponding driven net. Similarly, (V DD − V p )∕I p is pull-up network resistance with the parasitic resistance on the corresponding driven net. The involved parasitic resistance on two shorted nets are calculated by processing the extracted DSPF netlist. Therefore, we calculate the transistor-network resistances by subtracting the parasitic resistances in V p ∕I p and (V DD − V p )∕I p . Considering the effect of short-defect location, we classify the relationships between the transistor and parasitic resistances into four cases per cell pattern p that activates the short defect.
) If no parasitic resistance is involved in the conductive path caused by a short defect, R pull−a ∕R pull−b > 3/2 results in more than 60% voltage drops on the transistor network driving Net a. If a pull-up network drives Net a, the voltage at the short location V p is smaller than 40%V DD . The logic values on Nets a and b are 0. If a pull-down network drives Net a, V p is larger than 60%V DD . The logic values on a and b are interpreted into 1. Therefore, Net b dominates Net b. Furthermore, transistor resistances are much larger than the maximum of R a and R b . The variations of R a and R b due to short-defect location cannot cause the voltage value at the short-defect location to vary between logic 0 and 1. The effect of short-defect location can be ignored, and hence all short defects between a and b are equivalent for p.
) Case 2 is similar to Case 1. The voltage value at the short-defect location is also either smaller than 40% 40% DD or larger than 60%V DD , and the effect of short-defect location can be ignored and all short defects are equivalent for pattern p. But in this case, Net a dominates Net a. Case 3: 2/3 ⩽ R pull−a (p)∕R pull−b (p) ⩽ 2/3 If no parasitic resistance is involved in the conductive path caused by a short defect, the voltage value at the short-defect location is in the range of [ 40%V DD , 60%V DD ]. The threshold voltage between logic 0 and 1 is in this range for the CMOS technology. A small voltage variation can change the interpreted logic value. Therefore, the variations of R a and R b may cause different fault effects. In this case, two possible groups of equivalent short defects exist between the net pair [a, b]. One group causes the logic values on both nets to be 1, while the other group causes both nets' logic signals to be 0.
In standard cells with high drive strength, multiple parallel transistors are controlled by the same input signal. When these parallel transistors are switched on simultaneously, the signal on their driven net is pulled up or down by these parallel transistors together, and hence the drive strength increases. However, resistances of the transistor networks offering high drive strength decrease due to the parallel conducted transistors. If the transistor-network resistances are compatible with the parasitic resistance, the short-defect location may influence the logic value on the shorted Nets a and b. For the net pairs of which parasitic resistances are larger than 10% transistor-network resistance, we check if the two diagonal short defects are equivalent.
If Case 1 or 2 happens for a cell pattern, all short defects between the corresponding net pair are equivalent. Only in Cases 3 and 4, test quality verification and compensation is required to maintain the test quality. The process is described in Algorithm 7. For each net pair, we initiate Set nonEquPatterns to store the cell patterns for which the short defects are non-equivalent (Line 03). Based on the voltage value at and current value through each short-defect location, we calculate the transistor-network resistances per cell pattern p that activates the short defect (Lines 04-13). For the cell pattern with which the effect of short-defect location cannot be ignored, we simulate the other diagonal short defect which identifies the range of voltage values with the already simulated diagonal short defect together. Only if the two DDM entries of the two diagonal short defects are different, short defects between a and b are non-equivalent for pattern p, and we store p in the set nonEquPatterns (Lines [18][19][20][21][22][23][24]. If for only one cell pattern, the short defects are non-equivalent, we derive the only two different DDM columns in which the entries are all identical except in pattern p row. (Lines 26-31). Otherwise, we simulate all short defects only with the identified cell patterns in nonEquPatterns to get all different DDM columns (Lines 33-36). To facilitate the downstream ATPG step, we collapse all identical DDM columns into one (Line 38).

Experimental Results
The experiment results reported in this paper are based on 351 combinational cells in Cadence's GPDK045 45nm library [3]. This library is planar CMOS technology and all cells contain four layers: n-diffusion, p-diffusion, poly, and metal-1 layers. The EDA tool versions used in our experiments are Cadence Pegasus v18.1, Quantus Extraction Solution v18.1, and Modus v20.1. For the full set DLI, we use the default user settings which includes all possible defect locations, such that guarantees the best test quality; for defect characterization, we set open-defect resistance 1GΩ and short-defect resistance 0.001Ω . The tolerated delay threshold value is set to 1ns.

GPDK045 CMOS Library
A standard-cell library contains two types of cells: logic cells and physical only cells. Physical only cells, such as tie-up, filler, and decap cells, have no inputs or outputs, hence we cannot perform the logic test that requires applying test stimuli on the inputs and observing responses on the outputs. CAT, as a logic test method, is applied on only logic cells. Typically, a logic cell with base drive strength that is denoted by 'D1' or 'X1' in the cell name contains only transistors that implement the cell's logic function. We refer to these transistors as function transistors. Designers increase the drive strength of cells by adding so-called drive transistors in parallel with the function transistors driving the cell outputs. Therefore, both function and drive transistors are switched on or off by the same input signal and contribute to pull up or pull down the output simultaneously, which increases the drive strength. For a group of cells with the same logic function, the variation of drive strength influences the number of transistors but not the number of nets, which indicates the number of short-defect locations in the compact set is independent of the drive strength. For example, Fig. 17 shows the drive strengths and the number of transistors and nets in all GPDK045 inverter and multiplexer 2-to-1 cells. The number of nets always stays the same for all inverter or multiplexer cells, but the number of transistors increases with the drive strength.
The GPDK045 cell information is shown in Fig. 18. The cells with the most inputs are AOI33, AOI222, OAI33, OAI222, and MUX4-to-1. As they both have six inputs and one output, the number of one-cycle cell patterns is 2 6 = 64 . MUX4-to-1 cell requires the most (i.e. 128) twocycle cell patterns in which one input signal change causes an output transition.

Example Cell AND2X1
First, we illustrate our approach with cell AND2X1. The AND2X1 cell has two inputs, A and B, one output Y, and performs a logic-AND function. Three I/Os, power and ground connections, and six transistors with four terminals give a total of 29 terminals for this cell. As shown in Fig. 19a, these terminals are interconnected by seven nets: Figure 19b, shows the layout of AND2X1. The PEX tool partitions the seven nets into 43 segments excluding eight virtual segments, each with its own parasitic resistor. Figure 19c illustrates the PEX partitioning of nets into segments for Net A, which interconnects input terminal A and the gate terminals of Transistors Mp0 and Mn1. PEX introduces four internal nodes on Net A. Nodes A#9 and A#4 mark the transition points between respectively metal-1 and poly-silicon to the contact between these two layers. At Node A#3 , the net forks out to the PMOS and NMOS transistors. Node A#5 is introduced to mark the 90 • bend point in the 'L'-shape. The dangling segments at the extreme ends of the poly-silicon and metal-1 structures are not reported. Example Net A consists of six segments, partitioned over three branches: {A, A#9, A#4, A#3} , {A#3, Mp0#g} , and {A#3, A#5, Mn1#g} . For each of the three branches, we identify the segment on Among the 20 branch opens, nine opens cause either no delay or delay longer than 1 s (= 1000ns) for all cell patterns. As opens on these nine branches cause delay much larger than the detection threshold of delay size defined by the user, and the load capacitance variation on a branch cannot be 1000× , all opens on these branches will cause delay larger than 1ns and hence be detected by the same patterns, namely are equivalent. We do not verify test quality for these branches. The other eleven open defects causing delay smaller than 1 s and larger than 1ns are shown in Table 1. The column `Nominal Delay' presents the delays for all cell patterns under the defect-free situation. Based on the load capacitances after opens per branch, we calculate the delay size caused by the most downstream segment open on each of the eleven branches, shown in Table 2. All the calculated delay sizes in Table 2 are larger than 1ns. Therefore, for all branches, the most upstream and the most downstream segment opens are detected by the same set of cell patterns. All opens on the branches are equivalent, and no additional simulation to be performed.
For Branches on which {d3, d3 ′ }, {d4, d4 ′ }, {d5, d5 ′ }, {d6, d6 ′ }, {d8, d8 ′ }, {d9, d9 ′ } are, the delay sizes caused by the most upstream and the most downstream segment opens are identical. The reason can be either the branch contains only one segment, or the load capacitance variation caused by open-defect location is too small to impact the delay size. Next, we consider short-defect locations. Cell AND2X1 has 58 nodes, which make for 58 2 = 1,653 node pairs. Subtracting the 257 node pairs for which both nodes are part of the same net leaves 1,396 node pairs between disjoint nets; this is the theoretical maximum for the number of shortdefect locations. The PEX tool extracts only 745 parasitic capacitors above its user-specified PEX minimum of 10 −25 F. The DLI function considers at most one short-defect location per net pair. Cell AND2X1 has six physical nets, except Net n2 connecting Mn0#d and Mn1#s between which there is no real physical interconnection but just a shared diffusion area. Therefore, we consider 6 2 = 15 net pairs. As they all have extracted parasitic capacitors, for each one we identify one representative short-defect location by selecting the capacitor with the maximum value. For each transistor, identifying short-defect locations between three terminal pairs would result in 18 shorts. However, some transistor-terminal short locations are between the same net pairs, hence are equivalent. Only 15 terminal shorts are non-equivalent to each other and ten of them are equivalent to the net-pair shorts already identified based on the parasitic capacitors. Therefore, we only need to add five more transistor-terminal short locations into the compact set. Furthermore, three of these shorts (A-Y, B-Y, and A-B) are equivalent to the port-shorts at the cell boundary. In total, 20 of the 21 potential net-pair short locations are considered realistic in our approach; this constitutes a reduction of 97.4% from the 745+24 = 769 possible short locations based on the extracted netlist. The corresponding simulation time is reduced from 74 to only 7 seconds, which is a 90.5% reduction. The 20 identified cell-internal short-defect locations also cover the six stuckat faults at the cell boundary, i.e., Ports A, B, and Y either stuck-at-zero or -one. Table 3 shows the voltages at short-defect locations given by the simulation. The power voltage value is set to be 1.1v based on the library Liberty files. Cell patterns are in the first column. The second to the last columns in the second row give the names of net pairs between which we inject a diagonal short defect. The shorted net pairs of which one net is V DD or V SS are not listed in Table 3 as V DD or V SS always strongly dominates the other net. It is not necessary to verify the shortdefect equivalence for these net pairs. The highlighted items in Table 3 indicate the combinations of net pair and cell pattern may have multiple detection results with changing the shortdefect location, because the voltage value at the short-defect location is close to the threshold of logic 1. For all the other net pairs, the voltage values at short-defect location are strong logic '1' or '0' and their parasitic resistances are much smaller

45nm Technology Library
The PEX run on all 351 combinational cells of the GPDK045 library [3] Fig. 20.
The simulation time of the compact set is 26.6 hours, which indicates the simulation based on the full set would take around 67 hours. For the total 22,306 branches, 74.6% are branches of which the most upstream segment opens are detected. Subsequently, we verify the test quality for the 74.6% branches. The other 25.4% branch opens cause delay smaller than the detectable threshold 1ns, and hence are non-detectable. As the most upstream segment opens on the 25.4% branches cause delay smaller than 1ns, all the other opens on their branches cause delay also smaller than 1ns. All opens on the 25.4% are non-detectable and equivalent.
For each branch of the 74.6% branches, if the most upstream segment open causes delay size larger than the tolerated delay threshold 1ns and the most downstream segment open leads to a delay size smaller than 1ns, not all segment opens on this branch are equivalent. Before calculating the delay sizes of the most downstream opens, we filter out Type 3 branches on which open defects at different locations may cause different detection results. 42.7% branch opens cause delay larger than 1 s for all required cell patterns, 1000× detectable threshold 1ns. It is not possible on these branches the delay variation is larger than 1000× . All opens on these branches for all required cell patterns cause delays larger than 1ns, so they are detectable by the same cell patterns and equivalent. For the remaining 31.9% branches, we classify them corresponding to the four types discussed in Section 5, shown in Fig. 21. Only 9.93% (= 2,214) Type 3 branches require the test quality verification.
For all 2,214 Type 3 branches, we calculate the delay size resulting from the most downstream segment open for each pattern. Figure 22 shows per branch, only the minimum delay sizes caused by the most upstream and downstream segment opens by applying all cell patterns. The results show that on no branch, the most upstream and downstream segment opens cause different detection results. All segment opens on every branch are equivalent. The branch classification and delay size calculation for the Category 5 branches took 0.86 hours in total. In summary, we reduce the opendefect simulation time by 59.0% with maintaining the best test quality.
The PEX tool extracts 1,278,098 parasitic capacitors (excluding self-capacitances). If we assign shorts on all parasitic C and four short defects per transistor between its transistor terminals, we identify a full set of 1,278,098 + 3 × 6,438 = 1,297,412 short-defect locations. We consider one diagonal short defect per net pair. The library cells together have 22,840 net pairs, for which only 10,430 net pairs have extracted parasitic capacitors. We identify 10,430 net-pair shorts in the compact set. Subsequently, we add additional 4,235 transistor-terminal shorts that are between non-physical interconnects but model transistor-internal defects. The compact set reduces the full set of short locations with 1 -(14,683/(1,278,098 + 3 × 6,438)) = 98.9%. Figure 23 shows per library cell the DLI results for shorts. Unlike what was claimed by [14], there is no need to explicitly add stuck-at faults at the cell boundaries to the candidate defect set, as all stuck-at faults are already included in the compact set as short defects between cell ports and power or ground. Figure 24 shows per cell the simulation time in the defect characterization and test quality verification and compensation. Assuming we simulate the full set of short defects, for each library cell, the number of simulations equals the product of short-defect count and cell pattern count, resulting in 21,788,604 (=100%) simulations. With the approach proposed in this paper, we first simulate only the compact During short-defect characterization, we use the exhaustive set of one-cycle cell patterns to simulate each short defect and record the voltage and current values at each short-defect location for the downstream test quality verification and compensation. Using these values, we identify the combinations of net pairs and cell patterns that may result in different detection results due to different short-defect locations. Between 1,616 (=15.5%) out of 10,430 net pairs, short defects are possible non-equivalent for one or multiple cell patterns. Applying only the cell patterns that can cause different detections, we perform simulations for the other diagonal short defect of the 1,616 net pairs. The total additional number of simulations is 16,232, and the corresponding simulation time is 0.8 hours. After this simulation, we determine 348 (=0.03%) net pairs between which two diagonal short defects are non-equivalent. Therefore, between these net pairs, not all short defects are equivalent. Among the 348 net pairs, 148 net pairs are with only one cell pattern causing different detections for the two diagonal short defects. We already can determine the only two DDM columns caused by these net pair shorts without additional simulation. For the remaining 240 net pairs, we perform additional simulations for the full set of the short defects. 120,115 times simulations took additional 3.4 hours and identified 823 different DDM columns for these net pair shorts. In total, we spent 16.7 hours simulating the short defects; 74.9% simulation is for the compact set of short defects, and 25.1% simulation time is for the test quality verification and compensation. Compared with the estimated simulation time for the full set, we save 98.5% time cost on defect simulation.

Conclusion
CAT promises to significantly reduce test-escape rates, as it targets cell-internal defects that are covered by conventional test approaches only on a serendipitous basis. Test quality and costs of CAT critically depend on the library characterization. To guarantee the test quality, we propose a PEX based approach to identify a full set of defect locations. To reduce the simulation time, we minimize the number of defects to be simulated without a negative impact on the test quality. We collapse the full set to a compact set based on the defect equivalence. For opens, this means that we consider only one defect per net branch, while for shorts, we consider only one defect per net pair. Even though simulation with the compact set maintains the test quality for a large number of defects, errors can happen in some specific situations. On particular branches, open-defect location affects the resulting delay size and may lead to different detection results based on the user-defined delay threshold. Between cell-internal net pairs, short-defect location affects the voltage, namely the logic value for the downstream circuit. We propose a solution to verify the test quality based on the compact set and, if necessary, compensate the test quality to the same level based on the full set. In our experiment results for Cadence's GPDK045 library, the defect collapsing reduces the total number of defect locations to be simulated by 96.6% compared to the number of defects in the full set. To compensate the test quality, we simulate additional 2% defects from the full set to identify 823 more different DDM columns. In theory, we would reduce 96.4% simulation time compared with simulating the full set of defects.
SantoshMalagi is working as a software engineer with Cadence as a part of their Modus ATPG R&D group. His current area of focus is cell-aware test, he has played a significant role in developing a cell-aware test methodology for prospective Cadence customers. He has experience in executing all steps involved in cell-aware test -i.e. Parasitic Extraction, Library Characterization and Cell-Aware ATPG. Santosh recently completed a master's in computer engineering from the Delft University of Technology in the Netherlands, where he got interested in VLSI testing. Previously, he was a research intern on a joint development project on cell-aware test between Cadence and IMEC in Leuven, Belgium.
Joe Swenton earned his B.S. degree in Computer Science from Binghamton University in 1996. He has worked in the EDA field since 1983 for both IBM and Cadence Design Systems, developing solutions for a variety of problems including embedded macro test, stored pattern ATPG, fault analysis and diagnostics. He is presently working in advanced fault modeling of cell internal defects for ATPG and diagnosis.
Jos Huisken is a researcher in the domain of energy efficient VLSI design. After graduation from the University of Twente he joined Philips Research to work on digital signal processor design. He has been involved in architectural synthesis for digital signal processors, and applied these techniques for the first Digital Audio Broadcast (ETSI-DAB) ICs in the 1990s. Since then he has been driving low power VLSI-design from an architectural point of view. After investigating turbo-and LDPC-decoders, and being involved in creating a spinoff company Silicon Hive from Philips on digital signal processing and compilation, he joined Holst Centre/imec in 2008 to work on ultra low power DSP for wireless sensor nodes, specifically for body area networks, with a strong focus on low-voltage and low-power circuit design. Joining RWTH Aachen in 2013 his research shifted to reliable VLSI design and Design for Error Resilience (DfER). On January 2016 he joined TU/e to give courses on VLSI circuit design, carry out research in the field of EEG, ultrasound and baseband processing and to continue his research in Design for Error Resilience.
Kees Goossens is Full Professor of Realtime Embedded Systems in the Electronic Systems group of the Department of Electrical Engineering at Eindhoven University of Technology (TU/e). He focuses on composable (virtualized), predictable (real-time), low-power embedded systems, supporting multiple models of computation. At Topic Embedded Products, Goossens works on real-time dependable dynamic partial reconfiguration in FPGAs. Early in his research, he investigated the formal verification of hardware, in particular by using semi-automated proof systems in conjunction with formal semantics of hardware description languages such as ELLA and VHDL. At NXP Semiconductors, Goossens worked on real-time networks on chip for consumer electronics, then on on-chip communication protocols and memory management. He led the team that defined the Aethereal network on chip for consumer electronics. Goossens has been editorial board member for the ACM Transactions on Design Automation of Electronic Systems (TODAES) since 2009 and associate editor for the Springer Journal of Design Automation of Embedded Systems (DAEM) since 2006, and he has been guest editor for several special issues on networks on chip. He is author of 24 patents, and he has published four books and over 175 articles, with four paper awards.