Advertisement

Journal of Hardware and Systems Security

, Volume 2, Issue 1, pp 83–96 | Cite as

ESCALATION: Leveraging Logic Masking to Facilitate Path-Delay-Based Hardware Trojan Detection Methods

  • Arash Nejat
  • David Hely
  • Vincent Beroulle
Article
  • 446 Downloads

Abstract

Hardware Trojan (HT), intellectual property (IP) piracy, and overproduction of integrated circuit (IC) are three threats that may happen in untrusted fabrication foundries. HTs are malicious circuitry changes in the IC layout. They affect side-channels (IC parameters) such as path-delay or power consumption. Therefore, HT detection methods based on side-channel analysis have been proposed. They can detect an HT only if its effects on side-channels are significant among the alteration of side-channels, caused by process1 and environment2 variations. IC design modifications at different abstraction levels have been proposed to facilitate HT detection methods after fabrication, such as modifying a circuit to make the paths3 of the circuit more sensitive to HTs. Such modifications are known as design-for-trust (DfTr). In addition, key-based modifications have been proposed to protect IPs/ICs from IP piracy and IC overproduction. This approach is known as masking or obfuscation, and it modifies a circuit such that it does not correctly work without applying a correct key. In this work, we propose a DfTr method based on leveraging the masking approach. It improves HT detection methods based on path-delay analysis. As a matter of fact, the delay of shorter paths varies less than that of longer ones. Therefore, the objective of the proposed DfTr is to generate fake short paths for nets that only belong to long paths. Our layout level experiments show that the proposed DfTr masks the functionality of circuits and, on average, increases the HT detectability of path-delay-based detection methods by 10%.

Keywords

Hardware security Design-for-trust Logic masking Hardware Trojan detection IP/IC piracy 

1 Introduction

The fabless business model has progressively become the main model within the semiconductor industry over the last two decades. Although this model reduces development costs, it faces security challenges among its roles [1], such as design houses, IP developers, system integrators, and fabrication foundries. An important role is fabrication foundries because they can take advantage of access to the ordered layouts and insert a malicious functionality, known as Hardware Trojan (HT). HTs may cause a failure or leak confidential information [2]. In addition, untrusted fabrication foundries can overproduce the layout or perform reverse engineering on the layout functionality and then extract and pirate the intellectual properties (IPs) of the layout [1].

HT detection is a challenging task [2, 3] because smart and skillful attackers insert hard-to-trig and well-designed HTs into the ordered layout. In other words, HT attackers try to use low controllable signals for HT activations (i.e., the trigger part of HT), as well as low observable signals for HT missions (i.e., the payload part of HT). Such HTs rarely change the circuit outputs. Moreover, well-designed HTs affect as least as possible the circuit side-channels, such as power or delays of paths3 of a circuit. In such cases, the HT effects on side-channels are not easily distinguishable because side-channels do not have a fixed value and they vary due to process variations (PV1) and environment variations (EV2). HT detection difficulties impose some design modifications on designers. These modifications can hinder HT attackers or facilitate HT detection methods. This approach is entitled design-for-trust (DfTr) [3, 4]. For instance, in order to improve path-delay-based HT detection methods, one can modify a circuit so that it has fewer paths having high delay variations. In this case, the choices for an HT attacker to insert a hard-to-detect HT are more limited. Another type of DfTr is to add calibration structures to a circuit in order to accurately measure side-channels and PV effects [4].

Key-based design modifications have been proposed to prevent IP/IC piracy [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]. Their basic approach is to modify a circuit such that it does not properly function without applying a correct key [1]; therefore, the piracy of the circuit is meaningless as long as the key is not revealed. Key-based design modifications change the finite-state machine of circuits, at the register-transfer level (RTL), and/or in the combinational part of circuits, at the gate level (GL). Authors have used different names for this approach according to the implementation in GL or RTL, such as logic/state obfuscation [6, 7, 8], logic/state encryption [10, 13], logic/state locking [11, 12, 13], and logic/state masking [5, 14]. Brice et al. discussed that the masking term is more accurate [15], and it is used hereafter.

The second benefit of key-based modifications is to hinder HT insertion; so they are considered as DfTr. Indeed, during the fabrication stage, if HT designers do not know the correct key and functionality, they have more difficulty to design a hard-to-detect and efficient HT. For instance, a poorly designed HT might be activated by a sequence of events that never happens if the circuit is fed by the correct key [6, 16].

In [17], we proposed a new benefit of key-based modifications. The proposed approach is to leverage logic masking to improve path-delay-based HT detection methods. We called the proposed approach “ESCALATION: rEusing logic maSking to faCilitate pAth-deLay-bAsed Trojan detectION”. ESCALATION masks the functionality (combinational part) of circuits while improving the efficiency of path-delay-based HT detection methods. Apart from logic masking benefits, ESCALATION’s objective is to generate fake short paths for nets that only belong to long paths, because shorter paths have smaller delay variation [18]. In [17], we reported the improvement of HT detection probability (HDP) obtained by the proposed algorithm, for a few technology-mapped circuits. In addition, the masking quality of the proposed approach was compared with one masking algorithm proposed in [6], namely HARPOON.

In this work, we propose and validate an ESCALATION-based algorithm. The aim of this algorithm is to mask the circuit logic such that each net in the modified circuit belongs to at least one short-enough path, where path-delay analysis can be successfully performed. The contributions of this paper, in comparison to our work in [17], are as follows:
  • We detail the proposed algorithm accompanied by three ideas for improvement. The ideas are explained and shown in examples.

  • We compare the results and overheads of the proposed algorithm with trust-driven retiming (TDR) proposed in [18]. TDR has the same objective as the proposed algorithm, but TDR does not make any changes in or mask the original functionality. The comparisons are performed at the gate level because the TDR results and overheads are reported at the gate level.

  • We validate the HT detection efficiency of the proposed algorithm at the layout level. Our experimental results show that our algorithm improves the HDP based on path-delay analysis. It also provides logic masking.

  • We validate the logic masking efficiency of the proposed algorithm. We calculate the Hamming distance (HD) of the outputs of masked circuits while they have the correct or wrong keys. We use HD, as a usual metric, to compare the masking quality of our algorithm with that of previous work. The results show that the proposed algorithm makes for better HD than the random masking presented in the following sections.

The rest of the paper is organized as follows: In Section II, we survey related work on both HT detection and logic masking methods. In Section III, we explain the ESCALATION approach and an algorithm based on it. In Section IV, we evaluate the efficiency and the costs of our methodology. The experimental results are given in Section V. We present layout-level validation in Section VI. Section VII includes discussions about logic masking in general, and the ESCALATION approach, such as how one can use logic masking in million-gate circuits, and how one can improve HD in the proposed algorithm. Finally, in Section VIII, we draw some conclusions and propose future perspectives on this work.

2 Related Previous Work

In this section, the flow of side-channel-based HT detection methods is firstly explained. Then, the concepts used in the proposed algorithm are described in previous work relating to path-delay-based HT detection methods and masking algorithms.

2.1 The Flow of Side-Channel-Based Hardware Trojan Detection

HT attackers try to design hard-to-trig HTs that rarely change the circuit outputs; however, HTs always have effects on side-channels. For instance, the HT in Fig. 1 is activated if a specific combination of signals and state elements happens at the trigger part; and then, the HT changes the value of one or more signals (this is the payload part). Imagine the dotted net in Fig. 1 is a net in the HT-free circuit, and the attacker has changed it with a new net and gate. This HT adds delays to several paths even when it does not have any activity. The trigger part increases the net fan-out loads that drive the HT inputs, and consequently the delay of these nets. The payload part adds at least one level of gate to the paths that include the HT output(s).
Fig. 1

An abstract model of Hardware Trojan including the trigger and payload parts. The trigger part is fed by the combinational part and some flip-flops. The payload part is an AND gate

In order to identify the HT effects on a side-channel, one needs to generate and apply some test vectors. For instance, the authors proposed approaches and hardware structures for path-delay measurement and HT detection [19, 20]; they used the test vectors generated by traditional path-delay-fault detection algorithms.

The flow of HT detection methods based on side-channel analyses has four main steps detailed in [2]. First of all, sufficient and efficient input vectors must be generated. This is very challenging because these vectors must be as few as possible and also stimulate all of the circuit elements. For instance, with path-delay analysis, the test vectors must cover a collection of paths in such a way that each net of the circuit belongs to at least one path of the collection. The second step consists in applying the generated test vectors to the ICs and measuring the signals of the targeted side-channel. These signals are used as a fingerprint for each IC. When the targeted side-channel is path-delay, the generated test vectors are applied to the ICs and the selected path-delays are measured. Afterwards, the variation of the targeted side-channel must be calculated. This variation is referred to as the “Golden reference”.

The Golden reference can be derived from a few genuine ICs (Golden ICs). These ICs can be found by randomly selecting a few ICs and then performing destructive reverse engineering (including encapsulating, delayering, imaging, and image processing [2]). Later research has shown that reverse engineering can be avoided by adding some embedded structures to the circuit and using them to obtain the IC fingerprints and side-channel variations. This approach is known as Golden-free HT detection [21]. The final step is to compare the other ICs’ fingerprints with the Golden reference.

2.2 Path-Delay-Based Hardware Trojan Detection

Jin et al. proposed delay fingerprinting in earlier work [22]. They simulated the presence of PV by assuming a 15% variation in the parameters of a 130 nm technology. Their simulation results show that their approach can detect the HTs that they inserted. However, more PV must be taken into account with newer technologies.

The authors proposed to leverage a path-delay measurement structure, named Shadow Register, to detect HTs [20, 23]. The implementation of Shadow Register includes duplicating the clock signals and all the flip-flops (FF) of the circuit. The input of each original FF feeds the added (shadow) FF. The shadow FFs capture their input using the added clock signal. The added and original clock signals have the same frequency, with a specific phase difference. The phase difference should be equal to the maximum path-delay variation. In this case, if an FF and its shadow have two different values, one can suspect the presence of an HT. The comparison between each FF and its shadow FF requires an additional XOR gate. The results show that the area overhead and resolution of this structure, as well as HT detectability, are very high [20, 23].

Nejat et al. have proposed using another structure with the same objective [19]. This structure calls for just one additional multiplexer for each FF in the circuit. The results show that this structure calls for less area overhead, and lower delay-measurement accuracy in comparison to the shadow register-based measurement approach [19].

Cha et al. have proposed the use of a ring oscillator1 (RO) in addition to the shadow register structure [20]. In this work, an RO is used as a calibration device. Indeed, accurate calibration is necessary to remove (or at least decrease) process variation effects and then be able to detect HT side-channel effects. Briefly explained, process variations have two different components: die-to-die variation components which alter side-channels in each instance of an IC; and with-in-die variation components which alter side-channels at different points of an IC [24]. For instance, due to die-to-die variations, the frequency of one RO is different in each instance of the IC. Die-to-die variations do not have any effect on ROs in different places of one IC. If, due to die-to-die variations, an RO in an IC is X% slower (or faster) than the expected value, one can be sure all ROs (with different sizes and structures) in the IC are X% slower (or faster). Thus, one can calculate the die-to-die variation effect on path-delay using one RO, and then be sure the rest of path-delay variation is due to with-in-die variation or HT effects.

Shadow registers empower defenders to investigate HT in shorter paths. The results in [20] show that shorter paths are better choices than long ones for delay-based HT investigations. Due to with-in-die variations, the delays of two specific gates are different in two different places of an IC. The two gates differ less if they are close to each other [24]. If the gates of a path are placed very far from each other, they are more different and they create a long path. This is the reason for less delay variation in shorter paths. Shekarian et al. theoretically proved this fact [18]. Thus, in order to enhance the HDP of path-delay-based HT detection methods, shorter paths should be investigated instead of longer ones. The experiments in [19] also show that performance-driven technology mapping increases the success of path-delay-analysis-based HT detection methods because it generates shorter paths.

ROs have also been used with the aim of HT detection based on path-delay analysis. With this method, all the circuit gates must belong to at least one RO. RO frequency deviation warns of the presence of HTs. In order to have less area overhead, RO-embedding algorithms aim to cover all of the circuit gates with fewer ROs. Decreasing the number of ROs increases RO lengths [25]. Charles et al. proposed a delay chain, named REBEL, which bypasses circuit FFs in order to measure the delay of different parts of the circuit and then detect HTs [26]. Whereas REBEL and RO-based structures have less area overhead in comparison to the earlier-mentioned structures, they make long paths for investigating HT. Long paths (or long ROs) have more delay variations; thus, their HT detection probabilities (HDP) are lower.

Shekarian proposed a new DfTr approach based on a trust-driven retiming algorithm, named TDR [18]. The aim was to reduce the number of vulnerable points. Vulnerable points are nets that only belong to long paths. For each net (N) of the circuit (C), there are some paths (P) that pass the net (N). Accordingly, the definition of the most vulnerable net (MVN) of a circuit, with n net, is obtained by the f function:

Definition 1

$$ {\displaystyle \begin{array}{l} SP(N)=\operatorname{Min}\left\{\forall P\in C:P\;\mathrm{is}\kern0.17em \mathrm{passing}\;N\right\}\\ {} MSP(C)=\operatorname{Max}\left\{{SP}_i\left({N}_i\right)\left|{SP}_i\&{N}_i\in C,i=1\dots n\right.\right\}\\ {}f: MVN\mapsto MSP\\ {}f=\left\{\exists N:N\in \kern0.5em \mathrm{the}\kern0.17em \mathrm{nets}\kern0.17em \mathrm{of}\kern0.5em MSP,N\notin \kern0.5em \mathrm{the}\kern0.17em \mathrm{nets}\kern0.17em \mathrm{of}\kern0.17em \mathrm{other}\kern0.5em {SP}_i\right\}\end{array}} $$

In Def. 1, the MSP of a circuit is the maximum (longest) path of the shortest-paths (SP). In other words, MSP is the shortest path of the most vulnerable net, and its value is greater than that of the shortest path of other nets.

The TDR algorithm reduces the MSP value by adding extra FFs. It increases the HDP of path-delay-based HT detection methods but since functionality is not changed and no logic/state masking is generated, it does not prevent IP/IC piracy. ESCALATION, on the other hand, does protect circuits against IP/IC piracy as it uses the logic masking approach. It also uses the potential of masking to improve the HDP of path-delay analysis-based HT detection methods.

2.3 Logic Masking

Key-based modifications have been proposed against IP/IC piracy and overproduction threats. A masked circuit has a masked mode and a functional mode. In the masked mode, the circuit has an incorrect key and consequently generates an incorrect functionality. In the functional mode, the key is correct and the circuit works correctly [1].

Key-based modifications include modifying the finite-state machine (FSM), at RTL, and the combinational part of a circuit, at the gate level. FSM modification is known as state masking (or state obfuscation). The modified FSM has a few more flip-flops (FF) added to extend the state space. It also has a new start state on power up, and one needs to know a specific input sequence to put the circuit back to the original start state. Modifications of the combinational part, known as logic masking, change the state transition graph and the outputs of the circuits. They modify the circuit such that even if an attacker can reach one of the original states, by chance, the circuit outputs and the transition to the next state wrongly occur.

XOR/XNOR gates are used for the majority of proposed logic masking methods. These methods are based on two steps. The first step, common to all combinational masking methods, is a random selection between XOR and XNOR gates (named key-gates). For each added key-gate, the designer also adds one input (named key-input) and connects it to one input of the key-gate. The selected key-gate determines the correct key value. For XOR (XNOR), the correct key is 0 (1). The second step is to connect the other input of each key-gate to one of the nets of the combinational part of the circuit. Instead of the selected net, the key-gate output drives the cell(s) that was (were) driven by the net. The net can be selected using different algorithms and with different objectives [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]. Figure 2 shows a masked circuit. It has extra FFs that expand the state space, and a tamper-resistant memory that stores the key and feeds the key-gates. One XOR and one XNOR key-gate are shown in the figure below.
Fig. 2

A masked circuit including extra FFs to expand the state space, two key-gates (K1 and K2), two key-inputs, and a tamper-resistant memory that feeds the key-gates

The use of MUX primitive as key-gates has been proposed instead of XOR/XNOR gates [10, 11]. The select input of MUX plays the key-input role. One of the two MUX inputs is randomly selected and connected to a combinational part net. Net selection is done according to the objective of the masking algorithm. This input pin defines the correct value of the key. The MUX passes the correct input if its select input is fed by the correct value; otherwise, the MUX can generate a wrong value coming from another part of the circuit.

Roy et al. have proposed a random net selection for key-gate insertion [27]. We call this method “random masking”. Whereas this approach is very easy-to-implement, there is no guarantee that the circuit will malfunction for all wrong keys. Muhammad et al. have presented another logic masking method in which key-gates have more effect on each other [12]. They use fault excitation/propagation knowledge to insert the key-gates in such a way that the attackers cannot use the inputs to excite and propagate the corruption effect of one key-gate only. This hinders the attackers’ effort to reveal the key by structural analysis of the masked circuit.

Subhra et al. proposed an algorithm, namely HARPOON, focusing on the fan-in and fan-out cones of a circuit [6]. Their objective was to maximize the mismatch points between a masked circuit and its original circuit. A mismatch point is likely to create an inversion. If an inversion arrives at the circuit outputs, it consequently produces an incorrect output. Obviously, the output failure might depend on input vectors. As a result, the number of failing patterns can be used as a simple metric (criterion) for masking qualification. However, this metric is not very accurate since it counts failed outputs even with single-bit failures.

If a fab that fabricates a masked circuit finds an activated IC including the masked circuit in the market, then the fab can guess the correct key by comparing the masked-layout output bits (for different primary input and key-input vectors) with the correct output bits of the activated IC. The correct output bits, in each trial of the attack, lead the attacker to find the correct key [10]. Thus, an effective masking approach must produce nearly equal numbers of correct and incorrect output bits when it is driven by the wrong keys. In other words, the Hamming distance (HD) between the correct and incorrect output bits (when a circuit is fed by incorrect keys) should be close to 50%. Rajendran et al. also mathematically proved that with 50% HD, the attacker faces the most difficult situation for key guesses [10]. In this case, 0% HD means that the masked circuit outputs are always correct and independent of the key value. Moreover, 100% HD means that all output bits are always wrong (and equal to the inversion of the correct value) while the key-inputs are applied by the wrong keys. Thus, it is not suitable for masking.

If a key-input has the wrong value, its key-gate creates a fault (an inversion). The fault can propagate through the circuit. It looks like the value at the key-gate location in the original circuit is ‘1’ (‘0’) and there is also a stuck-at at ‘0’ (‘1’). According to this similarity, Rajendran et al. proposed an algorithm based on knowledge of fault sensitization and propagation [10]. In their fault-based analysis (FBA) algorithm, a net has a high fault-impact if we can easily sensitize a stuck-at fault on this net, and also easily propagate the fault to the circuit outputs. In each iteration, the proposed algorithm greedily searches all nets to find the one with the highest fault-impact. Instead of fault-analysis, Samimi et al. proposed the same algorithm that directly checks the key-gate effects on the output HD [14]. Both algorithms achieve high HD, but they face a scalability problem because in each iteration, the designer has to apply many random input vectors to calculate the key-gate effect or fault-impact.

Logic masking can be a good DfTr; however, its first goal is to hinder IP/IC pirates. Samimi et al. tried to remove rare signals using key-gates [29]. Such signals barely have transitions during IC operation either in the normal mode or test mode. They are good choices to feed an HT (so that the HT is rarely activated, in particular, during circuit testing). Dupuis et al. proposed the use of AND/NAND/OR/NOR gates instead of XOR/XNOR key-gates in order to decrease the number of low controllable nets [28]. Removing rare and low-controllable signals does not guarantee that HTs would be fully activated and detected, because activation may depend on a lot of nets, with normal activity, such that only one rare combination of them activates the HT. It is noteworthy that if a circuit has no rare or low-controllable signals, making a rare event using normal events can be difficult; and HT attackers probably need a bigger trigger part. The bigger the HT, the more side-channel deviation appears. Thus, removing rare signals hinders HT attackers.

3 Escalation-Based Algorithm

3.1 Escalation

As mentioned in Section II, an HT might increase the delay of some paths. These paths can include either the payload part of HT or the nets that drive the trigger part. Note that most of the nets in a circuit usually belong to more than one path. Furthermore, it was already mentioned that the additional delay is less detectable if it is investigated among long paths. Consequently, a net cannot be a suitable choice for the HT attacker if the net belongs to at least one short path. In other words, for each net, the shortest path passing the net is the best option to investigate a potential HT interacting with the net. This path selection approach can be used in all path-delay-based HT detection methods. However, there may be nets whose shortest path is not short enough for high HT detection probability. As a result, in ESCALATION, we aim to generate short fake paths using key-gates for such nets.

As in most logic masking methods, in the ESCALATION approach, modifications are done at the gate level and in the combinational part of circuits. The key-gates used in ESCALATION are XOR/XNOR and MUX cells. The ESCALATION flow has three main steps. In the first step, for each net, the shortest path passing the net is found and selected. The second step is to sort the selected paths and find the longest one (MSP). The MSP has the least HDP among all the selected paths, and it includes the most vulnerable net. The third step is to insert a key-gate such that the most vulnerable net belongs to a new path shorter than the MSP. These three steps continue until all nets belong to at least one short-enough path or until all key-gates are inserted.

3.2 Examples

In order to show how ESCALATION works, and highlight some effective empirical hints, examples are presented below. The examples are at the gate level. The unit delay model is used at these examples. It is thus assumed that each cell in the netlist has 1 delay unit, and the interconnection delays are negligible.

Example 1: A path from a primary input (PI) to a primary output (PO) is shown in Fig. 3a. Imagine that this path is the MSP of a circuit, and the shortest path for net ‘N4’ (the most vulnerable net). In the original circuit, gate ‘G’ immediately precedes ‘PO1’; and net ‘N3’ connects the output of gate ‘C’ to an input of gate ‘D’. Thus, this path in the original circuit has six delay units and runs from primary input ‘PI1’, through the cells {A, B, D, E, F, G}, ending up at primary output ‘PO1’. In this figure, the designer has added the XOR ‘X1’ and the MUX ‘M1’ to make a shorter path for ‘N4’. The new path starts from key-input ‘KI2’, and bypasses gates ‘A’ and ‘B’. In addition, cells ‘F’ and ‘G’ are bypassed by ‘M1’ and a fan-out to ‘N5’, shown by the fan-out ‘F1’. As a result, the new path is generated with four delay units, running from ‘KI1’, through cells ‘X1’, ‘D’, ‘E’, and ‘M1’, and ending up at ‘PO1’.
Fig. 3

Four paths from a primary input to a primary output. a The shortest path of net N4, from PI1 to PO1 in the original circuit, and from KI2 to PO1 in the masked circuit. b The shortest path of nets N2, and N5, from PI2 to PO2 in the original circuit, an XOR and MUX of not use that cannot make shorter paths for N2 and N5, in the masked circuit. c The shortest path for N3 and N5, from PI3 to PO3 in the original circuit, and an added MUX in the masked circuit that makes a shorter path for N3 but not for N5. d The shortest path of net N4, from PI4 to PO5, and the shortest path for N5, from PI4 to PO4, in the original circuit. X3 makes the shortest paths for both N4 and N5

In order to efficiently insert XOR/XNOR or MUX key-gates, three important ideas are considered in the proposed algorithm:
  1. 1.

    Inserting an XOR/XNOR key-gate makes a shorter path if the delay of the nearest PI to the target net (Dpi) is greater than the delay of the target net to the nearest PO (Dpo) in the MSP. Otherwise, it is better to use a MUX key-gate.

     
  2. 2.

    Each inserted key-gate adds a delay unit (one level of gate) to all the paths that include the inserted key-gate. This may defeat the purpose. In order not to have this problem, the defender should avoid inserting key-gates in selected paths longer than the maximum accepted MSP.

     
  3. 3.

    There is often more than only one vulnerable net in the circuit. Thus, key-gate insertion should be done in such a way as to decrease the number of vulnerable nets as much as possible.

     

These problems and ideas are shown in the following examples and illustrated in Fig. 3.

Example 2: assume the path in Fig. 3b, including {PI2, A, B, C, D, E, F, PO2}, is the MSP with six delay units in the original circuit. In addition, nets ‘N2’ and ‘N5’ are the most vulnerable nets. In this figure, ‘X2’ and ‘M2’ are added to make shorter paths for ‘N2’ and ‘N5’. As shown, the insertion of ‘X2’ before ‘B’ does not generate any shorter paths for ‘N2’. Likewise, the insertion of ‘M2’ before a PO and making a fan-out from ‘N6’ does not make any shorter paths for ‘N5’. For a net like N2, in order for its Dpi to be smaller than its Dpo, a MUX must be used. Otherwise, like N5, an XOR/XNOR key-gate must be used.

Example 3: assume that the path in Fig. 3c, including {PI3, A, B, C, D, E, PO3}, is the MSP with five delay units in the original circuit. In addition, nets ‘N3’ and ‘N5’ are the most vulnerable nets. In this case, the insertion of MUX ‘M3’ before PO3 and making a fan-out from ‘N4’ makes a shorter path for ‘N3’ including {PI3, A, B, C, M3, PO3}. It has 4 delay units. But this modification adds a delay unit to the shortest path of ‘N5’. This path includes {PI3, A, B, C, D, E, M3, PO3} and has 5 delay units. As a result, such POs must not be used.

Example 4: assume that the path in Fig. 3.c, including {PI4, A, B, C, D, F, G, PO4}, is the MSP with 6 delay units in the original circuit. In addition, net ‘N5’ is the most vulnerable net. Also assume that the shortest path of ‘N4’ has 5 delay units. It includes {PI4, A, B, C, D, E, PO5}. If ‘X3’ is inserted before ‘F’, a short path with 3 delay units, including {KI6, X3, F, G, PO4}, would be made for N5. The shortest path passing ‘N4’ would be the MSP with 5 delay units. As seen in Fig. 3d, ‘X3’ is inserted before ‘E’; thus, the shortest paths for ‘N5’ and ‘N4’ have 4 and 3 delay units, respectively. They include {KI6, X3, D, F, G, PO4} and {KI6, X3, D, E, PO5}. The tip from this example is that it is sometimes better to consider a few paths other than just the MSP. As seen in this example, it is better to insert the XOR key-gate before ‘E’ to solve the MSP problem, and the shortest path of ‘N4’ is a potential MSP.

3.3 Proposed Algorithm

In the following section, we explain an algorithm based on the ESCALATION approach. Algorithm 1 shows the pseudocode of the proposed algorithm. The algorithm takes a gate-level netlist, a number as the key lengths (the number of key-gates), and an integer number as the targeted MSP value (line 1). At first, the shortest path for each net of the circuit is found and stored in the set SIPSet (line 2). To find the shortest path for each net, a breadth-first search (BFS) is done from the target net to the primary inputs and primary outputs.

A structure is used to store the information on the shortest path (SIPInfo) for each net. It includes two pointers to the selected path (SP) and the target net (TN), and also two variables. One variable contains the delay from the net to its nearest PI (Dpi), and the other one contains the delay of the path from the net to its nearest PO (Dpo). The summation of these two variables is the delay of the path (Value). In order to find MSP, the paths in SIPSet must be sorted according to their delay (Value). MSP is stored in MSPSet [0]. In addition, the potential MSPs are gathered in a set named MSPSet (line 3). The potential MSPs are the paths that belong to SIPSet and have delays greater than the targeted delay for the MSP value.

The next step of the algorithm is a loop that includes the key-gate insertion functions (lines 8 and 10). As shown in example 1, if Dpi is greater than Dpo, many nets belonging to the fan-in cone of the target net can be used to insert an XOR/XNOR key-gate. Likewise, there are many candidates for inserting a MUX key-gate in the fan-out of the targeted net if Dpo is greater than Dpi. Please note that there is no need to search all the nets in the cones. A BFS can be performed in the cones with the maximum DeepSearch according to (1) (line 6):

$$ \mathrm{DeepSearch}=\left\{\begin{array}{ll}\mathrm{TargetedMSP}-\mathrm{Dpo}\hbox{-} 1& \mathrm{if}\;\mathrm{Dpi}>=\mathrm{Dpo}\\ {}\mathrm{TargetedMSP}-\mathrm{Dpi}\hbox{-} 1& \mathrm{if}\;\mathrm{Dpi}<\mathrm{Dpo}\end{array}\right. $$
(1)
The XOR insertion function (line 8) is then executed if Dpi is greater than Dpo; otherwise, the MUX insertion function (line 10) is executed. In the cones, the most appropriate net for key-gate insertion is a net that will remove more nets from MSPSet. This is shown in example 2. If an XOR/XNOR key-gate is to be inserted, Dpi should be recalculated for all nets that belong to the fan-out cone of selected net. Likewise, if a MUX key-gate must be inserted, Dpo should be recalculated for all nets that belong to the fan-in cone of the selected net. There is no need to recalculate all the nets in the cones. A BFS can be performed on the cones with the maximum DeepRecalculate according to (2) (line 11):
$$ \mathrm{DeepRecalculate}\left\{\begin{array}{lll}& \mathrm{TargetedMSP}-\mathrm{Dpi}\hbox{-} 1& \mathrm{if}\;\mathrm{Dpi}>=\mathrm{Dpo}\\ {}=& & \\ {}& \mathrm{TargetedMSP}-\mathrm{Dpo}\hbox{-} 1& \mathrm{if}\;\mathrm{Dpi}<\mathrm{Dpo}\end{array}\right. $$
(2)

After a key-gate is inserted, SIPSet is updated (line 14), and again, MSP and the potential MSPs of the modified circuit are identified and stored in MSPSet (line 15). The loop is finished when MSPSet is empty or the number of inserted key-gates is more than the key-length (line 5).

The time and memory complexities of the proposed algorithm depend on the BFSs done to find the shortest path for each net. Please note that inside the loop, the BFSs are performed in order to find the most appropriate net for key-gate insertion, but the number of nets in the cones (considering DeepSearch and DeepRecalculate) is much less than the number of nets in the circuit, and it is thus ignorable. Other parts of the algorithm are fixed as well. Assuming we are working with an extracted graph of the circuit, BFS takes O(bd + 1) time and memory [30], where b is the branching factor (it is equal to the number of nodes in the biggest logic cone), and d is the distance from the starting node (it is equal to the maximum logic-depth in the circuit). As a BFS must be done for each net and in each iteration, the order of time complexity of ESCALATION is O (nkbd + 1), where n and k are the number of nets in the circuit and the number of inserted key-gates, respectively. In a typical case, as n>> b>> d, the ESCALATION complexity order can be estimated as O (kn). This means that the time complexity of the ESCALATION algorithm increases linearly according to the size of the circuit.

4 Measuring ESCALATION Efficiency

In the previous section, we presented an algorithm for key-gate insertion based on the ESCALATION approach. Since ESCALATION has two aims, there are two criteria that must be considered. First, how much logic masking is obtained. This is the primary aim in all masking methods. Second, how much the HDP of path-delay-based detection methods can be improved, an important goal of the ESCALATION approach. In addition, the key-gate overheads such as area and circuit performance must be noted.

4.1 Masking Quality

In Section II, it was mentioned that the number of output failures can be a simple metric to measure masking quality. The more output failures there are, the more masking is obtained. However, a more accurate metric is the average of the HD of correct and incorrect output bits while the circuit is fed by the correct and wrong key vectors. As a result, the masking quality of two circuits obtained by applying two different masking methods on one circuit can be evaluated by comparing these masking metrics.

4.2 HDP Calculation

In order to know the HDP improvement when path-delay analysis is used, the HDPs of the original circuit and of its masked circuits must be compared. Moreover, MSP has the least HDP among the selected shortest paths. Hence, the HDP of MSP can be considered as a metric.

In order to calculate the delay variability of MSP and hence HDP, a Golden reference is required. As mentioned in Section II, the Golden reference is the variation of the targeted side-channel without HT effects. It can be obtained by either the Golden ICs or Golden-free approaches. Assume that the Golden reference for path-delay is reached using one of these approaches, and that it follows Gaussian distributions or normal probability distribution functions (PDF). We call it the “Golden PDF”. The Golden PDF is constructed by “μp and σp”, where μp is the mean and σp is the standard deviation, as shown in Fig. 4. Likewise, in the ICs under test, the PDF of path-delay is named “HT-suspected PDF”, and constructed with “μT and σT”, where μT is the mean and σT is the standard deviation. The HDP is calculated by comparing the HT-suspected PDF and the Golden PDF.
Fig. 4

The probability distribution functions of a path with an HT (dashed lines) and without any HT (solid lines). The area of the shaded part is equal to HDP with 0% FPR. The dotted area is FPR

HDP is the probability that the HT-suspected PDF has a value of more than +3σp. In other words, HDP is equal to the area under the HT-suspected PDF between “μp + 3σp” and “μT + 3σT”, as shown in Fig. 4. The HDP of a path can be formulated by:
$$ HDP= Prob\left(a<x<b\right)={\int}_a^b{f}_x(t) dt. $$
(3)
where ‘x’ is the path-delay, and fx is the PDF of the path-delay. ‘a’ and ‘b’ are μp + 3σP and μT + 3σT, respectively. According to the 3-sigma rule, and with ‘b’ approximated as infinity, we obtain
$$ HDP\left({\upmu}_p+3{\sigma}_p<x<\infty \right)={\int}_{\upmu_p+3{\sigma}_p}^{\infty }{f}_x(t) dt. $$
(4)

In addition, according to PDF properties

$$ {\displaystyle \begin{array}{l} HDP\left({\upmu}_p+3{\sigma}_p<x<\infty \right)={\int}_{-\infty}^{\infty }{f}_x(t) dt-{\int}_{-\infty}^{\upmu_p+3{\sigma}_p}{f}_x(t) dt.\\ {}=1-{\int}_{-\infty}^{\upmu_p+3{\sigma}_p}{f}_x(t) dt.\kern0.5em =1-{F}_x\left({\upmu}_p+3{\sigma}_p\right)\\ {}\kern0.36em \end{array}} $$
(5)

where in (5), Fx is the cumulative distribution function of HT-suspected PDF. Equation (5) can be used to calculate the HDP with less than 0.2% error. The use of the 3-sigma rule is the reason for this ignorable error. In fact, the Golden PDF can have a value higher than (μ p  + 3σ p ) with less than 0.002 probability. In other words, two out of 1000 HT-free ICs are reported as HT-infected ICs. This fraction, illustrated by the dotted area in Fig. 4, is known as the false positive rate (FPR).

FPR is the fraction of HT-free ICs which are reported as HT-infected ICs. A higher FPR can be accepted in order to have a higher HDP. For example, in Fig. 4, HDP is increased to 100% by accepting more FPR, the dotted area. To avoid the high FPR, more accurate and costly HT detection methods, such as layout image processing, must be used. There can be a tradeoff between the costs of FPR, other HT detection methods, and trustworthiness gained.

In order to get a 100% HDP, Fx in eq. (5) should be zero. As a result, ‘μ p  + 3σ p ’ in eq. (5) should be less than ‘μT−3σ T ’ in HT-suspected PDF. FPR can be calculated by eq. (6), reached as in the HDP equation

$$ FPR=1-{\int}_{-\infty}^{\upmu_T-3{\sigma}_T}{g}_x(t) dt. $$
(6)
where g x is the Golden PDF.

It is noteworthy that the interval changes of the Golden or HT-suspected PDF depend on die-to-die and with-in-die variation. For example, in 45 nm technology, they are 36 and 12% respectively [31]. These values together make HT detection very difficult. Fortunately, we can decrease them using some calibration structures. For instance, die-to-die effects can be removed from the path-delay analysis using the method proposed in [20].

5 Experimental Results

5.1 Experiment Setups and Assumptions

The experiments have been carried out on gate-level circuits from ISCAS-89 [32] and ISCAS-85 [33]. First, the circuits were elaborated by VERIFIC API [34]. Then, the proposed algorithm (using the unit delay mode) was executed for different targeted MSP values. The algorithm was implemented using VERIFIC API and C++ programming. Afterwards, all the modified circuits were synthesized by Design Compiler [35] and then placed and routed by SOC-Encounter [36]. The NANGATE 45 nm technology was used during the synthesis and physical design [37].

In order to perform a fair comparison between the proposed algorithm and previous works, we tried to use the same experiment flows and assumptions. We compare both MSP reduction and HDP improvement with the results of the TDR algorithm [18]. We also compare the logic masking quality of the algorithm with the [6, 10], based on the number of output failures and HD. Finally, we report layout-level results.

5.2 HDP Results in Gate Level

Shekarian et al. used the unit delay model in their TDR algorithm [18]. They also reported the HDP improvement and MSP reduction at this level. In this model, zero correlation among the delay variation of cells is assumed, an unrealistic assumption. But the authors tried to make it acceptable by assuming a higher percent of delay variability. They assumed 60% cell delay variation due to process variation.

In order to compare the HT detection improvement (based on path-delay) reached by the proposed ESCALATION algorithm and the TDR algorithm, four sequential circuits with different sizes are considered. Figures 5 and 6 give the results of the experiments. The first empirical observation shown by these experiments is that for each circuit, there is a minimum MSP value that the proposed algorithms can reach. Going below, this minimum value is impossible because adding new key-gates will create new vulnerable nets. Figure 5 shows the last four minimum reachable MSP values and their associated area overheads, shown in X and Y axes respectively. Note that the area overhead is calculated after synthesis. This figure shows that there is a direct link between MSP reduction and area overheads; a better MSP value (and therefore a better HDP) requires more key-gates and area overhead.
Fig. 5

The MSP value obtained (by ESCALATION) versus its required area overhead for four MSP values (the minimum reachable MSP value and three bigger values) in four sequential circuits

Fig. 6

Results of TDR and two executions of ESCALATION. ESCALATION1 (ESCALATION2) needs a bit less (more) area overhead than TDR. a MSP values. b Area overheads

Among all MSP values reachable by ESCALATION, the two MSP values for which ESCALATION makes area overhead as similar as possible to the area overhead made by TDR are selected and used in Fig. 6. The ESCALATION executions for obtaining these two MSP values are named ESCALATION1 and ESCALATION2 in Fig. 6. ESCALATION1 (2) needs a bit less (more) area overhead than TDR. Figure 6a shows the MSP values obtained by these two ESCALATION executions and TDR. In Figure 6b, the area overheads of the two executions and TDR are compared. Shekarian et al. just reported the number of flip-flops (FFs) added due to TDR execution as the area overhead [18]. In fact, the area overhead of the added FFs (including their area and required clock route) is much bigger than the area overhead of key-gates. Thus, we calculate the TDR area overhead by multiplying the reported percentage of the added FFs by the percentage of the sequential area of the circuit. For example, for circuit s9234, the TDR algorithm adds 36 FFs, which equals 17% of the number of FFs in the circuit. As the sequential part of s9234 corresponds to 58% of the circuit area after synthesis, the modified circuit has a 9.9% area overhead. In Fig. 6b, the area overheads of ESCALATION1 and ESCALATION2 are prepared according to the area reports obtained by SYNOPSYS Compiler [35].

Figure 6 illustrates that the TDR algorithm achieves a slightly smaller MSP value with almost the same area overhead for the three smallest circuits; however, in one circuit (s13207), the ESCALATION algorithm achieves a better MSP reduction with less area overhead. It is to be noted that heuristic algorithms do not always achieve the optimal result. The TDR algorithm is a heuristic one, and so it is difficult to understand why it does not always give better results than the ESCALATION algorithm.

Table 1 reports all the results gathered from the masked circuits by the ESCALATION algorithm with a 20% area overhead limitation. Columns 2, 3, and 4 show three variables for the original circuits: the MSP value, the HDP, and the required FPR for 100% HDP (RFPR), respectively. The same variables are reported for the masked circuit in Columns 5–7. The number of key-gates and the area overhead percentage are given in Columns 8 and 9, respectively. The area overheads are reported by Design Compiler. In two cases, before and after the ESCALATION algorithm execution, an HT (just one AND) is inserted in MSP and then HDP and RFPR is calculated. Comparing HDP and RFPR, in these two cases, shows that our algorithm can be an efficient DfTr approach. Indeed, the ESCALATION algorithm modified circuits in benefit of an average MSP relative reduction of around 60%. The average HDP absolute value is increased by around 34%. And the average RFPR absolute value is decreased by around 32%.
Table 1

MSP value, HDP, and required FPR (to obtain 100% HDP) before and after performing the ESCALATION algorithm on sequential circuits, accepting 20% area overhead and assuming unit delay model and 60% cell delay variation

Circuit

MSP value

HDP

RFPR

MSP value

HDP

RFPR

No. of used key-gates

Percent of area overhead

In the original circuit

By ESCALATION in the masked circuit

S13207

35

10

91

14

38

63

741

17.7

S35814

38

8

93

13

41

60

297

17.2

S9234

43

5

96

16

31

70

165

22

S5378

21

23

78

11

51

50

109

21

S1423

20

22

78

7

80

21

67

21.5

S1196

18

24

77

10

55

46

31

21.1

Average

30

15

86

12

49

52

 

20

5.3 Masking Quality

The number of output failures is a simple metric to qualify masking quality. Chakraborty et al. reported the masking quality of their algorithm, HARPOON, with this metric [6]. They used different percentages of area overheads: 5, 10, 15, and 20%. Thus, we executed our algorithm to obtain different MSP values. Then, we select four MSP values which have similar area overheads close to the four considered area overheads in [6]. In order to have a better comparison, we divide the number of output failures by the percentage of area overhead. This proportion is shown as normalized output failures in Fig. 7. In this figure, each circuit has an E or H label that stands for the ESCALATION and HARPOON algorithms, respectively. The results in Fig. 7 show that for each circuit, the proportion converges to the same number when the area overhead increases. It means that both algorithms have a similar logic masking quality for higher area overheads (with 15 or 20% area overheads for two circuits over the three studied circuits). For the area overheads less than 10%, the results are variable depending on the circuit. It is noteworthy that the ESCALATION algorithm has benefits for HT detection.
Fig. 7

Output failures versus area overhead for both HARPOON [6] and ESCALATION

In Table 2, we compared the average HD results reached by three different algorithms: random masking [27], our proposed algorithm, and the fault-based-analysis (FBA) algorithm [10]. As seen in the table, for seven over nine circuits, the results are better for our algorithm than with random masking. They gain on average 12 and 21%, respectively. The proposed algorithm cannot compete with the FBA algorithm because it just aims to reach high balanced HD. The ESCALATION algorithm does not have as much time and memory complexity as the FBA, a greedy algorithm. In the following section, we discuss how one can improve the HD results in the ESCALATION algorithm.
Table 2

Comparing Hamming distance results of random masking, ESCALATION, and FBA algorithms

Circuit

No. of key-gate

Random (%)

ESCALATION (%)

FBA (%)

C432

17

26

37

50

C499

40

3

22

50

C880

28

16

17

50

C1355

42

13

25

50

C1908

28

9

25

50

C3540

22

15

13

50

C5315

97

10

20

45

C6288

27

24

8

50

C7552

89

8

18

48

Average

12

2 1

49

6 Layout-Level Validation

In Section V, we used the same assumptions that the authors used at the gate level [18], to fairly compare with this work. Two assumptions in this work are (1) 60% cell delay variation and (2) no correlation among the delay variation of the cells in a path. In fact, there are components in with-in-die variation that are physically dependent and correlated [24]. The lack of layout-level information at gate level and RTL forces designers to use a simple delay and variation models as the authors have done [18].

In order to achieve more accurate results, we performed experiments at the layout level, post-placement, and routing. The experiments include HT insertion in MSP and HPD calculation. The experiments consist in placing and routing masked circuits with SOC-Encounter [36]. The shortest path for each net and the MSP of each circuit are then found using TCL scripts. Afterward, an AND gate, as an HT, is inserted in MSP. Note that AND gate is the smallest functional HT that one can use. The MSP delays before and after HT inserting are obtained. The HDP of MSP is calculated according to formula (6).

In HDP, we consider 12% of path-delay variation due to with-in-die variation according to [31]. In this work, the authors fabricated ICs with ROs with different lengths. The ROs were inserted in different locations of the layout design. The ICs were fabricated using different layout design styles, in the 45 nm technology. The reports in [31] show that there are around 36 and 12% of path-delay variations due to die-to-die and with-in-die variation, respectively. Thanks to calibration methods, like [20], we can remove the effects of die-to-die variations from the path-delay analysis. In addition, defenders can use any structure for measuring path-delay such as the ones given in [19, 20].

In Table 3, the MSP value for the original circuits and their masked circuits are presented in Columns 3 and 6. The masked circuits are obtained by the proposed algorithm. In addition, the HDP of MSP is reported in Columns 4 and 7. The table shows that four circuits do not need the MSP reduction, as the HT in their MSP is detectable almost 100%. The proposed algorithm improves the HDP by 23%. The HDP is averagely 10% in the masked circuits. This improvement needs to accept averagely 6 and 23% performance and area overhead, respectively. However, these overheads may look like very much, but in reality, we do not need to mask the whole circuit. In the next section, we give examples and further explanations.
Table 3

HDP of MSP in the original and masked circuits; performance and area overhead in layout level

Circuit

Original circuit

Masked circuit

Performance overhead (%)

No. of used key-gates

Area overhead (%)

Performance

MSP

HDP (%)

Performance

MSP

HDP (%)

C432

1.47 ns

0.45 ns

93

1.47 ns

0.31 ns

100

0

17

16

C449

0.79 ns

0.39 ns

100

C880

1.06 ns

0.32 ns

100

C1355

1.02 ns

0.40 ns

99

C1908

1.08 ns

0.32 ns

100

C3540

1.51 ns

0.47 ns

90

1.66 ns

0.47 ns

90

10

22

13

C5315

1.11 ns

0.45 ns

93

1.25 ns

0.38 ns

99

13

97

19

C6288

4.32 ns

1.23 ns

8

4.55 ns

0.81 ns

30

5

27

3

C7552

2.15 ns

0.77 ns

34

2.19 ns

0.66 ns

51

2

89

21

Average

  

64

  

74

6

 

14

Regarding the power overhead of masking methods, it is noteworthy that masked circuits work in their functional mode by the correct key; thus, the added key-gates are totally transparent in this mode. They do not add any transitions in functional mode. Hence, they just have leakage power and do not add any dynamic power. The power overhead is not reported in this work because the static power of a few key-gates in 45 nm technology is negligible.

7 Discussions

In this section, we discuss some noteworthy points about the ESCALATION approach and how it can have better results.
  1. 1.

    ESCALATION is a DfTr approach and hinders HT insertion in two ways. First, it masks a circuit, thus, HT attackers cannot have good knowledge about the original functionality of the circuit. Second, using the ESCALATION approach, if all nets of a circuit belong to at least one short-enough path, the circuit is more sensitive to HT delay effects. In such situations, in order to hide these effects, an HT attacker can increase the drive strength and capacity load of the cells that precede and proceed the HT. Cells having more drive strength and capacity load consume more power. As a result, increasing the strength and capacity load increases the success probability of power-based HT detection methods. Thus, path-delay and power-based HT detection methods must be combined.

     
  2. 2.

    The ESCALATION approach is based on logic masking, and it only modifies the combinational part of circuits at the gate level. As explained in Section II, logic masking can be used as one step of masking an RTL netlist. It is used to change the state transition graph of an RTL netlist [16]. If we change the combinational part of a sequential circuit, then wrong keys create wrong values in the FFs and also POs. Output failures in a masked sequential circuit are the result of wrong keys and (consequently) wrong values in FFs. Thus, key detection in sequential circuits is more difficult than in simple combinational ones.

     
  3. 3.

    In order to have better HDP results in the ESCALATION approach, an ESCALATION-based algorithm can be implemented in design-abstraction levels lower than the gate level, such as after performing synthesis, placement, or routing. In the lower levels, there is more information about the delay components of nets and cells; thus, path-delay calculation and finding the shortest path for the nets is more accurate (and certainly more complex) than at the gate level. If designers implement any logic masking method post-placement and routing, they must incrementally perform a logic optimization after inserting all the key-gates. This optimization is done in order to solve the key-gates in the combinational part of circuits. For example, if one inverter gate in the original circuit is preceded by an XOR key-gate, a logic optimizer algorithm might convert it to an XNOR key-gate. As a result, the masked circuit has an XNOR key-gate, for which the correct key is ‘0’, although the correct key value of an XNOR key-gate would seem to be ‘1’ at first glance.

     
  4. 4.

    In order to improve the HD results in the ESCALATION approach, both HD achievement and MSP reduction can be considered simultaneously. As mentioned in Section III, the objective of the proposed ESCALATION algorithm is just to reduce the MSP. In each iteration of the proposed algorithm, there might be more than one suitable net for our objective. Among these nets, other objectives can be investigated. For instance, we have seen many times that there are a few nets for which key-gate insertion can make a shorter path for MSP. A net is then randomly selected, but one can investigate which one has a better effect on HD. This is sure to increase the time and memory complexity of the algorithm.

     
  5. 5.

    The area overheads reported in the previous section seem high; however, it must be taken into account that there is no need to mask a whole big circuit. HT and IP piracy threats are important in the security-critical parts of circuits. An HT inserted in unmasked parts will not have critical effects. In systems-on-chips, there are many IPs that can be found freely everywhere; and so no one cares about them being stolen. A large circuit can therefore easily be partitioned into sub-circuits and only the security-critical parts masked. When this is the case, the area overhead is much less than the reports in this work. For instance, Rajendran et al. masked some small parts of a microprocessor (e.g., thread switch, DMA controller, FP unit, etc.) [10]. As shown in Table 3, in the worst case, we have 21% area overhead; definitely, this is a lot. But if the critical security part of a circuit, which should be masked, occupies just 10% of the circuit area, the area overhead is only 2.1%.

     

8 Conclusion and Future Works

In this paper, we have presented a new DfTr approach, called ESCALATION, which leverages logic masking in order to enhance HT detection based on path-delay analysis. Its objective is to reduce the MSP value of the circuit. MSP value reduction is of major interest for HT detection: it increases the HDP of the most vulnerable net. Based on the ESCALATION approach, we proposed an algorithm that identifies the most vulnerable net in the circuit and then inserts a key-gate before or after this net. According to the delay of the target net to the PIs or POs, an XOR or MUX key-gate is used by the algorithm.

Simple formulas for calculating both HDP and RFPR have been proposed and proven. Using the formulas, HDP has been calculated considering a 60% cell delay variation at the gate level. Furthermore, in layout level, HDP has been calculated considering 12% path-delay variation. The layout level experiments and results show that the ESCALATION algorithm is capable of improving the HDP of the MSP by 35%.

In addition, the logic masking quality of the ESCALATION algorithm was investigated according to two metrics: the number of failed outputs and the HD of the output bits. We compared the ESCALATION algorithm to the HARPOON algorithm [6]. Experiments show that ESCALATION can reach a good level of logic masking quality, as good as HARPOON’s, by accepting a bit more area overhead. Moreover, the HD of masked circuits using the ESCALATION algorithm was calculated. The results are much better than those attained by random masking [27]. However, they are not as good as the fault-based-analysis (FBA) results [10]. In addition, we have also discussed how to improve the HD of masked circuits obtained by the ESCALATION approach.

Footnotes

  1. 1.

    ROs generate oscillations and they include an odd number of NOT gates (or gates having an inversion function such as NOR/NAND gates) and feedback that the output of the last NOT gate is fed into the first NOT gate.

References

  1. 1.
    Mishra P, Tehranipoor M, Bhunia S (2017) Security and trust vulnerabilities in third-party IPs, In Hardware IP security and trust. Springer, Cham, pp 3–14Google Scholar
  2. 2.
    Agrawal, D., Baktir, S., Karakoyunlu, D., Rohatgi, P., & Sunar, B. (2007). Trojan detection using IC fingerprinting. In Security and privacy, 2007. SP'07. IEEE Symposium on (pp. 296–310). IEEEGoogle Scholar
  3. 3.
    Li H, Liu Q, Zhang J (2016) A survey of hardware Trojan threat and defense. Integr VLSI J 55:426–437CrossRefGoogle Scholar
  4. 4.
    Lecomte M, Fournier J, Maurine P (2017) An on-chip technique to detect hardware Trojans and assist counterfeit identification. IEEE Trans Very Large Scale Integr (VLSI) Syst 25(12):3317–3330CrossRefGoogle Scholar
  5. 5.
    Yu Q, Dofe J, Zhang Y, Frey J (2017) Hardware hardening approaches using camouflaging, encryption, and obfuscation. In: Hardware IP security and trust. Springer, Cham, pp 135–163CrossRefGoogle Scholar
  6. 6.
    Chakraborty RS, Bhunia S (2009) HARPOON: an obfuscation-based SoC design methodology for hardware protection. IEEE Trans Comput Aided Des Integr Circuits Syst 28(10):1493–1502CrossRefGoogle Scholar
  7. 7.
    Dofe, J., & Yu, Q. (2017) Novel dynamic state-deflection method for gate-level design obfuscation. IEEE Trans Comput Aided Des Integr Circuits SystGoogle Scholar
  8. 8.
    Rajendran, J., Pino, Y., Sinanoglu, O., & Karri, R. (2012) Security analysis of logic obfuscation. In Proceedings of the 49th Annual Design Automation Conference (pp. 83–89). ACMGoogle Scholar
  9. 9.
    Zhang J (2016) A practical logic obfuscation technique for hardware security. IEEE Trans Very Large Scale Integr (VLSI) Syst 24(3):1193–1197CrossRefGoogle Scholar
  10. 10.
    Rajendran J, Zhang H, Zhang C, Rose GS, Pino Y, Sinanoglu O, Karri R (2015) Fault analysis-based logic encryption. IEEE Trans Comput 64(2):410–424MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Plaza SM, Markov IL (2015) Solving the third-shift problem in IC piracy with test-aware logic locking. IEEE Trans Comput Aided Des Integr Circuits Syst 34(6):961–971CrossRefGoogle Scholar
  12. 12.
    Yasin M, Rajendran JJ, Sinanoglu O, Karri R (2016) On improving the security of logic locking. IEEE Trans Comput Aided Des Integr Circuits Syst 35(9):1411–1424CrossRefGoogle Scholar
  13. 13.
    Dutta RG, Guo X, Jin Y (2017) IP trust: the problem and design/validation-based solution. In: Fundamentals of IP and SoC security. Springer, Cham, pp 49–65CrossRefGoogle Scholar
  14. 14.
    Samimi, S. M. S., Aerabi, E., Nejat, A., Fazeli, M., Hely, D., & Beroulle, V. (2016). High output hamming-distance achievement by a greedy logic masking approach. In East-West Design & Test Symposium (EWDTS), 2016 I.E. (pp. 1–4). IEEEGoogle Scholar
  15. 15.
    Colombier B, Bossuet L, Hély D (2017) Logic modification-based IP protection methods: an overview and a proposal, In Foundations of hardware IP protection. Springer, Cham, pp 37–64Google Scholar
  16. 16.
    Chakraborty RS, Bhunia S (2011) Security against hardware Trojan attacks using key-based design obfuscation. J Electron Test 27(6):767–785CrossRefGoogle Scholar
  17. 17.
    Nejat, A., Hely, D., & Beroulle, V. (2016) How logic masking can improve path delay analysis for Hardware Trojan detection. In Computer Design (ICCD), 2016 I.E. 34th International Conference on (pp. 424–427). IEEEGoogle Scholar
  18. 18.
    Shekarian SMH, Zamani MS (2015) Improving hardware Trojan detection by retiming. Microprocess Microsyst 39(3):145–156CrossRefGoogle Scholar
  19. 19.
    Nejat A, Shekarian SMH, Zamani MS (2014) A study on the efficiency of hardware Trojan detection based on path-delay fingerprinting. Microprocess Microsyst 38(3):246–252CrossRefGoogle Scholar
  20. 20.
    Cha, B., & Gupta, S. K. (2013). Trojan detection via delay measurements: a new approach to select paths and vectors to maximize effectiveness and minimize cost. In Proceedings of the conference on design, automation and test in Europe (pp. 1265–1270). EDA ConsortiumGoogle Scholar
  21. 21.
    Hoque T, Narasimhan S, Wang X, Mal-Sarkar S, Bhunia S (2017) Golden-free hardware Trojan detection with high sensitivity under process noise. J Electron Test 33(1):107–124CrossRefGoogle Scholar
  22. 22.
    Jin, Y., & Makris, Y. (2008). Hardware Trojan detection using path delay fingerprint. In Hardware-oriented security and trust, 2008. HOST 2008. IEEE International Workshop on (pp. 51–57). IEEEGoogle Scholar
  23. 23.
    Rai, D., & Lach, J. (2009) Performance of delay-based Trojan detection techniques under parameter variations. In Hardware-oriented security and trust, 2009. HOST'09. IEEE International Workshop on (pp. 58–65). IEEEGoogle Scholar
  24. 24.
    Blaauw D, Chopra K, Srivastava A, Scheffer L (2008) Statistical timing analysis: from basic principles to state of the art. IEEE Trans Comput Aided Des Integr Circuits Syst 27(4):589–607CrossRefGoogle Scholar
  25. 25.
    Ferraiuolo, A., Zhang, X., & Tehranipoor, M. (2012) Experimental analysis of a ring oscillator network for hardware Trojan detection in a 90nm ASIC. In Proceedings of the International Conference on Computer-Aided Design (pp. 37–42). ACMGoogle Scholar
  26. 26.
    Lamech, C., & Plusquellic, J. (2012) Trojan detection based on delay variations measured using a high-precision, low-overhead embedded test structure. In Hardware-Oriented Security and Trust (HOST), 2012 I.E. International Symposium on (pp. 75–82). IEEEGoogle Scholar
  27. 27.
    Roy JA, Koushanfar F, Markov IL (2010) Ending piracy of integrated circuits. Computer 43(10):30–38CrossRefGoogle Scholar
  28. 28.
    Dupuis, S., Ba, P. S., Di Natale, G., Flottes, M. L., & Rouzeyre, B. (2014) A novel hardware logic encryption technique for thwarting illegal overproduction and hardware trojans. In On-Line Testing Symposium (IOLTS), 2014 I.E. 20th International (pp. 49–54). IEEEGoogle Scholar
  29. 29.
    Samimi, M. S., Aerabi, E., Kazemi, Z., Fazeli, M., & Patooghy, A. (2016). Hardware enlightening: nowhere to hide your hardware Trojans!. In On-Line Testing and Robust System Design (IOLTS), 2016 I.E. 22nd International Symposium on (pp. 251–256). IEEEGoogle Scholar
  30. 30.
    Russell SJ, Norvig P, Canny JF, Malik JM, Edwards DD (2003) Artificial intelligence: a modern approach (Vol. 2, No. 9). Prentice hall, Upper Saddle RiverGoogle Scholar
  31. 31.
    Pang LT, Qian K, Spanos CJ, Nikolic B (2009) Measurement and analysis of variability in 45 nm strained-Si CMOS technology. IEEE J Solid State Circuits 44(8):2233–2243CrossRefGoogle Scholar
  32. 32.
    The ISCAS-85 Benchmark Circuits. [Online]. Available: http://www.pld.ttu.ee/~maksim/benchmarks/iscas89/
  33. 33.
    The ISCAS-85 Benchmark Circuits. [Online]. Available: http://pld.ttu.ee/~maksim/benchmarks/iscas85/
  34. 34.
    Verific Design Automation Inc., [Online]. Available: http://www.verific.com
  35. 35.
  36. 36.
    Cadence SOC Encounter, [Online]. Available: https://www.cadence.com
  37. 37.
    NanGate—The Standard Cell Library Optimization Company, [Online]. Available: http://www.nangate.com/

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Université Grenoble Alpes, Grenoble INP, LCIS: Laboratoire de Conception et d’Intégration des SystèmesValenceFrance

Personalised recommendations