Experimental setup
The experimental platform consists of four modules in total, each of which holds 60 Basys-3 boards, 10 7-port USB hubs, a Raspberry PI-2 and power supply. The USB connection between the PI-2 and Basys3 boards powers the FPGA as well as provides a JTAG interface to configure the bitstream of the design into the FPGA. A UART interface is used to communicate with the configured design and receive the measurement results. The Raspberry-Pi communicates over a (LAN) with a global experiment control server, which also stores the measured data.
The RO frequency was measured indirectly by counting the positive edges of the toggle flip-flop as shown in Fig. 1a during an evaluation time D, with different evaluation times ranging from 0.50 \(\upmu s\) to 10.00 ms.
Overall metrics
A number of metrics have been suggested [29] for the evaluation of different PUF designs. Table 3 shows the experimental results on uniqueness, reliability, uniformity, bit-aliasing, correlation and min-entropy of RO and PicoPUF designs in four different types of CLB placement. Details of the analysis and a comparison are provided in the following subsections.
Table 3 Experimental results of PicoPUF and RO based on slice locations Uniqueness
Uniqueness represents the ability to distinguish between different devices based on its response to the same challenge. As the instantiations are identical, the difference between the responses is based completely on the process variations. In order to use the designs as an intrinsic identifier, no two devices should have the same response, and the responses learnt from a (large) number of devices should not allow an adversary to infer any information about the response from a different device. Uniqueness can be measured by the average fractional HD between the responses generated from different pairs of devices. A fractional HD of 0 indicates all bits between two strings are different, and 1 means that all the bits are identical. Ideally, the expected fractional HD between any pair of responses is 0.5. The uniqueness experiment is carried out on 217 FPGA devices. A total of 217 responses are generated. Each response has 6592 bits generated from 6592 independent single-slice bit cells of a device.
In Table 3, the PicoPUF at the LEFT-UPPER location has the best uniqueness of 0.4968 with a small standard deviation (STD) of 0.0124. Therefore, PicoPUF is best implemented on the LEFT-UPPER location of Xilinx Artix-7 for uniqueness. Although the uniqueness of the RO PUF is not as good, it still has reasonably good uniqueness of 0.4895 when it is placed at the RIGHT-UPPER location. Additionally, the RIGHT-LOWER location of Xilinx Artix-7 should be avoided for the RO PUF implemented with the single-slice ROs due to its worst uniqueness at this location.
The histogram of the fractional HD for the RO PUF responses over 217 devices is shown in Fig. 4. The mean and (STD) of the distribution are 0.4805 and 0.0087, respectively. As shown in Fig. 5, the uniqueness of the PicoPUF obtained from the mean of the fractional HD distribution is \(\approx 0.4886\), which is closer to the ideal value of 0.5 than that of the RO. The STD of its distribution is 0.0094. It is interesting to note from Figs. 4 and 5 that the distribution of the fractional Hamming distances of the RO PUF is closer to Gaussian than that of the PicoPUF. This is probably attributed to the distribution of the delay deviation of wiring among slices more uniform than the distribution of delay deviations of different active elements (e.g., DFFs and NAND gates)
Correlation
Table 3 shows the spatial correlation scores computed based on the method in [20, 21]. The best correlation result (0.0102) is from the PicoPUF in the RIGHT-UPPER location, and the worst (0.0605) is from the RO in the LEFT-LOWER location. The PicoPUF has a lower correlation between devices than the RO. Based on the uniqueness and correlation results, it is recommended to place the PicoPUF and single-slice RO at the *-UPPER locations instead of the *-LOWER locations.
Figure 6a shows the correlation results of the RO frequencies at the four different RO locations for 15 different evaluation frequencies. The evaluation frequency is the reciprocal of the evaluation time. The longer the evaluation time for the ROs, the lower the correlation between the devices. The RO placements in the LEFT-UPPER and RIGHT-UPPER locations have lower correlation than those in the LEFT-LOWER and RIGHT-LOWER locations.
Min-entropy
Min-entropy is commonly used as a worst-case analysis for describing the unpredictability and randomness of the outcome of a non-uniform distribution of secret [7, 22]. The occurrence probability of 1 and 0 in the n-bit responses generated from m devices is denoted by p1 and p0, respectively. p1 can be calculated by the fractional HD of each bit b of m devices, \(\frac{\mathsf {HW}_b}{m}\), and p0 by \(1-\frac{\mathsf {HW}_b}{m}\). The maximum probability, \(p_{b\,\mathsf {max}} = \hbox {max} (p0; p1)\), is used to estimate the min-entropy per bit in 6 (“Appendix A”).
Table 3 presents the min-entropies of the RO and PicoPUF at four different placement locations. The best and worst min-entropy results of both designs are observed at the LEFT-UPPER and LEFT-LOWER locations, respectively. In particular, the PicoPUF at the LEFT-LOWER location has the worst STD of 0.1278. The results match well with the relative uniqueness at different locations, which confirm the correlation between uniqueness and min-entropy. Figure 7 shows both the average min-entropy for different numbers of devices and the bit entropy distribution over all locations for the PicoPUF. Its average min-entropy of 0.8225 is higher than that of the RO, which is 0.7825. Previous research [7] indicated that with inadequate sampling, a small number of outliers can bias the evaluation of PUF quality metrics. The result in Fig. 7 confirms this observation. The min-entropy converges only with measurements taken from more than 150 devices.
Effect of locations and evaluation time for the ROs
Figure 8 presents the min-entropy results of the RO frequencies at different RO locations for 15 RO evaluations. The longer the RO evaluation time, the greater the min-entropy, but the increase is insignificant when the RO evaluation time is larger than 974.19 MHz. The ROs at the LEFT-UPPER and RIGHT-UPPER locations have higher min-entropy than at other locations. The results again suggest that the *-UPPER locations are the best locations for the placement of these two types of PUF cell.
Effect of the number of devices
Figure 9 shows the min-entropy results of RO and PicoPUF at the four different locations and over different numbers of devices. It indicates that the larger the number of devices, the higher the min-entropy. It can be seen that approximately 140 devices (\(m \geqslant 140\)) are required in order to minimize the estimation error of the average min-entropy of the design. The min-entropy ranges from 0.7585 to 0.8826 for the PicoPUF as shown in Fig. 9b and from 0.7701 to 0.7945 for the RO as shown in Fig. 9a. Hence, PicoPUF has a broader spread of min-entropy than RO over different placement locations.
Reliability
It is important to be able to repeat the response of each bit cell of the PUF under test at all times. The greater the reliability of the raw response, the less costly the error correction. Intra-HD is a popular metric for investigating the reliability of a PUF response. It measures the fractional HD between a reference response and the measured response.
To test the reliability, \(r=10,001\) and \(r=1000\) repeated measurements were taken for every response bit of each PicoPUF and RO cell, respectively, on each FPGA. The results in the rows pertaining to reliability of Table 3 are obtained from \(m \times n = 180 \times 8000 = 1.44M\) response bits of PicoPUF and \(m \times n = 217 \times 6592 = 1.43M\) response bits of RO. For PicoPUF, a significant portion of the response bits are reliable of which \(44.87\%\) (or 646, 128 of 1440, 000) are stable 0’s and \(34.78\%\) (or 500, 832 of 1440, 000) are stable 1’s for each of the r acquisitions. For RO, \(47.75\%\) of the stable response bits (or 687, 600 of 1440, 000) are 0’s and \(48.11\%\) of the reproducible bits (or 692, 782 of 1440, 000) are 1’s for each of the r acquisitions. Hence, PicoPUF has approximately \(10\%\) difference between the number of stable 0’s and 1’s. RO bit cells are more reliable than PicoPUF bit cells, with a smaller difference of approximately 1-\(3\%\) between the stable 0’s and 1’s.
The heatmap in Fig. 10a shows the mapping of the reliability of each PicoPUF bit from a randomly selected device of the testbed to the corresponding location in the FPGA floorplan. Each box presents the probability of occurrence of 1 of each response bit in r repeated measurements. It can be seen that the ‘1’ or ‘0’ bits are evenly and randomly distributed. A small number of bits are unreliable. They are also randomly distributed. The heatmap in Fig. 10b presents similar results for the RO. The missing cells in the middle right of the image are due to the slices utilized for Miroblaze, with the remaining blank spaces not containing slices due to block RAM (BRAM) or digital signal processing (DSP) blocks. The reliabilities of both PicoPUF and RO show no significant dependence on the surrounding paths.
We also evaluated the response reliability against temperature and supply voltage variations for the PicoPUF implemented on 65-nm technology Xilinx Spartan-6 and 28-nm technology Artix-7 FPGAs. The results are plotted in Fig. 11. In this evaluation, the core supply voltage is varied by ± 10% from its nominal voltage. The nominal core voltages of Xilinx Spartan-6 and Artix-7 FPGAs are 1.2 V and 1.0 V, respectively. The average reliability of the PicoPUF against voltage variation is 94.27% on Xilinx Spartan-6 and 91.62% on Artix-7. The specified operating temperature range of both FPGA boards is 0–75\(^{\circ }\hbox {C}\). The average reliability over this working temperature range of the PicoPUF is 95.73% on Spartan-6 and 96.53% on Artix-7. The results show that the PicoPUF responses are more sensitive to voltage than temperature variation, particularly for advanced technology node with lower nominal supply voltage and reduced on–off current ratio.
Uniformity
The uniformity metric depicts how the response from each device is split between [0,1]. It is the expected ‘weight’ or ‘bias’ of a response bit for a randomly chosen device calculated by taking the average of all the response bits. An unbiased bit has a uniformity of 0.5. The results in Table 3 show that the best uniformity of 0.5019 is generated by the RO cells at the RIGHT-UPPER locations, and the worst uniformity of 0.4103 is generated by the PicoPUF cells at the LEFT-LOWER locations. Additionally, RO has better overall uniformity than PicoPUF. Similar to the results of uniqueness and correlation, the uniformity at the *-UPPER locations for both the PicoPUF and single-slice RO is better than those at the *-LOWER locations. Hence, it is recommended to avoid the placement of these cells at the *-LOWER locations on Xilinx Artix-7 FPGA.
Bit-aliasing
Bit-aliasing investigates each of the response bits individually. This can be done by simply averaging the response bits generated by all cells at the same location across the number of available devices. To ensure that no physical locations of the FPGA are strongly biased towards [0, 1], the expected bit response of each physical location of the target FPGA should be 0.5 for a well-balanced design.
Heatmaps of the bit-aliasing results for PicoPUF and RO are shown in Fig. 12a, b, respectively. In general, although no single-slice location returns the same value across different devices, a small number of cells are significantly biased. Skews toward either 1 or 0 are observed in the area adjacent to the clock distribution network for the clock tile as shown in Fig. 12b for the RO. As shown in Table 3, the best bit-aliasing result (0.5019) is from the RO at the RIGHT-UPPER location and the worst (0.4103) is from the PicoPUF at the LEFT-LOWER location. It is therefore suggested to place the single-slice RO at the *-UPPER locations and keep a distance from the clock distribution network if feasible.
Comparison and discussion
Table 4 Comparison of hardware resource consumption and metrics of different PUF designs [30]
PicoPUF and RO
The PicoPUF implemented at the LEFT-UPPER location has the best uniqueness and min-entropy, but it gives the worst uniformity, reliability, bit-aliasing and min-entropy when implemented at the LEFT-LOWER location. Hence, the RIGHT-UPPER location is the best placement choice for implementing PicoPUF on FPGA. Interestingly, the RO achieves the best reliability when it is implemented at the RIGHT-LOWER location, but it achieves lower reliability and higher uniqueness when it is implemented at both the RIGHT-UPPER and LEFT-UPPER locations. Therefore, there is an inevitable trade-off between uniqueness and reliability depending on the placement of the RO on FPGA. Considering the fact that RO-based designs usually require counters for digitalizing the RO frequencies, PicoPUF is a more lightweight choice than the RO for a design that requires higher uniqueness and less hardware resources on FPGA. The RO presented in Fig. 1a and the PicoPUF shown in Fig. 1b have very different response bit generation mechanisms. This has led to the differences in their sensitivity to inter- and intra-slice wire delay variations. Unlike conventional RO PUF, the single-slice RO has only three inverter stages, which causes its RO frequency to be more susceptible to wire delay variations within and among slices. Thus, the intra- and inter-slice wire delays of the single-slice RO play a significant role in contributing to the frequency deviation and spatial correlation of RO frequencies, respectively, in our RO PUF construction. The latter can be averaged out by a longer evaluation time, as indicated in Fig. 6b. On the other hand, the response bit of PicoPUF is generated based on the race condition of cross-coupled NAND gates and the simultaneous switching of the two DFFs. The response is predominantly influenced by the intra- and inter-slice delay differences of the active elements instead of the wire delay. For this reason, the PicoPUF can achieve better min-entropy and uniqueness results by route balancing than the RO PUF constructed from single-slice ROs.
Other weak PUFs
A comparison of the resource usage and metrics of the PicoPUF, RO PUF and the previous work on PUF implementations is shown in Table 4. The SRAM PUF cell, proposed by Guajardo et al. [31], only generates a response upon resetting the memory array. The Latch PUF proposed by Su et al. [32] dissipates low power, but the results are only reported on ASIC implementation. The Flip-flop PUF proposed by Roel et al. [34] is similar to SRAM PUF in that it uses the power-up reset of flip-flops. However, its has limited entropy and requires post-processing to boost the randomness. The Butterfly PUF proposed by Kumar et al. [35] is suitable for FPGA implementation since it can be implemented using basic logic gates. It is reported to have 94% reliability over temperature variations, but its reliability over voltage changes is not evaluated. It consumes 130 slices of a Virtex-5 FPGA device for a 64-bit response. The RO PUF proposed by Suh et al. [5] has been implemented on different FPGAs, e.g. Virtex-4 and Spartan-3. The hardware resource consumption is at least 384 slices for a 64-bit response. To avoid the interdependent response bits of Suh’s RO PUF [5], two independent single-slice ROs of Fig. 1a are used to generate one response bit. For a 128-bit response, it requires \(2 \times 128=256\) slices. Additionally, counters and comparator are also required to compare the number of positive edges of the toggle flip-flops between two ROs. The length of each counter depends on the evaluation time, which has an impact on the min-entropy of RO frequencies, as evaluated in Fig. 6b. PicoPUF design [3] is a lightweight FPGA-based Weak PUF design compared to these Weak PUF designs. In [30], the reliability of the PicoPUF design [3] has been enhanced to almost 100% by a post-characterization process at the expense of a slight degradation of uniqueness. This post-characterized version of PicoPUF is denoted by PicoPUF* in Table 4.