Abstract
Machine learning-based side-channel attacks have recently been introduced to recover the secret information from protected software and hardware implementations. Limited research exists for public-key algorithms, especially on non-traditional implementations like those using Residue Number System (RNS). Template attacks were proven successful on RNS-based Elliptic Curve Cryptography (ECC), only if the aligned portion is used for templates. In this study, we present a systematic methodology for the evaluation of ECC cryptosystems with and without countermeasures (both RNS-based and traditional ones) against ML-based side-channel attacks using two attack models on full length aligned and unaligned leakages. RNS-based ECC datasets are evaluated using four machine learning classifiers and comparison is provided with existing state-of-the-art template attacks. Moreover, we analyze the impact of raw features and advanced hybrid feature engineering techniques. We discuss the metrics and procedures that can be used for accurate classification on the imbalanced datasets. The experimental results demonstrate that, for ECC RNS datasets, the efficiency of simple machine learning algorithms is better than the complex deep learning techniques when such datasets are limited in size. This is the first study presenting a complete methodology for ML side-channel attacks on public key algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We note here that the unbalanced dataset was not created on purpose, it was caused due to signal processing applied on the traces. The unbalanced dataset reflects real-life situations, and by using SMOTE we show that it is possible to get valid conclusions. Therefore, our methodology is validated even when there are unbalanced datasets.
References
Inspector SCA tool. https://www.riscure.com/security-tools/inspector-sca. Accessed 08 Feb 2022
National Computational Infrastructure Australia. https://nci.org.au/our-services/supercomputing. Accessed 08 Feb 2022
Andrikos, C., et al.: Location, location, location: revisiting modeling and exploitation for location-based side channel leakages. In: Galbraith, S.D., Moriai, S. (eds.) ASIACRYPT 2019, Part III. LNCS, vol. 11923, pp. 285–314. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34618-8_10
Archambeau, C., Peeters, E., Standaert, F.-X., Quisquater, J.-J.: Template attacks in principal subspaces. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249, pp. 1–14. Springer, Heidelberg (2006). https://doi.org/10.1007/11894063_1
Bajard, J.-C., Duquesne, S., Meloni, N.: Combining Montgomery Ladder for Elliptic Curves defined over Fp and RNS Representation. In Research Report 06041 (2006)
Bajard, J.-C., Eynard, J., Gandino, F.: Fault detection in RNS Montgomery modular multiplication. In: IEEE 21st Symposium on Computer Arithmetic, pp. 119–126. IEEE (2013)
Bajard, J.-C., Imbert, L., Liardet, P.-Y., Teglia, Y.: Leak resistant arithmetic. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156, pp. 62–75. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28632-5_5
Batina, L., Hogenboom, J., van Woudenberg, J.G.J.: Getting more from PCA: first results of using principal component analysis for extensive power analysis. In: Dunkelman, O. (ed.) CT-RSA 2012. LNCS, vol. 7178, pp. 383–397. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27954-6_24
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19, 711–720 (1997)
Benadjila, R., Prouff, E., Strullu, R., Cagli, E., Dumas, C.: Deep learning for side-channel analysis and introduction to ASCAD database. J. Cryptogr. Eng. 11, 163–188 (2019)
Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Cagli, E., Dumas, C., Prouff, E.: Convolutional neural networks with data augmentation against jitter-based countermeasures. In: Fischer, W., Homma, N. (eds.) CHES 2017. LNCS, vol. 10529, pp. 45–68. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66787-4_3
Carbone, M., et al.: Deep learning to evaluate secure RSA implementations. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2019(2), 132–161 (2019)
Chari, S., Rao, J.R., Rohatgi, P.: Template attacks. In: Kaliski, B.S., Koç, K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 13–28. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36400-5_3
Chawla, N., Bowyer, R., Hall, L., Kegelmeyer, W.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. (JAIR) 16, 321–357 (2002)
Chollet, F.: Keras (2015). https://keras.io
Ciet, M., Neve, M., Peeters, E., Quisquater, J.-J.: Parallel FPGA implementation of RSA with residue number systems - Can Side-channel threats be avoided? In Cryptology ePrint Archive, Report 2004/187 (2004)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
De Mulder, E., Örs, S.B., Preneel, B., Verbauwhede, I.: Differential power and electromagnetic attacks on a FPGA implementation of elliptic curve cryptosystems. Comput. Electr. Eng. 33(5–6), 367–382 (2007)
Durvaux, F., Standaert, F.-X., Veyrat-Charvillon, N.: How to certify the leakage of a chip? In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 459–476. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-55220-5_26https://www.iacr.org/archive/eurocrypt2014/84410138/84410138.pdf
Apostolos P. Fournaris. RNS_LRA_EC_scalar Multiplier (2018). https://github.com/afournaris/RNS_LRA_EC_Scalar_Multiplier
Fournaris, A.P., Klaoudatos, N., Sklavos, N., Koulamas, C.: Fault and power analysis attack resistant RNS based edwards curve point multiplication. In: Proceedings of the 2nd Workshop on Cryptography and Security in Computing Systems, CS2 at HiPEAC 2015, Amsterdam, Netherlands, 19–21 January 2015, pp. 43–46 (2015)
Fournaris, A.P., Papachristodoulou, L., Batina, L., Sklavos, N.: Residue number system as a side channel and fault injection attack countermeasure in elliptic curve cryptography. In: 2016 International Conference on Design and Technology of Integrated Systems in Nanoscale Era (DTIS), pp. 1–4 (2016)
Fournaris, A.P., Papachristodoulou, L., Sklavos, N.: Secure and efficient RNS software implementation for elliptic curve cryptography. In: 2017 IEEE European Symposium on Security and Privacy Work., pp. 86–93. IEEE (2017)
Standaert, F.-X., Archambeau, C.: Using subspace-based template attacks to compare and combine power and electromagnetic information leakages. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 411–425. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85053-3_26
Genkin, D., Shamir, A., Tromer, E.: RSA key extraction via low-bandwidth acoustic cryptanalysis. In: Garay, J.A., Gennaro, R. (eds.) CRYPTO 2014. LNCS, vol. 8616, pp. 444–461. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44371-2_25
Gilmore, R., Hanley, N., O’Neill, M.: Neural network based attack on a masked implementation of AES. In 2015 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), pp. 106–11. Institute of Electrical and Electronics Engineers (IEEE) (2015)
Golder, A., Das, D., Danial, J., Ghosh, S., Sen, S., Raychowdhury, A.: Practical approaches toward deep-learning-based cross-device power side-channel attack. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 27, 2720–2733 (2019)
Goodwill, G., Jun, B., Jaffe, J., Rohatgi, P.: A testing methodology for side channel resistance validation. In: NIST Noninvasive Attack Testing Workshop (2011)
Perin, G., Imbert, L., Torres, L., Maurine, P.: Attacking randomized exponentiations using unsupervised learning. In: Prouff, E. (ed.) COSADE 2014. LNCS, vol. 8622, pp. 144–160. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10175-0_11
Guillermin, N.: A Coprocessor for Secure and High Speed Modular Arithmetic. IACR Cryptology ePrint Archive (2011)
Hospodar, G., Gierlichs, B., De Mulder, E., Verbauwhede, I., Vandewalle, J.: Machine learning in side-channel analysis: a first study. J. Crypt. Eng. 1(4), 293 (2011)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning: With Applications in R, August 2013
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of the Eleventh International Conference on International Conference on Machine Learning, ICML 1994, pp. 121–129, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. (1994)
Jolliffe, I.: Principal Component Analysis, pp. 1094–1096. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-04898-2_455
Joye, M., Yen, S.-M.: The Montgomery powering ladder. In: Kaliski, B.S., Koç, K., Paar, C. (eds.) CHES 2002. LNCS, vol. 2523, pp. 291–302. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36400-5_22
Kim, J., Picek, S., Heuser, A., Bhasin, S., Hanjalic, A.: Make some noise. unleashing the power of convolutional neural networks for profiled side-channel analysis. IACR Trans. Crypt. Hardw. Embed. Syst. 2019(3), 148–179 (2019)
Kocher, P., Jaffe, J., Jun, B.: Differential power analysis. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-48405-1_25
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Langley, P., Iba, W.: Average-case analysis of a nearest neighbor algorithm. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence - Volume 2, IJCAI 1993, pp. 889–894, San Francisco, CA, USA (1993). Morgan Kaufmann Publishers Inc
LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. In: Shape, Contour and Grouping in Computer Vision. LNCS, vol. 1681, pp. 319–345. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-46805-6_19
Maghrebi, H., Portigliatti, T., Prouff, E.: Breaking cryptographic implementations using deep learning techniques. IACR Cryptology ePrint Archive 2016, p. 921 (2016)
Markowitch, O., Lerman, L., Bontempi, G.: Side channel attack: an approach based on machine learning. In: Constructive Side-Channel Analysis and Secure Design, COSADE (2011)
Martins, P., Sousa, L.: The role of non-positional arithmetic on efficient emerging cryptographic algorithms. IEEE Access 8, 59533–59549 (2020)
Masure, L., Dumas, C., Prouff, E.: A comprehensive study of deep learning for side-channel analysis. IACR Trans. Crypt. Hardw. Embed. Syst. 348–375 (2020)
Mukhtar, N., Mehrabi, A., Kong, Y., Anjum, A.: Machine-learning-based side-channel evaluation of elliptic-curve cryptographic FPGA processor. Appl. Sci. 9, 64 (2018)
Papachristodoulou, L., Fournaris, A.P., Papagiannopoulos, K., Batina, L.: Practical evaluation of protected residue number system scalar multiplication. IACR Trans. Crypt. Hardw. Embed. Syst. 2019(1), 259–282 (2018)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Picek, S., Heuser, A., Jovic, A., Batina, L.: A systematic evaluation of profiling through focused feature selection. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 27, 2802–2815 (2019)
Picek, S., Heuser, A., Jovic, A., Bhasin, S., Regazzoni, F.: The curse of class imbalance and conflicting metrics with machine learning for side-channel evaluations. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2019(1), 209–237 (2019)
Schmidhuber, J.: Deep learning in neural networks. Neural Netw 61(C), 85–117 (2015)
Svetnik, V., Liaw, A., Tong, C., Christopher Culberson, J., Sheridan, R.P., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)
Swets, D.L., Weng, J.: Using discriminant eigenfeatures for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 18(8), 831–836 (1996)
Weissbart, L., Picek, S., Batina, L.: One trace is all it takes: machine learning-based side-channel attack on EdDSA. In: Bhasin, S., Mendelson, A., Nandi, M. (eds.) SPACE 2019. LNCS, vol. 11947, pp. 86–105. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35869-3_8
Zaid, G., Bossuet, L., Habrard, A., Venelli, A.: Methodology for efficient CNN architectures in profiling attacks. IACR Trans. Crypt. Hardw. Embedd. Syst. 2020(1), 1–36 (2019)
Zeng, Z., Gu, D., Liu, J., Guo, Z.: An improved side-channel attack based on support vector machine. In: 2014 Tenth International Conference on Computational Intelligence and Security, pp. 676–680, November 2014
Acknowledgments
We would like to thank the COSADE reviewers for their useful comments and feedback. This work is partially supported by international Macquarie University Research Excellence Scholarship. This research received funding from the Dutch Research Council (NWO) in the framework of the NWA-Cybersecurity Call for the project Physical Attack Resistance of Cryptographic Algorithms and Circuits with Reduced Time to Market (PROACT, project number: NWA.1215.18.014). Also, the work has received funding from the European Union’s Horizon 2020 research and innovation programme CPSoSaware under grant agreement No. 871738 and also from the European Union’s Horizon 2020 research and innovation programme CONCORDIA under grant agreement No. 830927.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
Machine Learning Algorithms In this paper, four different classifiers are used to create the training model for the leakage information of a device under test. Hereby, each classifier is described in brief and the important parameters for profiling SCAs are specified.
Support Vector Machine (SVM). Support Vector Machines are one of the most popular algorithms used for classification problems in different application domains, including side-channel analysis [19, 33, 57]. In SVM, n-dimensional data is separated using a hyperplane, by computing and adjusting the coefficients to find the maximum-margin hyperlane, which best separates the target classes. Often, real-world data is very complex and cannot be separated with a linear hyperplane. For learning hyperplanes in complex problems, the training instances or the support vectors are transformed into another dimension using kernels. There are three widely used SVM kernels; linear, radial and polynomial. To tune the kernels, hyperparameters like ‘gamma’ and cost ‘C’ play a vital role. Parameter ‘C’ acts as a regularization parameter in SVM and helps in adjusting the margin distance from the hyperplane. Thus, it controls the cost of misclassification. Parameter ‘gamma’ controls the spread of the Gaussian curve. Low values of ’C’ reflect more variance and lower bias; however, higher values of ‘C’ show lower variance and higher bias. However, higher gamma leads to better accuracy but results in a biased model. To find an optimum value of ‘C’ and ‘gamma’, gridsearch or other optimization methods are applied.
Random Forest (RF). In Random Forest, data is formed by aggregating the collection of decision trees [12]. The results of individual decision trees are combined together to predict the final class value. RF uses unpruned trees, avoids over-fitting by design, and reduces the bias error. Efficient modeling using random forests, highly depends on the number of trees in the forest and the depth of each tree. These two parameters are tuned for an efficient model in this study.
Multi-layer Perceptron (MLP). Multi-Layer Perceptron is a basic feed-forward artificial neural network that uses back-propagation for learning and consists of three layers: input layer, hidden layer, and a output layer [52]. Input layer connects to the input feature variables and output layers return back the predicted class value. To learn the patterns from the non-linear data, non-linear activation function is used. Due to the non-linear nature of side-channel leakages, MLP appears to be the best choice, in order to recover secret information from learning patterns of the signals.
Convolutional Neural Network (CNN). Convolutional Neural Network is a type of neural network which consists of convolutional layers, activation layers, flatten layer, and pooling layer. Convolutional layer performs convolution on the input features, using filters, to recognize the patterns in the data [42]. The pooling layer is a non-linear layer, and its functionality is to reduce the spatial size and hence the parameters. Fully connected layers combine the features back, just like in MLP. There are certain hyperparameters related to each layer, which can be optimized for an efficient trained model. These parameters include learning rate, batch size, epochs, optimizers, activation functions, etc. In addition to these, there are a few model hyperparameters which can be used to design an efficient architecture. It should be noted that the purpose of this study is not to propose the architecture design of the CNN, but to analyze and test the existing CNN design on the RNS-based ECC dataset. Therefore, the focus is on tuning the optimized hyperparameters rather than modeling hyperparameters.
Hyperparameters tuning for SVM, RF, MLP, CNN For the training phase, we tuned the hyperparameters for SVM, RF, MLP and CNN using gridsearch to obtain the best possible model, as shown in Table 2.
The best parameters are chosen for each classification method and they are shown in Table 3.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mukhtar, N., Papachristodoulou, L., Fournaris, A.P., Batina, L., Kong, Y. (2022). Machine-Learning Assisted Side-Channel Attacks on RNS ECC Implementations Using Hybrid Feature Engineering. In: Balasch, J., O’Flynn, C. (eds) Constructive Side-Channel Analysis and Secure Design. COSADE 2022. Lecture Notes in Computer Science, vol 13211. Springer, Cham. https://doi.org/10.1007/978-3-030-99766-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-99766-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99765-6
Online ISBN: 978-3-030-99766-3
eBook Packages: Computer ScienceComputer Science (R0)