Abstract
The identification of a cryptosystem has been a challenge for decades. This paper’s main objective is to identify the type of cryptosystem used to encrypt a particular text. We have explored the realm of machine learning to recognize a pattern among complex classical ciphertexts that generally have a simple representation in plaintext. We have modeled our objective as a sequence-to-sequence learning task that we have tried to solve using Convolution Neural Networks (CNNs) and state-of-the-art Transformer models. With only a tiny dataset (130 k) consisting of ciphertexts and the corresponding cryptosystem used to encrypt the same, our model has shown a good accuracy of 96.72 % which proves a significantly steep learning curve compared to other sequence-to-sequence models. Here we show the enormous potential of these models and how they can perform even better if the barrier of resources and computation time is lifted.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Katz J, Lindell Y Introduction to modern cryptography. http://staff.ustc.edu.cn/~mfy/moderncrypto/reading%20materials/Introduction_to_Modern_Cryptography.pdf
Leierzopf E, Mikhalev V, Kopal N, Esslinger B, Lampesberger H, Hermann E (2021) Detection of classical cipher types with feature-learning approaches. https://link.springer.com/chapter/10.1007/978-981-16-8531-6_11
Kopal N (2020) Of ciphers and neurons—detecting the type of ciphers using artificial neural networks. https://www.researchgate.net/publication/341517754_Of_Ciphers_and_Neurons_-_Detecting_the_Type_of_Ciphers_Using_Artificial_Neural_Networks)
Ahmadzadeh E, Kim H, Jeong O, Kim N, Moon I A deep bidirectional lstm-gru network model for automated ciphertext classification. https://ieeexplore.ieee.org/document/9668927
Zupan J (1994) Introduction to artificial neural network (ANN) methods: what they are and how to use them. Acta Chimica Slovenica 41:327–327
Gallant SI et al (1990) Perceptron-based learning algorithms. IEEE Trans Neural Netw 1(2):179–191
O’Shea K, Nash R (2015) An introduction to convolutional neural networks. http://arxiv.org/abs/1511.08458
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. https://arxiv.org/abs/1706.03762
Shakespeare W (2015) 2015/hamlet.txt at master . cs109/2015. https://github.com/cs109/2015/blob/master/Lectures/Lecture15b/sparklect/shakes/hamlet.txt. Accessed on 11 Mar 2023
Shakespeare W (2015) 2015/macbeth.txt at master. cs109/2015. https://github.com/cs109/2015/blob/master/Lectures/Lecture15b/sparklect/shakes/macbeth.txt, Accessed on 09 Mar 2023
Shakespeare W (2015) 2015/merchantofvenice.txt at master. cs109/2015. https://github.com/cs109/2015/blob/master/Lectures/Lecture15b/sparklect/shakes/merchantofvenice.txt. Accessed on 15 Mar 2023
Shakespeare W (2015) 2015/romeojuliet.txt at master . cs109/2015. https://github.com/cs109/2015/blob/master/Lectures/Lecture15b/sparklect/shakes/romeojuliet.txt. Accessed on 04 Mar 2023
Tensorflow: tf.keras.layers.textvectorization | tensorflow v2.11.0. https://www.tensorflow.org/api_docs/python/tf/keras/layers/TextVectorization. Accessed on 25 Mar 2023
Liu Z, Mao H, Wu CY, Feichtenhofer C, Darrell T, Xie S (2022) A convnet for the 2020s. https://arxiv.org/abs/2201.03545
Neo G (2023) Gpt neo. https://huggingface.co/docs/transformers/model_doc/gpt_neo. Accessed on 17 Feb 2023
Zhang S, Roller S, NGMAMCSCCD et al (2022) Open pre-trained transformer language models. https://doi.org/10.48550/arXiv.2205.01068
Gao L, Biderman S, Black S, Golding L, Hoppe T, Foster et al The pile: an 800gb dataset of diverse text for language modeling. https://arxiv.org/abs/2101.00027 (2020)
Baumgartner J, Zannettou S, Keegan B, Squire M, Blackburn J (2020) The pushshift reddit dataset. https://arxiv.org/abs/2001.08435
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al Language models are unsupervised multitask learners. OpenAI blog
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S (2020) Language models are few-shot learners. https://doi.org/10.48550/arXiv.2005.14165
Face H (2023) Hugging face—the ai community building the future. https://huggingface.co/. Accessed on 17 Feb 2023
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mukherjee, A. et al. (2023). Detection of Cipher Types Using Machine Learning Techniques. In: Das, A.K., Nayak, J., Naik, B., Vimal, S., Pelusi, D. (eds) Computational Intelligence in Pattern Recognition. CIPR 2022. Lecture Notes in Networks and Systems, vol 725. Springer, Singapore. https://doi.org/10.1007/978-981-99-3734-9_25
Download citation
DOI: https://doi.org/10.1007/978-981-99-3734-9_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-3733-2
Online ISBN: 978-981-99-3734-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)