Abstract
With the rise of traffic over wide networks, particularly the internet, and the cloud-based transactions and interactions, database security is important for any organisation. The detection of, and protection from, unauthorised external attacks and insiders abusing privileges is an integral part of database security. To that end, we propose Outlier based Intrusion Detection in Databases for User Behaviour Analysis using Weighted Sequential Pattern Mining (BWSPM), a novel method for the detection of malicious transactions through a sequential flow from outlier detection followed by different behavioural checks at the role-based rule mining component, and finally a user level behavioural check. In the worst case, a transaction has to go through a triple-fold security validation directing the model from generalisation to specification. The Outlier Detection module generates clusters based on the syntactic characteristics of transactions and detects transactions that do not adhere to their closest cluster. Role-level analysis is based upon mining rules that capture dynamic usage of attributes local to every role domain, and the transactions are verified against these rules. Finally, User behaviour profiling models user behaviour based on past transactions, and the incoming transaction is flagged if it diverges from that. Security checks are made at every level to prevent further transaction analysis to reduce false positive rate and achieve a higher degree of optimisation. Encouraging results, with levels of accuracy of around 86.4%, were obtained through our approach after conducting experiments on a dataset generated using the TPC-C (Transaction Processing Performance Council) benchmark.
Similar content being viewed by others
Data availability
The datasets generated and/or analysed during the current study are available in http://www.tpc.org/tpcc/default.asp.
References
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, pp 3–14
Agrawal Rakesh, Srikant Ramakrishnan, et al. (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, vol 1215, pp 487–499
Alzubi OA, Qiqieh I, Alzubi JA (2023) Fusion of deep learning based cyberattack detection and classification model for intelligent systems. Cloud Comput 2:1363–1374
Alzubi Omar, Alzubi Jafar, Alazab Moutaz, Alrabea Adnan, Awajan Albara, Qiqieh Issa (2022) Optimized machine learning-based intrusion detection system for fog and edge computing environment. Electronics 11:3007
Bergroth L, Hakonen H, Raita T (2000) A survey of longest common subsequence algorithms. In: Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000, pp 39–48
Bertino Elisa, Sandhu R (2005) Database security - concepts, approaches. IEEE Trans Dependable Secure Comput 2:2–19
Bezdek James C, Ehrlich Robert, Full William (1984) The fuzzy c-means clustering algorithm. FCM Comput Geosci 10(2–3):191–203
Seok-Jun Bu, Cho Sung-Bae (2020) A convolutional neural-based learning classifier system for detecting database intrusion via insider attack. Inform Sci 512:123–136
Seok-Jun Bu, Kang Han-Bit, Cho Sung-Bae (2022) Ensemble of deep convolutional learning classifier system based on genetic algorithm for database intrusion detection. Electronics 11(5)
Cai Jinyu, Fan Jicong (2022) Perturbation learning based anomaly detection
Dawn Cappelli, Andrew Moore, Randall Trzeciak (2012) The CERT Guide to Insider Threats: How to Prevent, Detect, and Respond to Information Technology Crimes (Theft, Sabotage, Fraud)
Cárdenas Alvaro A, Amin Saurabh, Lin Zong-Syun, Huang Yu-Lun, Huang Chi-Yen, Sastry Shankar (2011) Attacks against process control systems: risk assessment, detection, and response. In: Proceedings of the 6th ACM symposium on information, computer and communications security, pp 355–366
Chen Ming-Syan, Han Jiawei, Philip SYu (1996) Data mining: an overview from a database perspective. IEEE Trans Knowledge Data Eng 8(6):866–883
Chung Christina Yip, Gertz Michael, Levitt Karl (1999) Demids: A misuse detection system for database systems. In: Working Conference on Integrity and Internal Control in Information Systems. Springer, pp 159–178
Debar Hervé, Dacier Marc, Wespi Andreas (1999) Towards a taxonomy of intrusion-detection systems. Computer Networks 31(8):805–822
Denning DE (1987) An intrusion-detection model. IEEE Trans Software Eng SE–13(2):222–232
Ferraiolo David, Sandhu Ravi, Serban Gavrila D, Kuhn Ramaswamy Chandramouli (2001) Proposed nist standard for role based access control. ACM Trans Inf Syst Secur 4:224–274
Yang-Geng Fu, Ye Ji-Feng, Yin Ze-Feng, Chen Long-Jiang, Wang Ying-Ming, Liu Geng-Geng (2021) Construction of ebrb classifier for imbalanced data based on fuzzy c-means clustering. Knowledge Based Syst 234:107590
Ge Jiaqi, Xia Yuni, Wang Jian, Hewa Nadungodage Chandima, Prabhakar Sunil (2017) Sequential pattern mining in databases with temporal uncertainty. Knowledge Inform Syst 51(3):821–850
Gondree Mark, Mohassel Payman (2009) Longest common subsequence as private search 81–90
Guorui Feng, Xinguo Zou, Jian Wu (2012) Intrusion detection based on the semi-supervised fuzzy c-means clustering algorithm. In: 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet). IEEE, pp 2667–2670
Harish BS, Aruna Kumar SV (2017) Anomaly based intrusion detection using modified fuzzy clustering
Hashemi Sattar, Yang Ying, Zabihzadeh Davoud, Kangavari Mohammadreza (2008) Detecting intrusion transactions in databases using data item dependencies and anomaly analysis. Expert Syst 25(5):460–473
Heady R, Luger George, Maccabe Arthur, Servilla Mark (1990) The architecture of a network level intrusion detection system
Hu Yi, Panda Brajendra (2003) Identification of malicious transactions in database systems 329–335
Hu Yi, Panda Brajendra (2004) A data mining approach for database intrusion detection 711–716
Hung Ming-Chuan, Yang Don-Lin (2001) An efficient fuzzy c-means clustering algorithm. In: Proceedings 2001 IEEE International Conference on Data Mining. IEEE, pp 225–232
Kalid Suraya Nurain, Ng Keng-Hoong, Tong Gee-Kok, Khor Kok-Chin (2020) A multiple classifiers system for anomaly detection in credit card data with unbalanced and overlapped classes. IEEE Access 8:28210–28221
Kamra Ashish, Terzi Evimaria, Bertino Elisa (2008) Detecting anomalous access patterns in relational databases. VLDB J 17(5):1063–1077
Khan Muhammad Imran, O’Sullivan Barry, Foley Simon N (2017) A semantic approach to frequency based anomaly detection of insider access in database management systems. In: International Conference on Risks and Security of Internet and Systems. Springer, pp 18–28
Kim Tae-Young, Cho Sung-Bae (2019) Cnn-lstm neural networks for anomalous database intrusion detection in rbac-administered model. In: International Conference on Neural Information Processing. Springer, pp 131–139
Kim Tae Young, Cho Sung Bae (2021) Optimizing cnn-lstm neural networks with pso for anomalous query access control. Neurocomputing 456:666–677
Kundu Amlan, Sural Shamik, Majumdar Arun K (2010) Database intrusion detection using sequence alignment. Int J inform Secur 9(3):179–191
Lan Guo-Cheng, Hong Tzung-Pei, Lee Hong-Yu (2014) An efficient approach for finding weighted sequential patterns from sequence databases. Appl Intellig 41:439–452
Liao Hung-Jen, Lin Chun-Hung Richard, Lin Ying-Chih, Tung Kuang-Yuan (2013) Intrusion detection system: a comprehensive review. J Network Comput Appl 36(1):16–24
Martín Alejandro G, Beltrán Marta, Fernández-Isabel Alberto, de Diego Isaac Martín (2021) An approach to detect user behaviour anomalies within identity federations. Comput Secur 108:102356
Navarro Gonzalo (2001) A guided tour to approximate string matching. ACM Comput Surveys (CSUR) 33(1):31–88
Alzubi Jafar A, Alzubi Omar A, Qiqieh Issa (2023) Fusion of deep learning based cyberattack detection and classification model for intelligent systems. Clust Comput 1363–1374
Panigrahi Suvasini, Sural Shamik, Majumdar Arun (2013) Two-stage database intrusion detection by combining multiple evidence and belief update. Inform Syst Front 15:1–19
Rahman Md Mahmudur, Ahmed Chowdhury F, Leung Carson K, Pazdor Adam GM (2018) Frequent sequence mining with weight constraints in uncertain databases. In: Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication, pp 1–8
Rahman Md Mahmudur, Ahmed Chowdhury Farhan, Leung Carson Kai-Sang (2019) Mining weighted frequent sequences in uncertain databases. Inform Sci 479:76–100
Ranjan Rohit, Kumar Shashi Shekhar (2022) User behaviour analysis using data analytics and machine learning to predict malicious user versus legitimate user. High-Confidence Comput 2(1):100034
Ronao Charissa Ann, Cho Sung-Bae (2016) Anomalous query access detection in rbac-administered databases with random forest and PCA. Inform Sci 369:238–250
Roy Kashob Kumar, Moon Md Hasibul Haque, Rahman Md Mahmudur, Ahmed Chowdhury Farhan, Leung Carson Kai-Sang (2022) Mining weighted sequential patterns in incremental uncertain databases. Inform Sci 582:865–896
Sallam Asmaa, Bertino Elisa (2019) Result-based detection of insider threats to relational databases. In: Proceedings of the ninth ACM conference on data and application security and privacy, pp 133–143
Singh I, Sareen S, Ahuja H (2017) Detection of malicious transactions in databases using dynamic sensitivity and weighted rule mining. In: 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp 1–8
Srikant Ramakrishnan, Agrawal Rakesh (1996) Mining sequential patterns: Generalizations and performance improvements. In: International Conference on Extending Database Technology. Springer, pp 1–17
Srivastava Abhinav, Sural Shamik, Majumdar Arun (2006) Database intrusion detection using weighted sequence mining. JCP 1:8–17
Subudhi Sharmila, Panigrahi Suvasini (2019) Application of optics and ensemble learning for database intrusion detection. J King Saud University-Comput Inform Sci
Sun Xiaobing, Wenjie Feng, Liu Shenghua, Xie Yuyang, Bhatia Siddharth, Hooi Bryan, Wang Wenhan, Cheng Xueqi (2022) Monlad: Money laundering agents detection in transaction streams 976–986
Sun Yuqing, Haoran Xu, Bertino Elisa, Sun Chao (2016) A data-driven evaluation for insider threats. Data Sci Eng 1:07
TPC. Tpc-c benchmark. http://www.tpc.org/tpcc/, Last Accessed = 20-01-01, 1992
Wang Weina, Zhang Yunjie, Li Yi, Zhang Xiaona (2006) The global fuzzy c-means clustering algorithm. In: 2006 6th World Congress on Intelligent Control and Automation, vol 1. IEEE, pp 3604–3607
Wang Yazi, Liang Yingbo, Sun Huaibo, Ma Yuankun (2020) Intrusion detection and performance simulation based on improved sequential pattern mining algorithm. Cluster Comput 23(3):1927–1936
Yang Yinghui Catherine (2010) Web user behavioral profiling for user identification. Decis Support Syst 49(3):261–271
Yun Unil, Leggett John J (2005) Wfim: weighted frequent itemset mining with a weight range and a minimum weight. In: Proceedings of the 2005 SIAM international conference on data mining, vol 26. SIAM, pp 636–640
Zhang Zhong-Ping, Shi Ming-Yue, Liu Cong, Qiu Jing-Yang, Qi Jie (2019) Fast local outlier detection algorithm using k kernel space. J Comput Methods Sci Eng 19(3):751–764
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
IS: Conceptualization, Methodology, Writing, Validation, Review & Editing. RJ: Conceptualization, Supervision, Validation, Review & Editing
Corresponding author
Ethics declarations
Conflict of interest
Authors declare that we have no conflict of interest. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Informed Consent
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Singh, I., Jindal, R. Outlier based intrusion detection in databases for user behaviour analysis using weighted sequential pattern mining. Int. J. Mach. Learn. & Cyber. (2023). https://doi.org/10.1007/s13042-023-02049-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13042-023-02049-4