Abstract
Data have become a very important asset to many organizations, companies, and individuals, and thus, the security of relational databases that encapsulate these data has become a major concern. Standard database security mechanisms, as well as network-based and host-based intrusion detection systems, have been rendered inept in detecting malicious attacks directed specifically to databases. Therefore, there is an imminent need in developing an intrusion detection system (IDS) specifically for the database. In this paper, we propose the use of the random forest (RF) algorithm as the anomaly detection core mechanism, in conjunction with principal components analysis (PCA) for the task of dimension reduction. Experiments show that PCA produces a very compact, meaningful set of features, while RF, a graphical method that is most likely to exploit the inherent tree-structure characteristic of SQL queries, exhibits a consistently good performance in terms of false positive rate, false negative rate, and time complexity, even with varying number of features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lee, S.-Y., Low, W.L., Wong, P.Y.: Learning fingerprints for a database intrusion detection system. In: Gollmann, D., Karjoth, G., Waidner, M. (eds.) ESORICS 2002. LNCS, vol. 2502, pp. 264–279. Springer, Heidelberg (2002)
Huynh, V.H., Le, A.N.: Process mining and security: visualization in database intrusion detection. In: Chau, M., Wang, G., Yue, W.T., Chen, H. (eds.) PAISI 2012. LNCS, vol. 7299, pp. 81–95. Springer, Heidelberg (2012)
Jin, X., Osborn, S.L.: Architecture for data collection in database intrusion detection systems. In: Jonker, W., Petković, M. (eds.) SDM 2007. LNCS, vol. 4721, pp. 96–107. Springer, Heidelberg (2007)
Rajput, I.J., Shrivastava, D.: Data Mining based Database Intrusion Detection System: A Survey. Int’l Journal of Engineering Research and Applications (IJERA) 2(4), 1752–1755 (2012)
Hu, Y., Panda, B.: A Data Mining Approach for Database Intrusion Detection. ACM Symposium on Applied Computing, pp. 711-716 (2004)
Srivastava, A., Sural, S., Majumdar, A.K.: Database Intrusion Detection Using Weighted Sequence Mining. Journal of Computers 1(4), 8–17 (2006)
Barbara, D., Goel, R., Jajodia, S.: Mining Malicious Corruption of Data with Hidden Markov Models. In: Gudes, E., Shenoi, S. (eds.) Research Directions in Data and Applications Security, IFIP. IFIP, vol. 128, pp. 175–189. Springer, US (2003)
Ramasubramanian, P., Kannan, A.: Intelligent multi-agent based database hybrid intrusion prevention system. In: Benczúr, A.A., Demetrovics, J., Gottlob, G. (eds.) ADBIS 2004. LNCS, vol. 3255, pp. 393–408. Springer, Heidelberg (2004)
Ramasubramanian, P., Kannan, A.: A Genetic Algorithm Based Neural Network Short-term Forecasting Framework for Database Intrusion Prediction System. Soft Computing 10(8), 699–714 (2006)
Pinzón, C., Herrero, A., De Paz, J.F., Corchado, E., Bajo, J.: CBRid4SQL: a CBR intrusion detector for SQL injection attacks. In: Corchado, E., Graña Romay, M., Manhaes Savio, A. (eds.) HAIS 2010, Part II. LNCS, vol. 6077, pp. 510–519. Springer, Heidelberg (2010)
Kamra, A., Terzi, E., Bertino, E.: Detecting Anomalous Access Patterns in Relational Databases. The VLDB Journal 17(5), 1063–1077 (2008)
Ronao, C.A., Cho, S.-B.: A Comparison of Data Mining Techniques for Anomaly Detection in Relational Databases. Int’l. Conf. on Digital Society (ICDS), pp. 11-16 (2014)
Mathew, S., Petropoulos, M., Ngo, H.Q., Upadhyaya, S.: A data-centric approach to insider attack detection in database systems. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 382–401. Springer, Heidelberg (2010)
Bockermann, C., Apel, M., Meier, M.: Learning SQL for database intrusion detection using context-sensitive modelling (Extended Abstract). In: Flegel, U., Bruschi, D. (eds.) DIMVA 2009. LNCS, vol. 5587, pp. 196–205. Springer, Heidelberg (2009)
Zhang, J., Zulkernine, M., Haque, A.: Random-Forests-Based Network Intrusion Detection Systems. Systems, Man, and Cybernetics 38(5), 649–659 (2008)
Elbasiony, R.M., Sallam, E.A., Eltobely, T.E., Fahmy, M.M.: A Hybrid Network Intrusion Detection Framework based on Random Forests and Weighted K-means. Ain Shams Eng’g. Journal 4(4), 753–762 (2013)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Transaction Processing Performance Council (TPC): TPC benchmark E, Standard specification, Version 1.13.0 (2014)
Cutler, A., Cutler, R., Stevens, J.R.: Tree-based methods. In: High-Dimensional Data Analysis in Cancer Research, pp. 1-19. Springer (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ronao, C.A., Cho, SB. (2015). Mining SQL Queries to Detect Anomalous Database Access using Random Forest and PCA. In: Ali, M., Kwon, Y., Lee, CH., Kim, J., Kim, Y. (eds) Current Approaches in Applied Artificial Intelligence. IEA/AIE 2015. Lecture Notes in Computer Science(), vol 9101. Springer, Cham. https://doi.org/10.1007/978-3-319-19066-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-19066-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19065-5
Online ISBN: 978-3-319-19066-2
eBook Packages: Computer ScienceComputer Science (R0)