Bagging-RandomMiner: a one-class classifier for file access-based masquerade detection

  • José Benito Camiña
  • Miguel Angel Medina-PérezEmail author
  • Raúl Monroy
  • Octavio Loyola-González
  • Luis Angel Pereyra Villanueva
  • Luis Carlos González Gurrola
Special Issue Paper


Dependence on personal computers has required the development of security mechanisms to protect the information stored in these devices. There have been different approaches to profile user behavior to protect information from a masquerade attack; one such recent approach is based on user file-access patterns. In this paper, we propose a novel classification ensemble for file access-based masquerade detection. We have successfully validated the hypothesis that a one-class classification approach to file access-based masquerade detection outperforms a multi-class one. In particular, our proposed one-class classifier significantly outperforms several state-of-the-art multi-class classifiers. Our results indicate that one-class classification attains better classification results, even when unknown attacks arise. Additionally, we introduce three new repositories of datasets for the identification of the three main types of attacks reported in the literature, where each training dataset contains no object belonging to the type of attack to be identified. These repositories can be used for testing future classifiers, simulating attacks carried out in a real scenario.


One-class classification Masquerade detection information security User behavior File access 



Area under the receiver operating characteristic curve


Critical difference


False positive


False positive detection rate


File system


File-system navigation


Graphical user interface


Human–computer interaction


Masquerade detection system


Masquerade testing set


Masquerade training set


Most representative object


Personal computer


Principal component analysis


Receiver operating characteristic


Testing set


True positive


True positive detection rate


Training set


User testing set


User training set


Zero-false positives


One versus all


Fivefold cross-validation



We wish to express our gratitude to the members of the GIEE-ML group at Tecnológico de Monterrey for providing useful suggestions and advice on earlier versions of this paper.


  1. 1.
    Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991). Google Scholar
  2. 2.
    Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)Google Scholar
  3. 3.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). CrossRefzbMATHGoogle Scholar
  4. 4.
    Camiña, B., Monroy, R., Trejo, L.A., Sánchez, E.: Towards building a masquerade detection method based on user file system navigation. In: Batyrshin, I., Sidorov, G. (eds.) Proceedings of the 10th Mexican International Conference on Artificial Intelligence (MICAI 2011), pp. 174–186. Springer, Berlin (2011). Google Scholar
  5. 5.
    Camiña, J.B., Hernndez-Gracidas, C., Monroy, R., Trejo, L.: The windows-users and -intruder simulations logs dataset (wuil): an experimental framework for masquerade detection mechanisms. Expert Syst. Appl. 41(3), 919–930 (2014). Methods and Applications of Artificial and Computational IntelligenceCrossRefGoogle Scholar
  6. 6.
    Camiña, J.B., Monroy, R., Trejo, L.A., Medina-Pérez, M.A.: Temporal and spatial locality: an abstraction for masquerade detection. IEEE Trans. Inf. Forensics Secur. 11(9), 2036–2051 (2016). CrossRefGoogle Scholar
  7. 7.
    Camiña, J.B., Rodríguez, J., Monroy, R.: Towards a masquerade detection system based on user’s tasks. In: Stavrou, A., Bos, H., Portokalidis, G. (eds.) Proceedings of the 17th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2014), pp. 447–465. Springer, Cham (2014). Google Scholar
  8. 8.
    Cessie, S.L., Houwelingen, J.C.V.: Ridge estimators in logistic regression. J. R. Stat. Soc. Ser. C (Appl. Stat.) 41(1), 191–201 (1992)zbMATHGoogle Scholar
  9. 9.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Denning, D.E.: An intrusion-detection model. IEEE Trans. Softw. Eng. SE-13(2), 222–232 (1987).
  11. 11.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 7th edn. Wiley-Interscience, Hoboken (2012)zbMATHGoogle Scholar
  12. 12.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006). ROC Analysis in Pattern RecognitionMathSciNetCrossRefGoogle Scholar
  13. 13.
    Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148–156 (1996)Google Scholar
  14. 14.
    García, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)zbMATHGoogle Scholar
  15. 15.
    Garg, A., Rahalkar, R., Upadhyaya, S., Kwiat, K.: Profiling users in GUI based systems for masquerade detection. In: IEEE Information Assurance Workshop, pp. 48–54 (2006).
  16. 16.
    Gates, C., Li, N., Xu, Z., Chari, S.N., Molloy, I., Park, Y.: Detecting insider information theft using features from file access logs. In: Kutyłowski, M., Vaidya, J. (eds.) Proceedings of the 19th European Symposium on Research in Computer Security (ESORICS), pp. 383–400. Springer, Cham (2014). Google Scholar
  17. 17.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). CrossRefGoogle Scholar
  18. 18.
    Haykin, S.S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Tsinghua University Press, Beijing (2001)zbMATHGoogle Scholar
  19. 19.
    Japkowicz, N.: Assessment Metrics for Imbalanced Learning. In: He, H., Ma, Y. (eds.) Imbalanced Learning: Foundations, Algorithms, and Applications, Chap. 8, pp. 187–206. Wiley, New York (2013). CrossRefGoogle Scholar
  20. 20.
    Jian, Z., Shirai, H., Takahashi, I., Kuroiwa, J., Odaka, T., Ogura, H.: Masquerade detection by boosting decision stumps using unix commands. Comput. Secur. 26(4), 311–318 (2007). CrossRefGoogle Scholar
  21. 21.
    John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, UAI’95, pp. 338–345. Morgan Kaufmann Publishers Inc., San Francisco (1995).
  22. 22.
    Kholidy, H.A., Baiardi, F., Hariri, S.: DDSGA: a data-driven semi-global alignment approach for detecting masquerade attacks. IEEE Trans. Dependable Secure Comput. 12(2), 164–178 (2015). CrossRefGoogle Scholar
  23. 23.
    Killourhy, K., Maxion, R.: Why did my detector do that?!. In: Jha, S., Sommer, R., Kreibich, C. (eds.) Proceedings of the 13th International Symposium on Recent Advances in Intrusion Detection (RAID), pp. 256–276. Springer, Berlin (2010). CrossRefGoogle Scholar
  24. 24.
    Killourhy, K.S., Maxion, R.A.: Comparing anomaly-detection algorithms for keystroke dynamics. In: International Conference on Dependable Systems Networks (IFIP), pp. 125–134 (2009).
  25. 25.
    Kim, H.S., Cha, S.D.: Empirical evaluation of SVM-based masquerade detection using unix commands. Comput. Secur. 24(2), 160–168 (2005). CrossRefGoogle Scholar
  26. 26.
    Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: 14th International Conference on Machine Learning (ICML97), pp. 179–186 (1997)Google Scholar
  27. 27.
    Kudłacik, P., Porwik, P., Wesołowski, T.: Fuzzy approach for intrusion detection based on user’s commands. Soft. Comput. 20(7), 2705–2719 (2016). CrossRefGoogle Scholar
  28. 28.
    Loyola-González, O., Medina-Pérez, M.A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Monroy, R., García-Borroto, M.: PBC4cip: a new contrast pattern-based classifier for class imbalance problems. Knowl. Based Syst. 115, 100–109 (2017). CrossRefGoogle Scholar
  29. 29.
  30. 30.
    Maxion, R.A.: Masquerade detection using enriched command lines. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 5–14 (2003).
  31. 31.
    Maxion, R.A., Townsend, T.N.: Masquerade detection using truncated command lines. In: Proceedings of the International Conference on Dependable Systems and Networks, pp. 219–228 (2002).
  32. 32.
    Medina-Pérez, M.A., Monroy, R., Camiña, J.B., García-Borroto, M.: Bagging-TPMiner: a classifier ensemble for masquerader detection based on typical objects. Soft. Comput. 21(3), 557–569 (2017). CrossRefGoogle Scholar
  33. 33.
    Messerman, A., Mustafi, T., Camtepe, S.A., Albayrak, S.: Continuous and non-intrusive identity verification in real-time environments based on free-text keystroke dynamics. In: International Joint Conference on Biometrics (IJCB), pp. 1–8 (2011).
  34. 34.
    Morales, A., Fierrez, J., Ortega-Garcia, J.: Towards predicting good users for biometric recognition based on keystroke dynamics. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) Proceedings of the Workshop on Computer Vision (ECCV 2014), pp. 711–724. Springer, Cham (2015). Google Scholar
  35. 35.
    Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schólkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods, pp. 185–208. MIT, Cambridge, MA, USA (1999)Google Scholar
  36. 36.
    Pusara, M., Brodley, C.E.: User re-authentication via mouse movements. In: Proceedings of the Workshop on Visualization and Data Mining for Computer Security, VizSEC/DMSEC ’04, pp. 1–8. ACM, New York (2004).
  37. 37.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., Los Altos (1993)Google Scholar
  38. 38.
    Rodríguez, J., Cañete, L., Monroy, R., Medina-Pérez, M.A.: Experimenting with masquerade detection via user task usage. Int. J. Interact. Des. Manuf. (IJIDeM) 11(4), 771–784 (2016). CrossRefGoogle Scholar
  39. 39.
    Salem, M.B., Stolfo, S.J.: Modeling user search behavior for masquerade detection. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) Proceedings of the 14th International Symposium on Recent Advances in Intrusion Detection, pp. 181–200. Springer, Berlin (2011). CrossRefGoogle Scholar
  40. 40.
    Saljooghinejad, H., Bhukya, W.N.: Layered security architecture for masquerade attack detection. In: Cuppens-Boulahia, N., Cuppens, F., Garcia-Alfaro, J. (eds.) Proceedings of the 26th Conference on Data and Applications Security and Privacy, pp. 255–262. Springer, Berlin (2012). Google Scholar
  41. 41.
    Schonlau, M., DuMouchel, W., Ju, W.H., Karr, A.F., Theus, M., Vardi, Y.: Computer intrusion: detecting masquerades. Stat. Sci. 16(1), 58–74 (2001)Google Scholar
  42. 42.
    Shen, C., Cai, Z., Guan, X., Maxion, R.: Performance evaluation of anomaly-detection algorithms for mouse dynamics. Comput. Secur. 45, 156–171 (2014). CrossRefGoogle Scholar
  43. 43.
    Song, Y., Salem, M.B., Hershkop, S., Stolfo, S.J.: System level user behavior biometrics using fisher features and gaussian mixture models. In: IEEE Security and Privacy Workshops, pp. 52–59 (2013).
  44. 44.
    Vidal, J.M., Orozco, A.L.S., Villalba, L.J.G.: Online masquerade detection resistant to mimicry. Expert Syst. Appl. 61, 162–180 (2016). CrossRefGoogle Scholar
  45. 45.
    Wang, K., Stolfo, S.J.: One-class training for masquerade detection. In: Workshop on Data Mining for Computer Security, p. 10. Citeseer (2003)Google Scholar
  46. 46.
    Wang, X., Sun, Y., Wang, Y.: An abnormal file access behavior detection approach based on file path diversity. IET Conference Proceedings, pp. 455–459 (2014).
  47. 47.
    Wang, X., Wang, Y., Liu, Q., Sun, Y., Xie, P.: Insider detection by analyzing process behaviors of file access. In: Park, J.J.J.H., Yi, G., Jeong, Y.S., Shen, H. (eds.) Advances in Parallel and Distributed Computing and Ubiquitous Services (UCAWSN & PDCAT), pp. 209–219. Springer, Singapore (2016). CrossRefGoogle Scholar
  48. 48.
    Weiss, A., Ramapanicker, A., Shah, P., Noble, S., Immohr, L.: Mouse movements biometric identification: a feasibility study. Proc. Student/Faculty Research Day CSIS. Pace University, White Plains (2007)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • José Benito Camiña
    • 1
  • Miguel Angel Medina-Pérez
    • 1
    Email author
  • Raúl Monroy
    • 1
  • Octavio Loyola-González
    • 1
  • Luis Angel Pereyra Villanueva
    • 2
  • Luis Carlos González Gurrola
    • 2
  1. 1.School of Science and EngineeringTecnologico de MonterreyAtizapánMexico
  2. 2.Facultad de IngenieríaUniversidad Autónoma de ChihuahuaChihuahuaMexico

Personalised recommendations