Automatic Security Classification with Lasso

  • Paal E. EngelstadEmail author
  • Hugo Hammer
  • Kyrre Wahl Kongsgård
  • Anis Yazidi
  • Nils Agne Nordbotten
  • Aleksander Bai
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9503)


With an increasing amount of generated information, also within security domains, there is a growing need for tools that can assist with automatic security classification. The state-of-the art today is the use of simple classification lists (“dirty word lists”) for reactive content checking. In the future, however, we expect there will be both proactive tools for security classification (assisting humans when creating the information object) and reactive tools (i.e. double-checking the content in a guard). This paper demonstrates the use of machine learning with Lasso (Least Absolute Shrinkage and Selection Operator) [1, 2] both to two-class (binary) and multi-class security classification. We also explore the ability of Lasso to create sparse solutions that are easy for humans to analyze and interpret, in contrast to many other machine learning techniques that do not possess an explanatory nature.


Classification list Machine learning Feature selection Multiclass Guard Multi-layer security Cross-domain information exchange 



This work was partially funded by the University Graduate Center (UNIK).


  1. 1.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B. 58(1), 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010). Scholar
  3. 3.
    Nicolls, W.: Implementing company classification policy with the S/MIME security label. RFC 3114, IETF, May 2002Google Scholar
  4. 4.
    UCDMO. Ucdmo cross domain baseline list. (2011). Accessed 26 March 2015
  5. 5.
    Brown, J.D., Charlebois, D.: Security classification using automated learning (scale), DRDC Ottawa CR, Technical Report (2010)Google Scholar
  6. 6.
    Entezari-Maleki, C., Rezaei, A., Minaei-Bidgoli, B.: Comparison of classification methods based on the type of attributes and sample size. J. Convergence Inf. Technol. 4(3), 94–102 (2009)CrossRefGoogle Scholar
  7. 7.
    Kotsiantis, S.B.: Supervised machine learning: A review of classification techniques. Informatica 31, 249–268 (2007)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Mathkour, H., Touir, A., Al-Sanie, W.: Automatic information classifier using rhetorical structure theory. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds.) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol. 31, pp. 229–236. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Clark, K.: Automated security classification. Master’s thesis, Vrije Universiteit (2008)Google Scholar
  10. 10.
    Digitial national security archive. Accessed 26 March 2015
  11. 11.
    Abbyy. Accessed 26 March 2015
  12. 12.
    Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern Information Retrieval, vol. 463. ACM Press, New York (1999)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Paal E. Engelstad
    • 1
    • 2
    Email author
  • Hugo Hammer
    • 2
  • Kyrre Wahl Kongsgård
    • 1
  • Anis Yazidi
    • 2
  • Nils Agne Nordbotten
    • 1
  • Aleksander Bai
    • 2
  1. 1.Norwegian Defense Research Establishement (FFI)KjellerNorway
  2. 2.Oslo and Akershus University College of Applied Sciences (HiOA)OsloNorway

Personalised recommendations