Evaluating the Quantity of Incident-Related Information in an Open Cyber Security Dataset

  • Benjamin AzizEmail author
  • John Arthur Lee
  • Gulsum Akkuzu
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 373)


Data-driven security has become essential in many organisations in their attempt to tackle Cyber security incidents. However, whilst the dominant approach to data-driven security remains through the mining of private and internal data, there is an increasing trend towards more open data through the sharing of Cyber security information and experience over public and community platforms. However, some questions remain over the quality and quantity of such open data. In this paper, we present the results of a recent case study that considers how feasible it is to answer a common question in Cyber security incident investigations, namely that “in an incident, who did what to which asset or victim, and with what result and impact”, for one such open Cyber security database.


Cyber security incidents Open datasets Quantity of information 


  1. 1.
    Akkuzu, G., Aziz, B., et al.: Feature analysis on the containment time for cyber security incidents. In: 2018 International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR), pp. 262–269. IEEE (2018)Google Scholar
  2. 2.
    Aziz, B.: Towards open data-driven evaluation of access control policies. Comput. Stan. Interfaces 56, 13–26 (2018)CrossRefGoogle Scholar
  3. 3.
    Cano, L.A.: A modern approach to security: Using systems engineering and data-driven decision-making. In: 2016 IEEE International Carnahan Conference on Security Technology (ICCST), pp. 1–5, October 2016Google Scholar
  4. 4.
    Center for Applied Internet Data Analysis: CAIDA Data. Accessed 14 Aug 2017
  5. 5.
    CERT Coordination Center: CERT Vulnerability Notes Database. Accessed 14 Aug 2017
  6. 6.
    Cordero, C.G., Vasilomanolakis, E., Milanov, N., Koch, C., Hausheer, D., Mühlhäuser, M.: Id2t: a diy dataset creation toolkit for intrusion detection systems. In: 2015 IEEE Conference on Communications and Network Security (CNS), pp. 739–740. IEEE (2015)Google Scholar
  7. 7.
    Dandurand, L., Serrano, O.S.: Towards improved cyber security information sharing. In: 2013 5th International Conference on Cyber Conflict (CYCON 2013), pp. 1–16, June 2013Google Scholar
  8. 8.
    Johnson, C.S., Badger, M.L., Waltermire, D.A., Snyder, J., Skorupka, C.: Guide to Cyber Threat Information Sharing. Technical Report 800–150, NIST (2016)Google Scholar
  9. 9.
    Liang, G., Weller, S.R., Zhao, J., Luo, F., Dong, Z.Y.: The 2015 Ukraine blackout: implications for false data injection attacks. IEEE Trans. Power Syst. 32(4), 3317–3318 (2017)CrossRefGoogle Scholar
  10. 10.
    Los Alamos National Laboratory: Cyber Security Science Open Data Sets. Accessed 14 Aug 2017
  11. 11.
    Sconzo, M.: - Samples of Security Related Data. Accessed 14 Aug 2017
  12. 12.
    Moses, T.: eXtensible Access Control Markup Language (XACML) Version 2.0. OASIS Standard (2005)Google Scholar
  13. 13.
    Moustafa, N., Slay, J.: Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6, November 2015Google Scholar
  14. 14.
    Sangster, B., et al.: Toward instrumenting network warfare competitions to generate labeled datasets. In: CSET (2009)Google Scholar
  15. 15.
    Serrano, O., Dandurand, L., Brown, S.: On the design of a cyber security data sharing system. In: Proceedings of the 2014 ACM Workshop on Information Sharing & #38; Collaborative Security, pp. 61–69, WISCS 2014. ACM, New York (2014)Google Scholar
  16. 16.
    Tejay, G., Dhillon, G., Chin, A.G.: Data quality dimensions for information systems security: a theoretical exposition (Invited Paper). In: Dowland, P., Furnell, S., Thuraisingham, B., Wang, X.S. (eds.) Security Management, Integrity, and Internal Control in Information Systems. IICIS 2004. IFIP International Federation for Information Processing, vol. 193. Springer, Boston (2005).
  17. 17.
    Thakkar, H., Endris, K.M., Gimenez-Garcia, J.M., Debattista, J., Lange, C., Auer, S.: Are linked datasets fit for open-domain question answering? a quality assessment. In: Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics, p. 19. ACM (2016)Google Scholar
  18. 18.
    VERIZON: The Vocabulary for Event Recording and Incident Sharing (VERIS). Accessed 21 Nov 2016
  19. 19.
    VERIZON: VERIS Community Database. Accessed 21 Nov 2016
  20. 20.
    Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Seman. Web 7(1), 63–93 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Benjamin Aziz
    • 1
    Email author
  • John Arthur Lee
    • 1
  • Gulsum Akkuzu
    • 1
  1. 1.School of ComputingUniversity of PortsmouthPortsmouthUK

Personalised recommendations