Semantic Query Federation for Scalable Security Log Analysis

  • Kabul Kurniawan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11155)


The digitalization of business processes increasingly exposes organizations to sophisticated cyber-security threats. To contain attacks and minimize their impact, it is essential to detect them early. To this end, it is necessary to analyze a wide range of log files that potentially provide clues about malicious activity. However, these logs are typically voluminous, heterogeneous, difficult to interpret, and stored in disparate locations, which makes it difficult to analyze them. Current approaches to analyze security logs mainly focus on regular expressions and statistical indicators and do not directly provide actionable insight to security analysts. To address these limitations, we propose a distributed approach that enables semantic querying of dispersed log sources in large-scale infrastructures. To automatically integrate and reason about security log information, we will leverage linked data technologies and state-of-the-art federated query processing systems. In this proposal, we discuss the research problem, methodology, approach and evaluation plan for scalable federated semantic security log analysis.


Security log analysis Semantic query federation Linked data Semantic reasoning 



This work was supported by the Ministry of Education and Culture, Indonesia. Furthermore, support for project SEPSES by the Austrian Science Fund (FWF) and netidee SCIENCE: P 30437-N31 is gratefully acknowledged. I want to thank my supervisors, Prof. A Min Tjoa, Prof. Gerald Quichmayr, Dr. Elmar Kiesling and my colleague Fajar Ekaputra, Peb Aryan and Niina Novak for their helpful discussion, comments and feedback.


  1. 1.
    FT Services: Cybercrime survey report insight and perspective (2017)Google Scholar
  2. 2.
    Calvanese, D., Montali, M., Syamsiyah, A., Van Der Aalst, W.M.P.: Ontology-driven extraction of event logs from relational databases 256, 140–153 (2016)Google Scholar
  3. 3.
    Kent, K., Souppaya, M.: Guide to computer security log management. National Institute of Standards and Technology, pp. 1–72 (2006)Google Scholar
  4. 4.
    He, P., Zhu, J., He, S., Li, J., Lyu, M.R.: An evaluation study on log parsing and its use in log mining. In: Proceedings - 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2016, pp. 654–661 (2016)Google Scholar
  5. 5.
    Xu, W.: Advances and challenges in log analysis. Commun. ACM 55(2), 55–61 (2012)Google Scholar
  6. 6.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Sci. Am. 284, 34–43 (2001)CrossRefGoogle Scholar
  7. 7.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data-the story so far. Int. J. Semant. Web Inf. Syst. 5(3), 1–22 (2009)CrossRefGoogle Scholar
  8. 8.
    Miller, D.R., Harris, S., Harper, A., VanDyke, S., Blask, C.: Security Information and Event Management. McGraw-Hill Osborne Media (2010)Google Scholar
  9. 9.
    Axelsson, S.: Intrusion detection systems: a survey and taxonomy. Department of Computer Engineering (2009)Google Scholar
  10. 10.
    Gander, M., Felderer, M., Katt, B., Tolbaru, A., Breu, R., Moschitti, A.: Anomaly detection in the cloud: detecting security incidents via machine learning. In: Moschitti, A., Plank, B. (eds.) EternalS 2012. CCIS, vol. 379, pp. 103–116. Springer, Heidelberg (2013). Scholar
  11. 11.
    Wu, S., Zhang, Y., Cao, W.: Network security assessment using a semantic reasoning and graph based approach. Comput. Electr. Eng. 64, 96–109 (2017)CrossRefGoogle Scholar
  12. 12.
    Hartig, O.: Zero-knowledge query planning for an iterator implementation of link traversal based query execution. In: Antoniou, G., et al. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 154–169. Springer, Heidelberg (2011). Scholar
  13. 13.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., et al. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). Scholar
  14. 14.
    Gorlitz, O., Staab, S.: SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In: Proceedings of the 2nd International Workshop on Consuming Linked Data, Bonn, Germany (2011)Google Scholar
  15. 15.
    Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011). Scholar
  16. 16.
    Verborgh, R., et al.: Triple pattern fragments: a low-cost knowledge graph interface for the web. J. Web Semant. 37–38, 184–206 (2016)CrossRefGoogle Scholar
  17. 17.
    Azodi, A., Jaeger, D., Cheng, F., Meinel, C.: Pushing the limits in event normalisation to improve attack detection in IDS/SIEM systems. In: Proceedings of the 2013 International Conference on Advanced Cloud and Big Data, pp. 69–76. IEEE (2013)Google Scholar
  18. 18.
    Kimball, R., Caserta, J: The Data Warehouse ETL Toolkit. Wiley Publishing, Inc., Indianapolis (2004)Google Scholar
  19. 19.
    Della Valle, E., Ceri, S., van Harmelen, F., Fensel, D.: It’s a streaming world! Reasoning upon rapidly changing information. IEEE Intell. Syst. 24(6), 83–89 (2009)Google Scholar
  20. 20.
    Checkland, P., Holwell, S.: Action research: its nature and validity. Syst. Pract. Action Res. 11(1), 9–21 (1989)CrossRefGoogle Scholar
  21. 21.
    Sporny, M., et al.: A JSON-based serialization for linked data (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Multimedia and Information System GroupUniversity of ViennaViennaAustria

Personalised recommendations