Abstract
Threat intelligence is vital to implementing cyber security. The automated extraction of relations from open-source threat intelligence can greatly reduce the workload of security analysts. However, implementing this feature is hindered by the shortage of labeled training datasets, low accuracy and recall rates of automated models, and limited types of relations that can be extracted.
This chapter presents a novel relation extraction framework that employs distant supervision for data annotation and a neural network model for relation extraction. The framework is evaluated by comparing it with several state-of-the-art neural network models. The experimental results demonstrate that it effectively alleviates the data annotation challenges and outperforms the state-of-the-art neural network models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amazon, Amazon Mechanical Turk, Seattle, Washington (www.mturk.com), 2021.
S. Bird, NLTK: The Natural Language Toolkit, Proceedings of the Twenty-First International Conference on Computational Linguistics and Forty-Fourth Annual Meeting of the Association for Computational Linguistics: Interactive Presentation Sessions, pp. 69–72, 2006.
H. Gascon, B. Grobauer, T. Schreck, L. Rist, D. Arp and K. Rieck, Mining attributed graphs for threat intelligence, Proceedings of the Seventh ACM Conference on Data and Application Security and Privacy, pp. 15–22, 2017.
A. Graves and J. Schmidhuber, Framewise phoneme classification with bidirectional LSTM and other neural network architectures, Neural Networks, vol. 18(5-6), pp. 602–610, 2005.
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, vol. 9(8), pp. 1735–1780, 1997.
X. Jiang, Q. Wang, P. Li and B. Wang, Relation extraction with multi-instance multi-label convolutional neural networks, Proceedings of the Twenty-Sixth International Conference on Computational Linguistics: Technical Papers, pp. 1471–1480, 2016.
C. Jones, R. Bridges, K. Huffer and J. Goodall, Towards a relation extraction framework for cyber security concepts, Proceedings of the Tenth Annual Cyber and Information Security Research Conference, article no. 11, 2015.
B. Jordan and J. Wunder (Eds.), STIX 2.0 Specification, Core Concepts, Version 2.0 Draft 1, OASIS Cyber Threat Intelligence Technical Committee (www.oasis-open.org/committees/download.php/58538/ STIX2.0-Draft1-Core.pdf), 2017.
A. Joshi, R. Lal, T. Finin and A. Joshi, Extracting cybersecurity-related linked data from text, Proceedings of the Seventh IEEE International Conference on Semantic Computing, pp. 252–259, 2013.
R. Lai, Information Extraction of Security-Related Terms and Concepts from Unstructured Text, M.S. Thesis, Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, Maryland, 2013.
K. Lee, C. Hsieh, L. Wei, C. Mao, J. Dai and Y. Kuang, Sec-Buzzer: Cyber security emerging topic mining with open threat intelligence retrieval and timeline event annotation, Soft Computing, vol. 21(11), pp. 2883–2896, 2017.
X. Liao, K. Yuan, X. Wang, Z. Li, L. Xing and R. Beyah, Acing the IOC game: Toward automatic discovery and analysis of open-source cyber threat intelligence, Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pp. 755–766, 2016.
Y. Lin, S. Shen, Z. Liu, H. Luan and M. Sun, Neural relation extraction with selective attention over instances, Proceedings of the Fifty-Fourth Annual Meeting of the Association for Computational Linguistics, pp. 2124–2133, 2016.
C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. Bethard and D. McClosky, The Stanford CoreNLP Natural Language Processing Toolkit, Proceedings of the Fifty-Second Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60, 2014.
R. McMillan, Definition: Threat Intelligence, Gartner, Stamford, Connecticut, 2013.
N. McNeil, R. Bridges, M. Iannacone, B. Czejdo, N. Perez and J. Goodall, PACE: Pattern accurate computationally efficient bootstrapping for timely discovery of cyber security concepts, Proceedings of the Twelfth International Conference on Machine Learning and Applications, pp. 60–65, 2013.
T. Mikolov, K. Chen, G. Corrado and J. Dean, Efficient estimation of word representations in vector space, presented at the First International Conference on Learning Representations, 2013.
M. Mintz, S. Bills, R. Snow and D. Jurafsky, Distant supervision for relation extraction without labeled data, Proceedings of the Joint Conference of the Forty-Seventh Annual Meeting of the Association for Computational Linguistics and Fourth International Joint Conference on Natural Language Processing of the Asian Federation of National Language Processing, pp. 1003–1011, 2009.
S. Mittal, P. Das, V. Mulwad, A. Joshi and T. Finin, CyberTwitter: Using Twitter to generate alerts for cyber security threats and vulnerabilities, Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 860–867, 2016.
V. Mulwad, W. Li, A. Joshi, T. Finin and K. Viswanathan, Extracting information about security vulnerabilities from web text, Proceedings of the IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 257–260, 2011.
J. Smith, Cyber threat intelligence sharing – Ascending the pyramid of pain, APNIC Blog, June 23, 2016.
spaCy, spaCy – Industrial-Strength Natural Language Processing in Python (spacy.io), 2021.
R. Steele, Open-source intelligence, in Handbook of Intelligence Studies, L. Johnson (Ed.), Routledge, Abingdon, United Kingdom, pp. 129–147, 2007.
Y. Tao, Y. Zhang, S. Ma, K. Fan, M. Li, F. Guo and Z. Xu, Combining big data analysis and threat intelligence technologies for the classified protection model, Cluster Computing, vol. 20(2), pp. 1035–1046, 2017.
D. Zeng, K. Liu, S. Lai, G. Zhou and J. Zhao, Relation classification via convolutional deep neural networks, Proceedings of the Twenty-Fifth International Conference on Computational Linguistics: Technical Papers, pp. 2335–2344, 2014.
P. Zhou, W. Shi, J. Tian, Z. Qi, B. Li, H. Hao and B. Xu, Attention-based bidirectional long short-term memory networks for relation classification, Proceedings of the Fifty-Fourth Annual Meeting of the Association for Computational Linguistics: Volume 2 Short Papers, pp. 207–212, 2016.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 IFIP International Federation for Information Processing
About this paper
Cite this paper
Luo, Y., Ao, S., Luo, N., Su, C., Yang, P., Jiang, Z. (2021). EXTRACTING THREAT INTELLIGENCE RELATIONS USING DISTANT SUPERVISION AND NEURAL NETWORKS. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics XVII. DigitalForensics 2021. IFIP Advances in Information and Communication Technology, vol 612. Springer, Cham. https://doi.org/10.1007/978-3-030-88381-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-88381-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88380-5
Online ISBN: 978-3-030-88381-2
eBook Packages: Computer ScienceComputer Science (R0)