EXTRACTING THREAT INTELLIGENCE RELATIONS USING DISTANT SUPERVISION AND NEURAL NETWORKS

Luo, Yali; Ao, Shengqin; Luo, Ning; Su, Changxin; Yang, Peian; Jiang, Zhengwei

doi:10.1007/978-3-030-88381-2_10

Yali Luo¹⁷,
Shengqin Ao¹⁷,
Ning Luo¹⁷,
Changxin Su¹⁸,
Peian Yang¹⁷ &
…
Zhengwei Jiang¹⁹

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 612))

Included in the following conference series:

IFIP International Conference on Digital Forensics

559 Accesses

Abstract

Threat intelligence is vital to implementing cyber security. The automated extraction of relations from open-source threat intelligence can greatly reduce the workload of security analysts. However, implementing this feature is hindered by the shortage of labeled training datasets, low accuracy and recall rates of automated models, and limited types of relations that can be extracted.

This chapter presents a novel relation extraction framework that employs distant supervision for data annotation and a neural network model for relation extraction. The framework is evaluated by comparing it with several state-of-the-art neural network models. The experimental results demonstrate that it effectively alleviates the data annotation challenges and outperforms the state-of-the-art neural network models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amazon, Amazon Mechanical Turk, Seattle, Washington (www.mturk.com), 2021.
Google Scholar
S. Bird, NLTK: The Natural Language Toolkit, Proceedings of the Twenty-First International Conference on Computational Linguistics and Forty-Fourth Annual Meeting of the Association for Computational Linguistics: Interactive Presentation Sessions, pp. 69–72, 2006.
Google Scholar
H. Gascon, B. Grobauer, T. Schreck, L. Rist, D. Arp and K. Rieck, Mining attributed graphs for threat intelligence, Proceedings of the Seventh ACM Conference on Data and Application Security and Privacy, pp. 15–22, 2017.
Google Scholar
A. Graves and J. Schmidhuber, Framewise phoneme classification with bidirectional LSTM and other neural network architectures, Neural Networks, vol. 18(5-6), pp. 602–610, 2005.
Google Scholar
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, vol. 9(8), pp. 1735–1780, 1997.
Google Scholar
X. Jiang, Q. Wang, P. Li and B. Wang, Relation extraction with multi-instance multi-label convolutional neural networks, Proceedings of the Twenty-Sixth International Conference on Computational Linguistics: Technical Papers, pp. 1471–1480, 2016.
Google Scholar
C. Jones, R. Bridges, K. Huffer and J. Goodall, Towards a relation extraction framework for cyber security concepts, Proceedings of the Tenth Annual Cyber and Information Security Research Conference, article no. 11, 2015.
Google Scholar
B. Jordan and J. Wunder (Eds.), STIX 2.0 Specification, Core Concepts, Version 2.0 Draft 1, OASIS Cyber Threat Intelligence Technical Committee (www.oasis-open.org/committees/download.php/58538/ STIX2.0-Draft1-Core.pdf), 2017.
Google Scholar
A. Joshi, R. Lal, T. Finin and A. Joshi, Extracting cybersecurity-related linked data from text, Proceedings of the Seventh IEEE International Conference on Semantic Computing, pp. 252–259, 2013.
Google Scholar
R. Lai, Information Extraction of Security-Related Terms and Concepts from Unstructured Text, M.S. Thesis, Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, Maryland, 2013.
Google Scholar
K. Lee, C. Hsieh, L. Wei, C. Mao, J. Dai and Y. Kuang, Sec-Buzzer: Cyber security emerging topic mining with open threat intelligence retrieval and timeline event annotation, Soft Computing, vol. 21(11), pp. 2883–2896, 2017.
Google Scholar
X. Liao, K. Yuan, X. Wang, Z. Li, L. Xing and R. Beyah, Acing the IOC game: Toward automatic discovery and analysis of open-source cyber threat intelligence, Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pp. 755–766, 2016.
Google Scholar
Y. Lin, S. Shen, Z. Liu, H. Luan and M. Sun, Neural relation extraction with selective attention over instances, Proceedings of the Fifty-Fourth Annual Meeting of the Association for Computational Linguistics, pp. 2124–2133, 2016.
Google Scholar
C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. Bethard and D. McClosky, The Stanford CoreNLP Natural Language Processing Toolkit, Proceedings of the Fifty-Second Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60, 2014.
Google Scholar
R. McMillan, Definition: Threat Intelligence, Gartner, Stamford, Connecticut, 2013.
Google Scholar
N. McNeil, R. Bridges, M. Iannacone, B. Czejdo, N. Perez and J. Goodall, PACE: Pattern accurate computationally efficient bootstrapping for timely discovery of cyber security concepts, Proceedings of the Twelfth International Conference on Machine Learning and Applications, pp. 60–65, 2013.
Google Scholar
T. Mikolov, K. Chen, G. Corrado and J. Dean, Efficient estimation of word representations in vector space, presented at the First International Conference on Learning Representations, 2013.
Google Scholar
M. Mintz, S. Bills, R. Snow and D. Jurafsky, Distant supervision for relation extraction without labeled data, Proceedings of the Joint Conference of the Forty-Seventh Annual Meeting of the Association for Computational Linguistics and Fourth International Joint Conference on Natural Language Processing of the Asian Federation of National Language Processing, pp. 1003–1011, 2009.
Google Scholar
S. Mittal, P. Das, V. Mulwad, A. Joshi and T. Finin, CyberTwitter: Using Twitter to generate alerts for cyber security threats and vulnerabilities, Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 860–867, 2016.
Google Scholar
V. Mulwad, W. Li, A. Joshi, T. Finin and K. Viswanathan, Extracting information about security vulnerabilities from web text, Proceedings of the IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 257–260, 2011.
Google Scholar
J. Smith, Cyber threat intelligence sharing – Ascending the pyramid of pain, APNIC Blog, June 23, 2016.
Google Scholar
spaCy, spaCy – Industrial-Strength Natural Language Processing in Python (spacy.io), 2021.
Google Scholar
R. Steele, Open-source intelligence, in Handbook of Intelligence Studies, L. Johnson (Ed.), Routledge, Abingdon, United Kingdom, pp. 129–147, 2007.
Google Scholar
Y. Tao, Y. Zhang, S. Ma, K. Fan, M. Li, F. Guo and Z. Xu, Combining big data analysis and threat intelligence technologies for the classified protection model, Cluster Computing, vol. 20(2), pp. 1035–1046, 2017.
Google Scholar
D. Zeng, K. Liu, S. Lai, G. Zhou and J. Zhao, Relation classification via convolutional deep neural networks, Proceedings of the Twenty-Fifth International Conference on Computational Linguistics: Technical Papers, pp. 2335–2344, 2014.
Google Scholar
P. Zhou, W. Shi, J. Tian, Z. Qi, B. Li, H. Hao and B. Xu, Attention-based bidirectional long short-term memory networks for relation classification, Proceedings of the Fifty-Fourth Annual Meeting of the Association for Computational Linguistics: Volume 2 Short Papers, pp. 207–212, 2016.
Google Scholar

Download references

Author information

Authors and Affiliations

Cyber Security at the Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Yali Luo, Shengqin Ao, Ning Luo & Peian Yang
Computer Science at the Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Changxin Su
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Zhengwei Jiang

Authors

Yali Luo
View author publications
You can also search for this author in PubMed Google Scholar
Shengqin Ao
View author publications
You can also search for this author in PubMed Google Scholar
Ning Luo
View author publications
You can also search for this author in PubMed Google Scholar
Changxin Su
View author publications
You can also search for this author in PubMed Google Scholar
Peian Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengwei Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yali Luo .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Air Force Institute of Technology, Wright-Patterson AFB, OH, USA
Gilbert Peterson
Tandy School of Computer Science, University of Tulsa, Tulsa, OK, USA
Sujeet Shenoi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, Y., Ao, S., Luo, N., Su, C., Yang, P., Jiang, Z. (2021). EXTRACTING THREAT INTELLIGENCE RELATIONS USING DISTANT SUPERVISION AND NEURAL NETWORKS. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics XVII. DigitalForensics 2021. IFIP Advances in Information and Communication Technology, vol 612. Springer, Cham. https://doi.org/10.1007/978-3-030-88381-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-88381-2_10
Published: 15 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88380-5
Online ISBN: 978-3-030-88381-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)