Abstract
We propose a pipeline for identifying important entities from intelligence reports that constructs a knowledge graph, where nodes correspond to entities of fine-grained types (e.g. traffickers) extracted from the text and edges correspond to extracted relations between entities (e.g. cartel membership). The important entities in intelligence reports then map to central nodes in the knowledge graph. We introduce a novel method that extracts fine-grained entities in a few-shot setting (few labeled examples), given limited resources available to label the frequently changing entity types that intelligence analysts are interested in. It outperforms other state-of-the-art methods. Next, we identify challenges facing previous evaluations of zero-shot (no labeled examples) methods for extracting relations, affecting the step of populating edges. Finally, we explore the utility of the pipeline: given the goal of identifying important entities, we evaluate the impact of relation extraction errors on the identification of central nodes in several real and synthetic networks. The impact of these errors varies significantly by graph topology, suggesting that confidence in measurements based on automatically extracted relations should depend on observed network features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cai, E., et al.: From low resource information extraction to identifying influential nodes in knowledge graphs. arXiv preprint arXiv:2401.04915 (2024)
Chen, C.Y., Li, C.T.: ZS-BERT: towards zero-shot relation extraction with attribute representation learning. In: NAACL, pp. 3470–3479 (2021)
Chen, Q., et al.: Enhanced LSTM for natural language inference. In: ACL, pp. 1657–1668 (2017)
Das, S., et al.: CONTaiNER: Few-shot named entity recognition via contrastive learning. In: ACL (2021)
Devlin, J., et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL. Minneapolis, Minnesota (2019)
Ding, N., et al.: Few-NERD: a few-shot named entity recognition dataset. In: ACL-IJCNLP, pp. 3198–3213 (2021)
Gao, T., et al.: FewRel 2.0: towards more challenging few-shot relation classification. In: EMNLP-IJCNLP, pp. 6250–6255 (2019)
Gerdes, L.M., et al.: Assessing the Abu Sayyaf Group’s strategic and learning capacities. Stud. Confl. Terror. 37(3), 267–293 (2014)
Gill, P., et al.: Lethal connections: the determinants of network connections in the Provisional Irish Republican Army, 1970–1998. Int. Interact. 40(1), 52–78 (2014)
Han, X., et al.: FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: EMNLP, pp. 4803–4809 (2018)
Huang, J., et al.: Few-shot named entity recognition: an empirical baseline study. In: EMNLP, pp. 10408–10423 (2021)
Isella, L., et al.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
Jo, H., et al.: Vulcan: Automatic extraction and analysis of cyber threat intelligence from unstructured text. Comput. Secur. 120 (2022)
Leitner, E., et al.: Fine-grained named entity recognition in legal documents. In: SEMANTiCS, pp. 272–287 (2019)
Li, J., et al.: Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng. 34(9), 4245–4256 (2020)
Liu, C., Yang, S.: Using text mining to establish knowledge graph from accident/incident reports in risk assessment. Expert Syst. Appl. 207, 117991 (2022)
Liu, M., et al.: LTP: a new active learning strategy for CRF-based named entity recognition. Neural Process. Lett. 54(3), 2433–2454 (2022)
Lothritz, C., et al.: Evaluating pretrained transformer-based models on the task of fine-grained named entity recognition. In: COLING, pp. 3750–3760 (2020)
Lyu, Q., et al.: Zero-shot event extraction via transfer learning: challenges and insights. In: ACL-IJCNLP, pp. 322–332 (2021)
Manning, C.D., et al.: The Stanford CoreNLP natural language processing toolkit. In: ACL, pp. 55–60 (2014)
Mayhew, S., et al.: Named entity recognition with partially annotated training data. In: CoNLL (2019)
Najafi, S., Fyshe, A.: Weakly-supervised questions for zero-shot relation extraction. In: EACL, pp. 3075–3087 (2023)
Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)
Radmard, P., et al.: Subsequence based deep active learning for named entity recognition. In: ACL-IJCNLP, pp. 4310–4321 (2021)
Ren, Y., et al.: CSKG4APT: a cybersecurity knowledge graph for advanced persistent threat organization attribution. IEEE Trans. Knowl. Data Eng. (2022)
Rocktäschel, T., et al.: Reasoning about entailment with neural attention. In: ICLR (2016)
Siddhant, A., Lipton, Z.C.: Deep Bayesian active learning for natural language processing: results of a large-scale empirical study. In: EMNLP, pp. 2904–2909 (2018)
Simek, O., et al.: XLab: early indications and warnings from open source data with application to biological threat. HICSS (2018)
Touvron, H., et al.: LLaMA: open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)
Tran, V.H., et al.: Improving discriminative learning for zero-shot relation extraction. In: SpaNLP, pp. 1–6 (2022)
Wang, Q., Li, C.: Evaluating risk propagation in renewable energy incidents using ontology-based bayesian networks extracted from news reports. Int. J. Green Energy 19(12), 1290–1305 (2022)
Williams, A., et al.: A broad-coverage challenge corpus for sentence understanding through inference. In: NAACL, pp. 1112–1122 (2018)
Xue, M., et al.: Coarse-to-fine pre-training for named entity recognition. In: EMNLP (2020)
Zhou, B., et al.: MTAAL: multi-task adversarial active learning for medical named entity recognition and normalization. In: AAAI, vol. 35, pp. 14586–14593 (2021)
Acknowledgements
This material is based upon work supported by the Department of Defense under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Department of Defense.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, E., Simek, O., Miller, B.A., Sullivan, D., Young, E., Smith, C.L. (2024). From Low Resource Information Extraction to Identifying Influential Nodes in Knowledge Graphs. In: Botta, F., Macedo, M., Barbosa, H., Menezes, R. (eds) Complex Networks XV. CompleNet-Live 2024. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-031-57515-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-57515-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57514-3
Online ISBN: 978-3-031-57515-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)