Skip to main content

From Low Resource Information Extraction to Identifying Influential Nodes in Knowledge Graphs

  • Conference paper
  • First Online:
Complex Networks XV (CompleNet-Live 2024)

Abstract

We propose a pipeline for identifying important entities from intelligence reports that constructs a knowledge graph, where nodes correspond to entities of fine-grained types (e.g. traffickers) extracted from the text and edges correspond to extracted relations between entities (e.g. cartel membership). The important entities in intelligence reports then map to central nodes in the knowledge graph. We introduce a novel method that extracts fine-grained entities in a few-shot setting (few labeled examples), given limited resources available to label the frequently changing entity types that intelligence analysts are interested in. It outperforms other state-of-the-art methods. Next, we identify challenges facing previous evaluations of zero-shot (no labeled examples) methods for extracting relations, affecting the step of populating edges. Finally, we explore the utility of the pipeline: given the goal of identifying important entities, we evaluate the impact of relation extraction errors on the identification of central nodes in several real and synthetic networks. The impact of these errors varies significantly by graph topology, suggesting that confidence in measurements based on automatically extracted relations should depend on observed network features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cai, E., et al.: From low resource information extraction to identifying influential nodes in knowledge graphs. arXiv preprint arXiv:2401.04915 (2024)

  2. Chen, C.Y., Li, C.T.: ZS-BERT: towards zero-shot relation extraction with attribute representation learning. In: NAACL, pp. 3470–3479 (2021)

    Google Scholar 

  3. Chen, Q., et al.: Enhanced LSTM for natural language inference. In: ACL, pp. 1657–1668 (2017)

    Google Scholar 

  4. Das, S., et al.: CONTaiNER: Few-shot named entity recognition via contrastive learning. In: ACL (2021)

    Google Scholar 

  5. Devlin, J., et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: NAACL. Minneapolis, Minnesota (2019)

    Google Scholar 

  6. Ding, N., et al.: Few-NERD: a few-shot named entity recognition dataset. In: ACL-IJCNLP, pp. 3198–3213 (2021)

    Google Scholar 

  7. Gao, T., et al.: FewRel 2.0: towards more challenging few-shot relation classification. In: EMNLP-IJCNLP, pp. 6250–6255 (2019)

    Google Scholar 

  8. Gerdes, L.M., et al.: Assessing the Abu Sayyaf Group’s strategic and learning capacities. Stud. Confl. Terror. 37(3), 267–293 (2014)

    Article  Google Scholar 

  9. Gill, P., et al.: Lethal connections: the determinants of network connections in the Provisional Irish Republican Army, 1970–1998. Int. Interact. 40(1), 52–78 (2014)

    Article  Google Scholar 

  10. Han, X., et al.: FewRel: a large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In: EMNLP, pp. 4803–4809 (2018)

    Google Scholar 

  11. Huang, J., et al.: Few-shot named entity recognition: an empirical baseline study. In: EMNLP, pp. 10408–10423 (2021)

    Google Scholar 

  12. Isella, L., et al.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)

    Article  MathSciNet  Google Scholar 

  13. Jo, H., et al.: Vulcan: Automatic extraction and analysis of cyber threat intelligence from unstructured text. Comput. Secur. 120 (2022)

    Google Scholar 

  14. Leitner, E., et al.: Fine-grained named entity recognition in legal documents. In: SEMANTiCS, pp. 272–287 (2019)

    Google Scholar 

  15. Li, J., et al.: Few-shot named entity recognition via meta-learning. IEEE Trans. Knowl. Data Eng. 34(9), 4245–4256 (2020)

    Article  Google Scholar 

  16. Liu, C., Yang, S.: Using text mining to establish knowledge graph from accident/incident reports in risk assessment. Expert Syst. Appl. 207, 117991 (2022)

    Article  Google Scholar 

  17. Liu, M., et al.: LTP: a new active learning strategy for CRF-based named entity recognition. Neural Process. Lett. 54(3), 2433–2454 (2022)

    Google Scholar 

  18. Lothritz, C., et al.: Evaluating pretrained transformer-based models on the task of fine-grained named entity recognition. In: COLING, pp. 3750–3760 (2020)

    Google Scholar 

  19. Lyu, Q., et al.: Zero-shot event extraction via transfer learning: challenges and insights. In: ACL-IJCNLP, pp. 322–332 (2021)

    Google Scholar 

  20. Manning, C.D., et al.: The Stanford CoreNLP natural language processing toolkit. In: ACL, pp. 55–60 (2014)

    Google Scholar 

  21. Mayhew, S., et al.: Named entity recognition with partially annotated training data. In: CoNLL (2019)

    Google Scholar 

  22. Najafi, S., Fyshe, A.: Weakly-supervised questions for zero-shot relation extraction. In: EACL, pp. 3075–3087 (2023)

    Google Scholar 

  23. Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)

    Article  MathSciNet  Google Scholar 

  24. Radmard, P., et al.: Subsequence based deep active learning for named entity recognition. In: ACL-IJCNLP, pp. 4310–4321 (2021)

    Google Scholar 

  25. Ren, Y., et al.: CSKG4APT: a cybersecurity knowledge graph for advanced persistent threat organization attribution. IEEE Trans. Knowl. Data Eng. (2022)

    Google Scholar 

  26. Rocktäschel, T., et al.: Reasoning about entailment with neural attention. In: ICLR (2016)

    Google Scholar 

  27. Siddhant, A., Lipton, Z.C.: Deep Bayesian active learning for natural language processing: results of a large-scale empirical study. In: EMNLP, pp. 2904–2909 (2018)

    Google Scholar 

  28. Simek, O., et al.: XLab: early indications and warnings from open source data with application to biological threat. HICSS (2018)

    Google Scholar 

  29. Touvron, H., et al.: LLaMA: open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023)

  30. Tran, V.H., et al.: Improving discriminative learning for zero-shot relation extraction. In: SpaNLP, pp. 1–6 (2022)

    Google Scholar 

  31. Wang, Q., Li, C.: Evaluating risk propagation in renewable energy incidents using ontology-based bayesian networks extracted from news reports. Int. J. Green Energy 19(12), 1290–1305 (2022)

    Article  Google Scholar 

  32. Williams, A., et al.: A broad-coverage challenge corpus for sentence understanding through inference. In: NAACL, pp. 1112–1122 (2018)

    Google Scholar 

  33. Xue, M., et al.: Coarse-to-fine pre-training for named entity recognition. In: EMNLP (2020)

    Google Scholar 

  34. Zhou, B., et al.: MTAAL: multi-task adversarial active learning for medical named entity recognition and normalization. In: AAAI, vol. 35, pp. 14586–14593 (2021)

    Google Scholar 

Download references

Acknowledgements

This material is based upon work supported by the Department of Defense under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Department of Defense.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erica Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cai, E., Simek, O., Miller, B.A., Sullivan, D., Young, E., Smith, C.L. (2024). From Low Resource Information Extraction to Identifying Influential Nodes in Knowledge Graphs. In: Botta, F., Macedo, M., Barbosa, H., Menezes, R. (eds) Complex Networks XV. CompleNet-Live 2024. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-031-57515-0_2

Download citation

Publish with us

Policies and ethics