Advertisement

Visualizing Multi-document Semantics via Open Domain Information Extraction

  • Yongpan Sheng
  • Zenglin Xu
  • Yafang Wang
  • Xiangyu Zhang
  • Jia Jia
  • Zhonghui You
  • Gerard de MeloEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11053)

Abstract

Faced with the overwhelming amounts of data in the 24/7 stream of new articles appearing online, it is often helpful to consider only the key entities and concepts and their relationships. This is challenging, as relevant connections may be spread across a number of disparate articles and sources. In this paper, we present a system that extracts salient entities, concepts, and their relationships from a set of related documents, discovers connections within and across them, and presents the resulting information in a graph-based visualization. We rely on a series of natural language processing methods, including open-domain information extraction, a special filtering method to maintain only meaningful relationships, and a heuristic to form graphs with a high coverage rate of topic entities and concepts. Our graph visualization then allows users to explore these connections. In our experiments, we rely on a large collection of news crawled from the Web and show how connections within this data can be explored. Code related to this paper is available at: https://shengyp.github.io/vmse.

Keywords

Multi-document information extraction Graph-based visualization 

Notes

Acknowledgments

This paper was partially supported by National Natural Science Foundation of China (Nos. 61572111 and 61876034), and a Fundamental Research Fund for the Central Universities of China (No. ZYGX2016Z003).

References

  1. 1.
    Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: IJCAI, pp. 2670–2676 (2007)Google Scholar
  2. 2.
    Li, J., Li, L., Li, T.: Multi-document summarization via submodularity. Appl. Intell. 37(3), 420–430 (2012)CrossRefGoogle Scholar
  3. 3.
    Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: ACL System Demonstrations, pp. 55–60 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yongpan Sheng
    • 1
  • Zenglin Xu
    • 1
  • Yafang Wang
    • 2
  • Xiangyu Zhang
    • 1
  • Jia Jia
    • 2
  • Zhonghui You
    • 1
  • Gerard de Melo
    • 3
    Email author
  1. 1.School of Computer Science and EngineeringUniversity of Electronic Science and Technology of ChinaChengduChina
  2. 2.Shandong UniversityJinanChina
  3. 3.Rutgers UniversityNew BrunswickUSA

Personalised recommendations