Advertisement

Journal of Combinatorial Optimization

, Volume 37, Issue 1, pp 293–318 | Cite as

Optimizing model parameter for entity summarization across knowledge graphs

  • Jihong Yan
  • Chen Xu
  • Na Li
  • Ming GaoEmail author
  • Aoying Zhou
Article
  • 202 Downloads

Abstract

Knowledge graphs, which belongs to the category of semantic networks, are considered as a new method of knowledge representation of health care data. It establishes a semantic explanation model for human perception and health care information processing. Each knowledge graph is composed of massive entities and relationships. However, it is an arduous work to search and visualize users’ interested entities and attributes since there are many attributes for an entity across different knowledge graphs. It is a natural problem how to summarize an entity based on multiple knowledge graphs. We propose a three-stage algorithm to solve the problem of entity summarization across knowledge graphs, including candidate generation, knowledge graph linkage, and entity summarization. We propose an unsupervised framework to link different knowledge graphs based on the semantic and structure information of entities. To further reduce the computational cost, we employ word embedding technique to find the similar entities in semantic, and filter some pairs of unmatched entities. Finally, we model entity summarization as personalized ranking problem in a knowledge graph. We conduct a set of experiments to evaluate our proposed method on four real datasets: historical data for user query, two English knowledge graphs (YAGO and DBpeida) and an English corpus. Experimental results demonstrate the effectiveness of our proposed method by comparing with baselines.

Keywords

Entity summarization Word embedding Expectation maximization (EM) algorithm Parameter optimization 

Notes

Acknowledgements

This work has been supported by the National Key Research and Development Program of China under Grant 2016YFB1000905, NSFC under Grant Nos. U1401256, 61672234, 61402180 and 61472321.The author would also like to thank Key Disciplines of Software Engineering of Shanghai Polytechnic University under Grant No. XXKZD1604.

References

  1. Biega J, Kuzey E, Suchanek FM (2013) Inside yago2s: a transparent information extraction architecture. In: Proceedings of the 22nd International Conference on World Wide Web, ACM, pp 325–328Google Scholar
  2. Bordes A, Glorot X, Weston J, Bengio Y (2012) Joint learning of words and meaning representations for open-text semantic parsing. In: AISTATS 22, pp 127–135Google Scholar
  3. Bordes A, Usunier N, Garciaduran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Advances in neural information processing systems, pp 2787–2795Google Scholar
  4. Bordes A, Weston J, Collobert R, Bengio Y (2011) Learning structured embeddings of knowledge bases. In: Conference on artificial intelligence, number EPFL-CONF-192344Google Scholar
  5. Cheng G, Tran T, Qu Y (2011) Relin: relatedness and informativeness-based centrality for entity summarization. Lect Note Comput Sci 7031:114–129CrossRefGoogle Scholar
  6. Cheng G, Xu D, Qu Y (2015) C3d+p: a summarization method for interactive entity resolution. Web Semant Sci Servi Agents World Wide Web 35:203–213CrossRefGoogle Scholar
  7. Chieu HL, Ng HT (2002) A maximum entropy approach to information extraction from semi-structured and free text. In: Aaai/iaai 2002, pp 786–791Google Scholar
  8. Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 601–610Google Scholar
  9. Faloutsos C, Mccurley KS, Tomkins A (2004) Fast discovery of connection subgraphs In: Tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 118–127Google Scholar
  10. Fang L, Sarma AD, Yu C, Bohannon P (2011) Rex: explaining relationships between entity pairs. Proc Vldb Endow 5:241–252CrossRefGoogle Scholar
  11. Fattah MA, Ren F (2008) Automatic text summarization. Gas 692:10785Google Scholar
  12. García-Hernández RA, Ledeneva Y (2009) Word sequence models for single text summarization. In: Second international conferences on advances in computer-human interactions, 2009. ACHI’09, IEEE, pp 44–48Google Scholar
  13. Kong C, Gao M, Xu C, Qian W, Zhou A (2016) Entity matching across multiple heterogeneous data sources. In: Database systems for advanced applications 21st international conference (DASFAA), Dallas, TX, USA, pp. 133–146Google Scholar
  14. Kruengkrai C, Jaruskulchai C (2003) Generic text summarization using local and global properties of sentences In: Proceedings of IEEE/WIC international conference on web intelligence, 2003. WI 2003, IEEE, pp 201–206Google Scholar
  15. Kyoomarsi F, Khosravi H, Eslami E, Dehkordy PK, Tajoddin A (2008) Optimizing text summarization based on fuzzy logic In: Seventh IEEE/ACIS international conference on computer and information science, 2008, ICIS 08, IEEE, pp 347–352Google Scholar
  16. Lin CY (1999) Training a selection function for extraction In: Proceedings of the eighth international conference on Information and knowledge management, ACM, pp 55–62Google Scholar
  17. Lin Y, Liu Z, Sun M, Liu Y, Zhu X (2015) Learning entity and relation embeddings for knowledge graph completion, pp 2181–2187Google Scholar
  18. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR. arXiv:1301.3781
  19. Page L (1998) The pagerank citation ranking: bringing order to the web. Stanford Digital Libraries Working Paper 9, pp 1–14Google Scholar
  20. Pass G, Chowdhury A, Torgeson C (2006) A picture of search. In: InfoscaleGoogle Scholar
  21. Radev DR, Hovy E, McKeown K (2002) Introduction to the special issue on summarization. Comput Linguist 28:399–408CrossRefGoogle Scholar
  22. Radev DR, McKeown KR (1998) Generating natural language summaries from multiple on-line sources. Comput Linguist 24:470–500Google Scholar
  23. Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C et al. (2013) Recursive deep models for semantic compositionality over a sentiment treebank In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Vol. 1631, p 1642Google Scholar
  24. Suchanek FM, Kasneci G, Weikum G (2007) Yago: a core of semantic knowledge In: Proceedings of the 16th international conference on World Wide Web, ACM, pp 697–706Google Scholar
  25. Sutskever I, Tenenbaum JB, Salakhutdinov RR (2009) Modelling relational data using bayesian clustered tensor factorization In: Advances in neural information processing systems, pp 1821–1828Google Scholar
  26. Svore KM, Vanderwende L, Burges CJ (2007) Enhancing single-document summarization by combining ranknet and third-party sources. In: Emnlp-conll, pp 448–457Google Scholar
  27. Takamura H, Okumura M (2009) Text summarization model based on maximum coverage problem and its variant. In: Proceedings of the 12th Conference of the European chapter of the association for computational linguistics, Association for computational linguistics, pp 781–789Google Scholar
  28. Thalhammer A, Rettinger A (2014) Browsing dbpedia entities with summaries. In: The Semantic Web: ESWC 2014 Satellite Events, pp 511–515Google Scholar
  29. The Corpus of Contemporary American (2015) Corpus of contemporary American http://corpus.byu.edu/coca/?f=texts_e
  30. Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes, pp 1112–1119Google Scholar
  31. Yan J (2016) Entity summarization based on Web text and knowledge graph Ph.D. Dissertation, East China Normal UniversityGoogle Scholar
  32. Yan J, Cheng W, Wang C, Liu J, Gao M, Zhou A (2015) Optimizing word set coverage for multi-event summarization. J Comb Optim 30:996–1015MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  • Jihong Yan
    • 2
  • Chen Xu
    • 3
  • Na Li
    • 1
  • Ming Gao
    • 1
    Email author
  • Aoying Zhou
    • 1
  1. 1.Institute for Data Science and EngineeringEast China Normal UniversityShanghaiChina
  2. 2.College of Computer and Information EngineeringShanghai Polytechnic UniversityShanghaiChina
  3. 3.Information CenterShanghai Agricultural CommitteeShanghaiChina

Personalised recommendations