Advertisement

CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System

  • Bo Xu
  • Yong Xu
  • Jiaqing Liang
  • Chenhao Xie
  • Bin Liang
  • Wanyun Cui
  • Yanghua Xiao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10351)

Abstract

Great efforts have been dedicated to harvesting knowledge bases from online encyclopedias. These knowledge bases play important roles in enabling machines to understand texts. However, most current knowledge bases are in English and non-English knowledge bases, especially Chinese ones, are still very rare. Many previous systems that extract knowledge from online encyclopedias, although are applicable for building a Chinese knowledge base, still suffer from two challenges. The first is that it requires great human efforts to construct an ontology and build a supervised knowledge extraction model. The second is that the update frequency of knowledge bases is very slow. To solve these challenges, we propose a never-ending Chinese Knowledge extraction system, CN-DBpedia, which can automatically generate a knowledge base that is of ever-increasing in size and constantly updated. Specially, we reduce the human costs by reusing the ontology of existing knowledge bases and building an end-to-end facts extraction model. We further propose a smart active update strategy to keep the freshness of our knowledge base with little human costs. The 164 million API calls of the published services justify the success of our system.

Keywords

Knowledge Base Human Cost Natural Language Sentence Exist Knowledge Base Current Knowledge Base 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-76298-0_52 CrossRefGoogle Scholar
  2. 2.
    Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)Google Scholar
  3. 3.
    Cui, W., Xiao, Y., Wang, W.: KBQA: an online template based question answering system over freebase. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July, pp. 4240–4241 (2016)Google Scholar
  4. 4.
    Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., et al.: Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 5, 1–29 (2014)Google Scholar
  5. 5.
    Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving Chinese linking open data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-25093-4_14 CrossRefGoogle Scholar
  6. 6.
    Sabou, M., Bontcheva, K., Scharl, A.: Crowdsourcing research opportunities: lessons from natural language processing. In: Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies, p. 17. ACM (2012)Google Scholar
  7. 7.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)Google Scholar
  8. 8.
    Xie, C., Liang, J., Chen, L., Xiao, Y.: Towards End-to-End Knowledge Graph Construction via a Hybrid LSTM-RNN FrameworkGoogle Scholar
  9. 9.
    Xu, B., Zhang, Y., Liang, J., Xiao, Y., Hwang, S., Wang, W.: Cross-lingual type inference. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9642, pp. 447–462. Springer, Cham (2016). doi: 10.1007/978-3-319-32025-0_28 CrossRefGoogle Scholar
  10. 10.
    Yang, D., He, J., Qin, H., Xiao, Y., Wang, W.: A graph-based recommendation across heterogeneous domains. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 463–472. ACM (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Bo Xu
    • 1
  • Yong Xu
    • 1
  • Jiaqing Liang
    • 1
    • 2
  • Chenhao Xie
    • 1
    • 2
  • Bin Liang
    • 1
  • Wanyun Cui
    • 1
  • Yanghua Xiao
    • 1
    • 3
  1. 1.Shanghai Key Laboratory of Data Science, School of Computer ScienceFudan UniversityShanghaiChina
  2. 2.Data Eyes ResearchShanghaiChina
  3. 3.Shanghai Internet Big Data Engineering and Technology CenterShanghaiChina

Personalised recommendations