Combining Topic Model and Co-author Network for KAKEN and DBLP Linking

  • Duy-Hoang Tran
  • Hideaki Takeda
  • Kei Kurakawa
  • Minh-Triet Tran
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7198)


The Web of Data is based on two simple ideas: to employ the RDF data model to public structured data on the Web and to set explicit RDF links to interlink data items within different data sources. In this paper, we describe our experience in building a system of link discovery between KAKEN, a database provides the latest information of research projects in Japan, and the DBLP Computer Science Bibliography. Using these links one can navigate from the information of a computer scientist in KAKEN to his publications in the DBLP database. Our problem of linkage between KAKE researchers and DBLP authors is name disambiguation. We proposed combining LDA based topic model and co-author network approach to improve linkage accuracy.


Web of Data LDA Model Co-author Network Connected Triple 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked data on the web. In: Proceeding of the 17th International Conference on World Wide Web, WWW 2008 (2008)Google Scholar
  2. 2.
    Bizer, C., Heath, T., Ayers, D., Raimond, Y.: Interlinking Open Data on the Web. In: Demonstrations Track, 4th European Semantic Web Conference, Innsbruck, Austria (2007)Google Scholar
  3. 3.
    Hassanzaded, O., Consens, M.: Linked movie data base. In: Proceedings of the WWW 2009 Workshop on Linked Data on the Web, Madrid, Spain (2009)Google Scholar
  4. 4.
    Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk - a link discovery framework for the web of data. In: Proceedings of WWW 2009 Workshop on Linked Data on the Web, Madrid, Spain (2009)Google Scholar
  5. 5.
    Le, N.T., Ichise, R., Le, H.B.: Detecting Hidden Relations in Geographic Data. In: Proceedings of the 4th International Conference on Advances in Semantic Processing, Florence, Italy (2010)Google Scholar
  6. 6.
    Biryukov, M.: Co-Author Network Analysis in DBLP: Classifying Personal Names. In: 2nd International Conference on Modeling, Computation and Optimization in Information Systems and Management Sciences, Metz, France (2008)Google Scholar
  7. 7.
    Rosen-Zvi, M., Griffit, T., Steyvers, M., Smyth, P.: The Author-Topic Model for Authors and Documents. In: 20th Conference on Uncertainty in Artificial Intelligence, Banff, Canada (2004)Google Scholar
  8. 8.
    Reuther, P., Walter, B., Ley, M., Weber, A., Klink, S.: Managing the Quality of Person Names in DBLP. In: Gonzalo, J., Thanos, C., Verdejo, M.F., Carrasco, R.C. (eds.) ECDL 2006. LNCS, vol. 4172, pp. 508–511. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Blei, D., Ng, A., Jordan, M.: Latent Dirichlet Allocation. Journal of Machine Learning Research (JMLR) 3, 993–1022 (2003)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Duy-Hoang Tran
    • 1
  • Hideaki Takeda
    • 2
  • Kei Kurakawa
    • 2
  • Minh-Triet Tran
    • 1
  1. 1.Faculty of Information TechnologyUniversity of ScienceHo Chi Minh CityVietnam
  2. 2.National Institute of InformaticsJapan

Personalised recommendations