Skip to main content
Log in

A Chinese expert disambiguation method based on semi-supervised graph clustering

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In order to utilize the associated relationship in the expert page efficiently, we’d like to introduce a Chinese expert disambiguation method based on the semi-supervised graph clustering with the integration of various associated relationships. Firstly, extract the correlation characteristics of the expert attributes according to the correlation analysis on the expert page. Secondly, construct a similarity matrix between the documents on different expert pages with the utilization of the attributes characteristics and the associated relationship of the expert pages. Finally, with the adoption of the attribute correlation as the semi-supervised constraint, construct an expert disambiguation model by applying the graph-based clustering approach to get the solution of the model through the kernel-based method for the purpose to achieve expert name disambiguation. Through the contrast experiment in the Chinese expert disambiguation, it turns out that the disambiguation effect is much better with the adoption of the semi-supervised graph clustering method that has been integrated with the expert-associated relationships.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Wang H, Mei Z (2005) Chinese multi-document person name disambiguation. High Technol Lett 11(3):280–283

    Google Scholar 

  2. Cohen W, Ravikumar P, Fienberg S (2003) A comparison of string distance metrics for name-matching tasks. In: The IJCAI workshop on information integration on the web, Acapulco, Mexico, pp 73–78

  3. Lang J, Qin B (2009) Person name disambiguation of searching results using social network. Chin J Comput 7:1365–1375

    Article  MathSciNet  Google Scholar 

  4. Tian W, Yu Z et al (2013) A Chinese expert name disambiguation approach based on spectral clustering with the expert page-associated relationships. In: Proceedings of 2013 Chinese intelligent automation conference. Springer, Berlin, Heidelberg, pp 245–253

  5. Zhang S, You L (2010) Chinese people name disambiguation by hierarchical clustering. New Technol Libr Inf Serv 11:64–68

    Google Scholar 

  6. Wagstaff K, Cardie C, Rogers S et al (2001) Constrained K-means clustering with background knowledge. In: Proceedings of 18th international conference on machine learning, San Francisco, USA, pp 577–584

  7. Bensaid AM, Hall LO, Bezdek JC (1996) Partially supervised clustering for image segmentation. Pattern Recogn 29(5):859–871

    Article  Google Scholar 

  8. Sarma TH, Viswanath P, Reddy BE (2013) A hybrid approach to speed-up the k-means clustering method. Int J Mach Learn Cybern 4(2):107–117

    Article  Google Scholar 

  9. Wang X, Wang Y, Wang L (2004) Improving fuzzy c-means clustering based on feature-weight learning. Pattern Recogn Lett 25(10):1123–1132

    Article  Google Scholar 

  10. Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. In: Proceedings of the ACM SIGMOD international conference on management of data, Seattle, Washington, USA, pp 73–84

  11. Guha S, Rastogi R, Shim K (2000) ROCK: a robust clustering algorithm for categorical attributes. Inf Syst 25(5):345–366

    Article  Google Scholar 

  12. Ester M, Kriegel HP, Sander J et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining. The AAAI Press, Menlo Park, CA, pp 226–231

  13. Wang W, Yang J, Muntz R (1999) STING: a statistical information grid approach to spatial data mining. In: Proceedings of the 15th international conference on data engineering, Sydney, New South Wales, Australia, pp 116–125

  14. Rana S, Jasola S, Kumar R (2013) A boundary restricted adaptive particle swarm optimization for data clustering. Int J Mach Learn Cybern 4(4):391–400

    Article  Google Scholar 

  15. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  16. Dhillon I, Guan Y, Kulis B (2007) Weighted graph cuts without eigenvectors: a multilevel approach. IEEE Trans Pattern Anal Mach Intell 29(11):1944–1957

  17. Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, San Franisco, pp 1103–1110

  18. Dhillon I, Guan Y, Kulis B (2005) A unified view of kernel k-means, spectral clustering and graph cuts. The University of Texas at Austin, Department of Computer Sciences, Technical Report TR-04

Download references

Acknowledgments

This paper is supported by National Nature Science Foundation (No. 61175068), and the National Innovation Fund for Technology based Firms (No. 11C26215305905), and the Open Fund of Software Engineering Key Laboratory of Yunnan Province (No. 2011SE14), and the Ministry of Education of Returned Overseas Students to Start Research and Fund Projects.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhengtao Yu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, J., Yan, X., Yu, Z. et al. A Chinese expert disambiguation method based on semi-supervised graph clustering. Int. J. Mach. Learn. & Cyber. 6, 197–204 (2015). https://doi.org/10.1007/s13042-014-0255-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-014-0255-z

Keywords

Navigation