Chapter

Research and Advanced Technology for Digital Libraries

Volume 5173 of the series Lecture Notes in Computer Science pp 185-196

Author Name Disambiguation for Citations Using Topic and Web Correlation

  • Kai-Hsiang YangAffiliated withInstitute of Information Science, Academia Sinica
  • , Hsin-Tsung PengAffiliated withInstitute of Information Science, Academia Sinica
  • , Jian-Yi JiangAffiliated withDepartment of Computer Science and Information Engineering, National Taiwan University of Science and Technology
  • , Hahn-Ming LeeAffiliated withInstitute of Information Science, Academia SinicaDepartment of Computer Science and Information Engineering, National Taiwan University of Science and Technology
  • , Jan-Ming HoAffiliated withInstitute of Information Science, Academia Sinica

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Today, bibliographic digital libraries play an important role in helping members of academic community search for novel research. In particular, author disambiguation for citations is a major problem during the data integration and cleaning process, since author names are usually very ambiguous. For solving this problem, we proposed two kinds of correlations between citations, namely, Topic Correlation and Web Correlation, to exploit relationships between citations, in order to identify whether two citations with the same author name refer to the same individual.The topic correlation measures the similarity between research topics of two citations; while the Web correlation measures the number of co-occurrence in web pages. We employ a pair-wise grouping algorithm to group citations into clusters. The results of experiments show that the disambiguation accuracy has great improvement when using topic correlation and Web correlation, and Web correlation provides stronger evidences about the authors of citations.

Keywords

Citation clustering Citation analysis Author disambiguation