Abstract
In recent years we have been witnessing an increase in the need for advanced bibliometric indicators for individual researchers and research groups, for which author disambiguation is needed. Using the complete population of university professors and researchers in the Canadian province of Québec (N = 13,479), their papers as well as the papers authored by their homonyms, this paper provides evidence of regularities in researchers’ publication patterns. It shows how these patterns can be used to automatically assign papers to individuals and remove papers authored by their homonyms. Two types of patterns were found: (1) at the individual researchers’ level and (2) at the level of disciplines. On the whole, these patterns allow the construction of an algorithm that provides assignment information for at least one paper for 11,105 (82.4 %) out of all 13,479 researchers—with a very low percentage of false positives (3.2 %).
Keywords
- Individual Researcher
- Bibliometric Data
- Reference Index
- Light Zone
- Publication Pattern
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options







Notes
- 1.
The recent collection of Scientometrics papers dealing with individual researchers published by Academia Kiado (Braun, 2006) illustrates this trend: the study with the highest number of researchers included has less than 200. Similarly, notable studies in the sociology of science by Cole and Cole (1973), Merton (1973) and Zuckerman (1977) analyzed small datasets.
- 2.
Physics journals, for instance, often having very long author lists, only provide initial(s) of author(s) given name(s).
- 3.
- 4.
The bibliometric part of their paper used the Scopus database, which, contrary to Thomson Reuters’ databases, links names of authors with institutional addresses for papers published since 1996.
- 5.
More details on the classification scheme can be found at: http://www.nsf.gov/statistics/seind06/c5/c5s3.htm#sb1
- 6.
For more details on the CIP, see: http://nces.ed.gov/pubs2002/cip2000/.
- 7.
Thus, 2,256 of Quebec’s researchers did not publish any papers during that period nor had any of their homonyms.
- 8.
The similarity threshold (MinSimilarity) was set at 0.25 in Microsoft SQL Server SQL Server Integration Services (SSIS). More details on the system can be found at: http://technet.microsoft.com/en-US/library/ms345128(v=SQL.90).aspx
References
Aksnes, D. W. (2008). When different persons have an identical author name. How frequent are homonyms? Journal of the American Society for Information Science and Technology, 59(5), 838–841.
Aswani, N., Bontcheva, K., & Cunningham, H. (2006). Mining information for instance unification. Lecture Notes in Computer Science, 4273, 329–342.
Barnett, G. A., & Fink, E. L. (2008). Impact of the internet and scholar age distribution on academic citation age. Journal of the American Society for Information Science and Technology, 59(4), 526–534.
Boyack, K. W., & Klavans, R. (2008). Measuring science–technology interaction using rare inventor–author names. Journal of Informetrics, 2(3), 173–182.
Braun, T. (Ed). (2006). Evaluations of Individual Scientists and Research Institutions: Scientometrics Guidebooks Series. Budapest, Hungary : Akademiai Kiado.
Campbell, D., Picard-Aitken, M., Côté, G., Caruso, J., Valentim, R., Edmonds, S., … & Archambault, É. (2010). Bibliometrics as a performance measurement tool for research evaluation: The case of research funded by the National Cancer Institute of Canada. American Journal of Evaluation, 31(1), 66–83.
Cole, J. R., & Cole, S. (1973). Social stratification in science. Chicago, IL: University of Chicago Press.
Cota, R. G., Ferreira, A. A., Nascimento, C., Gonçalves, M. A., & Laender, A. H. F. (2010). An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations. Journal of the American Society for Information Science and Technology, 61(9), 1853–1870.
Egghe, L. (2006). Theory and practice of the g-index. Scientometrics, 69(1), 131–152.
Enserink, M. (2009). Are you ready to become a number? Science, 323, 1662–1664.
Gingras, Y., Larivière, V., Macaluso, B., & Robitaille, J. P. (2008). The effects of aging on researchers’ publication and citation patterns. PLoS One, 3(12), e4048.
Gurney, T., Horlings, E., & van den Besselaar, P. (2012). Author disambiguation using multi-aspect similarity measures. Scientometrics, 91(2), 435–449.
Han, H., Zha, H., & Giles, C. L. (2005). Name disambiguation in author citations using a K-way spectral clustering method. Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital libraries (pp. 334–343). Retrieved from http://clgiles.ist.psu.edu/papers/JCDL-2005-K-Way-Spectral-Clustering.pdf.
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Science, 102(46), 16569–16572.
Jensen, P., Rouquier, J. B., Kreimer, P., & Croissant, Y. (2008). Scientists who engage in society perform better academically. Science and Public Policy, 35(7), 527–541.
Kang, I. S., Seung-Hoon, N., Seungwoo, L., Hanmin, J., Pyung, K., Won-Kyung, S., & Jong-Hyeok, L. (2009). On co-authorship for author disambiguation. Information Processing and Management, 45(1), 84–97.
Larivière, V., Macaluso, B., Archambault, E., & Gingras, Y. (2010). Which scientific elites? On the concentration of research funds, publications and citations. Research Evaluation, 19(1), 45–53.
Levin, M., Krawczyk, S., Bethard, S., & Jurafsky, D. (2012). Citation-based bootstrapping for large-scale author disambiguation. Journal of the American Society for Information Science and Technology, 63(5), 1030–1047.
Lewison, G. (1996). The frequencies of occurrence of scientific papers with authors of each initial letter and their variation with nationality. Scientometrics, 37(3), 401–416.
Merton, R. K. (1973). The sociology of science: Theoretical and empirical investigations. Chicago, IL: Chicago University Press.
Reijnhoudt, L., Costas, R., Noyons, E., Borner, K., & Scharnhorst, A. (2013). “Seed + Expand”: A validated methodology for creating high quality publication oeuvres of individual researchers. arXiv preprint arXiv:1301.5177.
Schreiber, M. (2008). A modification of the h-index: The hm-index accounts for multi-authored manuscripts. Journal of Informetrics, 2(3), 211–216.
Smalheiser, N. R., & Torvik, V. I. (2009). Author name disambiguation. In B. Cronin (Ed.), Annual review of information science and technology (Vol. 43, pp. 287–313). Medford, NJ: ASIST and Information Today.
Torvik, V. I., Weeber, M., Swanson, D. R., & Smalheiser, N. R. (2005). Probabilistic similarity metric for Medline records: A model for author name disambiguation. Journal of the American Society for Information Science and Technology, 56(2), 140–158.
Wang, J., Berzins, K., Hicks, D., Melkers, J., Xiao, F., & Pinheiro, D. (2012). A boosted-trees method for name disambiguation. Scientometrics, 93(2), 391–411.
Wooding, S., Wilcox-Jay, K., Lewison, G., & Grant, J. (2006). Co-author inclusion: A novel recursive algorithmic method for dealing with homonyms in bibliometric analysis. Scientometrics, 66(1), 11–21.
Zhang, C. T. (2009). The e-index, complementing the h-index for excess citations. PLoS One, 5(5), e5429.
Zuckerman, H. A. (1977). Scientific elite: Nobel laureates in the United States. New York, NY: Free Press.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1: List of Disciplines Assigned to Journals
-
Arts
-
Fine Arts & Architecture
-
Performing Arts
-
-
Biology
-
Agricultural & Food Sciences
-
Botany
-
Dairy & Animal Science
-
Ecology Entomology
-
General Biology
-
General Zoology
-
Marine Biology & Hydrobiology
-
Miscellaneous Biology
-
Miscellaneous Zoology
-
-
Biomedical Research
-
Anatomy & Morphology
-
Biochemistry & Molecular Biology
-
Biomedical Engineering
-
Biophysics
-
Cellular Biology Cytology & Histology
-
Embryology
-
General Biomedical Research
-
Genetics & Heredity
-
Microbiology Microscopy
-
Miscellaneous Biomedical Research
-
Nutrition & Dietetic
-
Parasitology
-
Physiology
-
Virology
-
-
Chemistry
-
Analytical Chemistry
-
Applied Chemistry
-
General Chemistry
-
Inorganic & Nuclear Chemistry
-
Organic Chemistry
-
Physical Chemistry
-
Polymers
-
-
Clinical Medicine
-
Addictive Diseases
-
Allergy
-
Anesthesiology
-
Arthritis & Rheumatology
-
Cancer
-
Cardiovascular System
-
Dentistry
-
Dermatology & Venereal Disease
-
Endocrinology
-
Environmental & Occupational Health
-
Fertility
-
Gastroenterology
-
General & Internal Medicine
-
Geriatrics
-
Hematology
-
Immunology
-
Miscellaneous Clinical Medicine
-
Nephrology
-
Neurology & Neurosurgery
-
Obstetrics & Gynecology
-
Ophthalmology
-
Orthopedics
-
Otorhinolaryngology
-
Pathology
-
Pediatrics
-
Pharmacology
-
Pharmacy
-
Psychiatry
-
Radiology & Nuclear Medicine
-
Respiratory System
-
Surgery
-
Tropical Medicine
-
Urology
-
Veterinary Medicine
-
-
Earth and Space
-
Astronomy & Astrophysics
-
Earth & Planetary Science
-
Environmental Science
-
Geology
-
Meteorology & Atmospheric Science
-
Oceanography & Limnology
-
-
Engineering and Technology
-
Aerospace Technology
-
Chemical Engineering
-
Civil Engineering
-
Computers
-
Electrical Engineering & Electronics
-
General Engineering
-
Industrial Engineering
-
Materials Science
-
Mechanical Engineering
-
Metals & Metallurgy
-
Miscellaneous Engineering & Technology
-
Nuclear Technology
-
Operations Research
-
-
Health
-
Geriatrics & Gerontology
-
Health Policy & Services
-
Nursing
-
Public Health
-
Rehabilitation
-
Social Sciences, Biomedical
-
Social Studies of Medicine
-
Speech-Language Pathology and Audiology
-
-
Humanities
-
History
-
Language & Linguistics
-
Literature
-
Miscellaneous Humanities
-
Philosophy
-
Religion
-
-
Mathematics
-
Applied Mathematics
-
General Mathematics
-
Miscellaneous Mathematics
-
Probability & Statistics
-
-
Physics
-
Acoustics
-
Applied Physics
-
Chemical Physics
-
Fluids & Plasmas
-
General Physics
-
Miscellaneous Physics
-
Nuclear & Particle Physics
-
Optics
-
Solid State Physics
Professional Fields
-
Communication
-
Education
-
Information Science & Library Science
-
Law
-
Management
-
Miscellaneous Professional Field
-
Social Work
-
-
Psychology
-
Behavioral Science & Complementary Psychology
-
Clinical Psychology
-
Developmental & Child Psychology
-
Experimental Psychology
-
General Psychology
-
Human Factors
-
Miscellaneous Psychology
-
Psychoanalysis
-
Social Psychology
Social Sciences
-
Anthropology and Archaeology
-
Area Studies
-
Criminology
-
Demography
-
Economics
-
General Social Sciences
-
Geography
-
International Relations
-
Miscellaneous Social Sciences
-
Planning & Urban Studies
-
Political Science and Public Administration
-
Science studies
-
Sociology
-
Appendix 2: List of Disciplines Assigned to Departments
-
Basic Medical Sciences
-
General Medicine
-
Laboratory Medicine
-
Medical Specialties
-
Surgical Specialties
-
-
Business & Management
-
Education
-
Engineering
-
Chemical Engineering
-
Civil Engineering
-
Electrical & Computer Engineering
-
Mechanical & Industrial Engineering
-
Other Engineering
-
-
Health Sciences
-
Dentistry
-
Kinesiology/Physical Education
-
Nursing
-
Other Health Sciences
-
Public Health & Health Administration
-
Rehabilitation Therapy
-
-
Humanities
-
Fine & Performing Arts
-
Foreign Languages, Literature, & Linguistics
-
Area Studies
-
French/English
-
History
-
Philosophy
-
Religious Studies & Vocations
-
-
Non-health Professional
-
Law & Legal Studies
-
Library & Information Sciences
-
Media & Communication Studies
-
Planning & Architecture
-
Social Work
-
-
Sciences
-
Agricultural & Food Sciences
-
Biology & Botany
-
Chemistry
-
Computer & Information Science
-
Earth & Ocean Sciences
-
Mathematics
-
Physics & Astronomy
-
Resource Management & Forestry
-
-
Social Sciences
-
Anthropology, Archaeology, & Sociology
-
Economics
-
Geography
-
Other Social Sciences & Humanities
-
Political Science
-
Psychology
-
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Larivière, V., Macaluso, B. (2014). Researchers’ Publication Patterns and Their Use for Author Disambiguation. In: Ding, Y., Rousseau, R., Wolfram, D. (eds) Measuring Scholarly Impact. Springer, Cham. https://doi.org/10.1007/978-3-319-10377-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-10377-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10376-1
Online ISBN: 978-3-319-10377-8
eBook Packages: Computer ScienceComputer Science (R0)