Skip to main content

Researchers’ Publication Patterns and Their Use for Author Disambiguation

  • 3978 Accesses

Abstract

In recent years we have been witnessing an increase in the need for advanced bibliometric indicators for individual researchers and research groups, for which author disambiguation is needed. Using the complete population of university professors and researchers in the Canadian province of Québec (N = 13,479), their papers as well as the papers authored by their homonyms, this paper provides evidence of regularities in researchers’ publication patterns. It shows how these patterns can be used to automatically assign papers to individuals and remove papers authored by their homonyms. Two types of patterns were found: (1) at the individual researchers’ level and (2) at the level of disciplines. On the whole, these patterns allow the construction of an algorithm that provides assignment information for at least one paper for 11,105 (82.4 %) out of all 13,479 researchers—with a very low percentage of false positives (3.2 %).

Keywords

  • Individual Researcher
  • Bibliometric Data
  • Reference Index
  • Light Zone
  • Publication Pattern

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-10377-8_7
  • Chapter length: 21 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-319-10377-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Hardcover Book
USD   149.99
Price excludes VAT (USA)
Fig. 7.1
Fig. 7.2
Fig. 7.3
Fig. 7.4
Fig. 7.5
Fig. 7.6
Fig. 7.7

Notes

  1. 1.

    The recent collection of Scientometrics papers dealing with individual researchers published by Academia Kiado (Braun, 2006) illustrates this trend: the study with the highest number of researchers included has less than 200. Similarly, notable studies in the sociology of science by Cole and Cole (1973), Merton (1973) and Zuckerman (1977) analyzed small datasets.

  2. 2.

    Physics journals, for instance, often having very long author lists, only provide initial(s) of author(s) given name(s).

  3. 3.

    See for example Gingras, Larivière, Macaluso, and Robitaille (2008) and Larivière, Macaluso, Archambault, and Gingras (2010) for the some results based on this population.

  4. 4.

    The bibliometric part of their paper used the Scopus database, which, contrary to Thomson Reuters’ databases, links names of authors with institutional addresses for papers published since 1996.

  5. 5.

    More details on the classification scheme can be found at: http://www.nsf.gov/statistics/seind06/c5/c5s3.htm#sb1

  6. 6.

    For more details on the CIP, see: http://nces.ed.gov/pubs2002/cip2000/.

  7. 7.

    Thus, 2,256 of Quebec’s researchers did not publish any papers during that period nor had any of their homonyms.

  8. 8.

    The similarity threshold (MinSimilarity) was set at 0.25 in Microsoft SQL Server SQL Server Integration Services (SSIS). More details on the system can be found at: http://technet.microsoft.com/en-US/library/ms345128(v=SQL.90).aspx

References

  • Aksnes, D. W. (2008). When different persons have an identical author name. How frequent are homonyms? Journal of the American Society for Information Science and Technology, 59(5), 838–841.

    CrossRef  Google Scholar 

  • Aswani, N., Bontcheva, K., & Cunningham, H. (2006). Mining information for instance unification. Lecture Notes in Computer Science, 4273, 329–342.

    CrossRef  Google Scholar 

  • Barnett, G. A., & Fink, E. L. (2008). Impact of the internet and scholar age distribution on academic citation age. Journal of the American Society for Information Science and Technology, 59(4), 526–534.

    CrossRef  Google Scholar 

  • Boyack, K. W., & Klavans, R. (2008). Measuring science–technology interaction using rare inventor–author names. Journal of Informetrics, 2(3), 173–182.

    CrossRef  Google Scholar 

  • Braun, T. (Ed). (2006). Evaluations of Individual Scientists and Research Institutions: Scientometrics Guidebooks Series. Budapest, Hungary : Akademiai Kiado.

    Google Scholar 

  • Campbell, D., Picard-Aitken, M., Côté, G., Caruso, J., Valentim, R., Edmonds, S., … & Archambault, É. (2010). Bibliometrics as a performance measurement tool for research evaluation: The case of research funded by the National Cancer Institute of Canada. American Journal of Evaluation, 31(1), 66–83.

    Google Scholar 

  • Cole, J. R., & Cole, S. (1973). Social stratification in science. Chicago, IL: University of Chicago Press.

    Google Scholar 

  • Cota, R. G., Ferreira, A. A., Nascimento, C., Gonçalves, M. A., & Laender, A. H. F. (2010). An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations. Journal of the American Society for Information Science and Technology, 61(9), 1853–1870.

    CrossRef  Google Scholar 

  • Egghe, L. (2006). Theory and practice of the g-index. Scientometrics, 69(1), 131–152.

    CrossRef  MathSciNet  Google Scholar 

  • Enserink, M. (2009). Are you ready to become a number? Science, 323, 1662–1664.

    CrossRef  Google Scholar 

  • Gingras, Y., Larivière, V., Macaluso, B., & Robitaille, J. P. (2008). The effects of aging on researchers’ publication and citation patterns. PLoS One, 3(12), e4048.

    CrossRef  Google Scholar 

  • Gurney, T., Horlings, E., & van den Besselaar, P. (2012). Author disambiguation using multi-aspect similarity measures. Scientometrics, 91(2), 435–449.

    CrossRef  Google Scholar 

  • Han, H., Zha, H., & Giles, C. L. (2005). Name disambiguation in author citations using a K-way spectral clustering method. Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital libraries (pp. 334–343). Retrieved from http://clgiles.ist.psu.edu/papers/JCDL-2005-K-Way-Spectral-Clustering.pdf.

  • Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Science, 102(46), 16569–16572.

    CrossRef  Google Scholar 

  • Jensen, P., Rouquier, J. B., Kreimer, P., & Croissant, Y. (2008). Scientists who engage in society perform better academically. Science and Public Policy, 35(7), 527–541.

    CrossRef  Google Scholar 

  • Kang, I. S., Seung-Hoon, N., Seungwoo, L., Hanmin, J., Pyung, K., Won-Kyung, S., & Jong-Hyeok, L. (2009). On co-authorship for author disambiguation. Information Processing and Management, 45(1), 84–97.

    Google Scholar 

  • Larivière, V., Macaluso, B., Archambault, E., & Gingras, Y. (2010). Which scientific elites? On the concentration of research funds, publications and citations. Research Evaluation, 19(1), 45–53.

    CrossRef  Google Scholar 

  • Levin, M., Krawczyk, S., Bethard, S., & Jurafsky, D. (2012). Citation-based bootstrapping for large-scale author disambiguation. Journal of the American Society for Information Science and Technology, 63(5), 1030–1047.

    CrossRef  Google Scholar 

  • Lewison, G. (1996). The frequencies of occurrence of scientific papers with authors of each initial letter and their variation with nationality. Scientometrics, 37(3), 401–416.

    CrossRef  Google Scholar 

  • Merton, R. K. (1973). The sociology of science: Theoretical and empirical investigations. Chicago, IL: Chicago University Press.

    Google Scholar 

  • Reijnhoudt, L., Costas, R., Noyons, E., Borner, K., & Scharnhorst, A. (2013). “Seed + Expand”: A validated methodology for creating high quality publication oeuvres of individual researchers. arXiv preprint arXiv:1301.5177.

    Google Scholar 

  • Schreiber, M. (2008). A modification of the h-index: The hm-index accounts for multi-authored manuscripts. Journal of Informetrics, 2(3), 211–216.

    CrossRef  MathSciNet  Google Scholar 

  • Smalheiser, N. R., & Torvik, V. I. (2009). Author name disambiguation. In B. Cronin (Ed.), Annual review of information science and technology (Vol. 43, pp. 287–313). Medford, NJ: ASIST and Information Today.

    Google Scholar 

  • Torvik, V. I., Weeber, M., Swanson, D. R., & Smalheiser, N. R. (2005). Probabilistic similarity metric for Medline records: A model for author name disambiguation. Journal of the American Society for Information Science and Technology, 56(2), 140–158.

    CrossRef  Google Scholar 

  • Wang, J., Berzins, K., Hicks, D., Melkers, J., Xiao, F., & Pinheiro, D. (2012). A boosted-trees method for name disambiguation. Scientometrics, 93(2), 391–411.

    CrossRef  Google Scholar 

  • Wooding, S., Wilcox-Jay, K., Lewison, G., & Grant, J. (2006). Co-author inclusion: A novel recursive algorithmic method for dealing with homonyms in bibliometric analysis. Scientometrics, 66(1), 11–21.

    CrossRef  Google Scholar 

  • Zhang, C. T. (2009). The e-index, complementing the h-index for excess citations. PLoS One, 5(5), e5429.

    CrossRef  Google Scholar 

  • Zuckerman, H. A. (1977). Scientific elite: Nobel laureates in the United States. New York, NY: Free Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vincent Larivière .

Editor information

Editors and Affiliations

Appendices

Appendix 1: List of Disciplines Assigned to Journals

  • Arts

    • Fine Arts & Architecture

    • Performing Arts

  • Biology

    • Agricultural & Food Sciences

    • Botany

    • Dairy & Animal Science

    • Ecology Entomology

    • General Biology

    • General Zoology

    • Marine Biology & Hydrobiology

    • Miscellaneous Biology

    • Miscellaneous Zoology

  • Biomedical Research

    • Anatomy & Morphology

    • Biochemistry & Molecular Biology

    • Biomedical Engineering

    • Biophysics

    • Cellular Biology Cytology & Histology

    • Embryology

    • General Biomedical Research

    • Genetics & Heredity

    • Microbiology Microscopy

    • Miscellaneous Biomedical Research

    • Nutrition & Dietetic

    • Parasitology

    • Physiology

    • Virology

  • Chemistry

    • Analytical Chemistry

    • Applied Chemistry

    • General Chemistry

    • Inorganic & Nuclear Chemistry

    • Organic Chemistry

    • Physical Chemistry

    • Polymers

  • Clinical Medicine

    • Addictive Diseases

    • Allergy

    • Anesthesiology

    • Arthritis & Rheumatology

    • Cancer

    • Cardiovascular System

    • Dentistry

    • Dermatology & Venereal Disease

    • Endocrinology

    • Environmental & Occupational Health

    • Fertility

    • Gastroenterology

    • General & Internal Medicine

    • Geriatrics

    • Hematology

    • Immunology

    • Miscellaneous Clinical Medicine

    • Nephrology

    • Neurology & Neurosurgery

    • Obstetrics & Gynecology

    • Ophthalmology

    • Orthopedics

    • Otorhinolaryngology

    • Pathology

    • Pediatrics

    • Pharmacology

    • Pharmacy

    • Psychiatry

    • Radiology & Nuclear Medicine

    • Respiratory System

    • Surgery

    • Tropical Medicine

    • Urology

    • Veterinary Medicine

  • Earth and Space

    • Astronomy & Astrophysics

    • Earth & Planetary Science

    • Environmental Science

    • Geology

    • Meteorology & Atmospheric Science

    • Oceanography & Limnology

  • Engineering and Technology

    • Aerospace Technology

    • Chemical Engineering

    • Civil Engineering

    • Computers

    • Electrical Engineering & Electronics

    • General Engineering

    • Industrial Engineering

    • Materials Science

    • Mechanical Engineering

    • Metals & Metallurgy

    • Miscellaneous Engineering & Technology

    • Nuclear Technology

    • Operations Research

  • Health

    • Geriatrics & Gerontology

    • Health Policy & Services

    • Nursing

    • Public Health

    • Rehabilitation

    • Social Sciences, Biomedical

    • Social Studies of Medicine

    • Speech-Language Pathology and Audiology

  • Humanities

    • History

    • Language & Linguistics

    • Literature

    • Miscellaneous Humanities

    • Philosophy

    • Religion

  • Mathematics

    • Applied Mathematics

    • General Mathematics

    • Miscellaneous Mathematics

    • Probability & Statistics

  • Physics

    • Acoustics

    • Applied Physics

    • Chemical Physics

    • Fluids & Plasmas

    • General Physics

    • Miscellaneous Physics

    • Nuclear & Particle Physics

    • Optics

    • Solid State Physics

    Professional Fields

    • Communication

    • Education

    • Information Science & Library Science

    • Law

    • Management

    • Miscellaneous Professional Field

    • Social Work

  • Psychology

    • Behavioral Science & Complementary Psychology

    • Clinical Psychology

    • Developmental & Child Psychology

    • Experimental Psychology

    • General Psychology

    • Human Factors

    • Miscellaneous Psychology

    • Psychoanalysis

    • Social Psychology

    Social Sciences

    • Anthropology and Archaeology

    • Area Studies

    • Criminology

    • Demography

    • Economics

    • General Social Sciences

    • Geography

    • International Relations

    • Miscellaneous Social Sciences

    • Planning & Urban Studies

    • Political Science and Public Administration

    • Science studies

    • Sociology

Appendix 2: List of Disciplines Assigned to Departments

  • Basic Medical Sciences

    • General Medicine

    • Laboratory Medicine

    • Medical Specialties

    • Surgical Specialties

  • Business & Management

  • Education

  • Engineering

    • Chemical Engineering

    • Civil Engineering

    • Electrical & Computer Engineering

    • Mechanical & Industrial Engineering

    • Other Engineering

  • Health Sciences

    • Dentistry

    • Kinesiology/Physical Education

    • Nursing

    • Other Health Sciences

    • Public Health & Health Administration

    • Rehabilitation Therapy

  • Humanities

    • Fine & Performing Arts

    • Foreign Languages, Literature, & Linguistics

    • Area Studies

    • French/English

    • History

    • Philosophy

    • Religious Studies & Vocations

  • Non-health Professional

    • Law & Legal Studies

    • Library & Information Sciences

    • Media & Communication Studies

    • Planning & Architecture

    • Social Work

  • Sciences

    • Agricultural & Food Sciences

    • Biology & Botany

    • Chemistry

    • Computer & Information Science

    • Earth & Ocean Sciences

    • Mathematics

    • Physics & Astronomy

    • Resource Management & Forestry

  • Social Sciences

    • Anthropology, Archaeology, & Sociology

    • Economics

    • Geography

    • Other Social Sciences & Humanities

    • Political Science

    • Psychology

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Larivière, V., Macaluso, B. (2014). Researchers’ Publication Patterns and Their Use for Author Disambiguation. In: Ding, Y., Rousseau, R., Wolfram, D. (eds) Measuring Scholarly Impact. Springer, Cham. https://doi.org/10.1007/978-3-319-10377-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10377-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10376-1

  • Online ISBN: 978-3-319-10377-8

  • eBook Packages: Computer ScienceComputer Science (R0)