Advertisement

Citation analysis with microsoft academic

Abstract

We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the “fields of study” are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers’ publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Notes

  1. 1.

    Figures for 2015 are drawn from Sinha et al. (2015) and indicated in brackets.

  2. 2.

    https://academic.microsoft.com.

  3. 3.

    https://academicgraph.blob.core.windows.net/graph/index.html.

  4. 4.

    https://www.microsoft.com/cognitive-services/en-us/academic-knowledge-api.

  5. 5.

    A description of WoS entities and attributes is available at http://iuni.iu.edu/files/WoS_Documents/Entity_Relationship_Diagram_wos_core.pdf.

  6. 6.

    https://academicgraphwe.blob.core.windows.net/graph-2016-02-05/FieldsOfStudy.zip.

  7. 7.

    https://academicgraphwe.blob.core.windows.net/graph-2016-02-05/FieldOfStudyHierarchy.zip.

  8. 8.

    Data collection and publication of Harzing and Alakangas’ (2016) study took place after the submission of this paper.

References

  1. Bornmann, L., Leydesdorff, L., & Mutz, R. (2013). The use of percentiles and percentile rank classes in the analysis of bibliometric data: Opportunities and limits. Journal of Informetrics, 7(1), 158–165. doi:10.1016/j.joi.2012.10.001.

  2. Bornmann, L., Thor, A., Marx, W., & Schier, H. (2016). The application of bibliometrics to research evaluation in the humanities and social sciences: An exploratory study using normalized Google Scholar data for the publications of a research institute. Journal of the Association for Information Science and Technology, 67(11), 2778–2789. doi:10.1002/asi.23627.

  3. De Domenico, M., Omodei, E., & Arenas, A. (2016). Quantifying the diaspora of knowledge in the last century. arXiv:1604.00696v1.

  4. Harzing, A. W. (2007). Publish or perish. Available from http://www.harzing.com/pop.htm.

  5. Harzing, A. W. (2016). Microsoft academic (search): A phoenix arisen from the ashes? Scientometrics, 108(3), 1637–1647. doi:10.1007/s11192-016-2026-y.

  6. Harzing, A.-W., & Alakangas, S. (2016). Microsoft academic: Is the phoenix getting wings? Scientometrics. doi:10.1007/s11192-016-2185-x.

  7. Harzing, A. W., Alakangas, S., & Adams, D. (2014). hIa: An individual annual h-index to accommodate disciplinary and career length differences. Scientometrics, 99(3), 811–821. doi:10.1007/s11192-013-1208-0.

  8. Jacso, P. (2010). Metadata mega mess in Google Scholar. Online Information Review, 34(1), 175–191. doi:10.1108/14684521011024191.

  9. Prins, A. A. M., Costas, R., van Leeuwen, T. N., & Wouters, P. (2016). Using Google Scholar in research evaluation of humanities and social science programs: A comparison with Web of Science data. Research Evaluation, 25(3), 264–270. doi:10.1093/reseval/rvv049.

  10. Rehn, C., Wadskog, D., Gornitzki, C., & Larsson, A. (2014). Bibliometric indicators—definitions and usage at Karolinska Institutet. Stockholm: Karolinska Institutet University Library.

  11. Ribas, S., Ueda, A., Santos, R. L. T., Ribeiro-Neto, B., & Ziviani, N. (2016). Simplified Relative Citation Ratio for static paper ranking. arXiv:1603.01336v1.

  12. Sinha, A., Shen, Z., Song, Y., Ma, H., Eide, D., Hsu, B., & Wang, K. (2015). An overview of Microsoft Academic Service (MAS) and applications. Paper presented at the Proceedings of the 24th International Conference on World Wide Web (WWW’15). Retrieved from http://research.microsoft.com/apps/pubs/default.aspx?id=246609.

  13. Tijssen, R. J. W., Visser, M. S., & van Leeuwen, T. N. (2002). Benchmarking international scientific excellence: Are highly cited research papers an appropriate frame of reference? Scientometrics, 54(3), 381–397. doi:10.1023/a:1016082432660.

  14. Wade, A., Kuasan, W., Yizhou, S., & Gulli, A. (2016). WSDM cup 2016: Entity ranking challenge. In P. N. Bennet, V. Josifovski, J. Neville, & F. Radlinski (Eds.), Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (pp. 593–594). New York, NY: Association for Computing Machinery.

  15. Waltman, L., & Schreiber, M. (2013). On the calculation of percentile-based bibliometric indicators. Journal of the American Society for Information Science and Technology, 64(2), 372–379. doi:10.1002/asi.22775.

  16. Waltman, L., van Eck, N. J., van Leeuwen, T. N., Visser, M. S., & van Raan, A. F. J. (2011). Towards a new crown indicator: Some theoretical considerations. Journal of Informetrics, 5(1), 37–47. doi:10.1016/j.joi.2010.08.001.

  17. Wouters, P., & Costas, R. (2012). Users, narcissism and control—tracking the impact of scholarly publications in the 21st century. In E. Archambault, Y. Gingras, & V. Larivière (Eds.), Proceedings of the 17th International Conference on Science and Technology Indicators. Sciene-Metrix and OST: Montréal.

Download references

Author information

Correspondence to Sven E. Hug.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hug, S.E., Ochsner, M. & Brändle, M.P. Citation analysis with microsoft academic. Scientometrics 111, 371–378 (2017) doi:10.1007/s11192-017-2247-8

Download citation

Keywords

  • Normalization
  • Citation analysis
  • Percentiles
  • Microsoft Academic
  • Google Scholar
  • Scopus