Skip to main content

Topic scientific community in science: a combined perspective of scientific collaboration and topics

Abstract

Scientific communities are clusters of researchers and play important roles in modern science. Studying different forms of scientific communities that either physically or virtually exist is a feasible way to disclose underlying mechanisms of science. From the perspective of complex networks, topology-based communities and topic-based communities reflect scientific collaboration and topical features of science respectively. However, the two features are not isolated but intertwined in scientific practice. This study proposes an approach to detect Topical Scientific Communities (TSCs) with both topology and topic features by applying machine learning techniques and network theory. As an example, the TSCs of the informetrics field are detected, and then the characteristics of these TSCs are analyzed. It is shown that collaboration patterns on the topic level can be revealed by analyzing the static network structure and dynamics of TSCs. Furthermore, cross-topic collaborations at multiple levels could be investigated through TSCs. In addition, TSCs can effectively organize researchers in terms of productivity. Future work will further explore and generalize characteristics of TSCs, and the applications of TSCs to other tasks of studying science.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. One can imagine 4 as the value of the parameter t, the minimum number of topics one author can have, if the paper has only one author and the author only publishes the only paper.

References

  • Allison, P. D., & Stewart, J. A. (1974). Productivity differences among scientists: Evidence for accumulative advantage. American Sociological Review, 39(4), 596–606.

    Article  Google Scholar 

  • Ball, P. (2005). Index aims for fair ranking of scientists. Nature, 436(7053), 900.

    Article  Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.

    Article  Google Scholar 

  • Boyack, K. W., Newman, D., Duhon, R. J., Klavans, R., Patek, M., Biberstine, J. R., et al. (2011). Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PLoS ONE, 6(3), e18029.

    Article  Google Scholar 

  • Brown, C. M. (1999). Information seeking behavior of scientists in the electronic information age: Astronomers, chemists, mathematicians, and physicists. Journal of the Association for Information Science and Technology, 50(10), 929.

    Google Scholar 

  • Cahan, D. (2003). From natural philosophy to the sciences: Writing the history of nineteenth-century science. Chicago, London: University of Chicago Press.

    Google Scholar 

  • Cao, J., Xia, T., Li, J., Zhang, Y., & Tang, S. (2009). A density-based method for adaptive LDA model selection. Neurocomputing, 72(7), 1775–1781.

    Article  Google Scholar 

  • Clauset, A., Newman, M. E., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111.

    Article  Google Scholar 

  • Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703.

    MathSciNet  Article  MATH  Google Scholar 

  • Crane, D. (1965). Scientists at major and minor universities: A study of productivity and recognition. American Sociological Review, 1965, 699–714.

    Article  Google Scholar 

  • Dai, A. M., & Storkey, A. J. (2009, December). Author disambiguation: A nonparametric topic and co-authorship model. In NIPS workshop on applications for topic models text and beyond (pp. 1–4).

  • Ding, Y. (2011). Community detection: Topological vs. topical. Journal of Informetrics, 5(4), 498–514.

    Article  Google Scholar 

  • Evans, T. S., & Lambiotte, R. (2009). Line graphs, link partitions, and overlapping communities. Physical Review E, 80(1), 016105.

    Article  Google Scholar 

  • Galvagno, M. (2011). The intellectual structure of the anti-consumption and consumer resistance field: An author co-citation analysis. European Journal of Marketing, 45(11/12), 1688–1701.

    Article  Google Scholar 

  • Garfield, E., & Merton, R. K. (1979). Citation indexing: Its theory and application in science, technology, and humanities (Vol. 8). New York: Wiley.

    Google Scholar 

  • Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826.

    MathSciNet  Article  MATH  Google Scholar 

  • Glänzel, W. (2012). Bibliometric methods for detecting and analysing emerging research topics. El profesional de la información, 21(1), 194–201.

    Article  Google Scholar 

  • Glänzel, W., & Thijs, B. (2011). Using ‘core documents’ for detecting and labelling new emerging topics. Scientometrics, 91(2), 399–416.

    Article  Google Scholar 

  • Griffith, B. C., Small, H. G., Stonehill, J. A., & Dey, S. (1974). The structure of scientific literatures II: Toward a macro-and microstructure for science. Social Studies of Science, 4(4), 339–365.

    Google Scholar 

  • Griffiths, T. (2002). Gibbs sampling in the generative model of latent dirichlet allocation. Technical report, Stanford University.

  • Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.

    Article  Google Scholar 

  • Hein, D. I. O., Schwind, D. W. I. M., & König, W. (2006). Scale-free networks. Wirtschaftsinformatik, 48(4), 267–275.

    Article  Google Scholar 

  • Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569–16572.

    Article  MATH  Google Scholar 

  • Kuhn, T. S. (2012). The structure of scientific revolutions. Chicago, London: University of Chicago Press.

    Book  Google Scholar 

  • Lau, J. H., Grieser, K., Newman, D., & Baldwin, T. (2011). Automatic labelling of topic models. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (Vol. 1, pp. 1536–1545). Association for Computational Linguistics.

  • Li, D., He, B., Ding, Y., Tang, J., Sugimoto, C., Qin, Z., et al. (2010). Community-based topic modeling for social tagging. In Proceedings of the 19th ACM international conference on information and knowledge management (CIKM2010), October 26–30, 2010, Toronto, Canada (pp. 1565–1568).

  • Li, D., Zhu, J., Ding, Y., Xin, S., Chen, S., Tang, J., Bollen, J., & Rocha, G. (2011). Adding community and dynamics to topic models. Technical Report. School of Library and Information Science, Indiana University.

  • Lu, K., & Wolfram, D. (2010). Geographic characteristics of the growth of informetrics literature 1987–2008. Journal of Informetrics, 4(4), 591–601.

    Article  Google Scholar 

  • Lužar, B., Levnajić, Z., Povh, J., & Perc, M. (2014). Community structure and the evolution of interdisciplinarity in slovenia’s scientific collaboration network. PLoS ONE, 9(4), e94429.

    Article  Google Scholar 

  • McCain, K. W. (1990). Mapping authors in intellectual space: A technical overview. Journal of the American Society for Information Science, 41(6), 433.

    Article  Google Scholar 

  • Meho, L. I., & Tibbo, H. R. (2003). Modeling the information-seeking behavior of social scientists: Ellis’s study revisited. Journal of the American Society for Information Science and Technology, 54(6), 570–587.

    Article  Google Scholar 

  • Merton, R. K. (1968). The Matthew effect in science. Science, 159(3810), 56–63.

    Article  Google Scholar 

  • Mimno, D., Wallach, H. M., Talley, E., Leenders, M., & McCallum, A. (2011, July). Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing (pp. 262–272). Association for Computational Linguistics.

  • Morris, S. A., & Goldstein, M. L. (2007). Manifestation of research teams in journal literature: A growth model of papers, authors, collaboration, coauthorship, weak ties, and Lotka’s law. Journal of the American Society for Information Science and Technology, 58(12), 1764–1782.

    Article  Google Scholar 

  • Nagarajan, R., Kalinka, A. T., & Hogan, W. R. (2013). Evidence of community structure in biomedical research grant collaborations. Journal of Biomedical Informatics, 46(1), 40–46.

    Article  Google Scholar 

  • Newman, M. (2001a). Scientific collaboration networks. I. Network construction and fundamental results. Physical Review E, 64(1), 016131.

    MathSciNet  Article  Google Scholar 

  • Newman, M. (2001b). Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Physical Review E, 64(1), 016132.

    MathSciNet  Article  Google Scholar 

  • Newman, M. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences, 101(suppl 1), 5200–5205.

    Article  Google Scholar 

  • Newman, M. (2010). Networks: An introduction. New York: Oxford University Press.

    Book  MATH  Google Scholar 

  • Palla, G., Barabási, A. L., & Vicsek, T. (2007). Quantifying social group evolution. Nature, 446(7136), 664–667.

    Article  Google Scholar 

  • Price de Solla, D. J. (1963). Little science, big science. NewYork: Columbia University Press.

    Google Scholar 

  • Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., & Parisi, D. (2004). Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America, 101(9), 2658–2663.

    Article  Google Scholar 

  • Ramasco, J. J., & Morris, S. A. (2006). Social inertia in collaboration networks. Physical Review E, 73(1), 016122.

    Article  Google Scholar 

  • Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., & Steyvers, M. (2010). Learning author-topic models from text corpora. ACM Transactions on Information Systems (TOIS), 28(1), 4.

    Article  Google Scholar 

  • Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269.

    Article  Google Scholar 

  • Small, H. (2006). Tracking and predicting growth areas in science. Scientometrics, 68(3), 595–610.

    Article  Google Scholar 

  • Sonnenwald, D. H. (2007). Scientific collaboration. Annual Review of Information Science and Technology, 41(1), 643–681.

    Article  Google Scholar 

  • Strogatz, S. H. (2001). Exploring complex networks. Nature, 410(6825), 268–276.

    Article  Google Scholar 

  • Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications (Vol. 8). Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684), 440–442.

    Article  Google Scholar 

  • White, H. D. (1990). Author co-citation analysis: Overview and defense. Scholarly Communication and Bibliometrics, 84, 106.

    Google Scholar 

  • White, H. D., & Griffith, B. C. (1981). Author cocitation: A literature measure of intellectual structure. Journal of the American Society for Information Science, 32(3), 163–171.

    Article  Google Scholar 

  • White, H. D., & McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972–1995. Journal of the American Society for Information Science, 49(4), 327–355.

    Google Scholar 

  • Yan, E., Ding, Y., & Jacob, E. K. (2012a). Overlaying communities and topics: An analysis on publication networks. Scientometrics, 90, 499–513.

    Article  Google Scholar 

  • Yan, E., Ding, Y., Milojević, S., & Sugimoto, C. R. (2012b). Topics in dynamic research communities: An exploratory study for the field of information retrieval. Journal of Informetrics, 6(1), 140–153.

    Article  Google Scholar 

  • Zhao, D., & Strotmann, A. (2008). Author bibliographic coupling: Another approach to citation-based author knowledge network analysis. Proceedings of the American Society for Information Science and Technology, 45(1), 1–10.

    Article  Google Scholar 

  • Zhou, D., Manavoglu, E., Li, J., Giles, L. C., & Zha, H. (2006). Probabilistic models for discovering e-communities. In Proceedings of the 15th ACM international conference on world wide web, May 23–26, 2006, Edinburgh, Scotland (pp. 173–182).

Download references

Acknowledgements

We thank the anonymous reviewers for their comments. We also thank Dr. Hong Cui and Dr. Guo Chen for providing suggestions on an earlier version of this paper. This study is supported by the National Natural Science Foundation of China (CN) funded projects under Grant Nos. 71603189, 71420107026, and 71403190.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Mao.

Appendix

Appendix

See Table 6.

Table 6 The 50 research topics and their top terms

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mao, J., Cao, Y., Lu, K. et al. Topic scientific community in science: a combined perspective of scientific collaboration and topics. Scientometrics 112, 851–875 (2017). https://doi.org/10.1007/s11192-017-2418-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-017-2418-7

Keywords

  • Scientific community
  • Scientific collaboration
  • Research topic
  • Network
  • Author topic model

Mathematics Subject Classification

  • 05C82
  • 68U15