Skip to main content
Log in

A study on the author collaboration network in big data*

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

In order to obtain a deeper understanding of the collaboration status in the big data field, we investigated the author collaboration groups and the core author collaboration groups as well as the collaboration trends in big data by combining bibliometric analysis and social network analysis. A total of 4130 papers from 13,759 authors during the period of 2011–2015 was collected. The main results indicate that 3483 of the papers are coauthored (i.e., 84.33% of all papers) from 12,016 coauthors (i.e., 87.33% of all authors), which represent a reputable level of collaboration. On the other hand, 91.83% of all the identified coauthors have published only one paper so far, reflecting a poor level of maturity of such authors. Through social network analysis, we observed that the author collaboration network is composed of small author collaboration groups and also that the authors are mainly from the computer science & technology field. As an important contribution of our study, we further analyzed the author collaboration network, culminating in the generalization of four subnet modes, which were defined by some papers: ‘dual-core’, ‘complete’, ‘bridge’ and ‘sustainable development’. It was found that the dual-core mode stands for the stage that researchers have just begun to study big data. Beginning of big data research, the complete mode tends to joint research, both the dual-core and complete modes are mostly engaged in the same project, and the bridge mode and the sustainable development mode represent, respectively, the popular and valued directions in the big data field. The results of this study can be useful for researchers interested in finding suitable partners in the big data field. By tracking the core authors and the key author collaboration groups, one can learn about the current developments in the big data field as well as predict the development prospects of such a field. Therefore, we expect with the results of our study summarized in this paper to contribute to a faster development of the big data field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://apps.webofknowledge.com

References

  • Ammirati, Sean. (2012). Infographic: Data deluge – 8 zettabytes of data by 2015.http://www.readwriteweb.com/enterprise/2011/11/infographic-data-deluge---8-ze.php, Accessed 20 Num 2012.

  • Borgatti, S. P. (2002). NetDraw software for network visualization. Analytic Technologies: Lexington https://sites.google.com/site/netdrawsoftware/home.

    Google Scholar 

  • Borgatti, S. P., Everett, M. G., & Freeman, L. C. (2002). Ucinet for windows: Software for social network analysis. Harvard: Analytic Technologies https://sites.google.com/site/ucinetsoftware/home.

    Google Scholar 

  • Chen, C., & Paul, R. J. (2001). Visualizing a knowledge domain’s intellectual structure. IEEE Computer, 34(3), 65–71.

    Article  Google Scholar 

  • Dan, B. (2012). BigData=massdata+complexdatatypes.http://www.dlnet.com/news/hyxg/88831.html.

  • Dou, W., Zhang, X., Liu, J., et al. (2015). HireSome-II: Towards privacy-aware cross-cloud service composition for big data applications. Parallel & Distributed Systems IEEE Transactions, 26(2), 455–466.

    Article  Google Scholar 

  • Freeman, L. (1977). A set of measures of centrality based on betweenness. Sociometry, 40(1), 35–41.

    Article  Google Scholar 

  • Freeman, L. (1979). Centrality in social networks. 1. Conceptual clarification. Social Networks, 1, 215–239.

    Article  Google Scholar 

  • Freeman, L. (2000). Visualizing social networks. Journal of social. Structure, 1(1), 1–15.

    Google Scholar 

  • Fu, Y., Niu, W. Y., Wang, Y. L., & Li, D. (2009). Co-author Network analysis in the scientific field - take "science research management" (2004-2008) as an example. Research Management in Chinese, 30(3), 41–46.

  • Garfield, E. (1979). Citation indexing—Its theory and application in science, technology, and humanities. New York: Wiley.

    Google Scholar 

  • Huang, Y. Q. (2014). A study on the analysis of the research hotpots and development trends of big data overseas. Journal of Intelligence in Chinese, 6, 99–104.

    Google Scholar 

  • Kretschmer, H., & Aguillo, I. (2004). Visibility of collaboration on the web. Scientometrics, 61(3), 405–426.

  • Li. Y., Zhang, Z. Q. (2010). Effect evaluation method Intelligence research Core authors partnerships. Information Science in Chinese, 29(10), 80–83.

  • Liu, J. (2004). Study on document author 's distribution regulation—— A review of 15 years research articles on Lotka’s& Price’s Laws in China. Information Science in Chinese, 1(22), 123–128.

    Google Scholar 

  • Liu, Z. H., & Zhang, Z. Q. (2010). Author keywords coupling analysis and empirical study. Information Technology in Chinese, 29(2), 268–275.

    Google Scholar 

  • Liu, C., Zhang, X., Liu, C., et al. (2013). An iterative hierarchical key exchange scheme for secure scheduling of big data applications in cloud computing[C]// proceedings of the 2013 12th IEEE International conference on trust, security and privacy in computing and communications (pp. 9–16). New York: IEEE Computer Society.

    Google Scholar 

  • Liu, C., Chen, J., Yang, L. T., et al. (2014). Authorized public auditing of dynamic big data storage on cloud with efficient verifiable fine-grained updates. IEEE Transactions on Parallel and Distributed Systems, 25(9), 2234–2244.

    Article  Google Scholar 

  • Ma, T., Rong, H., Ying, C., Tian, Y., Al-Dhelaan, A., & Al-Rodhaan, M. (2016). Detect structural-connected communities based on BSCHEF in C-DBLP. Concurrency and Computation: Practice and Experience, 28(2), 311–330.

  • Newman, M. E. J. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Sciences of the United States of America, 101(1), 5200–5205.

    Article  Google Scholar 

  • Otte, E., & Rousseau, R. (2002). Social network analysis a powerful strategy, also for the information sciences. Journal of Information Science, 28(6), 441–453.

  • Peng, X. X., Zhu, Q. H., & Shen, C. (2013). The collaborative research of social computing based on social network analysis. Journal of Intelligence in Chinese, 3, 93–100.

    Google Scholar 

  • Qiu, J. P., & Wu, C. (2011). Study on the co-author relationship of Informetrics based on social network analysis. Information and Knowledge in Chinese, 6, 12–15.

    Google Scholar 

  • Ren, Y., Shen, J., Wang, J., et al. (2015). Mutual verifiable provable data auditing in public cloud storage[J]. Journal of Internet Technology, 16(2), 317–323.

    Google Scholar 

  • Wang, D. (2008). An empirical study of network structure analysis in co-authorship. Information Science in Chinese, 26(11), 1735–1739.

  • Wang, B. L. (2015). Research on big data on scientometrics and visualization analysis. Journal of Intelligence in Chinese, 2, 131–136.

    Google Scholar 

  • Wang, Y. B., Guo, X., & Wang, J. M. (2014). Study on the theme of big data based on co-ward analysis. Library Tribune in Chinese, 8, 96–102.

    Google Scholar 

  • Yang, C., Zhang, X., Zhong, C., et al. (2014). A spatiotemporal compression based approach for efficient big data processing on cloud. Journal of Computer and System Sciences, 80(8), 1563–1583.

    Article  Google Scholar 

  • Yoshikane, F., Takayuki, N., & Keita, T. (2005). Comparative analysis of co-authorship networks considering authors’ roles in collaboration: Differences between the theoretical and application areas. ISSI, 2, 509–516.

    Google Scholar 

  • Yoshikane, F., Nozawa, T., & Tsuji, K. (2006). Comparative analysis of co-authorship networks considering authors’ roles in collaboration: Differences between the theoretical and application areas. Scientomentrics, 68(3), 643–655.

    Article  Google Scholar 

  • Zhang, X., Liu, C., & Surya, N. (2013a). SaC-FRAPP: A scalable and cost-effective framework for privacy preservation over big data on cloud. Concurrency & Computation Practice & Experience, 25(18), 2561–2576.

    Article  Google Scholar 

  • Zhang, X., Yang, C., Nepal, S., et al. (2013b). A MapReduce based approach of scalable multidimensional anonymization for big data privacy preservation on cloud[C]// 2013 International conference on cloud and green computing (pp. 105–112). New York: IEEE Computer Society.

    Google Scholar 

  • Zhao, J., Wang, L., Jie, T., et al. (2014). A security framework in G-Hadoop for big data computing across Distributed cloud data Centres. Journal of Computer and System Sciences, 80(5), 994–1007.

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to anonymous referees and editors for their invaluable and insightful comments.

Funding

This work is, in part, financially supported by the National Natural Science Foundation of China (Grant No. 61100197) and the Jiangsu Province Graduate Education Innovation Project (Grant No. KYLX15_0025).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Shi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, Y., Shi, J., Fantinato, M. et al. A study on the author collaboration network in big data* . Inf Syst Front 19, 1329–1342 (2017). https://doi.org/10.1007/s10796-017-9771-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-017-9771-1

Keywords

Navigation