Abstract
Usage of Open Source Software (OSS) has been increased over the past fifteen years among programmers and computer users. OSS communities work as a “Bazaar” where the project constructors and end-users meet together and search for suitable matches to their skills and requirements. OSS is emerging as a strong competitor to commercial or closed software. GitHub is an OSS forge started in 2008 in order to simplify code sharing. It is a Web site and cloud-based service that aids software developers to store, manage, track, and control changes to their code. When a GitHub project fails, it results in the loss of time, effort, and resources of this large community. The current need is to build models that find interesting factors that contributes to the success of these projects. The massive repositories make this domain a good candidate for exploratory research using the data mining approach. In this work, the FP-Growth method is used to find the popular two programming language combinations and is validated using the SPSS tool. The outcome of this work benefits the OSS community in terms of time and resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S. Chawla, B. Arunasalam, J. Davis, Mining open source software (OSS) data using association rules network, in Pacific-Asia Conference on Knowledge Discovery and Data Mining (Springer, 2003), pp. 461–466
U. Raja, M. Tretter, Investigating open source project success: a data mining approach to model formulation, validation and testing, in Proceedings of SUGI, vol. 31 (2006)
F. Chatziasimidis, I. Stamelos, Data collection and analysis of github repositories and users, in 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA) (IEEE, 2015), pp. 1–6
Y.S. Koh, S.D. Ravana, Unsupervised rare pattern mining: a survey. ACM Trans. Knowl. Discov. Data (TKDD) 10(4), 45 (2016)
A.W.R. Emanuel, R. Wardoyo, J.E. Istiyanto, K. Mustofa, Success factors of OSS projects from source forge using data mining association rule, in International Conference on Distributed Framework and Applications (DFmA) (IEEE, 2010), pp. 1–8
Y. Hu, J. Zhang, X. Bai, S. Yu, Z. Yang, Influence analysis of github repositories. SpringerPlus 5(1), 1268 (2016)
R. Agrawal, R. Srikant, et al., Fast algorithms for mining association rules, in Proceedings of the 20th International Conference on Very Large Data Bases, VLDB. vol. 1215 (1994), pp. 487–499
G. Grahne, J. Zhu, Fast algorithms for frequent itemset mining using fp-trees. IEEE Trans. Knowl. Data Eng. 17(10), 1347–1362 (2005)
S.K. Tanbeer, M.M. Hassan, A Almogren., M. Zuair, B.S Jeong, Scalable regular pattern mining in evolving body sensor data. Future Gener. Comput. Syst. 75, 172–186 (2017)
S. Tsang, Y.S. Koh, G. Dobbie, Rp-tree: rare pattern tree mining, in International Conference on Data Warehousing and Knowledge Discovery (Springer, 2011), pp. 277–288
J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation. ACM Sigmod Rec. 29, 1–12 (2000)
A. Borah, B. Nath, Tree based frequent and rare pattern mining techniques: a comprehensive structural and empirical analysis. SN Appl. Sci. 1(9), 972 (2019)
C.R. Kothari, Research Methodology Methods and Techniques (New Age International Publications, 2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Upadhya, K.J., Rao, B.D., Geetha, M. (2022). Discovery of Popular Languages from GitHub Repository: A Data Mining Approach. In: Reddy, V.S., Prasad, V.K., Wang, J., Reddy, K. (eds) Soft Computing and Signal Processing. ICSCSP 2021. Advances in Intelligent Systems and Computing, vol 1413. Springer, Singapore. https://doi.org/10.1007/978-981-16-7088-6_14
Download citation
DOI: https://doi.org/10.1007/978-981-16-7088-6_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-7087-9
Online ISBN: 978-981-16-7088-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)