Mining Open Source Software (OSS) Data Using Association Rules Network

  • Sanjay Chawla
  • Bavani Arunasalam
  • Joseph Davis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2637)


The Open Source Software(OSS) movement has attracted considerable attention in the last few years. In this paper we report our results of mining data acquired from, the largest open source software hosting website. In the process we introduce Association Rules Network(ARN), a (hyper)graphical model to represent a special class of association rules. Using ARNs we discover important relationships between the attributes of successful OSS projects. We verify and validate these relationships using Factor Analysis, a classical statistical technique related to Singular Value Decomposition(SVD).


Open Source Software Association Rule Networks Hypergraph clustering Factor Analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association rules. In Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo, editors, Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487–499. Morgan Kaufmann, 12–15 1994.Google Scholar
  2. 2.
    Sanjay Chawla, Bavani Arunasalam, and Joseph Davis. Mining open source software(oss) data using association rules network. Technical Report TR 535, School of IT, University of Sydney, Sydney, NSW, Australia, 2003.Google Scholar
  3. 3.
    A. Dutoit and B. Bruegge. Communication metrics for software development. IEEE Transactions On Software Engineering, 24(8):615–628, 1998.CrossRefGoogle Scholar
  4. 4.
    L. Feng, J. Yu, H. Lu, and J. Han. A template model for multi-dimensional, inter-transactional association rules. VLDB Journal, 11(2):153–175, 2002.CrossRefGoogle Scholar
  5. 5.
    R.L Glass. The sociology of open source: of cults and cultures. IEEE Software, 17(3):104–105, 2000.Google Scholar
  6. 6.
    Eui-Hong Han, George Karypis, Vipin Kumar, and Bamshad Mobasher. Clustering based on association rule hypergraphs. In Proceedings SIGMOD Workshop Research Issues on Data Mining and Knowledge Discovery(DMKD’ 97), 1997.Google Scholar
  7. 7.
    Han, J., Kamber, M., 2001. Data Mining, Concepts and Trends. Morgan Kaufmann.Google Scholar
  8. 8.
    Hand, D., Mannila, H., Smyth, P., 2001. Principles of Data Mining. M.I.T Press.Google Scholar
  9. 9.
    Bing Liu, Wynne Hsu, and Yiming Ma. Integrating classification and association rule mining. In Knowledge Discovery and Data Mining, pages 80–86, 1998.Google Scholar
  10. 10.
    E.S. Raymond. The Cathedral and Bazaar:Musings on Open Source and Linux by an Accidental Revolutionary. O’Reilly, 2001.Google Scholar
  11. 11.
    L. Torwalds. The linux edge. Communications of the ACM, 42(4):38–39, 1999.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Sanjay Chawla
    • 1
  • Bavani Arunasalam
    • 1
  • Joseph Davis
    • 1
  1. 1.Knowledge Management Research Group, School of Information TechnologiesUniversity of SydneyAustralia

Personalised recommendations