Locality Preserving Projection on Source Code Metrics for Improved Software Maintainability

  • Xin Jin
  • Yi Liu
  • Jie Ren
  • Anbang Xu
  • Rongfang Bie
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4304)


Software project managers commonly use various metrics to assist in the design, maintaining and implementation of large software systems. The ability to predict the quality of a software object can be viewed as a classification problem, where software metrics are the features and expert quality rankings the class labels. In this paper we propose a Gaussian Mixture Model (GMM) based method for software quality classification and use Locality Preserving Projection (LPP) to improve the classification performance. GMM is a generative model which defines the overall data set as a combination of several different Gaussian distributions. LPP is a dimensionality deduction algorithm which can preserve the distance between samples while projecting data to lower dimension. Empirical results on benchmark dataset show that the two methods are effective.


Support Vector Machine Gaussian Mixture Model Software Quality Software Metrics Locality Preserve Projection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Xing, F., Guo, P., Lyu, M.R.: A novel method for early software quality prediction based on support vector machine. In: Proceedings 16th International Symposium on Software Reliability Engineering, ISSRE 2005, Chicago, Illinois, November 8-11 (2005)Google Scholar
  2. 2.
    He, X., Cai, D., Min, W.: Statistical and computational analysis of locality preserving projection. In: International Conference on Machine Learning (ICML), Bonn, Germany (2005)Google Scholar
  3. 3.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, MIT Press, Cambridge (2002)Google Scholar
  4. 4.
    Janet, M., Moreno Pedro, J.: A Study of Musical Instrument Classification using Gaussian Mixture Models and Support Vector Machines. Compaq & DEC Technical Reports, CRL-99-4 (1999)Google Scholar
  5. 5.
    He, X., Niyogi, P.: Locality preserving projection. In: Advances in Neural Information Processing Systems (NIPS 2003), vol. 16, MIT Press, Vancouver (2003)Google Scholar
  6. 6.
    Cortes, C., Vapnik, V.: Support-vector Networks. Machine Learning 20(3), 273–297 (1995)MATHGoogle Scholar
  7. 7.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  8. 8.
    Garmus, D., Herron, D.: Measuring the Software Process. Prentice Hall, Upper Saddle River (1996)Google Scholar
  9. 9.
  10. 10.
    Jin, X., Bie, R., Gao, X.: An Artificial Immune Recognition System-based Approach to Software Engineering Management with Software Metrics Selection. In: Sixth International Conference on Intelligent System Design and Application (ISDA 2006) (2006)Google Scholar
  11. 11.
    Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley & Sons, New York (1973)MATHGoogle Scholar
  12. 12.
    Pedrycz, W., Succi, G.: Genetic Granular Classifiers in Modeling Software Quality. Journal of Systems and Software 76(3), 277–285 (2005)CrossRefGoogle Scholar
  13. 13.
    Pedrycz, W., Succi, G., Chun, M.G.: Association Analysis of Software Measures. Int. J. of Software Engineering and Knowledge Engineering 12(3), 291–316 (2002)CrossRefGoogle Scholar
  14. 14.
    Muller, K.H., Paulish, D.J.: Software Metrics. IEEE Press/Chapman & Hall, London (1993)Google Scholar
  15. 15.
    Munson, J.C., Khoshgoftaar, T.M.: Software Metrics for Reliability Assessment. In: Handbook of Software Reliability and System Reliability, McGraw-Hill, Hightstown (1996)Google Scholar
  16. 16.
    Dick, S., Meeks, A., Last, M., Bunke, H., Kandel, A.: Data Mining in Software Metrics Databases. Fuzzy Sets and Systems 145(1), 81–110 (2004)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Pedrycz, W., Succi, G., Musilek, P., Bai, X.: Using Self-Organizing Maps to Analyze Object Oriented Software Measures. J. of Systems and Software 59, 65–82 (2001)CrossRefGoogle Scholar
  18. 18.
    Simpson, P.K.: Fuzzy Min-Max Neural Networks. Part 1: Classification. IEEE Trans. Neural Networks 3, 776–786 (1992)CrossRefGoogle Scholar
  19. 19.
    Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms. John Wiley &Sons Inc., New York (1993)MATHGoogle Scholar
  20. 20.
    Subramanyan, R., Krishnan, M.S.: Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects. IEEE Trans. Software Eng. 29, 297–310 (2003)CrossRefGoogle Scholar
  21. 21.
    Nabney, I.T.: Netlab: Algorithms for Pattern Recognition, pp. 79–113. Springer, Heidelberg (2001)Google Scholar
  22. 22.
    Ali, G.N., Chiang, P.-J., Mikkilineni, A.K., Chiu, G.T.-C., Allebach, J.P., Delp, E.J.: Application of principal components analysis and Gaussian mixture models to printer identification. In: Proceedings of the IS&T’s NIP20: International Conference on Digital Printing Technologies, pp. 301–305 (2004)Google Scholar
  23. 23.
    Render, R.A., Walker, H.F.: Mixture Densities, maximum likelihood, and EM algorithm. SIAM review 26(2), 195–239 (1984)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Permuter, H., Francos, J.M., Jermyn, I.H.: Gaussian Mixture Models of Texture and Colour for Image Database Retrieval. In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hong Kong (April 2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xin Jin
    • 1
  • Yi Liu
    • 1
  • Jie Ren
    • 1
  • Anbang Xu
    • 2
  • Rongfang Bie
    • 1
  1. 1.College of Information Science and TechnologyBeijing Normal UniversityBeijingP.R. China
  2. 2.Image Processing & Pattern Recognition LaboratoryBeijing Normal UniversityBeijingP.R. China

Personalised recommendations