Support Vector Machines for Regression and Applications to Software Quality Prediction

  • Xin Jin
  • Zhaodong Liu
  • Rongfang Bie
  • Guoxing Zhao
  • Jixin Ma
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3994)


Software metrics are the key tool in software quality management. In this paper, we propose to use support vector machines for regression applied to software metrics to predict software quality. In experiments we compare this method with other regression techniques such as Multivariate Linear Regression, Conjunctive Rule and Locally Weighted Regression. Results on benchmark dataset MIS, using mean absolute error, and correlation coefficient as regression performance measures, indicate that support vector machines regression is a promising technique for software quality prediction. In addition, our investigation of PCA based metrics extraction shows that using the first few Principal Components (PC) we can still get relatively good performance.


Support Vector Machine Support Vector Regression Multivariate Linear Regression Software Quality Linear Kernel 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Witten, I., Frank, E.: Data Mining –Practical Machine Learning Tools and Techniques with Java Implementation. Morgan Kaufmann, San Francisco (2000)Google Scholar
  2. 2.
    Friedman, J.H.: Stochastic Gradient Boosting. Technical Report, Stanford University (1999)Google Scholar
  3. 3.
    Atkeson, C., Moore, A., Schaal, S.: Locally Weighted Learning. AI Reviews (1996)Google Scholar
  4. 4.
    Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active Learning with Statistical Models. Journal of Artificial Intelligence Research 4, 129–145 (1996)MATHGoogle Scholar
  5. 5.
    Cleveland, W., Devlin, S., Grosse, E.: Regression by Local Fitting. Journal of Econometrics 37, 87–114 (1988)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Zeng, X.Y., Chen, Y.W., et al.: A New Texture Feature based on PCA Maps and Its Application to Image Retrieval. IEICE Trans. Inf. and Syst. E86-D, 929–936 (2003)Google Scholar
  7. 7.
    Diamantaras, K.I., Kung, S.Y.: Principal Component Neural Networks: Theory and Applications. John Wiley & Sons, INC., Chichester (1996)MATHGoogle Scholar
  8. 8.
    Garmus, D., Herron, D.: Measuring The Software Process. Prentice Hall, Upper Saddle River (1996)Google Scholar
  9. 9.
  10. 10.
    Trafalis, T.B., Ince, H.: Support Vector Machine for Regression and Applications to Financial Forecasting, ijcnn. In: IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), vol. 6, p. 6348 (2000)Google Scholar
  11. 11.
    Smola, A.J., Scholkopf, B.: A Tutorial on Support Vector Regression, NEUROCOLT2 Technical Report Series, NC2-TR-1998-030 (1998)Google Scholar
  12. 12.
    Pedrycz, W., Succi, G.: Genetic Granular Classifiers in Modeling Software Quality. Journal of Systems and Software 76(3), 277–285 (2005)CrossRefGoogle Scholar
  13. 13.
    Pedrycz, W., Succi, G., Chun, M.G.: Association Analysis of Software Measures. Int. J. of Software Engineering and Knowledge Engineering 12(3), 291–316 (2002)CrossRefGoogle Scholar
  14. 14.
    Muller, K.H., Paulish, D.J.: Software Metrics. IEEE Press/Chapman & Hall, London (1993)Google Scholar
  15. 15.
    Munson, J.C., Khoshgoftaar, T.M.: Software Metrics for Reliability Assessment. In: Handbook of Software Reliability and System Reliability. McGraw-Hill, Hightstown (1996)Google Scholar
  16. 16.
    Dick, S., Meeks, A., Last, M., Bunke, H., Kandel, A.: Data Mining in Software Metrics Databases. Fuzzy Sets and Systems 145(1), 81–110 (2004)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Pedrycz, W., Succi, G., Musilek, P., Bai, X.: Using Self-Organizing Maps to Analyze Object Oriented Software Measures. J. of Systems and Software 59, 65–82 (2001)CrossRefGoogle Scholar
  18. 18.
    Simpson, P.K.: Fuzzy Min-Max Neural Networks. Part 1: Classification. IEEE Trans. Neural Networks 3, 776–786 (1992)CrossRefGoogle Scholar
  19. 19.
    Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms. John Wiley & Sons Inc., New York (1993)MATHGoogle Scholar
  20. 20.
    Cortes, C., Vapnik, V.: Support Vector Networks. Machine Learning 20, 273–297 (1995)MATHGoogle Scholar
  21. 21.
    Correlation Coefficient (2006),
  22. 22.
    Joachims, T.: Making Large-Scale SVM Learning Practical, Technical Report, LS-8-24, Computer Science Department, University of Dortmund (1998)Google Scholar
  23. 23.
    Subramanyan, R., Krishnan, M.S.: Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects. IEEE Trans. Software Eng. 29, 297–310 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xin Jin
    • 1
  • Zhaodong Liu
    • 1
  • Rongfang Bie
    • 1
  • Guoxing Zhao
    • 2
    • 3
  • Jixin Ma
    • 3
  1. 1.College of Information Science and TechnologyBeijing Normal UniversityBeijingP.R. China
  2. 2.School of Mathematical SciencesBeijing Normal UniversityBeijingP.R. China
  3. 3.School of Computing and Mathematical ScienceThe University of GreenwichLondonU.K

Personalised recommendations