Skip to main content
Log in

Software process evaluation: a machine learning framework with application to defect management process

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Software process evaluation is important to improve software development and the quality of software products in a software organization. Conventional approaches based on manual qualitative evaluations (e.g., artifacts inspection) are deficient in the sense that (i) they are time-consuming, (ii) they usually suffer from the authority constraints, and (iii) they are often subjective. To overcome these limitations, this paper presents a novel semi-automated approach to software process evaluation using machine learning techniques. In this study, we mainly focus on the procedure aspect of software processes, and formulate the problem as a sequence (with additional information, e.g., time, roles, etc.) classification task, which is solved by applying machine learning algorithms. Based on the framework, we define a new quantitative indicator to evaluate the execution of a software process more objectively. To validate the efficacy of our approach, we apply it to evaluate the execution of a defect management (DM) process in nine real industrial software projects. Our empirical results show that our approach is effective and promising in providing a more objective and quantitative measurement for the DM process evaluation task. Furthermore, we conduct a comprehensive empirical study to compare our proposed machine learning approach with an existing conventional approach (i.e., artifacts inspection). Finally, we analyze the advantages and disadvantages of both approaches in detail.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://www.sei.cmu.edu/.

  2. http://www.cais.ntu.edu.sg/~chhoi/spe/.

References

  • Aalst W, Reijers H, Weijters A, van Dongen B, de Medeiros AA, Song M, Verbeek H (2007) Business process mining: An industrial application. Inf Syst 32(5):713–732

    Article  Google Scholar 

  • Abrahamsson P, Hanhineva A, Hulkko H, Ihme T, Jäälinoja J, Korkala M, Koskela J, Kyllönen P, Salo O (2004) Mobile-d: an agile approach for mobile application development, OOPSLA ’04. ACM, New York, NY, pp 174–175

    Google Scholar 

  • Acharya M, Xie T, Pei J, Xu J (2007) Mining api patterns as partial orders from source code: from usage scenarios to specifications. In: ESEC-FSE ’07. ACM, New York, NY, pp 25–34

    Chapter  Google Scholar 

  • Almeida M, Blanc X, Bendraou R (2011) Deviation management during process execution. In: Proceedings of the 2011 26th international conference on automated software engineering, IEEE, ASE’11, pp 528–531

  • Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL ’02. ACM, New York, NY, pp 4–16

    Google Scholar 

  • Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proc. 28th International Conference on Software Engineering (ICSE’06). Shanghai, China, pp 361–370

  • Bandyopadhyay S, Maulik U, Holder LB, Cook DJ, Sarawagi S (2005) Sequence data mining. In: Advanced methods for knowledge discovery from complex data, advanced information and knowledge processing. Springer, London, pp 153–187

    Google Scholar 

  • Beest NRTP, Maruster L (2007) A process mining approach to redesign business processes—a case study in gas industry. In: Proc. 9th international symposium on symbolic and numeric algorithms for scientific computing. Washington, DC, pp 541–548

  • Boehm BW (1988) A spiral model of software development and enhancement. Computer 21:61–72

    Article  Google Scholar 

  • Cataldo M, Nambiar S (2009) On the relationship between process maturity and geographic distribution: an empirical analysis of their impact on software quality. In: ESEC/FSE’09. Amsterdam, The Netherlands, pp 101–110

  • Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval, pp 161–175

  • Chen N, Hoi SCH, Xiao X (2011) Software process evaluation: a machine learning approach. In: Proceedings of the 2011 26th IEEE/ACM international conference on Automated Software Engineering, ASE ’11. IEEE Computer Society, Washington, DC, pp 333–342

    Chapter  Google Scholar 

  • Cheng BY, Carbonell JG, Klein-Seetharaman J (2005) Protein classification based on text document classification techniques. Proteins 58(4):955–970

    Article  Google Scholar 

  • Chrissis MB, Konrad M, Shrum S (2004) CMMI: guidelines for process integration and product improvement, 1st edn. Addison-Wesley Professional

  • Cleland-Huang J, Settimi R, Zou X, Solc P (2006) The detection and classification of non-functional requirements with application to early aspects. In: Prof. 14th IEEE intl. requirements engineering conference, pp 36–45

  • Cook JE, Wolf AL (1998) Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol 7(3):215–249

    Article  Google Scholar 

  • Cook JE, Wolf AL (1999) Software process validation: quantitatively measuring the correspondence of a process to a model. ACM Trans Softw Eng Methodol 8:147–176

    Article  Google Scholar 

  • Cubranic D, Murphy GC (2004) Automatic bug triage using text categorization. In: Proc. 16th international conference on software engineering & knowledge engineering, pp 92–97

  • Damian D, Chisan J (2006) An empirical study of the complex relationships between requirements engineering processes and other processes that lead to payoffs in productivity, quality, and risk management. IEEE Trans Softw Eng 32:433–453

    Article  Google Scholar 

  • Dong G (2009) Sequence data mining. Springer, Berlin/Heidelberg

    Google Scholar 

  • Dongen B, de Medeiros A, Verbeek H, Weijters A, Aalst W (2005) The prom framework: a new era in process mining tool support. In: Applications and theory of petri nets 2005, vol 3536. Springer, Berlin/Heidelberg, pp 1105–1116

    Google Scholar 

  • Goldenson DR, Gibson DL (2003) Demonstrating the impact and benefits of CMMI: an update and preliminary results. Technical Report CMU/SEI-2003-SR-009, Software Engineering Institute/Carnegie Mellon University

  • Dybå T, Prikladnicki R, Rönkkö K, Seaman CB, Sillito J (2011) Qualitative research in software engineering. Empir Software Eng 16(4):425–429

    Article  Google Scholar 

  • El Emam K, Drouin JN, Melo W (1998) SPICE. The theory and practice of software process improvement and capability determination. IEEE Computer Society, Los Alamitos

    Google Scholar 

  • El-Ramly M, Stroulia E, Sorenson P (2002) From run-time behavior to usage scenarios: an interaction-pattern mining approach. In: ACM SIGKDD Conf. (KDD’02). Edmonton, Alberta, Canada, pp 315–324

  • Fuggetta A (2000) Software process: a roadmap. In: Proceedings of the conference on the future of software engineering, ICSE ’00, Limerick, Ireland. ACM, New York, pp 25–34

    Chapter  Google Scholar 

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11:10–18

    Article  Google Scholar 

  • Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143:29–36

    Article  Google Scholar 

  • Harter DE, Kemerer CF, Slaughter SA (2012) Does software process improvement reduce the severity of defects? A longitudinal field study. IEEE Trans Softw Eng 38(4):810–827

    Article  Google Scholar 

  • Hoda R, Noble J, Marshall S (2012) Developing a grounded theory to explain the practices of self-organizing agile teams. Empir Software Eng 17(6):609–639

    Article  Google Scholar 

  • Hofmeyr SA, Forrest S, Somayaji A (1998) Intrusion detection using sequences of system calls. J Comput Secur 6:151–180

    Google Scholar 

  • Kellner MI, Madachy RJ, Raffo DM (1999) Software process simulation modeling: why? What. J Syst Softw 46:91–105

    Google Scholar 

  • Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, IJCAI’95. vol 2. Morgan Kaufmann Publishers Inc., San Francisco, CA, pp 1137–1143

    Google Scholar 

  • Krishnan MS, Kriebel CH, Kekre S, Mukhopadhyay T (2000) An empirical analysis of productivity and quality in software products. Manag Sci 46:745–759

    Article  Google Scholar 

  • Lee S, Yong HS (2010) Distributed agile: project management in a global environment. Empir Software Eng 15(2):204–217

    Article  Google Scholar 

  • Li J, Bjørnson FO, Conradi R, Kampenes VB (2006) An empirical study of variations in cots-based software development processes in the norwegian it industry. Empir Software Eng 11(3):433–461

    Article  Google Scholar 

  • Lo D, Khoo SC, Liu C (2007) Efficient mining of iterative patterns for software specification discovery. In: ACM SIGKDD Conf. (KDD’07). San Jose, CA, pp 460–469

  • Lucca Gd (2002) An approach to classify software maintenance requests. In: Proceedings of the International Conference on Software Maintenance (ICSM’02), Washington, DC, pp 93–102

  • Martin RC (2002) Agile software development, principles, patterns, and practices, 1st edn. Prentice Hall

  • Moor Ad, Delugach H (2006) Software process validation: comparing process and practice models. In: Proceedings of the workshop on Exploring Modeling Methods for Systems Analysis and Design (EMMSAD’06), pp 533–540

  • Osterweil L (1987) Software processes are software too. In: Proceedings of the 9th international conference on software engineering. IEEE Computer Society Press, Los Alamitos, CA, pp 2–13

    Google Scholar 

  • Osterweil LJ (1997) Software processes are software too, revisited: an invited talk on the most influential paper of icse 9. In: Proc. 19th International Conference on Software Engineering (ICSE’97). Boston, Massachusetts, pp 540–548

  • Patton MQ (2002) Qualitative research and evaluation methods. Sage, Thousand Oaks

    Google Scholar 

  • Pino FJ, Pardo C, García F, Piattini M (2010) Assessment methodology for software process improvement in small organizations. Inf Softw Technol 52:1044–1061

    Article  Google Scholar 

  • Poncin W, Serebrenik A, Brand Mvd (2011) Process mining software repositories. In: Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering, CSMR ’11. IEEE Computer Society, pp 5–14

  • Pressman R (2010) Software engineering: a practitioner’s approach, 7th edn. McGraw-Hill Inc.

  • Raffo D, Vandeville J, Martin RH (1999) Software process simulation to achieve higher cmm levels. J Syst Softw 46(2–3):163–172

    Article  Google Scholar 

  • Ramasubbu N, Balan RK (2009) The impact of process choice in high maturity environments: an empirical analysis. In: Proceedings of the 31st International Conference on Software Engineering, ICSE ’09, pp 529–539

  • Rozinat A, van der Aalst W (2008) Conformance checking of processes based on monitoring real behavior. Inf Syst 33(1):64–95

    Article  Google Scholar 

  • Rubin V, Günther CW, Van Der Aalst WMP, Kindler E, Van Dongen BF, Schäfer W (2007) Process mining framework for software processes. In: Proc. Intl. Conf. on Software Process (ICSP’07). Minneapolis, MN, pp 169–181

  • Samalikova J, Kusters R, Trienekens J, Weijters T, Siemons P (2011) Toward objective software process information: experiences from a case study. SQJ 19:101–120

    Google Scholar 

  • SCAMPI Upgrade Team (2011) Standard CMMI Appraisal Method for Process Improvement (SCAMPI) A, Version 1.3: Method Definition Document. Handbook CMU/SEI-2011-HB-001, Software Engineering Institute/Carnegie Mellon University

  • Sommerville I, Ransom J (2005) An empirical study of industrial requirements engineering process assessment and improvement. ACM Trans Softw Eng Methodol 14:85–117

    Article  Google Scholar 

  • Wangenheim CG, Anacleto A, Salviano CF (2006) Helping small companies assess software processes. IEEE Softw 23:91–98

    Article  Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learing tools and and techniques, 2nd edn

  • Xie T, Thummalapenta S, Lo D, Liu C (2009) Data mining for software engineering. Computer 42:55–62

    Article  Google Scholar 

  • Xing F, Guo P, Lyu MR (2005) A novel method for early software quality prediction based on support vector machine. In: Proceedings of the 16th IEEE International Symposium on Software Reliability Engineering, ISSRE ’05. IEEE Computer Society, Washington, DC, pp 213–222

    Chapter  Google Scholar 

  • Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. SIGKDD Explor Newsl 12:40–48

    Article  Google Scholar 

  • Yuan D, Lee K, Cheng H, Krishna G, Li Z, Ma X, Zhou Y, Han J (2008) Cispan: comprehensive incremental mining algorithms of closed sequential patterns for multi-versional software mining. In: SDM

  • Zhang H, Kitchenham B, Pfahl D (2008) Reflections on 10 years of software process simulation modeling: a systematic review. In: Proceedings of the Software process, 2008 international conference on making globally distributed software development a success story, ICSP’08. Springer, Berlin/Heidelberg, pp 345–356

    Google Scholar 

Download references

Acknowledgements

This work was supported by Nanyang Technological University SUG Grant M58020016, AcRF Tier 1 Grant RG 35/09 and MOE Academic Tier-1 Grant RG 33/11. We appreciate Quanxi Mi for sharing the raw data sets and the great help in the comparative study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Chen.

Additional information

Communicated by: Tim Menzies

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, N., Hoi, S.C.H. & Xiao, X. Software process evaluation: a machine learning framework with application to defect management process. Empir Software Eng 19, 1531–1564 (2014). https://doi.org/10.1007/s10664-013-9254-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-013-9254-z

Keywords

Navigation