Abstract
Software process evaluation is important to improve software development and the quality of software products in a software organization. Conventional approaches based on manual qualitative evaluations (e.g., artifacts inspection) are deficient in the sense that (i) they are time-consuming, (ii) they usually suffer from the authority constraints, and (iii) they are often subjective. To overcome these limitations, this paper presents a novel semi-automated approach to software process evaluation using machine learning techniques. In this study, we mainly focus on the procedure aspect of software processes, and formulate the problem as a sequence (with additional information, e.g., time, roles, etc.) classification task, which is solved by applying machine learning algorithms. Based on the framework, we define a new quantitative indicator to evaluate the execution of a software process more objectively. To validate the efficacy of our approach, we apply it to evaluate the execution of a defect management (DM) process in nine real industrial software projects. Our empirical results show that our approach is effective and promising in providing a more objective and quantitative measurement for the DM process evaluation task. Furthermore, we conduct a comprehensive empirical study to compare our proposed machine learning approach with an existing conventional approach (i.e., artifacts inspection). Finally, we analyze the advantages and disadvantages of both approaches in detail.
Similar content being viewed by others
References
Aalst W, Reijers H, Weijters A, van Dongen B, de Medeiros AA, Song M, Verbeek H (2007) Business process mining: An industrial application. Inf Syst 32(5):713–732
Abrahamsson P, Hanhineva A, Hulkko H, Ihme T, Jäälinoja J, Korkala M, Koskela J, Kyllönen P, Salo O (2004) Mobile-d: an agile approach for mobile application development, OOPSLA ’04. ACM, New York, NY, pp 174–175
Acharya M, Xie T, Pei J, Xu J (2007) Mining api patterns as partial orders from source code: from usage scenarios to specifications. In: ESEC-FSE ’07. ACM, New York, NY, pp 25–34
Almeida M, Blanc X, Bendraou R (2011) Deviation management during process execution. In: Proceedings of the 2011 26th international conference on automated software engineering, IEEE, ASE’11, pp 528–531
Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL ’02. ACM, New York, NY, pp 4–16
Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proc. 28th International Conference on Software Engineering (ICSE’06). Shanghai, China, pp 361–370
Bandyopadhyay S, Maulik U, Holder LB, Cook DJ, Sarawagi S (2005) Sequence data mining. In: Advanced methods for knowledge discovery from complex data, advanced information and knowledge processing. Springer, London, pp 153–187
Beest NRTP, Maruster L (2007) A process mining approach to redesign business processes—a case study in gas industry. In: Proc. 9th international symposium on symbolic and numeric algorithms for scientific computing. Washington, DC, pp 541–548
Boehm BW (1988) A spiral model of software development and enhancement. Computer 21:61–72
Cataldo M, Nambiar S (2009) On the relationship between process maturity and geographic distribution: an empirical analysis of their impact on software quality. In: ESEC/FSE’09. Amsterdam, The Netherlands, pp 101–110
Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval, pp 161–175
Chen N, Hoi SCH, Xiao X (2011) Software process evaluation: a machine learning approach. In: Proceedings of the 2011 26th IEEE/ACM international conference on Automated Software Engineering, ASE ’11. IEEE Computer Society, Washington, DC, pp 333–342
Cheng BY, Carbonell JG, Klein-Seetharaman J (2005) Protein classification based on text document classification techniques. Proteins 58(4):955–970
Chrissis MB, Konrad M, Shrum S (2004) CMMI: guidelines for process integration and product improvement, 1st edn. Addison-Wesley Professional
Cleland-Huang J, Settimi R, Zou X, Solc P (2006) The detection and classification of non-functional requirements with application to early aspects. In: Prof. 14th IEEE intl. requirements engineering conference, pp 36–45
Cook JE, Wolf AL (1998) Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol 7(3):215–249
Cook JE, Wolf AL (1999) Software process validation: quantitatively measuring the correspondence of a process to a model. ACM Trans Softw Eng Methodol 8:147–176
Cubranic D, Murphy GC (2004) Automatic bug triage using text categorization. In: Proc. 16th international conference on software engineering & knowledge engineering, pp 92–97
Damian D, Chisan J (2006) An empirical study of the complex relationships between requirements engineering processes and other processes that lead to payoffs in productivity, quality, and risk management. IEEE Trans Softw Eng 32:433–453
Dong G (2009) Sequence data mining. Springer, Berlin/Heidelberg
Dongen B, de Medeiros A, Verbeek H, Weijters A, Aalst W (2005) The prom framework: a new era in process mining tool support. In: Applications and theory of petri nets 2005, vol 3536. Springer, Berlin/Heidelberg, pp 1105–1116
Goldenson DR, Gibson DL (2003) Demonstrating the impact and benefits of CMMI: an update and preliminary results. Technical Report CMU/SEI-2003-SR-009, Software Engineering Institute/Carnegie Mellon University
Dybå T, Prikladnicki R, Rönkkö K, Seaman CB, Sillito J (2011) Qualitative research in software engineering. Empir Software Eng 16(4):425–429
El Emam K, Drouin JN, Melo W (1998) SPICE. The theory and practice of software process improvement and capability determination. IEEE Computer Society, Los Alamitos
El-Ramly M, Stroulia E, Sorenson P (2002) From run-time behavior to usage scenarios: an interaction-pattern mining approach. In: ACM SIGKDD Conf. (KDD’02). Edmonton, Alberta, Canada, pp 315–324
Fuggetta A (2000) Software process: a roadmap. In: Proceedings of the conference on the future of software engineering, ICSE ’00, Limerick, Ireland. ACM, New York, pp 25–34
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11:10–18
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143:29–36
Harter DE, Kemerer CF, Slaughter SA (2012) Does software process improvement reduce the severity of defects? A longitudinal field study. IEEE Trans Softw Eng 38(4):810–827
Hoda R, Noble J, Marshall S (2012) Developing a grounded theory to explain the practices of self-organizing agile teams. Empir Software Eng 17(6):609–639
Hofmeyr SA, Forrest S, Somayaji A (1998) Intrusion detection using sequences of system calls. J Comput Secur 6:151–180
Kellner MI, Madachy RJ, Raffo DM (1999) Software process simulation modeling: why? What. J Syst Softw 46:91–105
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, IJCAI’95. vol 2. Morgan Kaufmann Publishers Inc., San Francisco, CA, pp 1137–1143
Krishnan MS, Kriebel CH, Kekre S, Mukhopadhyay T (2000) An empirical analysis of productivity and quality in software products. Manag Sci 46:745–759
Lee S, Yong HS (2010) Distributed agile: project management in a global environment. Empir Software Eng 15(2):204–217
Li J, Bjørnson FO, Conradi R, Kampenes VB (2006) An empirical study of variations in cots-based software development processes in the norwegian it industry. Empir Software Eng 11(3):433–461
Lo D, Khoo SC, Liu C (2007) Efficient mining of iterative patterns for software specification discovery. In: ACM SIGKDD Conf. (KDD’07). San Jose, CA, pp 460–469
Lucca Gd (2002) An approach to classify software maintenance requests. In: Proceedings of the International Conference on Software Maintenance (ICSM’02), Washington, DC, pp 93–102
Martin RC (2002) Agile software development, principles, patterns, and practices, 1st edn. Prentice Hall
Moor Ad, Delugach H (2006) Software process validation: comparing process and practice models. In: Proceedings of the workshop on Exploring Modeling Methods for Systems Analysis and Design (EMMSAD’06), pp 533–540
Osterweil L (1987) Software processes are software too. In: Proceedings of the 9th international conference on software engineering. IEEE Computer Society Press, Los Alamitos, CA, pp 2–13
Osterweil LJ (1997) Software processes are software too, revisited: an invited talk on the most influential paper of icse 9. In: Proc. 19th International Conference on Software Engineering (ICSE’97). Boston, Massachusetts, pp 540–548
Patton MQ (2002) Qualitative research and evaluation methods. Sage, Thousand Oaks
Pino FJ, Pardo C, García F, Piattini M (2010) Assessment methodology for software process improvement in small organizations. Inf Softw Technol 52:1044–1061
Poncin W, Serebrenik A, Brand Mvd (2011) Process mining software repositories. In: Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering, CSMR ’11. IEEE Computer Society, pp 5–14
Pressman R (2010) Software engineering: a practitioner’s approach, 7th edn. McGraw-Hill Inc.
Raffo D, Vandeville J, Martin RH (1999) Software process simulation to achieve higher cmm levels. J Syst Softw 46(2–3):163–172
Ramasubbu N, Balan RK (2009) The impact of process choice in high maturity environments: an empirical analysis. In: Proceedings of the 31st International Conference on Software Engineering, ICSE ’09, pp 529–539
Rozinat A, van der Aalst W (2008) Conformance checking of processes based on monitoring real behavior. Inf Syst 33(1):64–95
Rubin V, Günther CW, Van Der Aalst WMP, Kindler E, Van Dongen BF, Schäfer W (2007) Process mining framework for software processes. In: Proc. Intl. Conf. on Software Process (ICSP’07). Minneapolis, MN, pp 169–181
Samalikova J, Kusters R, Trienekens J, Weijters T, Siemons P (2011) Toward objective software process information: experiences from a case study. SQJ 19:101–120
SCAMPI Upgrade Team (2011) Standard CMMI Appraisal Method for Process Improvement (SCAMPI) A, Version 1.3: Method Definition Document. Handbook CMU/SEI-2011-HB-001, Software Engineering Institute/Carnegie Mellon University
Sommerville I, Ransom J (2005) An empirical study of industrial requirements engineering process assessment and improvement. ACM Trans Softw Eng Methodol 14:85–117
Wangenheim CG, Anacleto A, Salviano CF (2006) Helping small companies assess software processes. IEEE Softw 23:91–98
Witten IH, Frank E (2005) Data mining: practical machine learing tools and and techniques, 2nd edn
Xie T, Thummalapenta S, Lo D, Liu C (2009) Data mining for software engineering. Computer 42:55–62
Xing F, Guo P, Lyu MR (2005) A novel method for early software quality prediction based on support vector machine. In: Proceedings of the 16th IEEE International Symposium on Software Reliability Engineering, ISSRE ’05. IEEE Computer Society, Washington, DC, pp 213–222
Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. SIGKDD Explor Newsl 12:40–48
Yuan D, Lee K, Cheng H, Krishna G, Li Z, Ma X, Zhou Y, Han J (2008) Cispan: comprehensive incremental mining algorithms of closed sequential patterns for multi-versional software mining. In: SDM
Zhang H, Kitchenham B, Pfahl D (2008) Reflections on 10 years of software process simulation modeling: a systematic review. In: Proceedings of the Software process, 2008 international conference on making globally distributed software development a success story, ICSP’08. Springer, Berlin/Heidelberg, pp 345–356
Acknowledgements
This work was supported by Nanyang Technological University SUG Grant M58020016, AcRF Tier 1 Grant RG 35/09 and MOE Academic Tier-1 Grant RG 33/11. We appreciate Quanxi Mi for sharing the raw data sets and the great help in the comparative study.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Tim Menzies
Rights and permissions
About this article
Cite this article
Chen, N., Hoi, S.C.H. & Xiao, X. Software process evaluation: a machine learning framework with application to defect management process. Empir Software Eng 19, 1531–1564 (2014). https://doi.org/10.1007/s10664-013-9254-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-013-9254-z