Abstract
In this paper, we are approaching, from a machine learning perspective, the problem of automatically detecting defective software entities (classes and methods) in existing software systems, a problem of major importance during software maintenance and evolution. In order to improve the internal quality of a software system, identifying faulty entities such as classes, modules, methods is essential for software developers. As defective software entities are hard to identify, machine learning-based classification models are still developed to approach the problem of detecting software design defects. We are proposing a novel method based on relational association rule mining for detecting faulty entities in existing software systems. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a dataset. Our method is based on the discovery of relational association rules for identifying design defects in software. Experiments on open source software are conducted in order to detect defective classes in object-oriented software systems, and a comparison of our approach with similar existing approaches is provided. The obtained results show that our method is effective for software design defect detection and confirms the potential of our proposal.
Similar content being viewed by others
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 487–499
Ahmed R, Karypis G (2012) Algorithms for mining the evolution of conserved relational states in dynamic network. Knowl Inf Syst 33(3):603–630
ASM (2013) ObjectWeb: Open Source Middleware. http://asm.objectweb.org/
Bieman JM, Kang BK (1995) Cohesion and reuse in an object-oriented system. ACM SIGSOFT Softw Eng Notes 20(SI):259–262
Briand L, Daly JW (1999) A unified framework for coupling measurement in object-oriented systems. IEEE Trans Softw Eng 25(1):91–121
Campan A, Serban G, Truta TM et al (2006) An algorithm for the discovery of arbitrary length ordinal association rules. In: The 2006 international conference on data mining. Las Vegas, Nevada, USA, pp 107–113
Chen CL, Tseng FSC, Liang T (2011) An integration of fuzzy association rules and wordnet for document clustering. Knowl Inf Syst 28(3):687–708
Chidamber SR, Kemerer CF (1991) Towards a metrics suite for object-oriented design. In: Conference proceedings on object oriented programming systems, languages, and applications. Phoenix, Arizona, USA, pp 197–211
Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Trans Softw Eng 20(6):476–493
Dhambri K, Sahraoui H, Poulin P (2008) Visual detection of design anomalies. In: Proceedings of the 12th European conference on software maintenance and reengineering. Greece, Athens, pp 279–283
Fokaefs M, Tsantalis N, Stroulia E et al (2012) Identification and application of extract class refactorings in object-oriented systems. J Syst Softw 85(10):2241–2260
Fowler M (1999) Refactoring: improving the design of existing code. Addison-Wesley Longman Publishing Co. Inc., Boston, MA
FTP (2013) FTP4j. http://sourceforge.net/projects/ftp4j/
Gamma E (n.d.) JHotDraw Project. http://sourceforge.net/projects/jhotdraw
Grady RB (1992) Practical software metrics for project management and process improvement. Prentice Hall Press, USA
Han J (2005) Data mining: concepts and techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA
Henderson-Sellers B (1996) Object-oriented metrics measures of complexity. Prentice-Hall, USA
Henry S, Kafura D (1981) Software structure metrics based on information flow. IEEE Trans Softw Eng 7(5):510–518
Hitz M, Montazeri B (1995) Measuring coupling and cohesion in object-oriented systems. In: Proceedings of international symposium on applied corporate computing. Monterrey, Mexico, pp 25–27
IPl (2013) iPlasma. http://loose.upt.ro/reengineering/research/iplasma
ISO (2013) ISO8583. http://sourceforge.net/projects/j8583/
JDe (2013) JDeodorant. http://www.jdeodorant.com/
Jiang Y, Li M, Zhou ZH (2011) Software defect detection with ROCUS. J Comput Sci Technol 26(2):328–342
Kessentini M, Sahraoui H, Boukadoum M et al (2011) Search-based design defects detection by example. In: Proceedings of the 14th international conference on fundamental approaches to software engineering, Germany, pp 401–415
Khomh F, Vaucher S, Guéhéneuc YG et al (2009) A bayesian approach for the detection of code and design smells. In: Proceedings of the 9th international conference on quality software. Jeju, Korea, pp 305–314
Larman C (2004) Applying UML and patterns: an introduction to object-oriented analysis and design and iterative development, 3rd edn. Prentice Hall, USA
Lee YS, Liang BS, Wu SF et al. (1995) Measuring the coupling and cohesion of an object-oriented program based on information flow. In: Proceedings of international conference on software quality, Maribor, Slovenia
Li W, Henry S (1993) Object oriented metrics which predict maintainability. J Syst Softw 23(2):111–122
Maisikeli SG (2009) Aspect mining using self-organizing maps with method level dynamic software metrics as input vectors. Ph.D. thesis, Graduate School of Computer and Information Sciences Nova Southeastern University
Marcus A, Maletic JI, Lin KI (2001) Ordinal association rules for error identification in data sets. In: Proceedings of the tenth international conference on information and knowledge management, CIKM ’01. ACM, New York, NY, pp 589–591
Marian Z (2012) Aggregated metrics guided software restructuring. In: Conference proceedings of ICCP 2012. Cluj-Napoca, Romania, pp 259–266
Marian Z, Czibula G, Czibula IG (2012) Using software metrics for automatic software design improvement. Stud Inf Control 21(3):249–258
Marinescu C, Marinescu R, Mihancea PF et al (2005) IPlasma: an integrated platform for quality assessment of object-oriented design. In: Proceedings of the 21st IEEE International Conference on Software Maintenance—Industrial and Tool volume, Budapest, Hungary, pp 77–80
Marinescu R (2002) Measurement and quality in object-oriented design. Ph.D. thesis, Politechnica University Timisoara, Faculty of Automatics and Computer Science, Romania
Mekruksavanich S, Yupapin PP, Muenchaisri P (2012) Analytical learning based on a meta-programming approach for the detection of object-oriented design defects. Inf Technol J 11(12):1677–1686
Moha N (2006) Detection and correction of design defects in object-oriented architectures. In: Doctoral symposium, 20th edition of the European conference on object-oriented programming. Nantes, France, pp 1–4
Moha N, Guéhéneuc YG, Leduc P (2006) Automatic generation of detection algorithms for design defects. In: Proceedings of the 21st IEEE/ACM international conference on automated software engineering. Tokyo, Japan, pp 297–300
Moha N, Guéhéneuc YG, Meur AFL et al (2010) From a domain analysis to the specification and detection of code and design smells. Formal Aspects Comput 22(3–4):345–361
Munro MJ (2005) Product metrics for automatic identification of “bad smell” design problems in Java source code. In: Proceedings of the 11th IEEE international software metrics symposium. Glasgow, UK, pp 1–15
NASA (2013) NASA defect data sets. http://nasa-softwaredefectdatasets.wikispaces.com/
Profiler (2013) Profiler4j. http://sourceforge.net/projects/profiler4j/
Rodríguez D, Ruiz R, Riquelme JC et al (2012) Searching for rules to detect defective modules: a subgroup discovery approach. Inf Sci 191:14–30
Salam A, Khayal SH (2012) Mining top-k frequent patterns without minimum support threshold. Knowl Inf Syst 30(1):57–86
Serban G, Câmpan A, Czibula IG (2006) A programming interface for finding relational association rules. Int J Comput Commun Control I(S):439–444
Simon F, Steinbruckner F, Lewerentz C (2001) Metrics based refactoring. In: Proceedings of the fifth European conference on software maintenance and reengineering. IEEE Computer Society, Washington, DC, pp 30–38
Soua B, Borgi A, Tagina M (2013) An ensemble method for fuzzy rule-based classification systems. Knowl Inf Syst 36(2):385–410
Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining, 1st edn. Addison-Wesley Longman Publishing Co. Inc., Boston, MA
Tsantalis N (2010) Evaluation and improvement of software architecture: identification of design problems in object-oriented systems and resolutions through refactorings. Ph.D. Thesis, Macedonia Thessaloniki
Tuffry S (2011) Data mining and statistics for decision making. Wiley, New York
Win (2013) Winrun4j. http://sourceforge.net/projects/winrun4j/
Zhang K, Lo D, Lim EP et al (2013) Mining indirect antagonistic communities from social interactions. Knowl Inf Syst 5(3):553–583
Acknowledgments
The authors would like to thank the editor and the anonymous reviewers for their valuable comments and suggestions to improve the paper and the presentation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Czibula, G., Marian, Z. & Czibula, I.G. Detecting software design defects using relational association rule mining. Knowl Inf Syst 42, 545–577 (2015). https://doi.org/10.1007/s10115-013-0721-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-013-0721-z