Abstract
The opinions and perspectives of software developers are highly regarded in software engineering research. The experience and knowledge of software practitioners are frequently sought to validate assumptions and evaluate software engineering tools, techniques, and methods. However, experimental evidence may unveil further or different insights, and in some cases even contradict developers’ perspectives. In this work, we investigate the correlation between software developers’ perspectives and experimental evidence about testability smells (i.e., programming practices that may reduce the testability of a software system). Specifically, we first elicit opinions and perspectives of software developers through a questionnaire survey on a catalog of four testability smells, we curated for this work. We also extend our tool DesigniteJava to automatically detect these smells in order to gather empirical evidence on testability smells. To this end we conduct a large-scale empirical study on \(1,115\) Java repositories containing approximately 46 million lines of code to investigate the relationship of testability smells with test quality, number of tests, and reported bugs. Our results show that testability smells do not correlate with test smells at the class granularity or with test suit size. Furthermore, we do not find a causal relationship between testability smells and bugs. Moreover, our results highlight that the empirical evidence does not match developers’ perspective on testability smells. Thus, suggesting that despite developers’ invaluable experience, their opinions and perspectives might need to be complemented with empirical evidence before bringing it into practice. This further confirms the importance of data-driven software engineering, which advocates the need and value of ensuring that all design and development decisions are supported by data.
Similar content being viewed by others
Data Availability
Replication package can be found on GitHub - https://github.com/SMART-Dal/testability.
Notes
bindingURL
References
Al-Subaihin AA, Sarro F, Black S, Capra L, Harman M (2021) App store effects on software engineering practices. IEEE Trans Softw Eng 47(2):300–319
Alenezi M, Zarour M (2018) An empirical study of bad smells during software evolution using designite tool. i-Manager’s Journal on Software Engineering 12(4): 12
Aljedaani W, Peruma A, Aljohani A, Alotaibi M, Mkaouer MW, Ouni A, Newman CD, Ghallab A, Ludi S (2021) Test smell detection tools: A systematic mapping study. In Evaluation and Assessment in Software Engineering, EASE 2021, New York, NY, USA, Association for Computing Machinery page 170–180
Bavota G, Qusef A, Oliveto R, De Lucia A, Binkley D (2012) An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 2012 28th IEEE International Conference on Software Maintenance (ICSM), pages 56–65
Bavota G, Qusef A, Oliveto R, Lucia A, Binkley D (2015) Are test smells really harmful? an empirical study. Empirical Softw. Engg 20(4):1052–1094
Berry KJ, Paul J, Mielke W (1988) A Generalization of Cohen’s Kappa Agreement Measure to Interval Measurement and Multiple Raters. Educational and Psychological Measurement 48(4): 921–933 eprint: https://doi.org/10.1177/0013164488484007
Binder RV (1994) Design for testability in object-oriented systems. Commun. ACM 37(9):87–101
Bruntink M, van Deursen A (2006) An empirical study into class testability. J Syst Softw 79(9):1219–1232
Chowdhary V (2009) Practicing testability in the real world. In 2009 International Conference on Software Testing Verification and Validation, pages 260–268
Couto C, Pires P, Valente MT, da Silva Bigonha R, Hora AC, Anquetil N (2013) Bugmaps-granger: A tool for causality analysis between source code metrics and bugs
Cox D, Miller H (1965) The Theory of Stochastic Process. Chapman and Hall, London, 1 edition
Deursen AV, Moonen L, Bergh A, Kok G (2001) Refactoring test code. In Proceedings of the 2nd International Conference on Extreme Programming and Flexible Processes in Software Engineering (XP2001, pages 92–95
Devanbu P, Zimmermann T, Bird C (2016) Belief & evidence in empirical software engineering. In Proceedings of the 38th International Conference on Software Engineering, ICSE ’16, New York, NY, USA, Association for Computing Machinery page 108–119
Eck M, Palomba F, Castelluccio M, Bacchelli A (2019) Understanding flaky tests: The developer’s perspective. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE, New York, NY, USA, Association for Computing Machinery page 830–840
Eposhi A, Oizumi W, Garcia A, Sousa L, Oliveira R, Oliveira A (2019) Removal of design problems through refactorings: Are we looking at the right symptoms? In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pages 148–153
Fatima S, Ghaleb TA, Briand L (2022) Flakify: A black-box, language model-based predictor for flaky tests. IEEE Trans Softw Eng
Feathers M (2004) Working Effectively with Legacy Code: WORK EFFECT LEG CODE_p1. Prentice Hall Professional
Filho FGS, Lelli V, Santos IdS, Andrade RMC (2020) Correlations among software testability metrics. In 19th Brazilian Symposium on Software Quality, SBQS’20, New York, NY, USA, Association for Computing Machinery
Freedman R (1991) Testability of software components. IEEE Transactions on Software Engineering 17(6):553–564
Fuller WA (1976) Introduction to Statistical Time Series. John Wiley and Sons New York, 1 edition
Garousi V, Felderer M, Kılıçaslan FN (2019) A survey on software testability. Information and Software Technology 108:35–64
Garousi V, Felderer M, Mäntylä MV (2018) Guidelines for including grey literature and conducting multivocal literature reviews in software engineering
Garousi V, Küçük B (2018) Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software 138:52–81
Granger CWJ (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3):424–438
Hassan MM, Afzal W, Blom M, Lindström B, Andler SF, Eldh S (2015) Testability and sofware robustness: A systematic literature review. In 2015 41st Euromicro Conference on Software Engineering and Advanced Applications, pages 341–348
Hevery M (2008) Writing Testable Code. https://testing.googleblog.com/2008/08/by-miko-hevery-so-you-decided-to.html
Human M (2022) Why You Should Be Replacing Full Stack Tests with Ember Tests. https://www.mutuallyhuman.com/blog/why-you-should-be-replacing-full-stack-tests-with-ember-tests/
Janes AA, Succi G (2012) The dark side of agile software development. In Proceedings of the ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2012, New York, NY, USA. Association for Computing Machinery page 215–228
Jeffrey VM (1991) Factors that affect software testability. Technical report
Junior NS, Rocha L, Martins LA, Machado I (2020) A survey on test practitioners’ awareness of test smells
Kaczanowski T (2013) Practical Unit Testing with JUnit and Mockito. Tomasz Kaczanowski
Khan RA, Mustafa K (2009) Metric based testability model for object oriented design (mtmood). SIGSOFT Softw. Eng. Notes 34(2):1–6
Kim DJ, Chen T-HP, Yang J (2021) The secret life of test smells - an empirical study on test smell evolution and maintenance. Empirical Software Engineering 26(5):100
Kolb R, Muthig D (2006) Making testing product lines more efficient by improving the testability of product line architectures. In Proceedings of the ISSTA 2006 Workshop on Role of Software Architecture for Testing and Analysis, ROSATEA ’06, New York, NY, USA. Association for Computing Machinery pages 22–27
Kwiatkowski D, Phillips PC, Schmidt P, Shin Y (1992) Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? Journal of Econometrics 54(1):159–178
Le Traon Y, Robach C (1995) From hardware to software testability. In Proceedings of 1995 IEEE International Test Conference (ITC), pages 710–719
Le Traon Y, Robach C (1997) Testability measurements for data flow designs. In Proceedings Fourth International Software Metrics Symposium, pages 91–98
Lienberherr KJ (1989) Formulations and benefits of the law of demeter. SIGPLAN Not 24(3):67–78
Lo B, Shi H (1998) A preliminary testability model for object-oriented software. In Proceedings. 1998 International Conference Software Engineering: Education and Practice (Cat. No.98EX220), pages 330–337
Marshall L, Webber J (2000) Gotos considered harmful and other programmers’ taboos. Department of Computing Science Technical Report Series
Mouchawrab S, Briand LC (1999) Labiche YA (2005) measurement framework for object-oriented software testability. Information and Software Technology, Most Cited Journal Articles in Software Engineering - 47(15):979–997
Munaiah N, Kroh S, Cabrey C, Nagappan M (2017) Curating GitHub for engineered software projects. Empirical Software Engineering 22(6):3219–3253
Murphy-Hill E, Parnin C, Black AP (2012) How we refactor, and how we know it. IEEE Transactions on Software Engineering 38(1):5–18
Nguyen T, Delaunay M, Robach C (2002) Testability analysis for software components. In International Conference on Software Maintenance, Proceedings pages 422–429
Nguyen TB, Delaunay M, Robach C (2005) Testability Analysis of Data-Flow Software. Electronic Notes in Theoretical Computer Science 116:213–225
Oizumi W, Sousa L, Oliveira A, Carvalho L, Garcia A, Colanzi T, Oliveira R (2019) On the density and diversity of degradation symptoms in refactored classes: A multi-case study. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), IEEE pages 346–357
Oliveira P, Lima FP, Valente MT, Serebrenik A (2014) Rttool: A tool for extracting relative thresholds for source code metrics. In 2014 IEEE International Conference on Software Maintenance and Evolution, pages 629–632
Oliveira P, Valente MT, Lima FP (2014) Extracting relative thresholds for source code metrics. In 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), pages 254–263
Palomba F, Bavota G, Penta MD, Fasano F, Oliveto R, Lucia AD (2018) A large-scale empirical study on the lifecycle of code smell co-occurrences. Information and Software Technology 99:1–10
Payne JE, Alexander RT, Hutchinson CD (1997) Design-for-testability for object-oriented software. Object Magazine 7(5):34–43
Peruma A, Almalki K, Newman CD, Mkaouer MW, Ouni A, Palomba F (2020) Tsdetect: An open source test smells detection tool. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, page 1650–1654, New York, NY, USA, Association for ComputingMachinery
Pettichord B (2002) Design for testability. In Pacific Northwest Software Quality Conference, pages 1–28
Pina D, Seaman C, Goldman A (2022) Technical debt prioritization: A developer’s perspective. In Proceedings of the International Conference on Technical Debt, TechDebt ’22, Association for Computing Machinery, New York, NY, USA page 46–55
Rahman F, Bird C, Devanbu P (2010) Clones: What is that smell? In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pages 72–81
Ribeiro DM, da Silva FQB, Valença D, Freitas ELSX, França C (2016) Advantages and disadvantages of using shared code from the developers perspective: A qualitative study. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’16, New York, NY, USA, Association for Computing Machinery
Sharma T (2018) DesigniteJava. https://github.com/tushartushar/DesigniteJava
Sharma T, Georgiou S, Kechagia M, Ghaleb TA, Sarro F (2022) Replication Package for Testability Study. https://github.com/SMART-Dal/testability
Sharma T, Singh P, Spinellis D (2020) An empirical investigation on the relationship between design and architecture smells. Empirical Software Engineering 25(5):4020–4068
Singh PK, Sangwan OP, Singh AP, Pratap A (2015) An assessment of software testability using fuzzy logic technique for aspect-oriented software. International Journal of Information Technology and Computer Science (IJITCS) 7(3):18
Spadini D, Palomba F, Zaidman A, Bruntink M, Bacchelli A (2018) On the relation of test smells to software code quality. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME) pages 1–12
Spearman C (1961) The proof and measurement of association between two things
Suryanarayana G, Samarthyam G, Sharma T (2014) Refactoring for Software Design Smells: Managing Technical Debt. Morgan Kaufmann, 1 edition
Sward RE, Chamillard A (2004) Re-engineering global variables in ada. In Proceedings of the 2004 annual ACM SIGAda international conference on Ada: The engineering of correct and reliable software for real-time & distributed systems using Ada and related technologies, pages 29–34
Terragni V, Salza P, Pezzé M (2020) Measuring software testability modulo test quality. In Proceedings of the 28th International Conference on Program Comprehension, ICPC ’20 page 241–251
Thomas D, Hunt A (2019) The Pragmatic Programmer: your journey to mastery. Addison-Wesley Professional
Toner B (1987) The impact of agreement bias on the ranking of questionnaire response. J Soc Psychol 127(2):221–222
Tufano M, Palomba F, Bavota G, Di Penta M, Oliveto R, De Lucia A, Poshyvanyk D (2016) An empirical investigation into the nature of test smells. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE ’16, New York, NY, USA, Association for Computing Machinery pages 4–15
Uchôa A, Barbosa C, Oizumi W, Blenilio P, Lima R, Garcia A, Bezerra C (2020) How does modern code review impact software design degradation? an in-depth empirical study. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 511–522
Vincent J, King G, Lay P, Kinghorn J (2002) Principles of Built-In-Test for Run-Time-Testability in Component-Based Software Systems. Softw Qual J 10(2):115–133
Virgínio T, Martins L, Rocha L, Santana R, Cruz A, Costa H, Machado I (2020) Jnose: Java test smell detector. In Proceedings of the 34th Brazilian Symposium on Software Engineering, SBES ’20, New York, NY, USA, Association for Computing Machinery page 564–569
Voas JM (1996) Object-Oriented Software Testability, Springer US, Boston, MA pages 279–290
Vranken H, Witteman M, Van Wuijtswinkel R (1996) Design for testability in hardware software systems. IEEE Design Test of Computers 13(3):79–86
Zhao L (2006) A new approach for software testability analysis. In Proceedings of the 28th International Conference on Software Engineering, ICSE ’06, page 985–988
Zhou Y, Leung H, Song Q, Zhao J, Lu H, Chen L, Xu B (2012) An in-depth investigation into the relationships between structural metrics and unit testability in object-oriented systems. Science china information sciences 55(12):2800–2815
Zilberfeld G (2012) Design for Testability – The True Story. https://www.infoq.com/articles/Testability/
Faruk Arar Ömer, Ayan K (2021) Deriving thresholds of software metrics to predict faults on open source software: Replicated case studies. Expert Systems with Applications 61:106–121
Acknowledgements
Maria Kechagia and Federica Sarro are supported by the ERC grant no. 741278 (EPIC).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by: Dietmar Pfahl.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sharma, T., Georgiou, S., Kechagia, M. et al. Investigating developers’ perception on software testability and its effects. Empir Software Eng 28, 120 (2023). https://doi.org/10.1007/s10664-023-10373-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10373-0