Skip to main content
Log in

Investigating developers’ perception on software testability and its effects

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

The opinions and perspectives of software developers are highly regarded in software engineering research. The experience and knowledge of software practitioners are frequently sought to validate assumptions and evaluate software engineering tools, techniques, and methods. However, experimental evidence may unveil further or different insights, and in some cases even contradict developers’ perspectives. In this work, we investigate the correlation between software developers’ perspectives and experimental evidence about testability smells (i.e., programming practices that may reduce the testability of a software system). Specifically, we first elicit opinions and perspectives of software developers through a questionnaire survey on a catalog of four testability smells, we curated for this work. We also extend our tool DesigniteJava to automatically detect these smells in order to gather empirical evidence on testability smells. To this end we conduct a large-scale empirical study on \(1,115\) Java repositories containing approximately 46 million lines of code to investigate the relationship of testability smells with test quality, number of tests, and reported bugs. Our results show that testability smells do not correlate with test smells at the class granularity or with test suit size. Furthermore, we do not find a causal relationship between testability smells and bugs. Moreover, our results highlight that the empirical evidence does not match developers’ perspective on testability smells. Thus, suggesting that despite developers’ invaluable experience, their opinions and perspectives might need to be complemented with empirical evidence before bringing it into practice. This further confirms the importance of data-driven software engineering, which advocates the need and value of ensuring that all design and development decisions are supported by data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

Replication package can be found on GitHub - https://github.com/SMART-Dal/testability.

Notes

  1. https://github.com/forcedotcom/wsc/blob/master/src/main/java/com/sforce/ws/wsdl/Binding.java

  2. https://github.com/forcedotcom/wsc/blob/master/src/main/java/com/sforce/async/JobInfo.java

  3. https://github.com/forcedotcom/wsc/blob/master/src/main/java/com/sforce/async/Error.java

  4. bindingURL

  5. https://www.designite-tools.com/blog/understanding-testability-test-smells

  6. https://www.designite-tools.com/designitejava/

  7. https://github.com/j256/ormlite-jdbc

  8. https://github.com/forcedotcom/wsc

  9. https://github.com/magarena/magarena

  10. https://github.com/enonic/xp

  11. https://github.com/rundeck/rundeck

  12. https://github.com/MyRobotLab/myrobotlab

  13. https://github.com/nemerosa/ontrack

References

  • Al-Subaihin AA, Sarro F, Black S, Capra L, Harman M (2021) App store effects on software engineering practices. IEEE Trans Softw Eng 47(2):300–319

    Article  Google Scholar 

  • Alenezi M, Zarour M (2018) An empirical study of bad smells during software evolution using designite tool. i-Manager’s Journal on Software Engineering 12(4): 12

  • Aljedaani W, Peruma A, Aljohani A, Alotaibi M, Mkaouer MW, Ouni A, Newman CD, Ghallab A, Ludi S (2021) Test smell detection tools: A systematic mapping study. In Evaluation and Assessment in Software Engineering, EASE 2021, New York, NY, USA, Association for Computing Machinery page 170–180

  • Bavota G, Qusef A, Oliveto R, De Lucia A, Binkley D (2012) An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 2012 28th IEEE International Conference on Software Maintenance (ICSM), pages 56–65

  • Bavota G, Qusef A, Oliveto R, Lucia A, Binkley D (2015) Are test smells really harmful? an empirical study. Empirical Softw. Engg 20(4):1052–1094

    Article  Google Scholar 

  • Berry KJ, Paul J, Mielke W (1988) A Generalization of Cohen’s Kappa Agreement Measure to Interval Measurement and Multiple Raters. Educational and Psychological Measurement 48(4): 921–933 eprint: https://doi.org/10.1177/0013164488484007

  • Binder RV (1994) Design for testability in object-oriented systems. Commun. ACM 37(9):87–101

    Article  Google Scholar 

  • Bruntink M, van Deursen A (2006) An empirical study into class testability. J Syst Softw 79(9):1219–1232

    Article  Google Scholar 

  • Chowdhary V (2009) Practicing testability in the real world. In 2009 International Conference on Software Testing Verification and Validation, pages 260–268

  • Couto C, Pires P, Valente MT, da Silva Bigonha R, Hora AC, Anquetil N (2013) Bugmaps-granger: A tool for causality analysis between source code metrics and bugs

  • Cox D, Miller H (1965) The Theory of Stochastic Process. Chapman and Hall, London, 1 edition

  • Deursen AV, Moonen L, Bergh A, Kok G (2001) Refactoring test code. In Proceedings of the 2nd International Conference on Extreme Programming and Flexible Processes in Software Engineering (XP2001, pages 92–95

  • Devanbu P, Zimmermann T, Bird C (2016) Belief & evidence in empirical software engineering. In Proceedings of the 38th International Conference on Software Engineering, ICSE ’16, New York, NY, USA, Association for Computing Machinery page 108–119

  • Eck M, Palomba F, Castelluccio M, Bacchelli A (2019) Understanding flaky tests: The developer’s perspective. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE, New York, NY, USA, Association for Computing Machinery page 830–840

  • Eposhi A, Oizumi W, Garcia A, Sousa L, Oliveira R, Oliveira A (2019) Removal of design problems through refactorings: Are we looking at the right symptoms? In 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pages 148–153

  • Fatima S, Ghaleb TA, Briand L (2022) Flakify: A black-box, language model-based predictor for flaky tests. IEEE Trans Softw Eng

  • Feathers M (2004) Working Effectively with Legacy Code: WORK EFFECT LEG CODE_p1. Prentice Hall Professional

  • Filho FGS, Lelli V, Santos IdS, Andrade RMC (2020) Correlations among software testability metrics. In 19th Brazilian Symposium on Software Quality, SBQS’20, New York, NY, USA, Association for Computing Machinery

  • Freedman R (1991) Testability of software components. IEEE Transactions on Software Engineering 17(6):553–564

    Article  Google Scholar 

  • Fuller WA (1976) Introduction to Statistical Time Series. John Wiley and Sons New York, 1 edition

  • Garousi V, Felderer M, Kılıçaslan FN (2019) A survey on software testability. Information and Software Technology 108:35–64

    Article  Google Scholar 

  • Garousi V, Felderer M, Mäntylä MV (2018) Guidelines for including grey literature and conducting multivocal literature reviews in software engineering

  • Garousi V, Küçük B (2018) Smells in software test code: A survey of knowledge in industry and academia. Journal of Systems and Software 138:52–81

    Article  Google Scholar 

  • Granger CWJ (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37(3):424–438

    Article  MATH  Google Scholar 

  • Hassan MM, Afzal W, Blom M, Lindström B, Andler SF, Eldh S (2015) Testability and sofware robustness: A systematic literature review. In 2015 41st Euromicro Conference on Software Engineering and Advanced Applications, pages 341–348

  • Hevery M (2008) Writing Testable Code. https://testing.googleblog.com/2008/08/by-miko-hevery-so-you-decided-to.html

  • Human M (2022) Why You Should Be Replacing Full Stack Tests with Ember Tests. https://www.mutuallyhuman.com/blog/why-you-should-be-replacing-full-stack-tests-with-ember-tests/

  • Janes AA, Succi G (2012) The dark side of agile software development. In Proceedings of the ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2012, New York, NY, USA. Association for Computing Machinery page 215–228

  • Jeffrey VM (1991) Factors that affect software testability. Technical report

  • Junior NS, Rocha L, Martins LA, Machado I (2020) A survey on test practitioners’ awareness of test smells

  • Kaczanowski T (2013) Practical Unit Testing with JUnit and Mockito. Tomasz Kaczanowski

  • Khan RA, Mustafa K (2009) Metric based testability model for object oriented design (mtmood). SIGSOFT Softw. Eng. Notes 34(2):1–6

    Article  Google Scholar 

  • Kim DJ, Chen T-HP, Yang J (2021) The secret life of test smells - an empirical study on test smell evolution and maintenance. Empirical Software Engineering 26(5):100

    Article  Google Scholar 

  • Kolb R, Muthig D (2006) Making testing product lines more efficient by improving the testability of product line architectures. In Proceedings of the ISSTA 2006 Workshop on Role of Software Architecture for Testing and Analysis, ROSATEA ’06, New York, NY, USA. Association for Computing Machinery pages 22–27

  • Kwiatkowski D, Phillips PC, Schmidt P, Shin Y (1992) Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? Journal of Econometrics 54(1):159–178

    Article  MATH  Google Scholar 

  • Le Traon Y, Robach C (1995) From hardware to software testability. In Proceedings of 1995 IEEE International Test Conference (ITC), pages 710–719

  • Le Traon Y, Robach C (1997) Testability measurements for data flow designs. In Proceedings Fourth International Software Metrics Symposium, pages 91–98

  • Lienberherr KJ (1989) Formulations and benefits of the law of demeter. SIGPLAN Not 24(3):67–78

    Article  Google Scholar 

  • Lo B, Shi H (1998) A preliminary testability model for object-oriented software. In Proceedings. 1998 International Conference Software Engineering: Education and Practice (Cat. No.98EX220), pages 330–337

  • Marshall L, Webber J (2000) Gotos considered harmful and other programmers’ taboos. Department of Computing Science Technical Report Series

  • Mouchawrab S, Briand LC (1999) Labiche YA (2005) measurement framework for object-oriented software testability. Information and Software Technology, Most Cited Journal Articles in Software Engineering - 47(15):979–997

    Google Scholar 

  • Munaiah N, Kroh S, Cabrey C, Nagappan M (2017) Curating GitHub for engineered software projects. Empirical Software Engineering 22(6):3219–3253

    Article  Google Scholar 

  • Murphy-Hill E, Parnin C, Black AP (2012) How we refactor, and how we know it. IEEE Transactions on Software Engineering 38(1):5–18

    Article  Google Scholar 

  • Nguyen T, Delaunay M, Robach C (2002) Testability analysis for software components. In International Conference on Software Maintenance, Proceedings pages 422–429

  • Nguyen TB, Delaunay M, Robach C (2005) Testability Analysis of Data-Flow Software. Electronic Notes in Theoretical Computer Science 116:213–225

    Article  Google Scholar 

  • Oizumi W, Sousa L, Oliveira A, Carvalho L, Garcia A, Colanzi T, Oliveira R (2019) On the density and diversity of degradation symptoms in refactored classes: A multi-case study. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), IEEE pages 346–357

  • Oliveira P, Lima FP, Valente MT, Serebrenik A (2014) Rttool: A tool for extracting relative thresholds for source code metrics. In 2014 IEEE International Conference on Software Maintenance and Evolution, pages 629–632

  • Oliveira P, Valente MT, Lima FP (2014) Extracting relative thresholds for source code metrics. In 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), pages 254–263

  • Palomba F, Bavota G, Penta MD, Fasano F, Oliveto R, Lucia AD (2018) A large-scale empirical study on the lifecycle of code smell co-occurrences. Information and Software Technology 99:1–10

    Article  Google Scholar 

  • Payne JE, Alexander RT, Hutchinson CD (1997) Design-for-testability for object-oriented software. Object Magazine 7(5):34–43

    Google Scholar 

  • Peruma A, Almalki K, Newman CD, Mkaouer MW, Ouni A, Palomba F (2020) Tsdetect: An open source test smells detection tool. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, page 1650–1654, New York, NY, USA, Association for ComputingMachinery

  • Pettichord B (2002) Design for testability. In Pacific Northwest Software Quality Conference, pages 1–28

  • Pina D, Seaman C, Goldman A (2022) Technical debt prioritization: A developer’s perspective. In Proceedings of the International Conference on Technical Debt, TechDebt ’22, Association for Computing Machinery, New York, NY, USA page 46–55

  • Rahman F, Bird C, Devanbu P (2010) Clones: What is that smell? In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pages 72–81

  • Ribeiro DM, da Silva FQB, Valença D, Freitas ELSX, França C (2016) Advantages and disadvantages of using shared code from the developers perspective: A qualitative study. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM ’16, New York, NY, USA, Association for Computing Machinery

  • Sharma T (2018) DesigniteJava. https://github.com/tushartushar/DesigniteJava

  • Sharma T, Georgiou S, Kechagia M, Ghaleb TA, Sarro F (2022) Replication Package for Testability Study. https://github.com/SMART-Dal/testability

  • Sharma T, Singh P, Spinellis D (2020) An empirical investigation on the relationship between design and architecture smells. Empirical Software Engineering 25(5):4020–4068

    Article  Google Scholar 

  • Singh PK, Sangwan OP, Singh AP, Pratap A (2015) An assessment of software testability using fuzzy logic technique for aspect-oriented software. International Journal of Information Technology and Computer Science (IJITCS) 7(3):18

    Article  Google Scholar 

  • Spadini D, Palomba F, Zaidman A, Bruntink M, Bacchelli A (2018) On the relation of test smells to software code quality. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME) pages 1–12

  • Spearman C (1961) The proof and measurement of association between two things

  • Suryanarayana G, Samarthyam G, Sharma T (2014) Refactoring for Software Design Smells: Managing Technical Debt. Morgan Kaufmann, 1 edition

  • Sward RE, Chamillard A (2004) Re-engineering global variables in ada. In Proceedings of the 2004 annual ACM SIGAda international conference on Ada: The engineering of correct and reliable software for real-time & distributed systems using Ada and related technologies, pages 29–34

  • Terragni V, Salza P, Pezzé M (2020) Measuring software testability modulo test quality. In Proceedings of the 28th International Conference on Program Comprehension, ICPC ’20 page 241–251

  • Thomas D, Hunt A (2019) The Pragmatic Programmer: your journey to mastery. Addison-Wesley Professional

    Google Scholar 

  • Toner B (1987) The impact of agreement bias on the ranking of questionnaire response. J Soc Psychol 127(2):221–222

    Article  Google Scholar 

  • Tufano M, Palomba F, Bavota G, Di Penta M, Oliveto R, De Lucia A, Poshyvanyk D (2016) An empirical investigation into the nature of test smells. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE ’16, New York, NY, USA, Association for Computing Machinery pages 4–15

  • Uchôa A, Barbosa C, Oizumi W, Blenilio P, Lima R, Garcia A, Bezerra C (2020) How does modern code review impact software design degradation? an in-depth empirical study. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 511–522

  • Vincent J, King G, Lay P, Kinghorn J (2002) Principles of Built-In-Test for Run-Time-Testability in Component-Based Software Systems. Softw Qual J 10(2):115–133

    Article  Google Scholar 

  • Virgínio T, Martins L, Rocha L, Santana R, Cruz A, Costa H, Machado I (2020) Jnose: Java test smell detector. In Proceedings of the 34th Brazilian Symposium on Software Engineering, SBES ’20, New York, NY, USA, Association for Computing Machinery page 564–569

  • Voas JM (1996) Object-Oriented Software Testability, Springer US, Boston, MA pages 279–290

  • Vranken H, Witteman M, Van Wuijtswinkel R (1996) Design for testability in hardware software systems. IEEE Design Test of Computers 13(3):79–86

    Article  Google Scholar 

  • Zhao L (2006) A new approach for software testability analysis. In Proceedings of the 28th International Conference on Software Engineering, ICSE ’06, page 985–988

  • Zhou Y, Leung H, Song Q, Zhao J, Lu H, Chen L, Xu B (2012) An in-depth investigation into the relationships between structural metrics and unit testability in object-oriented systems. Science china information sciences 55(12):2800–2815

    Article  Google Scholar 

  • Zilberfeld G (2012) Design for Testability – The True Story. https://www.infoq.com/articles/Testability/

  • Faruk Arar Ömer, Ayan K (2021) Deriving thresholds of software metrics to predict faults on open source software: Replicated case studies. Expert Systems with Applications 61:106–121

    Google Scholar 

Download references

Acknowledgements

Maria Kechagia and Federica Sarro are supported by the ERC grant no. 741278 (EPIC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tushar Sharma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Dietmar Pfahl.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, T., Georgiou, S., Kechagia, M. et al. Investigating developers’ perception on software testability and its effects. Empir Software Eng 28, 120 (2023). https://doi.org/10.1007/s10664-023-10373-0

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10373-0

Keywords

Navigation