Skip to main content
Log in

RePOR: Mimicking humans on refactoring tasks. Are we there yet?

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Refactoring is a maintenance activity that aims to improve design quality while preserving the behavior of a system. Several (semi)automated approaches have been proposed to support developers in this maintenance activity, based on the correction of anti-patterns, which are “poor” solutions to recurring design problems. However, little quantitative evidence exists about the impact of automatically refactored code on program comprehension, and in which context automated refactoring can be as effective as manual refactoring. Leveraging RePOR, an automated refactoring approach based on partial order reduction techniques, we performed an empirical study to investigate whether automated refactoring code structure affects the understandability of systems during comprehension tasks. (1) We surveyed 80 developers, asking them to identify from a set of 20 refactoring changes if they were generated by developers or by a tool, and to rate the refactoring changes according to their design quality; (2) we asked 30 developers to complete code comprehension tasks on 10 systems that were refactored by either a freelancer or an automated refactoring tool. To make comparison fair, for a subset of refactoring actions that introduce new code entities, only synthetic identifiers were presented to practitioners. We measured developers’ performance using the NASA task load index for their effort, the time that they spent performing the tasks, and their percentages of correct answers. Our findings, despite current technology limitations, show that it is reasonable to expect a refactoring tools to match developer code. Indeed, results show that for 3 out of the 5 anti-pattern types studied, developers could not recognize the origin of the refactoring (i.e., whether it was performed by a human or an automatic tool). We also observed that developers do not prefer human refactorings over automated refactorings, except when refactoring Blob classes; and that there is no statistically significant difference between the impact on code understandability of human refactorings and automated refactorings. We conclude that automated refactorings can be as effective as manual refactorings. However, for complex anti-patterns types like the Blob, the perceived quality achieved by developers is slightly higher.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Listing 1
Fig. 1
Fig. 2
Listing 2
Listing 3

Similar content being viewed by others

Notes

  1. Freelancer.com and Guru.com

  2. https://sourceforge.net/

  3. https://github.com/moar82/RefGen

  4. http://www.jotform.com

References

  • Abbes M, Khomh F, Gueheneuc YG, Antoniol G (2011) An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In: 2011 15th european conf. on Software maintenance and reengineering (CSMR), pp 181–190

  • Arima R, Higo Y, Kusomoto S (2018) Toward refactoring evaluation with code naturalness. In: Proceedings of the 26th International Conference on Program Comprehension, ICPC ’18. IEEE Press. https://doi.org/10.1145/3196321.3196362

  • Beck K, Andres C (2004) Extreme programming explained: Embrace change, 2nd edon. Addison-Wesley Professional

  • Bois BD, Demeyer S, Verelst J, Mens T, Temmerman M (2006) Does god class decomposition affect comprehensibility?. In: IASTED Conf. on software engineering

  • Brown WJ, Malveau RC, Brown WH, Mccormick III HW, Mowbray TJ (1998) Anti Patterns: Refactoring Software, Architectures, and Projects in Crisis, 1st edn. Wiley, New York

  • Cliff N (2014) Ordinal methods for behavioral data analysis. Psychology Press

  • Deligiannis I, Stamelos I, Angelis L, Roumeliotis M, Shepperd M (2004) A controlled experiment investigation of an object-oriented design heuristic for maintainability. J Syst Softw 72(2):129–143. https://doi.org/10.1016/S0164-1212(03)00240-1

    Article  Google Scholar 

  • Fowler M (1999) Refactoring: improving the design of existing code. Pearson Education, India

  • Fraser G, Arcuri A (2014) A large scale evaluation of automated unit test generation using evosuite. ACM Trans Softw Eng Methodol (TOSEM) 24(2):8

    Article  Google Scholar 

  • Griswold WG, Notkin D (1993) Automated assistance for program restructuring. ACM Trans Softw Eng Methodol (TOSEM) 2(3):228–269

    Article  Google Scholar 

  • Hassan AE (2009) Predicting faults using the complexity of code changes. In: 2009 IEEE 31St international conference on software engineering, pp 78–88. https://doi.org/10.1109/ICSE.2009.5070510

  • Hu X, Li G, Xia X, Lo D, Jin Z (2018) Deep code comment generation. In: Proceedings of the 26th Conference on Program Comprehension. ACM, pp 200–210

  • Kataoka Y, Imai T, Andou H, Fukaya T (2002) A quantitative evaluation of maintainability enhancement by refactoring. In: 2002. Proceedings. International conference on Software maintenance. IEEE, pp 576–585

  • Kim M, Zimmermann T, Nagappan N (2014) An empirical study of refactoringchallenges and benefits at microsoft. IEEE Trans Softw Eng 40(7):633–649. https://doi.org/10.1109/TSE.2014.2318734

    Article  Google Scholar 

  • Le Goues C, Dewey-Vogt M, Forrest S, Weimer W (2012) A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In: 2012 34th international conference on Software engineering (ICSE). IEEE, pp 3–13

  • Marinescu R (2004) Detection strategies: Metrics-based rules for detecting design flaws. In: IEEE Int’l conference on software maintenance, ICSM. IEEE Computer Society, pp 350–359

  • Mens T, Tourwé T (2004) A survey of software refactoring. IEEE Trans Software Engineering 30(2):126–139

    Article  Google Scholar 

  • Moghadam IH, Cinneide MO (2012) Automated refactoring using design differencing. In: Software Maintenance and Reengineering (CSMR), 16th European Conference on, Proceedings of the European Conference on Software Maintenance and Reengineering, CSMR. IEEE Computer Society, pp 43–52. https://doi.org/10.1109/CSMR.2012.15

  • Moha N, Gueheneuc YG, Duchien L, Le Meur A (2010) Decor: A method for the specification and detection of code and design smells. IEEE Trans Softw Eng 36(1):20–36

    Article  Google Scholar 

  • Morales R, Sabane A, Musavi P, Khomh F, Chicano F, Antoniol G (2016) Finding the best compromise between design quality and testing effort during refactoring. In: 2016 IEEE 23Rd international conference on software analysis, evolution, and reengineering (SANER), vol 1, pp 24–35

  • Morales R, Soh Z, Khomh F, Antoniol G, Chicano F (2017) On the use of developers’ context for automatic refactoring of software anti-patterns. J Syst Softw 128:236–251

    Article  Google Scholar 

  • Morales R, Chicano F, Khomh F, Antoniol G (2018) Efficient refactoring scheduling based on partial order reduction. Journal of Systems and Software. https://doi.org/10.1016/j.jss.2018.07.076

  • Morales R, Khomh F, Antoniol G (2019) Repor: Mimiking humans on refactoring tasks. replication website. https://moar82.github.io/emserefturing/

  • Moser R, Abrahamsson P, Pedrycz W, Sillitti A, Succi G (2008) A case study on the impact of refactoring on quality and productivity in an agile team. In: Balancing agility and formalism in software engineering. Springer, pp 252–266

  • Murphy-Hill E, Black A (2008) Refactoring tools: Fitness for purpose. IEEE Softw 25(5):38–44

    Article  Google Scholar 

  • Murphy-Hill E, Parnin C, Black AP (2012) How we refactor, and how we know it. IEEE Trans Softw Eng 38(1):5–18. https://doi.org/10.1109/TSE.2011.41

    Article  Google Scholar 

  • Negara S, Chen N, Vakilian M, Johnson RE, Dig D (2013) A comparative study of manual and automated refactorings. In: Castagna G (ed) ECOOP 2013 – Object-oriented programming. Springer, Berlin, pp 552–576

  • Opdyke WF (1992) Refactoring object-oriented frameworks. Ph.D. thesis., University of Illinois at Urbana-Champaign

  • Ouni A, Kessentini M, Sahraoui H, Inoue K, Hamdi MS (2015) Improving multi-objective code-smells correction using development history. J Syst Softw 105 (0):18–39

    Article  Google Scholar 

  • Palomba F, Bavota G, Penta MD, Oliveto R, Lucia AD (2014) Do they really smell bad? a study on developers’ perception of bad code smells. In: 2014 IEEE int’l conference on Software maintenance and evolution (ICSME). IEEE, pp 101–110

  • Parnas DL (1994) Software aging. In: ICSE ’94: Proc. Of the 16th int’l conference on software engineering. IEEE Computer Society Press, pp 279–287

  • Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the nsse and other surveys: Are the t-test and cohen’sd indices the most appropriate choices. In: Annual meeting of the southern association for institutional research

  • Seng O, Stammel J, Burkhart D (2006) Search-based determination of refactorings for improving the class structure of object-oriented systems. GECCO 2006: Genetic Evol Comput Conf 1 & 2:1909–1916

    Google Scholar 

  • Sharek D (2011) A useable, online nasa-tlx tool. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol 55. SAGE Publications, Sage, pp 1375–1379

    Google Scholar 

  • Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press

  • Sillito J, Murphy GC, De Volder K (2008) Asking and answering questions during a programming change task. IEEE Trans Softw Eng 34(4):434–451

    Article  Google Scholar 

  • Szőke G, Nagy C, Hegedűs P, Ferenc R, Gyimőthy T (2015) Do automatic refactorings improve maintainability? an industrial case study. In: 2015 IEEE International conference on software maintenance and evolution (ICSME), pp 429–438. https://doi.org/10.1109/ICSM.2015.7332494

  • Turing AM (2009) Computing machinery and intelligence. In: Parsing the turing test. Springer, pp 23–65

  • Vakilian M, Chen N, Negara S, Rajkumar BA, Bailey BP, Johnson RE (2012) Use, disuse, and misuse of automated refactorings. In: 2012 34th international conference on Software engineering (ICSE). IEEE, pp 233–243

  • Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2012) Experimentation in software engineering. Springer Science & Business Media

  • Xing Z, Stroulia E (2006) Refactoring practice: How it is and how it should be supported-an eclipse case study. In: 2006. ICSM’06. 22nd IEEE international conference on Software maintenance. IEEE, pp 458–468

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodrigo Morales.

Additional information

Communicated by: Vittorio Cortellessa

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Morales, R., Khomh, F. & Antoniol, G. RePOR: Mimicking humans on refactoring tasks. Are we there yet?. Empir Software Eng 25, 2960–2996 (2020). https://doi.org/10.1007/s10664-020-09826-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-020-09826-7

Keywords

Navigation