Skip to main content

BlackHorse: creating smart test cases from brittle recorded tests


Testing software that has a GUI is difficult. Manual testing is costly and error-prone, but recorded test cases frequently “break” due to changes in the GUI. Test cases intended to test business logic must therefore be converted to a less “brittle” form to lengthen their useful lifespan. In this paper, we describe BlackHorse, an approach to doing this that converts a recorded test case to Java code that bypasses the GUI. The approach was implemented within the testing environment of Research In Motion. We describe the design of the toolset and discuss lessons learned during the course of the project.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    In some shops, any piece of code that acts as a test case is referred to as a “unit test.” We prefer to reserve this phrase for a method that tests a small unit of source code, such as a single class.

  2. 2.

    Product and framework names have been given pseudonyms here for confidentiality reasons.

  3. 3.

    Note that the client therefore always had the ability to record test cases and convert them into sequences of keypresses. Test engineers were strongly discouraged from doing this, since it would simply have led to brittle recorded tests in the form of Java code.


  1. JUnit web site. Online. Accessed February 2012.

  2. SeleniumHQ: Web application testing system. Online. Accessed February 2012.

  3. SWTBot web site. Online. Accessed February 2012.

  4. Abbot framework for automated testing of Java GUI components and programs. Online. Accessed August 2011.

  5. Abdel Salam, M. A., Keshk, A. E., Ismail, N. A., & Nassar, H. M. (2007). Automated testing of Java menu-based GUIs using XML visual editor. 2007 International conference on computer engineering systems, pp. 313–318.

  6. Andrews, J. H., Li, F. C. H., & Menzies, T. (2007). Nighthawk: A two-level genetic-random unit test data generator. In Automated software Engineering (ASE), pp. 144–153.

  7. Baresi, L., Lanzi, P., Miraz, M. (2010). TestFul: An evolutionary test approach for Java. In International conference on software testing, verification and validation (ICST), pp. 185–194.

  8. Carino, S., Andrews, J., Goulding, S., Arunthavarajah, P., Florio, T., & Hertyk, J. (2012). Blackhorse: Creating smart test cases from brittle recorded tests. In Automation of software test (AST), 2012 7th international workshop on (pp. 89–95). IEEE.

  9. Dustin, E., Garrett, T., & Gauf, B. (2009). Implementing automated software testing: How to save time and lower costs while raising quality. Boston: Addison-Wesley Professional.

    Google Scholar 

  10. Elbaum, S., Chin, H. N., Dwyer, M. B., & Dokulil, J. (2006). Carving differential unit test cases from system test cases. In Foundations of software engineering (FSE), pp. 253–264.

  11. Fraser, G., & Zeller, A. (2010). Mutation-driven generation of unit tests and oracles. In International symposium on software testing and analysis (ISSTA), pp. 147–158.

  12. Gross, F., Fraser, G., & Zeller, A. (2012). EXSYST: Search-based GUI testing. In International conference on software engineering (ICSE), pp. 1423–1426.

  13. Joshi, S., & Orso, A. (2007). SCARPE: A technique and tool for selective capture and replay of program executions. In International conference on software maintenance (ICSM), pp. 234–243.

  14. Memon, A., Banerjee, I., & Nagarajan, A. (2003). What test oracle should I use for effective GUI testing? In Automated software engineering (ASE), pp. 164–173.

  15. Memon, A. M., Banerjee, I., & Nagarajan, A. (2003). GUI ripping: Reverse engineering of graphical user interfaces for testing. In Working conference on reverse engineering (WCRE), pp. 260–269.

  16. Newmarch, J. D. (1999). Testing java swing-based applications. In Technology of object-oriented language and systems (TOOLS) (pp. 156–165). IEEE Computer Society, Washington, DC, USA.

  17. Orso, A., & Kennedy, B. (2005). Selective capture and replay of program executions. In Proceedings of workshop on dynamic analysis (WODA) (pp. 1–7). ACM, New York, NY, USA.

  18. Pacheco, C., Lahiri, S. K., Ernst, M. D., & Ball, T. (2007). Feedback-directed random test generation. In International conference on software engineering (ICSE) (pp. 75–84). Minneapolis, MN.

  19. Saff, D., Artzi, S., Perkins, J. H., & Ernst, M. D. (2005). Automatic test factoring for Java. In Automated software engineering (ASE) (pp. 114–123). Long Beach, CA, USA.

  20. Silva, J. C., Silva, C., Gonçalo, R. D., Saraiva, J., & Campos, J. C. (2010). The GUISurfer tool: Towards a language independent approach to reverse engineering GUI code. In Engineering interactive computing systems (EICS), pp. 181–186.

  21. Xie, Q., & Memon, A. (2006). Studying the characteristics of a ‘good’ GUI test suite. In International symposium on software reliability engineering (ISSRE) (pp. 159 –168).

  22. Xie, Q., & Memon, A. (2007). Designing and comparing automated test oracles for GUI-based software applications. ACM Transactions on Software Engineering and Methodology, 16, Art. No. 4.

  23. Xie, Q., & Memon, A. M. (2005). Rapid ‘crash testing’ for continuously evolving GUI-based software applications. In International conference software maintenance (ICSM) (pp. 473–482).

  24. Yuan, X., Cohen, M., & Memon, A. M. (2007). Covering array sampling of input event sequences for automated GUI testing. In Automated software engineering (ASE) (pp. 405–408).

  25. Yuan, X., & Memon, A. (2007). Using GUI run-time state as feedback to generate test cases. In International conference on software engineering (ICSE) (pp. 396–405).

Download references


The authors would like to thank Mark Chatterley, Sebastian Elbaum, Ali Hesson, Johanne Leduc, and Lee Manchur for valuable discussions and comments. Thanks also to the anonymous referees of an earlier version of this paper. The work reported in this paper was supported by an Interaction grant and an Engage grant from the Natural Sciences and Engineering Research Council of Canada (NSERC).

Author information



Corresponding author

Correspondence to James H. Andrews.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Carino, S., Andrews, J.H., Goulding, S. et al. BlackHorse: creating smart test cases from brittle recorded tests. Software Qual J 22, 293–310 (2014).

Download citation


  • Software testing
  • Test recording and playback
  • Program generation