Online Testing with Reinforcement Learning

  • Margus Veanes
  • Pritam Roy
  • Colin Campbell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4262)


Online testing is a practical technique where test derivation and test execution are combined into a single algorithm. In this paper we describe a new online testing algorithm that optimizes the choice of test actions using Reinforcement Learning (RL) techniques. This provides an advantage in covering system behaviors in less time than with a purely random choice of test actions. Online testing with conformance checking is modeled as a 1\(\frac{1}{2}\)-player game, or Markov Decision Process (MDP), between the tester as one player and the implementation under test (IUT) as the opponent. Our approach has been implemented in C#, and benchmark results are presented in the paper. The specifications that generate the tests are written as model programs in any .NET language such as C# or VB.


Model Program Reinforcement Learn Markov Decision Process Player Game Label Transition System 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alur, R., Courcoubetis, C., Yannakakis, M.: Distinguishing tests for nondeterministic and probabilistic machines. In: Proc. 27th Ann. ACM Symp. Theory of Computing, pp. 363–372 (1995)Google Scholar
  2. 2.
    Blass, A., Gurevich, Y., Nachmanson, L., Veanes, M.: Play to test. Technical Report MSR-TR-2005-04, Microsoft Research, (January 2005); Short version of this report was presented at FATES 2005Google Scholar
  3. 3.
    Brinksma, E., Tretmans, J.: Testing Transition Systems: An Annotated Bibliography. In: Cassez, F., Jard, C., Rozoy, B., Dermot, M. (eds.) MOVEP 2000. LNCS, vol. 2067, pp. 187–193. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  4. 4.
    Campbell, C., Veanes, M.: State exploration with multiple state groupings. In: Beauquier, D., Börger, E., Slissenko, A. (eds.) 12th International Workshop on Abstract State Machines, ASM 2005, Laboratory of Algorithms, Complexity and Logic, University Paris 12 – Val de Marne, Créteil, France, March 8–11, pp. 119–130 (2005)Google Scholar
  5. 5.
    Chakrabarti, A., de Alfaro, L., Henzinger, T.A., Mang, F.Y.C.: Synchronous and bidirectional component interfaces. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 414–427. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  6. 6.
    Chatterjee, K., de Alfaro, L., Henzinger, T.A.: Trading memory for randomness. In: QEST, pp. 206–217 (2004)Google Scholar
  7. 7.
    de Alfaro, L.: Game models for open systems. In: Dershowitz, N. (ed.) Verification: Theory and Practice. LNCS, vol. 2772, pp. 269–289. Springer, Heidelberg (2004)Google Scholar
  8. 8.
    de Alfaro, L., Henzinger, T.A.: Interface automata. In: Proceedings of the 8th European Software Engineering Conference and the 9th ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pp. 109–120. ACM, New York (2001)Google Scholar
  9. 9.
    Grieskamp, W., Gurevich, Y., Schulte, W., Veanes, M.: Generating finite state machines from abstract state machines. In: ISSTA 2002. Software Engineering Notes, vol. 27, pp. 112–122. ACM, New York (2002)CrossRefGoogle Scholar
  10. 10.
    Grieskamp, W., Tillmann, N., Schulte, W.: Xrt – exploring runtime for. NET architecture and applications. Technical Report MSR-TR-2005-63, Microsoft Research (June 2005); Presented at SoftMC 2005Google Scholar
  11. 11.
    Jard, C., Jéron, T.: TGV: theory, principles and algorithms. In: The Sixth World Conference on Integrated Design and Process Technology, IDPT 2002, Pasadena, California (June 2002)Google Scholar
  12. 12.
    Kaelbling, L., Littman, M., Moore, A.: Reinforcement learning: A survey (1996)Google Scholar
  13. 13.
    Li, H., Lam, C.: Using Anti-Ant-like Agents to Generate Test Threads from the UML Diagrams. In: Khendek, F., Dssouli, R. (eds.) TestCom 2005. LNCS, vol. 3502, pp. 69–80. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  14. 14.
    Peled, D., Vardi, M.Y., Yannakakis, M.: Black box checking. In: FORTE, pp. 225–240 (1999)Google Scholar
  15. 15.
    Păsăreanu, C.S., Pelánek, R., Visser, W.: Concrete Model Checking with Abstract Matching and Refinement. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 52–66. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT, Cambridge (1998), URL: Google Scholar
  17. 17.
    Tretmans, J., Belinfante, A.: Automatic testing with formal methods. In: EuroSTAR 1999: 7th European Int. Conference on Software Testing, Analysis & Review, Barcelona, Spain. EuroStar Conferences, Galway, Ireland, November 8–12 (1999)Google Scholar
  18. 18.
    Tretmans, J., Brinksma, E.: TorX: Automated model based testing. In: 1st European Conference on Model Driven Software Engineering, Nuremberg, Germany, pp. 31–43 (December 2003)Google Scholar
  19. 19.
    van der Bij, M., Rensink, A., Tretmans, J.: Compositional testing with ioco. In: Petrenko, A., Ulrich, A. (eds.) FATES 2003. LNCS, vol. 2931, pp. 86–100. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Veanes, M., Campbell, C., Schulte, W., Tillmann, N.: Online testing with model programs. In: ESEC/FSE-13: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering, pp. 273–282. ACM, New York (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Margus Veanes
    • 1
  • Pritam Roy
    • 2
  • Colin Campbell
    • 1
  1. 1.Microsoft ResearchRedmondUSA
  2. 2.University of CaliforniaSanta CruzUSA

Personalised recommendations