Skip to main content
Log in

Mining understandable state machine models from embedded code

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Program understanding is a time-consuming and tedious activity for software developers. Manually building abstractions from source code requires in-depth analysis of the code. Automatic extraction of such models is possible, but cannot derive meaningful abstractions that are not already contained in the code. The automated extraction even has problems to decide which aspects of the code are important and which are not. Therefore, interactive semi-automatic approaches are the compromise of choice. In this article, we describe how state machines that describe the behaviour of a function can be extracted from code. The approach includes interaction – the user decides which aspects of the identified potentially relevant information is really relevant and which is not. This helps to reduce the resulting state machines to an understandable degree. However, these state machines in their raw form have transition conditions that are very complex and thus not understandable for humans. Therefore, we also introduce a technique to reduce these guards to an understandable form. The technique is a combination of heuristic logic minimization, exploitation of infeasible paths, and using transition priorities. We evaluate the approach on industrial embedded C code, first in a case study with hundreds of extracted state machines, and then in two experiments with professional developers. The results show that the approach is highly effective in making the guards understandable, and that guards reduced by our approach and presented with priorities are easier to understand than guards without priorities. We also show that the overall approach is beneficial for program comprehension. The guard reduction approach itself is quite generic and can also be applied to other problems. We demonstrate that for the simplification of mode switch logic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. https://www.mathworks.com/products/stateflow.html

  2. https://www.etas.com/en/products/ascet-developer-ascet_md.php

  3. https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/eqntott/espresso-ab-1.0.tar.gz

  4. https://www.etas.com/en/products/scode-analyzer.php

References

  • Alimadadi S, Mesbah A, Pattabiraman K (2018) Inferring hierarchical motifs from execution traces. In: Proceedings of 40th International conference on software engineering, pp 776–787

  • Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: Proceedings 29th symp. on principles of programming languages, pp 4–16

  • Bae JH, Chae HS (2016) Systematic approach for constructing an understandable state machine from a contract-based specification: Controlled experiments. Softw Syst Modell 15(3):847–879

    Article  Google Scholar 

  • Bailey RA (2008) Design of comparative experiments. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge

  • Bryant RE (1986) Graph-based algorithms for boolean function manipulation. IEEE Trans Comput 35(8):677–691

    Article  Google Scholar 

  • Christensen LB (2001) Experimental methodology. Allyn & Bacon, Boston

  • Collins RW, Hevner AR, Walton GH, Linger RC (2008) The impacts of function extraction technology on program comprehension: a controlled experiment. Inf Softw Technol 50(11):1165–1179

    Article  Google Scholar 

  • Corbett JC, Dwyer MB, Hatcliff J, Laubach S, Pasareanu CS, Robby ZH (2000) Bandera: extracting finite-state models from Java source code. In: Proceedings of 22nd International conference on software engineering, pp 439–448

  • Cornelissen B, Zaidman A, van Deursen A (2011) A controlled experiment for program comprehension through trace visualization. IEEE Trans Softw Eng 37(3):341–355

    Article  Google Scholar 

  • Coudert O, Sasao T (2002) Two-level Logic Minimization, chap. 2. Springer, Boston, pp 1–27

  • Cruz-Lemus JA, Maes A, Genero M, Poels G, Piattini M (2010) The impact of structural complexity on the understandability of UML statechart diagrams. Inf Sci 180(11):2209–2220

    Article  MathSciNet  Google Scholar 

  • de Moura LM, Bjørner N (2008) Z3: an efficient SMT solver. In: Proceedings of 14th International conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 337–340

  • Eisenbarth T, Koschke R, Vogel G (2005) Static object trace extraction for programs with pointers. J Syst Softw 77(3):263–284

    Article  Google Scholar 

  • Fjeldstad RK, Hamlen WT (1984) Application program maintenance study: Report to our respondents. In: Proceedings GUIDE 48

  • Godefroid P, Klarlund N, Sen K (2005) DART: directed automated random testing. In: Proceedings of conf. on programming language design and implementation, pp 213–223

  • Gravino C, Scanniello G, Tortora G (2015) Source-code comprehension tasks supported by UML design models: Results from a controlled experiment and a differentiated replication. J Vis Lang Comput 28:23–38

    Article  Google Scholar 

  • Harel D, Naamad A (1996) The STATEMATE semantics of statecharts. ACM Trans Softw Eng Methodol (TOSEM) 5(4):293–333

    Article  Google Scholar 

  • Knor R, Trausmuth G, Weidl J (1998) Reengineering C/C++ source code by transforming state machines. In: van der Linden F (ed) Development and evolution of software architectures for product families. Springer, Berlin, pp 97–105

  • Kumar A (2009) SCHAEM: A method to extract statechart representation of FSMs. In: Proceedings of int’l advance computing conference, pp 1556–1561

  • Kung DC, Suchak N, Gao JZ, Hsia P, Toyoshima Y, Chen C (1994) On object state testing. In: Proceedings of 18th int’l computer software and applications conference (COMPSAC), pp 222–227

  • Lind-Nielsen J (1999) BuDDy: A binary decision diagram package. https://sourceforge.net/projects/buddy/

  • Lo D, Khoo SC, Liu C (2007) Efficient mining of iterative patterns for software specification discovery. In: Proceedings of 13th International conference on knowledge discovery and data mining, pp 460–469

  • Minelli R, Mocci A, Lanza M (2015) I know what you did last summer: An investigation of how developers spend their time. In: Proceedings of 23rd International conference on program comprehension, ICPC ’15, pp 25–35

  • Murphy GC, Notkin D, Sullivan K (1995) Software reflexion models: Bridging the gap between source and high-level models. SIGSOFT Softw Eng Notes 20(4):18–28

    Article  Google Scholar 

  • Parnas DL (1994) Software aging. In: Proceedings of 16th International conference on software engineering (ICSE), pp 279–287

  • Quante J (2008) Do dynamic object process graphs support program understanding? - a controlled experiment. In: 2008 16th IEEE international conference on program comprehension, pp 73–82

  • Ricca F, Torchiano M, Leotta M, Tiso A, Guerrini G, Reggio G (2018) On the impact of state-based model-driven development on maintainability: a family of experiments using unimod. Empir Softw Eng 23(3):1743–1790

    Article  Google Scholar 

  • Roehm T, Tiarks R, Koschke R, Maalej W (2012) How do professional developers comprehend software?. In: Proceedings of 34th International conference on software engineering, pp 255–265

  • Rudell RL (1986) Multiple-valued logic minimization for PLA synthesis. Technical report, California Univ. Berkeley Electronics Research Lab

  • Said W, Quante J, Koschke R (2018a) On state machine mining from embedded control software. In: Proceedings of 34th International conference on software maintenance and evolution, pp 163–172

  • Said W, Quante J, Koschke R (2018b) Towards interactive mining of understandable state machine models from embedded software. In: Proceedings of 6th International conference on model-driven engineering and software development (MODELSWARD), pp 117–128

  • Said W, Quante J, Koschke R (2019a) Do extracted state machine models help to understand embedded software?. In: Proceedings of 27th International conference on program comprehension (ICPC), pp 191–196

  • Said W, Quante J, Koschke R (2019b) Towards understandable guards of extracted state machines from embedded software. In: 26th International conference on software analysis, evolution and reengineering (SANER), pp 264–274

  • Scanniello G, Gravino C, Genero M, Cruz-Lemus JA, Tortora G, Risi M, Dodero G (2018) Do software models based on the UML aid in source-code comprehensibility? aggregating evidence from 12 controlled experiments. Empir Softw Eng 23(5):2695–2733

    Article  Google Scholar 

  • Sen K, Marinov D, Agha G (2005) CUTE: a concolic unit testing engine for C. In: Proceedings of 10th european software engineering conf. / 13th int’l symp. on foundations of software engineering, pp 263–272

  • Sen T, Mall R (2016) Extracting finite state representation of Java programs. Softw Syst Model 15(2):497–511

    Article  Google Scholar 

  • Shoham S, Yahav E, Fink SJ, Pistoia M (2008) Static specification mining using automata-based abstractions. IEEE Trans Softw Eng 34(5):651–666

    Article  Google Scholar 

  • Snelting G, Robschink T, Krinke J (2006) Efficient path conditions in dependence graphs for software safety analysis. ACM Trans Softw Eng Methodol 15 (4):410–457

    Article  Google Scholar 

  • Somė SS, Lethbridge T (2002) Enhancing program comprehension with recovered state models. In: 10th int’l workshop on program comprehension (IWPC), pp 85–93

  • Tarjan RE (1974) Testing flow graph reducibility. J Comput Syst Sci 9(3):355–365

    Article  MathSciNet  Google Scholar 

  • Tonella P, Potrich A (2011) Reverse engineering of object oriented code. Springer Publishing Company, Incorporated

  • van den Brand M, Serebrenik A, van Zeeland D (2008) Extraction of state machines of legacy C code with cpp2XMI. In: Proceedings of 7th belgian-netherlands software evolution workshop, pp 28–30

  • Vasu S, Kust O (2018) SCODE: Designing and verifying functionally safe systems in conformance to IEC61508 and ISO26262. In: Bargende M, Reuss HC, Wiedemann J (eds) Internationales Stuttgarter symposium, vol 18. Springer Fachmedien, Wiesbaden, pp 981–992

  • Walkinshaw N, Bogdanov K, Ali S, Holcombe M (2008) Automated discovery of state transitions and their functions in source code. Softw Test Verif Reliab 18(2):99–121

    Article  Google Scholar 

  • Whaley J, Martin MC, Lam MS (2002) Automatic extraction of object-oriented component interfaces. SIGSOFT Softw Eng Notes 27(4):218–228

    Article  Google Scholar 

  • Xiao H, Sun J, Liu Y, Lin S, Sun C (2013) TzuYu: Learning stateful typestates. In: Proceedings of 28th International conference on automated software engineering, pp 432–442

  • Xie T, Martin E, Yuan H (2006) Automatic extraction of abstract-object-state machines from unit-test executions. In: Proceedings of 28th International conference on software engineering, pp 835–838

  • Zimmerman MK, Lundqvist K, Leveson N (2002) Investigating the readability of state-based formal requirements specification languages. In: Proceedings of 24th International conference on software engineering, pp 33–43

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wasim Said.

Additional information

Communicated by: Massimiliano Di Penta and David D. Shepherd

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Software Analysis, Evolution and Reengineering (SANER)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Said, W., Quante, J. & Koschke, R. Mining understandable state machine models from embedded code. Empir Software Eng 25, 4759–4804 (2020). https://doi.org/10.1007/s10664-020-09865-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-020-09865-0

Keywords

Navigation