Formal Methods in System Design

, Volume 46, Issue 1, pp 1–41 | Cite as

Generating models of infinite-state communication protocols using regular inference with abstraction

  • Fides Aarts
  • Bengt Jonsson
  • Johan Uijen
  • Frits Vaandrager
Article

Abstract

In order to facilitate model-based verification and validation, effort is underway to develop techniques for generating models of communication system components from observations of their external behavior. Most previous such work has employed regular inference techniques which generate modest-size finite-state models. They typically suppress parameters of messages, although these have a significant impact on control flow in many communication protocols. We present a framework, which adapts regular inference to include data parameters in messages and states for generating components with large or infinite message alphabets. A main idea is to adapt the framework of predicate abstraction, successfully used in formal verification. Since we are in a black-box setting, the abstraction must be supplied externally, using information about how the component manages data parameters. We have implemented our techniques by connecting the LearnLib tool for regular inference with an implementation of session initiation protocol (SIP) in ns-2 and an implementation of transmission control protocol (TCP) in Windows 8, and generated models of SIP and TCP components.

Keywords

Active automata learning Mealy machines Abstraction techniques Communication protocols Session initiation protocol Transmission control protocol 

List of symbols

\(\mathcal{A}\)

Mapper

\(\mathcal{A}_S\)

Symbolic mapper

\(\mathcal{H}\)

Hypothesis (Mealy machine)

\(\mathcal{M}\)

Mealy machine

\(\mathcal{M}_S\)

Symbolic Mealy machine

\(H\)

Set of states of hypothesis

\(I\)

Set of (concrete) input symbols

\(O\)

Set of (concrete) output symbols

\(Q\)

Set of states of a Mealy machine

\(R\)

Set of mapper states

\(T\)

Set of event terms

\(V\)

Set of variables

\(X\)

Set of (abstract) input symbols

\(Y\)

Set of (abstract) output symbols

\(a\)

(Input or output) symbol

\(d\)

Parameter value

\(e\)

Term

\(h\)

State of hypothesis

\(h_0\)

Initial state of hypothesis

\(i\)

(Concrete) input symbol

\(j, k, l, m, n\)

Index

\(o\)

(Concrete) output symbol

\(p\)

Parameter

\(q\)

State of Mealy machine

\(q_0\)

Initial state of Mealy machine

\(r\)

State of mapper

\(r_0\)

Initial state of mapper

\(s\)

Sequence of output symbols

\(t\)

Term

\(u\)

Sequence of input symbols

\(v\)

Variable

\(w\)

Sequence of input and output symbols

\(x\)

Abstract input symbol

\(y\)

Abstract output symbol

\(\alpha _\mathcal{A}\)

Abstraction induced by \(\mathcal{A}\)

\(\gamma _\mathcal{A}\)

Concretization induced by \(\mathcal{A}\)

\(\delta \)

Update function

\(\epsilon \)

Empty sequence

\(\varepsilon \)

Event primitive

\(\xi \)

Valuation

\(\tau _\mathcal{A}\)

Observation abstraction function induced by \(\mathcal{A}\)

\(\varphi \)

Formula

\(\psi \)

Abstraction function

\(\varDelta \)

Set of (symbolic) transitions

\(\varTheta \)

Initial condition

\(\varSigma \)

Event signature

\(\varPsi \)

Set of event abstractions

\(\perp \)

Undefined value

\(\rightarrow \)

Transition relation

\(\Rightarrow \)

Transition relation extended to sequences

\(\equiv \)

Syntactic equality (of terms)

\(\approx \)

Observation equivalence (of Mealy machines)

\(\le \)

Implementation preorder/behavior inclusion (of Mealy machines)

\(\approx _{wb}\)

Observation congruence (of CCS expressions)

References

  1. 1.
    Aarts F, Heidarian F, Kuppens H, Olsen P, Vaandrager FW (2012) Automata learning through counterexample-guided abstraction refinement. In: Giannakopoulou D, Méry D (eds) 18th international symposium on formal methods (FM 2012), Paris, France, August 27–31, 2012. Proceedings, volume 7436 of lecture notes in computer science. Springer, Berlin, pp 10–27. AugustGoogle Scholar
  2. 2.
    Aarts F, Heidarian F, Vaandrager FW (2012) A theory of abstractions for learning interface automata. In: Koutny M, Ulidowski I (eds) 23rd international conference on concurrency theory (CONCUR), Newcastle upon Tyne, UK, September 3–8, 2012. Proceedings, volume 7454 of lecture notes in computer science. Springer, Berlin, pp 240–255Google Scholar
  3. 3.
    Aarts F, Jonsson B, Uijen J (2010) Generating models of infinite-state communication protocols using regular inference with abstraction. In: Petrenko A, Maldonado JC, Simao A (eds) 22nd IFIP international conference on testing software and systems, Natal, Brazil, November 8–10, Proceedings, volume 6435 of lecture notes in computer science. Springer, Berlin, pp 188–204Google Scholar
  4. 4.
    Aarts F, Kuppens H, Tretmans GJ, Vaandrager FW, Verwer S (2012) Learning and testing the bounded retransmission protocol. In: Heinz J, de la Higuera C, and Oates T (eds) Proceedings 11th international conference on grammatical inference (ICGI 2012), September 5–8, 2012. University of Maryland, College Park, USA, volume 21 of JMLR workshop and conference proceedings, pp 4–18Google Scholar
  5. 5.
    Aarts F, Schmaltz J, Vaandrager FW (2010) Inference and abstraction of the biometric passport. In: Margaria T, Steffen B (eds) Leveraging applications of formal methods, verification, and balidation—4th international symposium on leveraging applications, ISoLA 2010, Heraklion, Crete, Greece, October 18–21, 2010, Proceedings, part I, volume 6415 of lecture notes in computer science. Springer, Berlin, pp 673–686Google Scholar
  6. 6.
    Ammons G, Bodik R, Larus J (2002) Mining specifications. In: Proceedings of 29th ACM symposium on principles of programming languages, pp 4–16Google Scholar
  7. 7.
    Angluin D (1987) Learning regular sets from queries and counterexamples. Inf Comput 75(2):87–106CrossRefMATHMathSciNetGoogle Scholar
  8. 8.
    Ball T, Rajamani SK (2002) The SLAM project: debugging system software via static analysis. In: Proceedings of the 29th ACM symposium on principles of programming languages, pp 1–3Google Scholar
  9. 9.
    Berg T, Jonsson B, Raffelt H (2006) Regular inference for state machines with parameters. In: Baresi L, Heckel R (eds) FASE, volume 3922 of lecture notes in computer science. Springer, Berlin, pp 107–121Google Scholar
  10. 10.
    Bergstra JA, Ponse A, Smolka SA (2001) editors. Handbook of process algebra. North-HollandGoogle Scholar
  11. 11.
    Broy M, Jonsson B, Katoen J-P, Leucker M, Pretschner A (2004) editors. Model-based testing of reactive systems, volume 3472 of lecture notes in computer science. Springer, BerlinGoogle Scholar
  12. 12.
    Brun Y, Ernst MD (2004) Finding latent code errors via machine learning over program executions. In: ICSE’04: 26th international conference on software engineringGoogle Scholar
  13. 13.
    Cassel S, Howar F, Jonsson B, Merten M, Steffen B (2011) A succinct canonical register automaton model. In: Bultan T, Hsiung P-A (eds) Automated technology for verification and analysis, 9th international symposium, ATVA 2011, Taipei, Taiwan, October 11–14, 2011. In: Bultan T, Hsiung P-A (eds) Proceedings, volume 6996 of lecture notes in computer science. Springer, Berlin, pp 366–380Google Scholar
  14. 14.
    Yuan CC, Domagoj B, ECR Shin, Song Dawn (2010) Inference and analysis of formal models of botnet command and control protocols. In: Al-Shaer E, Keromytis AD, and Shmatikov V (eds) ACM conference on computer and communications security. ACM, pp 426–439Google Scholar
  15. 15.
    Clarke EM, Grumberg O, Jha S, Lu Y, Veith H (2003) Counterexample-guided abstraction refinement for symbolic model checking. J ACM 50(5):752–794CrossRefMathSciNetGoogle Scholar
  16. 16.
    Cobleigh JM, Giannakopoulou D, Pasareanu CS (2003) Learning assumptions for compositional verification. In: Proceedings of the TACAS ’03, 9th international conference on tools and algorithms for the construction and analysis of systems, volume 2619 of lecture notes in computer science. Springer, Berlin, pp 331–346Google Scholar
  17. 17.
    Fiterău-Broştean P, Janssen R, Vaandrager FW (2014) Learning fragments of the TCP network protocol. In: Lang F, Flammini F (eds) Proceedings 19th international workshop on formal methods for industrial critical systems (FMICS’14), Florence, Italy, volume 8718 of lecture notes in computer science. Springer, Berlin, pp 78–93Google Scholar
  18. 18.
    van Glabbeek RJ (1993) The linear time—branching time spectrum II (the semantics of sequential systems with silent moves). In: Best E (ed) Proceedings CONCUR 93, Hildesheim, Germany, volume 715 of lecture notes in computer science. Springer, BerlinGoogle Scholar
  19. 19.
    Gold EM (1967) Language identification in the limit. Inf Control 10(5):447–474CrossRefMATHGoogle Scholar
  20. 20.
    Grieskamp W, Kicillof N, Stobie K, Braberman V (2011) Model-based quality assurance of protocol documentation: tools and methodology. Softw Test Verif Reliab 21(1):55–71CrossRefGoogle Scholar
  21. 21.
    Grinchtein O (2008) Learning of timed systems. PhD thesis, Dept. of IT, Uppsala University, SwedenGoogle Scholar
  22. 22.
    Grinchtein O, Jonsson B, Leucker M (2004) Learning of event-recording automata. In: Proceedings of the joint conferences FORMATS and FTRTFT, volume 3253 of LNCS, pp 379–396Google Scholar
  23. 23.
    Groce A, Peled D, Yannakakis M (2002) Adaptive model checking. In: Katoen J-P, Stevens P (eds) Proceedings of the TACAS ’02, 8th international conference on tools and algorithms for the construction and analysis of systems, volume 2280 of lecture notes in computer science. Springer, Berlin, pp 357–370Google Scholar
  24. 24.
    Groz R, Li K, Petrenko A, Shahbaz M (2008) Modular system verification by inference, testing and reachability analysis. In: TestCom/FATES, volume 5047 of lecture notes in computer science, pp 216–233Google Scholar
  25. 25.
    Grumberg O, Veith H (eds) (2008) 25 years of model checking: history, achievements, perspectives, volume 5000 of lecture notes in computer science. Springer, BerlinGoogle Scholar
  26. 26.
    Hagerer A, Hungar H, Niese O, Steffen B (2002) Model generation by moderated regular extrapolation. In: Kutsche R-D, Weber H (eds) Proceedings of the FASE ’02, 5th international conference on fundamental approaches to software engineering, volume 2306 of lecture notes in computer science. Springer, Berlin, pp 80–95Google Scholar
  27. 27.
    Henzinger TA, Jhala R, Majumdar R, Sutre G (2002) Lazy abstraction. In: Proceedings of the 29th ACM symposium on principles of programming languages, pp 58–70Google Scholar
  28. 28.
    Howar F, Isberner M, Steffen B, Bauer O, Jonsson B (2012) Inferring semantic interfaces of data structures. In: ISoLA (1): leveraging applications of formal methods, verification and validation. Technologies for mastering change—5th international symposium, ISoLA 2012, Heraklion, Crete, Greece, October 15–18, 2012, Proceedings, part I, volume 7609 of lecture notes in computer science. Springer, Berlin, pp 554–571Google Scholar
  29. 29.
    Howar F, Steffen B, Merten M (2011) Automata learning with automated alphabet abstraction refinement. In: VMCAI, volume 6538 of lecture notes in computer science. Springer, Berlin, pp 263–277Google Scholar
  30. 30.
    Huima A (2007) Implementing conformiq qtronic. In: Petrenko A, Veanes M, Tretmans J, and Grieskamp W (eds) Proceedings of the TestCom/FATES, Tallinn, Estonia, June, 2007, volume 4581 of lecture notes in computer science, pp 1–12Google Scholar
  31. 31.
    Hungar H, Niese O, Steffen B (2003) Domain-specific optimization in automata learning. In: Proceedings of the 15th international conference on computer aided verificationGoogle Scholar
  32. 32.
    Janssen R (2013) Learning a state diagram of TCP using abstraction. Bachelor thesis, ICIS, Radboud University NijmegenGoogle Scholar
  33. 33.
    Jonsson B (1994) Compositional specification and verification of distributed systems. ACM Trans Progr Lang Syst 16(2):259–303CrossRefGoogle Scholar
  34. 34.
    Kearns MJ, Vazirani UV (1994) An introduction to computational learning theory. MIT Press, Cambridge, MAGoogle Scholar
  35. 35.
    Li K, Groz R, Shahbaz M (2006) Integration testing of distributed components based on learning parameterized I/O models. In: Najm E, Pradat-Peyre J-F, Donzeau-Gouge V (eds) FORTE, volume 4229 of lecture notes in computer science, pp 436–450Google Scholar
  36. 36.
    Loiseaux C, Graf S, Sifakis J, Bouajjani A, Bensalem S (1995) Property preserving abstractions for the verification of concurrent systems. Form Methods Syst Des 6(1):11–44CrossRefMATHGoogle Scholar
  37. 37.
    Lorenzoli D, Mariani L, Pezzè M (2008) Automatic generation of software behavioral models. In: Proceedings of the ICSE’08: 30th international conference on software enginering, pp 501–510Google Scholar
  38. 38.
    Mariani L, Pezz M (2007) Dynamic detection of COTS components incompatibility. IEEE Softw 24(5):76–85CrossRefGoogle Scholar
  39. 39.
    Merten M, Howar F, Steffen B, Cassel S, Jonsson B (2012) Demonstrating learning of register automata. In: Flanagan C, König B (eds) Tools and algorithms for the construction and analysis of systems—18th international conference, TACAS 2012, Held as part of the European joint conferences on theory and practice of software, ETAPS 2012, Tallinn, Estonia, March 24–April 1, 2012. Proceedings, volume 7214 of lecture notes in computer science. Springer, Berlin, pp 466–471Google Scholar
  40. 40.
    Merten M, Steffen B, Howar F, Margaria T (2011) Next generation LearnLib. In: Abdulla PA, Leino KRM (eds) TACAS, volume 6605 of lecture notes in computer science. Springer, Berlin, pp 220–223Google Scholar
  41. 41.
    Milner R (1989) Communication and concurrency. Prentice-Hall, Englewood Cliffs, NJMATHGoogle Scholar
  42. 42.
    Mohri M (1997) Finite-state transducers in language and speech processing. Comput Linguist 23(2):269–311MathSciNetGoogle Scholar
  43. 43.
    Niese O (2003) An integrated approach to testing complex systems. Technical report, Dortmund University, Doctoral thesisGoogle Scholar
  44. 44.
    The Network Simulator NS-2. http://www.isi.edu/nsnam/ns/
  45. 45.
    Peled D, Vardi MY, Yannakakis M (1999) Black box checking. In: Wu J, Chanson ST, Gao Q (eds) Formal methods for protocol engineering and distributed systems, FORTE/PSTV. Kluwer, Beijing, pp 225–240CrossRefGoogle Scholar
  46. 46.
    J. Postel (ed) (1981) Transmission control protocol—DARPA internet program protocol specification (RFC 3261), September 1981. http://www.ietf.org/rfc/rfc793.txt
  47. 47.
    Raffelt H, Steffen B, Berg T, Margaria T (2009) LearnLib: a framework for extrapolating behavioral models. STTT 11(5):393–407CrossRefGoogle Scholar
  48. 48.
    Rivest RL, Schapire RE (1993) Inference of finite automata using homing sequences. Inf Comput 103:299–347CrossRefMATHMathSciNetGoogle Scholar
  49. 49.
    Rosenberg J, Schulzrinne H, Camarillo G, Johnston A, Peterson J, Sparks R, Handley M, and Schooler E (2002) SIP: session initiation protocol (RFC 3261), June 2002. http://www.ietf.org/rfc/rfc3261.txt
  50. 50.
    Shahbaz M, Li K, Groz R (2007) Learning and integration of parameterized components through testing. In: Petrenko A, Veanes M, Tretmans J, and Grieskamp W (eds) TestCom/FATES, volume 4581 of lecture notes in computer science. Springer, Berlin, pp 319–334Google Scholar
  51. 51.
    Shu G, Lee D (2007) Testing security properties of protocol implementations - a machine learning based approach. In: Proceedings of the ICDCS’07, 27th IEEE international conference on distributed computing systems, Toronto, Ontario. IEEE Computer SocietyGoogle Scholar
  52. 52.
    Smeenk W (2012) Applying automata learning to complex industrial software. Master thesis, Radboud University Nijmegen, SeptemberGoogle Scholar
  53. 53.
    Stevens WR (1994) TCP/IP illustrated, volume 1: the protocols. Addison Wesley Longman Inc, Reading, MAGoogle Scholar
  54. 54.
    Tretmans J (1992) A formal approach to conformance testing. PhD thesis, University of Twente, DecemberGoogle Scholar
  55. 55.
    Uijen J (2009) Learning models of communication protocols using abstraction techniques. Master thesis, Radboud University Nijmegen and Uppsala University, NovemberGoogle Scholar
  56. 56.
    Veanes M, Campbell C, Grieskamp W, Schulte W, Tillmann W, Nachmanson L (2008) Model-based testing of object-oriented reactive systems with spec explorer. In: Hierons RM, Bowen JP, Harman M (eds) Formal methods and testing, an outcome of the FORTEST network, revised selected papers, volume 4949 of lecture notes in computer science. Springer, Berlin, pp 39–76Google Scholar
  57. 57.
    Veanes M, Hooimeijer P, Livshits B, Molnar D, Bjørner N Symbolic finite state transducers: algorithms and applications. In: Field J, Hicks M (eds) Proceedings of the 39th ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL 2012, Philadelphia, Pennsylvania, USA, January 22–28, 2012, pp 137–150. ACM, 2012Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Fides Aarts
    • 1
  • Bengt Jonsson
    • 3
  • Johan Uijen
    • 1
    • 2
  • Frits Vaandrager
    • 1
  1. 1.Institute for Computing and Information SciencesRadboud University NijmegenNijmegenThe Netherlands
  2. 2.CGI Nederland B.V.RotterdamThe Netherlands
  3. 3.Department of Computer SystemsUppsala UniversityUppsalaSweden

Personalised recommendations