Automatic Transition System Model Identification for Network Applications from Packet Traces

  • Zeynab Sabahi-KavianiEmail author
  • Fatemeh Ghassemi
  • Fateme Bajelan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10522)


A wide range of network management tasks such as balancing bandwidth usage, firewalling, anomaly detection and differentiating traffic pricing, depend on accurate traffic classification. Due to the diversity and variability of network applications, port-based and statistical signature detection approaches become inefficient and hence, behavioral classification approaches have been considered recently. However, so far, there is no automated general method to obtain the behavioral models of applications. In this research, we propose an automatic procedure to infer a transition system model from generated traffic of an application. Our approach is based on passive automata learning theory and evidence driven state merging technique using the rules of the network domain. We consider the behavior of well-known network protocols to generate the model which includes unobserved behaviors and excludes invalid ones as much as possible. To this aim, we present a new equivalence relation regarding the given protocol behaviors to induce proper state merging conditions. This idea has led the time complexity order of the algorithm to be linear rather than exponential. Finally, we apply the model of some real applications to evaluate the precision and execution time of our approach.


  1. 1.
    Moore, A.W., Papagiannaki, K.: Toward the accurate identification of network applications. In: Dovrolis, C. (ed.) PAM 2005. LNCS, vol. 3431, pp. 41–54. Springer, Heidelberg (2005). doi: 10.1007/978-3-540-31966-5_4 CrossRefGoogle Scholar
  2. 2.
    Sen, S., Spatscheck, O., Wang, D.: Accurate, scalable in-network identification of p2p traffic using application signatures. In: Proceedings of 13th International Conference on World Wide Web, pp. 512–521. ACM (2004)Google Scholar
  3. 3.
    Moore, A., Zuev, D.: Internet traffic classification using bayesian analysis techniques, In: ACM SIGMETRICS Performance Evaluation Review, vol. 33, no. 1, pp. 50–60. ACM (2005)Google Scholar
  4. 4.
    McGregor, A., Hall, M., Lorier, P., Brunskill, J.: Flow clustering using machine learning techniques. In: Barakat, C., Pratt, I. (eds.) PAM 2004. LNCS, vol. 3015, pp. 205–214. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-24668-8_21 CrossRefGoogle Scholar
  5. 5.
    Bermolen, P., Mellia, M., Meo, M., Rossi, D., Valenti, S.: Abacus: accurate behavioral classification of P2P-TV traffic. Comput. Netw. 55(6), 1394–1411 (2011)CrossRefGoogle Scholar
  6. 6.
    Fall, K., Stevens, R.: TCP/IP illustrated, volume 1: The Protocols. Addison-Wesley (2011)Google Scholar
  7. 7.
    Gold, E.: Language identification in the limit. Inf. Control 10(5), 447–474 (1967)CrossRefzbMATHMathSciNetGoogle Scholar
  8. 8.
    Lang, K.J., Pearlmutter, B.A., Price, R.A.: Results of the abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In: Honavar, V., Slutzki, G. (eds.) ICGI 1998. LNCS, vol. 1433, pp. 1–12. Springer, Heidelberg (1998). doi: 10.1007/BFb0054059 CrossRefGoogle Scholar
  9. 9.
    Pnueli, A., Xu, J., Zuck, L.: Liveness with (0, 1, infty)-counter abstraction. In: Proceedings of 14th CAV, pp. 107–122. Springer, Heidelberg (2002)Google Scholar
  10. 10.
    Parsons, C.: Deep Packet Inspection in Perspective: Tracing Its Lineage and Surveillance potentials. Citeseer (2008)Google Scholar
  11. 11.
    Heule, M.J.H., Verwer, S.: Exact DFA identification using SAT solvers. In: Sempere, J.M., García, P. (eds.) ICGI 2010. LNCS, vol. 6339, pp. 66–79. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15488-1_7 CrossRefGoogle Scholar
  12. 12.
    Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Biermann, A., Feldman, J.: On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput. 100(6), 592–597 (1972)CrossRefzbMATHMathSciNetGoogle Scholar
  14. 14.
    Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT press, Cambridge (2008). vol. 26202649zbMATHGoogle Scholar
  15. 15.
    van Glabbeek, R.: The Linear Time - Branching Time Spectrum, pp. 278–297. Springer, Heidelberg (1990)Google Scholar
  16. 16.
    Emerson, E.A., Trefler, R.J.: From asymmetry to full symmetry: new techniques for symmetry reduction in model checking. In: Pierre, L., Kropf, T. (eds.) CHARME 1999. LNCS, vol. 1703, pp. 142–157. Springer, Heidelberg (1999). doi: 10.1007/3-540-48153-2_12 CrossRefGoogle Scholar
  17. 17.
    Sabahi, Z., Ghassemi, F., Bajelan, F.: Automatic transition system model identifications for network applications from packet traces, January 2017.
  18. 18.
    Lorenzoli, D., Mariani, L., Pezzè, M.: Automatic generation of software behavioral models. In: Proceedings of 30th ICSE, pp. 501–510. ACM (2008)Google Scholar
  19. 19.
    Ernst, M., Cockrell, J., Griswold, W., Notkin, D.: Dynamically discovering likely program invariants to support program evolution. IEEE Trans. Softw. Eng. 27(2), 99–123 (2001)CrossRefGoogle Scholar
  20. 20.
    Khalili, A., Tacchella, A.: Learning nondeterministic mealy machines. In: Proceedings of 12th ICGI, pp. 109–123 (2014)Google Scholar
  21. 21.
    Verwer, S.: “Efficient identification of timed automata: Theory and practice,” Ph.D. dissertation, TU Delft, Delft University of Technology (2010)Google Scholar
  22. 22.
    Aarts, F., Jonsson, B., Uijen, J.: Generating models of infinite-state communication protocols using regular inference with abstraction. In: Petrenko, A., Simão, A., Maldonado, J.C. (eds.) ICTSS 2010. LNCS, vol. 6435, pp. 188–204. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-16573-3_14 CrossRefGoogle Scholar
  23. 23.
    Walkinshaw, N., Derrick, J., Guo, Q.: Iterative refinement of reverse-engineered models by model-based testing. In: Cavalcanti, A., Dams, D.R. (eds.) FM 2009. LNCS, vol. 5850, pp. 305–320. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-05089-3_20 CrossRefGoogle Scholar
  24. 24.
    Lo, D., Maoz, S.: Scenario-based and value-based specification mining: better together. Autom. Softw. Eng. 19(4), 423–458 (2012)CrossRefGoogle Scholar
  25. 25.
    Wang, Y., Zhang, Z., Yao, D.D., Qu, B., Guo, L.: Inferring protocol state machine from network traces: a probabilistic approach. In: Lopez, J., Tsudik, G. (eds.) ACNS 2011. LNCS, vol. 6715, pp. 1–18. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-21554-4_1 CrossRefGoogle Scholar
  26. 26.
    Antunes, J., Neves, N., Verissimo, P.: Reverse engineering of protocols from network traces. In: Proceedings of 18th WCRE, pp. 169–178, October 2011Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2017

Authors and Affiliations

  • Zeynab Sabahi-Kaviani
    • 1
    Email author
  • Fatemeh Ghassemi
    • 1
  • Fateme Bajelan
    • 1
  1. 1.School of Electrical and Computer EngineeringUniversity of TehranTehranIran

Personalised recommendations