SISAP 2016: Similarity Search and Applications pp 34-47 | Cite as
A Free Energy Foundation of Semantic Similarity in Automata and Languages
Abstract
This paper develops a free energy theory from physics including the variational principles for automata and languages and also provides algorithms to compute the energy as well as efficient algorithms for estimating the nondeterminism in a nondeterministic finite automaton. This theory is then used as a foundation to define a semantic similarity metric for automata and languages. Since automata are a fundamental model for all modern programs while languages are a fundamental model for the programs’ behaviors, we believe that the theory and the metric developed in this paper can be further used for real-word programs as well.
Keywords
Free Energy Periodic Orbit Semantic Similarity Regular Language Finite AutomatonNotes
Acknowledgements
We would like to thank Jean-Charles Delvenne, David Koslicki, Daniel J. Thompson, Eric Wang, William J. Hutton III, and Ali Saberi for discussions. We would also like to thank the seven referees for suggestions and comments that have improved the presentation of our results.
References
- 1.Chartrand, G., Kubicki, G., Schultz, M.: Graph similarity, distance in graphs. Aequationes Math. 55(1), 129–145 (1998)MathSciNetCrossRefMATHGoogle Scholar
- 2.Chomsky, N., Miller, G.A.: Finite state languages. Inf. Control 1(2), 91–112 (1958)MathSciNetCrossRefMATHGoogle Scholar
- 3.Cui, C., Dang, Z., Fischer, T.R.: Typical paths of a graph. Fundam. Inform. 110( 1–4), 95–109 (2011)MathSciNetMATHGoogle Scholar
- 4.Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Similarity in languages and programs. Theor. Comput. Sci. 498, 58–75 (2013)MathSciNetCrossRefMATHGoogle Scholar
- 5.Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Information rate of some classes of non-regular languages: an automata-theoretic approach. In: Csuhaj-Varjú, E., Dietzfelbinger, M., Ésik, Z. (eds.) MFCS 2014. LNCS, vol. 8634, pp. 232–243. Springer, Heidelberg (2014)Google Scholar
- 6.Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Execution information rate for some classes of automata. Inf. Comput. 246, 20–29 (2016)MathSciNetCrossRefMATHGoogle Scholar
- 7.Dang, Z., Dementyev, D., Fischer, T.R., Hutton III, W.J.: Security of numerical sensors in automata. In: Drewes, F. (ed.) CIAA 2015. LNCS, vol. 9223, pp. 76–88. Springer, Heidelberg (2015)CrossRefGoogle Scholar
- 8.Dehmer, M., Emmert-Streib, F., Kilian, J.: A similarity measure for graphs with low computational complexity. Appl. Math. Comput. 182(1), 447–459 (2006)MathSciNetMATHGoogle Scholar
- 9.Delvenne, J.-C., Libert, A.-S.: Centrality measures and thermodynamic formalism for complex networks. Phys. Rev. E 83, 046117 (2011)CrossRefGoogle Scholar
- 10.ElGhawalby, H., Hancock, E.R.: Measuring graph similarity using spectral geometry. In: Campilho, A., Kamel, M.S. (eds.) ICIAR 2008. LNCS, vol. 5112, pp. 517–526. Springer, Heidelberg (2008)CrossRefGoogle Scholar
- 11.Gurevich, B.M.: A variational characterization of one-dimensional countable state gibbs random fields. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 68(2), 205–242 (1984)MathSciNetCrossRefMATHGoogle Scholar
- 12.Harel, D.: Statecharts: a visual formalism for complex systems. Sci. Comput. Program. 8(3), 231–274 (1987)MathSciNetCrossRefMATHGoogle Scholar
- 13.Ibarra, O.H., Cui, C., Dang, Z., Fischer, T.R.: Lossiness of communication channels modeled by transducers. In: Beckmann, A., Csuhaj-Varjú, E., Meer, K. (eds.) CiE 2014. LNCS, vol. 8493, pp. 224–233. Springer, Heidelberg (2014)Google Scholar
- 14.Koslicki, D.: Topological entropy of DNA sequences. Bioinformatics 27(8), 1061–1067 (2011)CrossRefGoogle Scholar
- 15.Koslicki, D., Thompson, D.J.: Coding sequence density estimation via topological pressure. J. Math. Biol. 70(1), 45–69 (2014)MathSciNetMATHGoogle Scholar
- 16.Li, Q., Dang, Z.: Sampling automata and programs. Theor. Comput. Sci. 577, 125–140 (2015)MathSciNetCrossRefMATHGoogle Scholar
- 17.Naval, S., Laxmi, V., Rajarajan, M., Gaur, M.S., Conti, M.: Employing program semantics for malware detection. IEEE Trans. Inf. Forensics Secur. 10(12), 2591–2604 (2015)CrossRefGoogle Scholar
- 18.Ruelle, D.: Thermodynamic Formalism: The Mathematical Structure of Equilibrium Statistical Mechanics. Cambridge University Press/Cambridge Mathematical Library, Cambridge (2004)CrossRefMATHGoogle Scholar
- 19.Sarig, O.M.: Thermodynamic formalism for countable Markov shifts. Ergodic Theor. Dyn. Syst. 19, 1565–1593 (1999)MathSciNetCrossRefMATHGoogle Scholar
- 20.Sarig, O.M.: Lecture notes on thermodynamic formalism for topological Markov shifts (2009)Google Scholar
- 21.Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana (1949)MATHGoogle Scholar
- 22.Sokolsky, O., Kannan, S., Lee, I.: Simulation-based graph similarity. In: Hermanns, H., Palsberg, J. (eds.) TACAS 2006. LNCS, vol. 3920, pp. 426–440. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 23.Walters, P.: An Introduction to Ergodic Theory. Graduate Texts in Mathematics. Springer, New York (1982)CrossRefMATHGoogle Scholar
- 24.Zager, L.A., Verghese, G.C.: Graph Similarity Scoring and Matching. Appl. Math. Lett. 21(1), 86–94 (2008)MathSciNetCrossRefMATHGoogle Scholar