Abstract
We present a distance metric on formal languages based on the accumulated weight of words in their symmetric difference. The contribution of an individual word to this weight decreases exponentially in its length, guaranteeing the distance between languages to be a real value between 0 and 1. We show that this distance is computable for regular languages. As an application, we show how the similarity measure derived from a modification of this metric can be used in automatic grading of particular standard exercises in formal language theory classes.
This is a preview of subscription content, access via your institution.
Buying options

Notes
- 1.
Actual matrix multiplication can be done in time \(\mathcal {O}(|Q|^{2.37286})\), cf. e.g. [1]. We state the cubic runtime here for the sake of readability.
- 2.
References
Alman, J., Williams, V.V.: A refined laser method and faster matrix multiplication. In: Proceedings ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, pp. 522–539. SIAM (2021). https://doi.org/10.1137/1.9781611976465.32
Alur, R., D’Antoni, L., Gulwani, S., Kini, D., Viswanathan, M.: Automated grading of DFA constructions. In: Proceedings 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013, pp. 1976–1982. IJCAI/AAAI (2013)
Ashby, F.G., Ennis, D.M.: Similarity measures. Scholarpedia 2(12), 4116 (2007)
Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theor. Comput. Sci. 286(1), 117–138 (2002). https://doi.org/10.1016/S0304-3975(01)00238-9. Mathematical Foundations of Computer Science
Choi, S., Cha, S.H., Tappert, C.: A survey of binary similarity and distance measures. J. Syst. Cybern. Inf. 8 (2009)
Chomsky, N., Miller, G.A.: Finite state languages. Inf. Control. 1(2), 91–112 (1958). https://doi.org/10.1016/S0019-9958(58)90082-2
Combéfis, S.: Automated code assessment for education: review, classification and perspectives on techniques and tools. Software 1(1), 3–30 (2022). https://doi.org/10.3390/software1010002
Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Similarity in languages and programs. Theor. Comput. Sci. 498, 58–75 (2013). https://doi.org/10.1016/j.tcs.2013.05.040
Furht, B. (ed.): Distance and Similarity Measures, pp. 207–208. Springer, Boston (2006). https://doi.org/10.1007/0-387-30038-4_63
Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, N. Reading (1979)
Ifenthaler, D.: Measures of similarity. In: Seel, N.M. (Ed.) Encyclopedia of the Sciences of Learning, pp. 2147–2150. Springer, New York (2012). https://doi.org/10.1007/978-1-4419-1428-6_503
Kozik, J.: Conditional densities of regular languages. Electr. Notes Theor. Comput. Sci. 140, 67–79 (2005). https://doi.org/10.1016/j.entcs.2005.06.023
Pearson, W.R.: An introduction to sequence similarity (“homology”) searching. Current Protoc. Bioinf. 42(1), 3.1.1–3.1.8 (2013). https://doi.org/10.1002/0471250953.bi0301s42
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Bruse, F., Herwig, M., Lange, M. (2022). A Similarity Measure for Formal Languages Based on Convergent Geometric Series. In: Caron, P., Mignot, L. (eds) Implementation and Application of Automata. CIAA 2022. Lecture Notes in Computer Science, vol 13266. Springer, Cham. https://doi.org/10.1007/978-3-031-07469-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-07469-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07468-4
Online ISBN: 978-3-031-07469-1
eBook Packages: Computer ScienceComputer Science (R0)