Skip to main content

A Similarity Measure for Formal Languages Based on Convergent Geometric Series

  • 162 Accesses

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13266)

Abstract

We present a distance metric on formal languages based on the accumulated weight of words in their symmetric difference. The contribution of an individual word to this weight decreases exponentially in its length, guaranteeing the distance between languages to be a real value between 0 and 1. We show that this distance is computable for regular languages. As an application, we show how the similarity measure derived from a modification of this metric can be used in automatic grading of particular standard exercises in formal language theory classes.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-031-07469-1_6
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   54.99
Price excludes VAT (USA)
  • ISBN: 978-3-031-07469-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   69.99
Price excludes VAT (USA)
Fig. 1.

Notes

  1. 1.

    Actual matrix multiplication can be done in time \(\mathcal {O}(|Q|^{2.37286})\), cf. e.g. [1]. We state the cubic runtime here for the sake of readability.

  2. 2.

    https://github.com/maurice-herwig/wofa.git.

References

  1. Alman, J., Williams, V.V.: A refined laser method and faster matrix multiplication. In: Proceedings ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, pp. 522–539. SIAM (2021). https://doi.org/10.1137/1.9781611976465.32

  2. Alur, R., D’Antoni, L., Gulwani, S., Kini, D., Viswanathan, M.: Automated grading of DFA constructions. In: Proceedings 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013, pp. 1976–1982. IJCAI/AAAI (2013)

    Google Scholar 

  3. Ashby, F.G., Ennis, D.M.: Similarity measures. Scholarpedia 2(12), 4116 (2007)

    CrossRef  Google Scholar 

  4. Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theor. Comput. Sci. 286(1), 117–138 (2002). https://doi.org/10.1016/S0304-3975(01)00238-9. Mathematical Foundations of Computer Science

  5. Choi, S., Cha, S.H., Tappert, C.: A survey of binary similarity and distance measures. J. Syst. Cybern. Inf. 8 (2009)

    Google Scholar 

  6. Chomsky, N., Miller, G.A.: Finite state languages. Inf. Control. 1(2), 91–112 (1958). https://doi.org/10.1016/S0019-9958(58)90082-2

    CrossRef  MathSciNet  MATH  Google Scholar 

  7. Combéfis, S.: Automated code assessment for education: review, classification and perspectives on techniques and tools. Software 1(1), 3–30 (2022). https://doi.org/10.3390/software1010002

    CrossRef  Google Scholar 

  8. Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Similarity in languages and programs. Theor. Comput. Sci. 498, 58–75 (2013). https://doi.org/10.1016/j.tcs.2013.05.040

    CrossRef  MathSciNet  MATH  Google Scholar 

  9. Furht, B. (ed.): Distance and Similarity Measures, pp. 207–208. Springer, Boston (2006). https://doi.org/10.1007/0-387-30038-4_63

  10. Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, N. Reading (1979)

    MATH  Google Scholar 

  11. Ifenthaler, D.: Measures of similarity. In: Seel, N.M. (Ed.) Encyclopedia of the Sciences of Learning, pp. 2147–2150. Springer, New York (2012). https://doi.org/10.1007/978-1-4419-1428-6_503

  12. Kozik, J.: Conditional densities of regular languages. Electr. Notes Theor. Comput. Sci. 140, 67–79 (2005). https://doi.org/10.1016/j.entcs.2005.06.023

  13. Pearson, W.R.: An introduction to sequence similarity (“homology”) searching. Current Protoc. Bioinf. 42(1), 3.1.1–3.1.8 (2013). https://doi.org/10.1002/0471250953.bi0301s42

  14. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

    CrossRef  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florian Bruse .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Bruse, F., Herwig, M., Lange, M. (2022). A Similarity Measure for Formal Languages Based on Convergent Geometric Series. In: Caron, P., Mignot, L. (eds) Implementation and Application of Automata. CIAA 2022. Lecture Notes in Computer Science, vol 13266. Springer, Cham. https://doi.org/10.1007/978-3-031-07469-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07469-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07468-4

  • Online ISBN: 978-3-031-07469-1

  • eBook Packages: Computer ScienceComputer Science (R0)