Skip to main content

A Similarity Measure for Formal Languages Based on Convergent Geometric Series

  • Conference paper
  • First Online:
  • 250 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13266))

Abstract

We present a distance metric on formal languages based on the accumulated weight of words in their symmetric difference. The contribution of an individual word to this weight decreases exponentially in its length, guaranteeing the distance between languages to be a real value between 0 and 1. We show that this distance is computable for regular languages. As an application, we show how the similarity measure derived from a modification of this metric can be used in automatic grading of particular standard exercises in formal language theory classes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Actual matrix multiplication can be done in time \(\mathcal {O}(|Q|^{2.37286})\), cf. e.g. [1]. We state the cubic runtime here for the sake of readability.

  2. 2.

    https://github.com/maurice-herwig/wofa.git.

References

  1. Alman, J., Williams, V.V.: A refined laser method and faster matrix multiplication. In: Proceedings ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, pp. 522–539. SIAM (2021). https://doi.org/10.1137/1.9781611976465.32

  2. Alur, R., D’Antoni, L., Gulwani, S., Kini, D., Viswanathan, M.: Automated grading of DFA constructions. In: Proceedings 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013, pp. 1976–1982. IJCAI/AAAI (2013)

    Google Scholar 

  3. Ashby, F.G., Ennis, D.M.: Similarity measures. Scholarpedia 2(12), 4116 (2007)

    Article  Google Scholar 

  4. Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theor. Comput. Sci. 286(1), 117–138 (2002). https://doi.org/10.1016/S0304-3975(01)00238-9. Mathematical Foundations of Computer Science

  5. Choi, S., Cha, S.H., Tappert, C.: A survey of binary similarity and distance measures. J. Syst. Cybern. Inf. 8 (2009)

    Google Scholar 

  6. Chomsky, N., Miller, G.A.: Finite state languages. Inf. Control. 1(2), 91–112 (1958). https://doi.org/10.1016/S0019-9958(58)90082-2

    Article  MathSciNet  MATH  Google Scholar 

  7. Combéfis, S.: Automated code assessment for education: review, classification and perspectives on techniques and tools. Software 1(1), 3–30 (2022). https://doi.org/10.3390/software1010002

    Article  Google Scholar 

  8. Cui, C., Dang, Z., Fischer, T.R., Ibarra, O.H.: Similarity in languages and programs. Theor. Comput. Sci. 498, 58–75 (2013). https://doi.org/10.1016/j.tcs.2013.05.040

    Article  MathSciNet  MATH  Google Scholar 

  9. Furht, B. (ed.): Distance and Similarity Measures, pp. 207–208. Springer, Boston (2006). https://doi.org/10.1007/0-387-30038-4_63

  10. Hopcroft, J.E., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, N. Reading (1979)

    MATH  Google Scholar 

  11. Ifenthaler, D.: Measures of similarity. In: Seel, N.M. (Ed.) Encyclopedia of the Sciences of Learning, pp. 2147–2150. Springer, New York (2012). https://doi.org/10.1007/978-1-4419-1428-6_503

  12. Kozik, J.: Conditional densities of regular languages. Electr. Notes Theor. Comput. Sci. 140, 67–79 (2005). https://doi.org/10.1016/j.entcs.2005.06.023

  13. Pearson, W.R.: An introduction to sequence similarity (“homology”) searching. Current Protoc. Bioinf. 42(1), 3.1.1–3.1.8 (2013). https://doi.org/10.1002/0471250953.bi0301s42

  14. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florian Bruse .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bruse, F., Herwig, M., Lange, M. (2022). A Similarity Measure for Formal Languages Based on Convergent Geometric Series. In: Caron, P., Mignot, L. (eds) Implementation and Application of Automata. CIAA 2022. Lecture Notes in Computer Science, vol 13266. Springer, Cham. https://doi.org/10.1007/978-3-031-07469-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07469-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07468-4

  • Online ISBN: 978-3-031-07469-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics