Minimization of Symbolic Transducers

  • Olli SaarikiviEmail author
  • Margus Veanes
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10427)


Symbolic transducers extend classical finite state transducers to infinite or large alphabets like Unicode, and are a popular tool in areas requiring reasoning over string transformations where traditional techniques do not scale. Here we develop the theory for and an algorithm for computing quotients of such transducers under indistinguishability preserving equivalence relations over states such as bisimulation. We show that the algorithm is a minimization algorithm in the deterministic finite state case. We evaluate the benefits of the proposed algorithm over real-world stream processing computations where symbolic transducers are formed as a result of repeated compositions.


  1. 1.
  2. 2.
    Emoticons, Unicode standard v9.0.
  3. 3.
    Abdulla, P.A., Deneux, J., Kaati, L., Nilsson, M.: Minimization of non-deterministic automata with large alphabets. In: Farré, J., Litovsky, I., Schmitz, S. (eds.) CIAA 2005. LNCS, vol. 3845, pp. 31–42. Springer, Heidelberg (2006). doi: 10.1007/11605157_3 CrossRefGoogle Scholar
  4. 4.
    Abdulla, P.A., Bouajjani, A., Holík, L., Kaati, L., Vojnar, T.: Composed bisimulation for tree automata. In: Ibarra, O.H., Ravikumar, B. (eds.) CIAA 2008. LNCS, vol. 5148, pp. 212–222. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-70844-5_22 CrossRefGoogle Scholar
  5. 5.
    Aho, A.V., Lam, M.S., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools, 2nd edn. Addison-Wesley, Boston (2006)zbMATHGoogle Scholar
  6. 6.
    Allauzen, C., Mohri, M.: Finitely subsequential transducers. Int. J. Found. Comput. Sci. 14(6), 983–994 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Almeida, R., Holík, L., Mayr, R.: Reduction of nondeterministic tree automata. In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 717–735. Springer, Heidelberg (2016). doi: 10.1007/978-3-662-49674-9_46 CrossRefGoogle Scholar
  8. 8.
    Alur, R., Raghothaman, M.: Decision problems for additive regular functions. In: Fomin, F.V., Freivalds, R., Kwiatkowska, M., Peleg, D. (eds.) ICALP 2013. LNCS, vol. 7966, pp. 37–48. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39212-2_7 CrossRefGoogle Scholar
  9. 9.
    Alur, R., Černý, P.: Streaming transducers for algorithmic verification of single-pass list-processing programs. SIGPLAN Not. - POPL 2011 46(1), 599–610 (2011)zbMATHGoogle Scholar
  10. 10.
    Baschenis, F., Gauwin, O., Muscholl, A., Puppis, G.: Minimizing resources of sweeping and streaming string transducers. In: 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). LIPIcs, vol. 55, pp. 114:1–114:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2016)Google Scholar
  11. 11.
    Bollig, B., Wegener, I.: Improving the variable ordering of OBDDs is NP-complete. IEEE Trans. Comput. 45(9), 993–1002 (1996)CrossRefzbMATHGoogle Scholar
  12. 12.
    Bradley, A.R.: SAT-based model checking without unrolling. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 70–87. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-18275-4_7 CrossRefGoogle Scholar
  13. 13.
    Buchfuhrer, D., Umans, C.: The complexity of Boolean formula minimization. J. Comput. Syst. Sci. 77(1), 142–153 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Choffrut, C.: Contribution à l’étude de quelques familles remarquables de fonctions rationnelles. Ph.D. thesis, Universit Paris 7, Paris, France (1978)Google Scholar
  15. 15.
    Colcombet, T., Fradet, P.: Enforcing trace properties by program transformation. In: Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 2000), pp. 54–66. ACM (2000)Google Scholar
  16. 16.
    Dalla Preda, M., Giacobazzi, R., Lakhotia, A., Mastroeni, I.: Abstract symbolic automata: mixed syntactic/semantic similarity analysis of executables. SIGPLAN Not. - POPL 2015 50(1), 329–341 (2015)zbMATHGoogle Scholar
  17. 17.
    D’Antoni, L., Veanes, M.: Minimization of symbolic automata. SIGPLAN Not. - POPL 2014 49(1), 541–553 (2014)zbMATHGoogle Scholar
  18. 18.
    D’Antoni, L., Veanes, M.: Forward bisimulations for nondeterministic symbolic finite automata. In: Legay, A., Margaria, T. (eds.) TACAS 201. LNCS, vol. 10205, pp. 518–534. Springer, Heidelberg (2017). doi: 10.1007/978-3-662-54577-5_30 CrossRefGoogle Scholar
  19. 19.
    Daviaud, L., Reynier, P.-A., Talbot, J.-M.: A generalised twinning property for minimisation of cost register automata. In: Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2016), pp. 857–866. ACM (2016)Google Scholar
  20. 20.
    Drobac, S., Lindén, K., Pirinen, T., Silfverberg, M.: Heuristic hyper-minimization of finite state lexicons. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014). ELRA (2014)Google Scholar
  21. 21.
    D’Souza, D., Shankar, P. (eds.): Modern Applications of Automata Theory. IISc Research Monographs Series, vol. 2. World Scientific, Singapore (2012)zbMATHGoogle Scholar
  22. 22.
    Fülöp, Z., Vogler, H.: Forward and backward application of symbolic tree transducers. Acta Informatica 51(5), 297–325 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Henriksen, J.G., Jensen, J., Jørgensen, M., Klarlund, N., Paige, R., Rauhe, T., Sandholm, A.: Mona: monadic second-order logic in practice. In: Brinksma, E., Cleaveland, W.R., Larsen, K.G., Margaria, T., Steffen, B. (eds.) TACAS 1995. LNCS, vol. 1019, pp. 89–110. Springer, Heidelberg (1995). doi: 10.1007/3-540-60630-0_5 CrossRefGoogle Scholar
  24. 24.
    Högberg, J., Maletti, A., May, J.: Backward and forward bisimulation minimisation of tree automata. In: Holub, J., Žd’árek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 109–121. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-76336-9_12 CrossRefGoogle Scholar
  25. 25.
    Hooimeijer, P., Livshits, B., Molnar, D., Saxena, P., Veanes, M.: Fast and precise sanitizer analysis with Bek. In: Proceedings of the 20th USENIX Conference on Security (SEC 2011). USENIX Association (2011)Google Scholar
  26. 26.
    Huffman, D.A.: A method for the construction of minimum-redundancy codes. Proc. IRE 40(9), 1098–1101 (1952)CrossRefzbMATHGoogle Scholar
  27. 27.
    Klarlund, N., Møller, A., Schwartzbach, M.I.: MONA implementation secrets. Int. J. Found. Comput. Sci. 13(4), 571–586 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Manuel, A., Ramanujam, R.: Automata over infinite alphabets. In: D’Souza, D., Shankar, P. (eds.) Modern Applications of Automata Theory, pp. 529–554. World Scientific (2012)Google Scholar
  29. 29.
    Mayr, R., Clemente, L.: Advanced automata minimization. SIGPLAN Not. - POPL 2013 48(1), 63–74 (2013)zbMATHGoogle Scholar
  30. 30.
    Mesfar, S., Silberztein, M.: Transducer minimization and information compression for NooJ dictionaries. In: Proceedings of the 2009 Conference on Finite-State Methods and Natural Language Processing (FSMNLP 2008), pp. 110–121. IOS Press (2009)Google Scholar
  31. 31.
    Mohri, M.: Minimization of sequential transducers. In: Crochemore, M., Gusfield, D. (eds.) CPM 1994. LNCS, vol. 807, pp. 151–163. Springer, Heidelberg (1994). doi: 10.1007/3-540-58094-8_14 CrossRefGoogle Scholar
  32. 32.
    Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)MathSciNetGoogle Scholar
  33. 33.
    Mohri, M.: Minimization algorithms for sequential transducers. Theoret. Comput. Sci. 234(1–2), 177–201 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Poess, M., Rabl, T., Jacobsen, H.-A., Caufield, B.: TPC-DI: the first industry benchmark for data integration. Proc. VLDB Endowment 7(13), 1367–1378 (2014)CrossRefGoogle Scholar
  35. 35.
    Saarikivi, O., Veanes, M., Mytkowicz, T., Musuvathi, M.: Fusing effectful comprehensions. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM (2017)Google Scholar
  36. 36.
    Schützenberger, M.P.: Sur une variante des fonctions séquentielles. Theoret. Comput. Sci. 4, 47–57 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    van Noord, G., Gerdemann, D.: Finite state transducers with predicates and identities. Grammars 4(3), 263–286 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Veanes, M., de Halleux, P., Tillmann, N.: Rex: symbolic regular expression explorer. In: Proceedings of the 2010 Third International Conference on Software Testing, Verification and Validation (ICST 2010), pp. 498–507. IEEE (2010)Google Scholar
  39. 39.
    Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Bjorner, N.: Symbolic finite state transducers: algorithms and applications. SIGPLAN Not. - POPL 2012 47(1), 137–150 (2012)zbMATHGoogle Scholar
  40. 40.
    Veanes, M., Mytkowicz, T., Molnar, D., Livshits, B.: Data-parallel string-manipulating programs. SIGPLAN Not. - POPL2015 50(1), 139–152 (2015)CrossRefzbMATHGoogle Scholar
  41. 41.
    Watson, B.W.: Implementing and using finite automata toolkits. In: Extended Finite State Models of Language, pp. 19–36. Cambridge University Press (1999)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Aalto University and Helsinki Institute for Information Technology HIITHelsinkiFinland
  2. 2.Microsoft ResearchRedmondUSA

Personalised recommendations