Single-valued finite transduction
Finite state transduction is a simple and effective tool for the efficient analysis and transformation of large bodies of text. However, transductions may yield more than one output for some inputs, an inconvenience in some applications. In principle, multiple output values can be treated in one of the following ways: (1) Consider any input leading to multiple outputs in error. (2) Select the shortest output and consider the input in error if ties remain. (3) Select the genealogical minimum of the possible outputs (minimum length with lexicographic minimum in case of ties). (4) Select the lexicographic minimum of the possible outputs. In cases (3) or (4) a lexicographic order based on a partially-ordered alphabet could be used, with ties resolved as in cases (1) or (2). The naive approach would compute the complete set of outputs and then apply the selection procedure. However, it is possible to combine the selection with the left-to-right computation of the set of outputs using a straight-forward algorithm specified in terms of a *-semiring defined for the strategy selected. The correctness proofs then follow from simple properties of the particular *-semirings. Use of an accessible configurations construction leads to a direct algorithm for computing a minimum-state minimum-delay subsequential transducer for a subsequential function presented as the behaviour of a finite transducer. This machine can be proven to be canonical.
KeywordsTotal Order Lexicographic Order Finite Automaton Oxford English Dictionary Free Monoid
Unable to display preview. Download preview PDF.
- [AHU74]Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, Mass, 1974.Google Scholar
- [AU72]Alfred V. Aho and Jeffrey D. Ullman. The Theory of Parsing, Translation and Compiling, Vol. I: Parsing. Prentice-Hall, Englewood Cliffs, N.J., 1972.Google Scholar
- [BC75]R. C. Backhouse and B. A. Carré. Regular algebra applied to path-finding problems. J. Inst. Math. Appl., 15:161–186, 1975.Google Scholar
- [Ber79]Jean Berstel. Transductions and Context-Free Languages. B. G. Teubner, Stuttgart, Germany, 1979.Google Scholar
- [Car79]B. A. Carré. Graphs and Networks. Clarendon Press, Oxford, 1979.Google Scholar
- [Cho77]Christian Choffrut. Une caractérisation des fonctions séquentielles et des fonctions sousséquentielles en tant que relations rationnelles. Theoretical Computer Science, 5:325–338, 1977.Google Scholar
- [Eil74]Samuel Eilenberg. Automata, Languages, and Machines, vol. A. Academic Press, New York, 1974.Google Scholar
- [EM65]C. C. Elgot and J. E. Mezei. On relations defined by generalized finite automata. IBM Journal of Research, 9:47–65, 1965.Google Scholar
- [KS86]Werner Kuich and Arto Salomaa. Semirings, Automata, Languages. Springer-Verlag, Berlin, 1986.Google Scholar
- [Leh77]D. J. Lehmann. Algebraic structures for transitive closure. Theoretical Computer Science, 4(1):59–76, 1977.Google Scholar
- [SS78]Arto Salomaa and Matti Soittola. Automata-Theoretic Aspects of Formal Power Series. Springer-Verlag, New York, 1978.Google Scholar