Abstract
A weighted finite-state machine with n tapes (n-WFSM) defines a rational relation on n strings. The paper recalls important operations on these relations, and an algorithm for their auto-intersection. Through a series of practical applications, it investigates the augmented descriptive power of n-WFSMs, w.r.t. classical 1- and 2-WFSMs (weighted acceptors and transducers). Some of the presented applications are not feasible with the latter.
Sections 2-4 are based on published results [18,19,20,4], obtained at Xerox Research Centre Europe (XRCE), Meylan, France, through joint work between Jean-Marc Champarnaud (Rouen Univ.), Jason Eisner (Johns Hopkins Univ.), Franck Guingne and Florent Nicart (XRCE and Rouen Univ.), and the author.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aït-Mokhtar, S., Chanod, J.-P.: Incremental finite-state parsing. In: Proc. 5th Int. Conf. ANLP, Washington, DC, USA, pp. 72–79 (1997)
Beesley, K.R., Karttunen, L.: Finite State Morphology. CSLI Publications, Palo Alto (2003)
Brent, M.: An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning 34, 71–106 (1999)
Champarnaud, J.-M., Guingne, F., Kempe, A., Nicart, F.: Algorithms for the join and auto-intersection of multi-tape weighted finite-state machines. Int. Journal of Foundations of Computer Science 19(2), 453–476 (2008)
Creutz, M., Lagus, K.: Unsupervised models for morpheme segmentation and morfology learning. ACM Transactions on Speech and Language Processing 4(1) (2007)
Eilenberg, S.: Automata, Languages, and Machines, vol. A. Academic Press, San Diego (1974)
Elgot, C.C., Mezei, J.E.: On relations defined by generalized finite automata. IBM Journal of Research and Development 9(1), 47–68 (1965)
Frougny, C., Sakarovitch, J.: Synchronized rational relations of finite and infinite words. Theoretical Computer Science 108(1), 45–82 (1993)
Goldsmith, J.: Unsupervised learning of the morphology of a natural language. Computational Linguistics 27, 153–198 (2001)
Harju, T., Karhumäki, J.: The equivalence problem of multitape finite automata. Theoretical Computer Science 78(2), 347–355 (1991)
Isabelle, P., Kempe, A.: Automatic string alignment for finite-state transducers (2004) (Unpublished work)
Kaplan, R.M., Kay, M.: Regular models of phonological rule systems. Computational Linguistics 20(3), 331–378 (1994)
Karttunen, L., Gaál, T., Kempe, A.: Xerox finite state complier, Xerox Research Centre Europe, Grenoble, France (1998), Online demo and documentation http://www.xrce.xerox.com/competencies/content-analysis/fsCompiler/
Kay, M.: Nonconcatenative finite-state morphology. In: Proc. 3rd Int. Conf. EACL, Copenhagen, Denmark, pp. 2–10 (1987)
Kempe, A.: Acronym-meaning extraction from corpora using multitape weighted finite-state machines. Research report 2006/019, Xerox Research Centre Europe, Meylan, France (2006)
Kempe, A.: Viterbi algorithm generalized for n-tape best-path search. In: Proc. 8th Int. Workshop FSMNLP, Pretoria, South Africa (2009)
Kempe, A., Baeijs, C., Gaál, T., Guingne, F., Nicart, F.: WFSC – A new weighted finite state compiler. In: Ibarra, O.H., Dang, Z. (eds.) CIAA 2003. LNCS, vol. 2759, pp. 108–119. Springer, Heidelberg (2003)
Kempe, A., Champarnaud, J.-M., Eisner, J.: A note on join and auto-intersection of n-ary rational relations. In: Watson, B., Cleophas, L. (eds.) Proc. Eindhoven FASTAR Days, Eindhoven, Netherlands, 2004. TU/e CS TR, vol. 04–40, pp. 64–78 (2004)
Kempe, A., Champarnaud, J.-M., Eisner, J., Guingne, F., Nicart, F.: A class of rational n-WFSM auto-intersections. In: Farré, J., Litovsky, I., Schmitz, S. (eds.) CIAA 2005. LNCS, vol. 3845, pp. 266–274. Springer, Heidelberg (2006)
Kempe, A., Champarnaud, J.-M., Guingne, F., Nicart, F.: Wfsm auto-intersection and join algorithms. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds.) FSMNLP 2005. LNCS (LNAI), vol. 4002, pp. 120–131. Springer, Heidelberg (2006)
Kempe, A., Guingne, F., Nicart, F.: Algorithms for weighted multi-tape automata. Research report 2004/031, Xerox Research Centre Europe, Meylan, France (2004)
Kiraz, G.A.: Linearization of nonlinear lexical representations. In: Coleman, J. (ed.) Proc. 3rd ACL SIG Computational Phonology, Madrid, Spain (1997)
Kiraz, G.A.: Multitiered nonlinear morphology using multitape finite automata: a case study on Syriac and Arabic. Computational Lingistics 26(1), 77–105 (2000)
Kuich, W., Salomaa, A.: Semirings, Automata, Languages. EATCS Monographs on Theoretical Computer Science, vol. 5. Springer, Heidelberg (1986)
Kumar, S., Byrne, W.: A weighted finite state transducer implementation of the alignment template model for statistical machine translation. In: Proc. Int. Conf. HLT-NAACL, Edmonton, Canada, pp. 63–70 (2003)
Mohri, M.: Edit-distance of weighted automata. In: Champarnaud, J.-M., Maurel, D. (eds.) CIAA 2002. LNCS, vol. 2608, pp. 1–23. Springer, Heidelberg (2003)
Mohri, M., Pereira, F.C.N., Riley, M.: A rational design for a weighted finite-state transducer library. In: Wood, D., Yu, S. (eds.) WIA 1997. LNCS, vol. 1436, pp. 144–158. Springer, Heidelberg (1998)
Nicart, F., Champarnaud, J.-M., Csáki, T., Gaál, T., Kempe, A.: Multi-tape automata with symbol classes. In: Ibarra, O.H., Yen, H.-C. (eds.) CIAA 2006. LNCS, vol. 4094, pp. 126–136. Springer, Heidelberg (2006)
Pereira, F.C.N., Riley, M.D.: Speech recognition by composition of weighted finite automata. In: Roche, E., Schabes, Y. (eds.) Finite-State Language Processing, pp. 431–453. MIT Press, Cambridge (1997)
Pirkola, A., Toivonen, J., Keskustalo, H., Visala, K., Järvelin, K.: Fuzzy translation of cross-lingual spelling variants. In: Proc. 26th Annual Int. ACM SIGIR, Toronto, Canada, 2003, pp. 345–352 (2003)
Post, E.: A variant of a recursively unsolvable problem. Bulletin of the American Mathematical Society 52, 264–268 (1946)
Pustejovsky, J., Castaño, J., Cochran, B., Kotecki, M., Morrell, M., Rumshisky, A.: Linguistic knowledge extraction from medline: Automatic construction of an acronym database. In: Proc. 10th World Congress on Health and Medical Informatics, Medinfo 2001 (2001)
Rabin, M.O., Scott, D.: Finite automata and their decision problems. IBM Journal of Research and Development 3(2), 114–125 (1959)
Rosenberg, A.L.: On n-tape finite state acceptors. In: IEEE Symposium on Foundations of Computer Science (FOCS), pp. 76–81 (1964)
Schwartz, A., Hearst, M.: A simple algorithm for identifying abbreviation definitions in biomedical texts. In: Proc. Pacific Symposium on Biocomputing, PSB-2003 (2003)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the Association for Computing Machinery 21(1), 168–173 (1974)
Yeates, S., Bainbridge, D., Witten, I.H.: Using compression to identify acronyms in text. In: Proc. Data Compression Conf. (DCC-2000), Snowbird, Utah, USA (2000); Also published in a longer form as Working Paper 00/01, Department of Computer Science, University of Waikato (January 2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kempe, A. (2010). Selected Operations and Applications of n-Tape Weighted Finite-State Machines. In: Yli-Jyrä, A., Kornai, A., Sakarovitch, J., Watson, B. (eds) Finite-State Methods and Natural Language Processing. FSMNLP 2009. Lecture Notes in Computer Science(), vol 6062. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14684-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-14684-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14683-1
Online ISBN: 978-3-642-14684-8
eBook Packages: Computer ScienceComputer Science (R0)