Abstract
Finite-state processing is typically based on structures that allow for efficient indexing and sequential search. However, this “rigid” framework has several disadvantages when used in natural language processing, especially for non-alphabetical languages. The solution is to systematically introduce polymorphic programming techniques that are adapted to particular cases. In this paper we describe the structure of a morphological dictionary implemented with finite-state automata using variable or polymorphic node formats. Each node is assigned a format from a predefined set reflecting its utility in corpora processing as measured by a number of graph theoretic metrics and statistics. Experimental results demonstrate that this approach permits a 52% increase in the performance of dictionary look-up.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kiraz, G.: Compressed storage of sparse finite-state transducers. In O. Boldt, H. Jurgensen, and L. Robbins, editors, Workshop on Implementing Automata WIA99-Pre-Proceedings, Potsdam, July, 1999.
Goetz, T., Wunsch, H.: An Abstract Machine Approach to Finite State Transduction over Large Character Sets. Finite State Methods in Natural Language Processing 2001. ESSLLI Workshop, August 20-24, Helsinki.
Ciura, M. G., Deorowicz, S.: How to squeeze a lexicon. Software Practice and Experience, vol. 31, n. 11, pp. 1077–1090, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Troussov, A., O’Donovan, B., Koskenniemi, S., Glushnev, N. (2003). Per-node Optimization of Finite-State Mechanisms for Natural Language Processing. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_22
Download citation
DOI: https://doi.org/10.1007/3-540-36456-0_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00532-2
Online ISBN: 978-3-540-36456-6
eBook Packages: Springer Book Archive