Using Tree Transducers for Grammatical Inference

Sandillon-Rezer, Noémie-Fleur; Moot, Richard

doi:10.1007/978-3-642-22221-4_16

Noémie-Fleur Sandillon-Rezer^21,22,23 &
Richard Moot^21,22,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6736))

Included in the following conference series:

International Conference on Logical Aspects of Computational Linguistics

618 Accesses
2 Citations

Abstract

We present a novel way of extracting a categorial grammar from annotated data. Using the sentences from the Paris VII annotated treebank [2] as our starting point, we use a tree transducer to convert the annotated trees from the corpus into categorial grammar derivations.

We describe both the formal aspects and the implementation of the tree transducer, which is a conservative extension of standard tree transducers allowing a compact specification of the transductions rules relevant for our purposes, and we discuss the specific set of transduction rules we use to convert the corpus into AB grammar derivation trees.

Evaluating the resulting tree transducer on the entire corpus, we find that it produces a treebank finds lexical entries for 90,0% of the corpus, though it produces complete derivations for only 75% of all sentence in the corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abeillé, A., Clément, L.: Annotation morpho-syntaxique (2003), http://llf.linguist.jussieu.fr
Abeillé, A., Clément, L., Toussenel, F.: Building a treebank for french. Treebanks. Kluwer, Dordrecht (2003)
Book Google Scholar
Besombes, J., Marion, J.: Learning tree languages from positive examples and membership queries. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 440–453. Springer, Heidelberg (2004)
Chapter Google Scholar
BuszKowski, W., Penn, G.: Categorial grammars determined from linguistic data by unification. Studia Logica 49(4), 431–454 (1990), http://dx.doi.org/10.1007/BF00370157
Article MathSciNet MATH Google Scholar
Chomsky, N.: Lectures on government and binding (1981)
Google Scholar
Clark, S., Curran, J.: Wide-coverage efficient statistical parsing with ccg and log-linear. Models, Computational Linguistics 33 (2007)
Google Scholar
Comon, H., Dauchet, M., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (1997), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.125.6165
Costa Florêncio, C.: Consistent identification in the limit of any of the classes k-valued is NP-hard. In: de Groote, P., Morrill, G., Retoré, C. (eds.) LACL 2001. LNCS (LNAI), vol. 2099, pp. 125–138. Springer, Heidelberg (2001)
Chapter Google Scholar
Engelfriet, J., Vogler, H.: The translation power of top-down tree-to-graph transducers. Journal of Computer and System Sciences 49(2) (1993)
Google Scholar
Gold, E.M.: Language identification in the limit. Information and Control 10(5) (1967)
Google Scholar
Hockenmaier, J.: Data and models for statistical parsing with combinatory categorial grammar (2003)
Google Scholar
Hockenmaier, J.: Creating a ccgbank and a wide-coverage ccg lexicon for german. In: Proceedings of COLING/ACL, Sydney (2006)
Google Scholar
Kanazawa, M.: Learnable Classes of Categorial Grammars. Center for the Study of Language and Information, Stanford University, Ventura Hall, 220 Panama Street, Stanford, CA 94305-4115 (1998), phone: 650-723-3084; e-mail: pubs@csli.stanford.edu; World Wide Web: http://csli-www.stanford.edu/publications/
Google Scholar
Knight, K., Graehl, J.: An overview of probabilistic tree transducers for natural language processing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 1–24. Springer, Heidelberg (2005)
Chapter Google Scholar
Kraak, E.: A deductive account of french object clitics. In: SYntax and Semantics, pp. 271–312 (1998)
Google Scholar
Lambek, J.: The mathematics of sentence structure. The American Mathematical Monthly 65(3), 154–170 (1958), http://www.jstor.org/stable/2310058 , articletype: primary_article / Full publication date: March 1958, Mathematical Association of America
Article MathSciNet MATH Google Scholar
Levy, R., Andrew, G.: Tregex and tsurgeon: tools for querying and manipulating tree data structures (2006), http://nlp.stanford.edu/software/tregex.shtml
Moortgat, M.: Categorial type logics. In: Handbook of Logic and Language, pp. 93–177 (1997), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.5803
Moot, R.: Automated extraction of type-logical supertags from the spoken dutch corpus. In: Complexity of Lexical Descriptions and its Relevance to Natural Language Processing: A Supertagging Approach (2010)
Google Scholar
Moot, R.: Semi-automated extraction of a wide-coverage type-logical grammar for french. In: Proceedings TALN 2010, Monreal (2010)
Google Scholar
Moot, R., Retoré, C.: Les indices pronominaux du français dans les grammaires catégorielles. Lingvisticae Investigationes 29(1), 137–146 (2006)
Article Google Scholar
Morrill, G.V.: Type Logical Grammar: Categorial Logic of Signs. Springer, Heidelberg (1994)
Book MATH Google Scholar
Sandillon-Rezer, N. (2011), http://www.labri.fr/perso/nfsr/
Steedman, M.: The syntactic process (200)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Bordeaux LaBRI, 351 cours de la libération, 33400, Talence, France
Noémie-Fleur Sandillon-Rezer & Richard Moot
CNRS, esplanade des Arts et Métiers, 33400, Talence, France
Noémie-Fleur Sandillon-Rezer & Richard Moot
SIGNES (INRIA Bordeaux SW), 351 cours de la libération, 33400, Talence, France
Noémie-Fleur Sandillon-Rezer & Richard Moot

Authors

Noémie-Fleur Sandillon-Rezer
View author publications
You can also search for this author in PubMed Google Scholar
Richard Moot
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INRIA Nancy – Grand Est, 615, rue du Jardin Botanique, 54602, Villers-lès-Nancy Cedex, France
Sylvain Pogodalla
LIRMM, UMR 5506 - CC 477, Université Montpellier 2, 161 rue Ada, 34095, Montpellier Cedex 5, France
Jean-Philippe Prost

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sandillon-Rezer, NF., Moot, R. (2011). Using Tree Transducers for Grammatical Inference. In: Pogodalla, S., Prost, JP. (eds) Logical Aspects of Computational Linguistics. LACL 2011. Lecture Notes in Computer Science(), vol 6736. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22221-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-22221-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22220-7
Online ISBN: 978-3-642-22221-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics