Skip to main content

Using Tree Transducers for Grammatical Inference

  • Conference paper
Book cover Logical Aspects of Computational Linguistics (LACL 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6736))

Abstract

We present a novel way of extracting a categorial grammar from annotated data. Using the sentences from the Paris VII annotated treebank [2] as our starting point, we use a tree transducer to convert the annotated trees from the corpus into categorial grammar derivations.

We describe both the formal aspects and the implementation of the tree transducer, which is a conservative extension of standard tree transducers allowing a compact specification of the transductions rules relevant for our purposes, and we discuss the specific set of transduction rules we use to convert the corpus into AB grammar derivation trees.

Evaluating the resulting tree transducer on the entire corpus, we find that it produces a treebank finds lexical entries for 90,0% of the corpus, though it produces complete derivations for only 75% of all sentence in the corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abeillé, A., Clément, L.: Annotation morpho-syntaxique (2003), http://llf.linguist.jussieu.fr

  2. Abeillé, A., Clément, L., Toussenel, F.: Building a treebank for french. Treebanks. Kluwer, Dordrecht (2003)

    Book  Google Scholar 

  3. Besombes, J., Marion, J.: Learning tree languages from positive examples and membership queries. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 440–453. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. BuszKowski, W., Penn, G.: Categorial grammars determined from linguistic data by unification. Studia Logica 49(4), 431–454 (1990), http://dx.doi.org/10.1007/BF00370157

    Article  MathSciNet  MATH  Google Scholar 

  5. Chomsky, N.: Lectures on government and binding (1981)

    Google Scholar 

  6. Clark, S., Curran, J.: Wide-coverage efficient statistical parsing with ccg and log-linear. Models, Computational Linguistics 33 (2007)

    Google Scholar 

  7. Comon, H., Dauchet, M., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (1997), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.125.6165

  8. Costa Florêncio, C.: Consistent identification in the limit of any of the classes k-valued is NP-hard. In: de Groote, P., Morrill, G., Retoré, C. (eds.) LACL 2001. LNCS (LNAI), vol. 2099, pp. 125–138. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Engelfriet, J., Vogler, H.: The translation power of top-down tree-to-graph transducers. Journal of Computer and System Sciences 49(2) (1993)

    Google Scholar 

  10. Gold, E.M.: Language identification in the limit. Information and Control 10(5) (1967)

    Google Scholar 

  11. Hockenmaier, J.: Data and models for statistical parsing with combinatory categorial grammar (2003)

    Google Scholar 

  12. Hockenmaier, J.: Creating a ccgbank and a wide-coverage ccg lexicon for german. In: Proceedings of COLING/ACL, Sydney (2006)

    Google Scholar 

  13. Kanazawa, M.: Learnable Classes of Categorial Grammars. Center for the Study of Language and Information, Stanford University, Ventura Hall, 220 Panama Street, Stanford, CA 94305-4115 (1998), phone: 650-723-3084; e-mail: pubs@csli.stanford.edu; World Wide Web: http://csli-www.stanford.edu/publications/

    Google Scholar 

  14. Knight, K., Graehl, J.: An overview of probabilistic tree transducers for natural language processing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 1–24. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. Kraak, E.: A deductive account of french object clitics. In: SYntax and Semantics, pp. 271–312 (1998)

    Google Scholar 

  16. Lambek, J.: The mathematics of sentence structure. The American Mathematical Monthly 65(3), 154–170 (1958), http://www.jstor.org/stable/2310058 , articletype: primary_article / Full publication date: March 1958, Mathematical Association of America

    Article  MathSciNet  MATH  Google Scholar 

  17. Levy, R., Andrew, G.: Tregex and tsurgeon: tools for querying and manipulating tree data structures (2006), http://nlp.stanford.edu/software/tregex.shtml

  18. Moortgat, M.: Categorial type logics. In: Handbook of Logic and Language, pp. 93–177 (1997), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.5803

  19. Moot, R.: Automated extraction of type-logical supertags from the spoken dutch corpus. In: Complexity of Lexical Descriptions and its Relevance to Natural Language Processing: A Supertagging Approach (2010)

    Google Scholar 

  20. Moot, R.: Semi-automated extraction of a wide-coverage type-logical grammar for french. In: Proceedings TALN 2010, Monreal (2010)

    Google Scholar 

  21. Moot, R., Retoré, C.: Les indices pronominaux du français dans les grammaires catégorielles. Lingvisticae Investigationes 29(1), 137–146 (2006)

    Article  Google Scholar 

  22. Morrill, G.V.: Type Logical Grammar: Categorial Logic of Signs. Springer, Heidelberg (1994)

    Book  MATH  Google Scholar 

  23. Sandillon-Rezer, N. (2011), http://www.labri.fr/perso/nfsr/

  24. Steedman, M.: The syntactic process (200)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sandillon-Rezer, NF., Moot, R. (2011). Using Tree Transducers for Grammatical Inference. In: Pogodalla, S., Prost, JP. (eds) Logical Aspects of Computational Linguistics. LACL 2011. Lecture Notes in Computer Science(), vol 6736. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22221-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22221-4_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22220-7

  • Online ISBN: 978-3-642-22221-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics