Skip to main content

A Continuum-Based Model of Lexical Acquisition

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9623))

  • 1339 Accesses

Abstract

The automatic acquisition of verbal constructions is an important issue for natural language processing. In this paper, we have a closer look at two fundamental aspects of the description of the verb: the notion of lexical item and the distinction between arguments and adjuncts. Following up on studies in natural language processing and linguistics, we embrace the double hypothesis (i) of a continuum between ambiguity and vagueness, and (ii) of a continuum between arguments and adjuncts. We provide a complete approach to lexical knowledge acquisition of verbal constructions from an untagged news corpus. The approach is evaluated through the analysis of a sample of the 7,000 Japanese verbs automatically described by the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://nlp.ist.i.kyoto-u.ac.jp/index.php?KNP.

  2. 2.

    http://taku910.github.io/cabocha/.

  3. 3.

    http://plata.ar.media.kyoto-u.ac.jp/tool/EDA/home_en.html.

  4. 4.

    http://taku910.github.io/mecab/.

References

  1. Manning, C.D.: Probabilistic syntax. In: Bod, R., Hay, J., Jannedy, S. (eds.) Probabilistic Linguistics, pp. 289–341. MIT Press, Cambridge (2003)

    Google Scholar 

  2. Manning, C.D.: Automatic acquisition of a large subcategorization dictionary from corpora. In: Proceedings of the Meeting of the Association for Computational Linguistics, pp. 235–242 (1993)

    Google Scholar 

  3. Brent, M.R.: From grammar to lexicon: unsupervised learning of lexical syntax. Comput. Linguist. 19, 203–222 (1993)

    Google Scholar 

  4. Briscoe, T., Carroll, J.: Automatic extraction of subcategorization from corpora. In: Proceedings of the 5th ACL Conference on Applied Natural Language Processing, Washington, DC., pp. 356–363 (1997)

    Google Scholar 

  5. Korhonen, A.: Subcategorization acquisition. Ph.D. thesis, University of Cambridge (2002)

    Google Scholar 

  6. Korhonen, A., Briscoe, T.: Extended lexical-semantic classification of English verbs. In: Moldovan, D., Girju, R. (eds.) Proceedings of the HLT-NAACL 2004: Workshop on Computational Lexical Semantics, Boston, Massachusetts, USA, 2–7 May 2004, pp. 38–45. Association for Computational Linguistics (2004)

    Google Scholar 

  7. Preiss, J., Briscoe, T., Korhonen, A.: A system for large-scale acquisition of verbal, nominal and adjectival subcategorization frames from corpora. In: Proceedings of the Meeting of the Association for Computational Linguistics, Prague, pp. 912–918 (2007)

    Google Scholar 

  8. Messiant, C., Poibeau, T., Korhonen, A.: LexSchem: a large subcategorization lexicon for French verbs. In: Proceedings of the International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco, 26 May–1 June 2008 (2008)

    Google Scholar 

  9. im Walde, S.S., Müller, S.: Using web corpora for the automatic acquisition of lexical-semantic knowledge. JLCL 28(2), 85–105 (2013)

    Google Scholar 

  10. Han, X., Zhao, T., Qi, H., Yu, H.: Subcategorization acquisition and evaluation for Chinese verbs. In: Proceedings of the 20th International Conference on Computational Linguistics, COLING 2004, Stroudsburg, PA, USA. Association for Computational Linguistics (2004)

    Google Scholar 

  11. Kawahara, D., Kurohashi, S.: Case frame compilation from the web using high-performance computing. In: Proceedings of the 5th International Conference on Language Resources and Evaluation, pp. 1344–1347 (2006)

    Google Scholar 

  12. Kawahara, D., Kurohashi, S.: A fully-lexicalized probabilistic model for Japanese syntactic and case structure analysis. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pp. 176–183 (2006)

    Google Scholar 

  13. Merlo, P., Esteve Ferrer, E.: The notion of argument in prepositional phrase attachment. Comput. Linguist. 32(3), 341–377 (2006)

    Article  Google Scholar 

  14. Abend, O., Rappoport, A.: Fully unsupervised core-adjunct argument classification. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 226–236 (2010)

    Google Scholar 

  15. Fabre, C., Bourigault, D.: Exploiter des corpus annotés syntaxiquement pour observer le continuum entre arguments et circonstants. J. Fr. Lang. Stud. 18(1), 87–102 (2008)

    Google Scholar 

  16. Fabre, C., Frérot, C.: Groupes prépositionnels arguments ou circonstants: vers un repérage automatique en corpus. In: Actes de la 9éme conférence sur le Traitement Automatique des Langues Naturelles (TALN 2002), pp. 215–224 (2002)

    Google Scholar 

  17. Partee, B.H.: Lexical semantics and compositionality. In: Gleitman, L.R., Liberman, M. (eds.) An Invitation to Cognitive Science, Second edition, vol. 1: Language, pp. 311–360. MIT Press, Cambridge (1995)

    Google Scholar 

  18. Mitchell, J.: Composition in distributional models of semantics. Ph.D. thesis, University of Edinburgh (2011)

    Google Scholar 

  19. Firth, J.R.: A synopsis of linguistic theory 1930-1955. In: Studies in Linguistic Analysis, Philological Society, Oxford. Reprinted in F.R. Palmer (ed. 1968), Selected Papers of J.R. Firth 1952-1959, pp. 1–32. Longman, London (1957)

    Google Scholar 

  20. Harris, Z.S.: Distributional structure. Word 10, 146–162 (1954)

    Article  Google Scholar 

  21. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  22. Kurohashi, S., Nagao, M.: KN parser: Japanese dependency/case structure analyzer. In: Proceedings of the Workshop on Sharable Natural Language Resources, pp. 48–55 (1994)

    Google Scholar 

  23. Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: The 6th Conference on Natural Language Learning (CoNLL-2002), pp. 63–69 (2002)

    Google Scholar 

  24. Flannery, D., Miyao, Y., Neubig, G., Mori, S.: A pointwise approach to training dependency parsers from partially annotated corpora. J. Nat. Lang. Process. 19(3), 167–191 (2012)

    Article  Google Scholar 

  25. Sasano, R., Kawahara, D., Kurohashi, S., Okumura, M.: koubun/zyutugo-kou-kouzou kaiseki sisutemu knp no nagare to tokutyou (2013)

    Google Scholar 

  26. Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to Japanese morphological analysis. In: Proceedings of EMNLP 2004, pp. 230–237 (2004)

    Google Scholar 

  27. Asahara, M., Matsumoto, Y.: Ipadic version 2.7.0 users manual (2003)

    Google Scholar 

  28. Nihongo Kizyutu Bunpô Kenkyûkai: gendai nihongo bunpou 2: dai-3-bu kaku to koubun; dai-4-bu voisu (2009)

    Google Scholar 

  29. Martin, S.E.: A Reference Grammar of Japanese. Yale University Press, New Haven, London (1975)

    Google Scholar 

  30. Information-technology Promotion Agency (IPA): IPA lexicon of the Japanese language for computers, basic Japanese verbs (1987)

    Google Scholar 

Download references

Acknowledgement

Pierre Marchal’s research has been partially supported by a national “contrat doctoral” from the ministry of research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierre Marchal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Marchal, P., Poibeau, T. (2018). A Continuum-Based Model of Lexical Acquisition. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75477-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75476-5

  • Online ISBN: 978-3-319-75477-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics