Abstract
The automatic acquisition of verbal constructions is an important issue for natural language processing. In this paper, we have a closer look at two fundamental aspects of the description of the verb: the notion of lexical item and the distinction between arguments and adjuncts. Following up on studies in natural language processing and linguistics, we embrace the double hypothesis (i) of a continuum between ambiguity and vagueness, and (ii) of a continuum between arguments and adjuncts. We provide a complete approach to lexical knowledge acquisition of verbal constructions from an untagged news corpus. The approach is evaluated through the analysis of a sample of the 7,000 Japanese verbs automatically described by the system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Manning, C.D.: Probabilistic syntax. In: Bod, R., Hay, J., Jannedy, S. (eds.) Probabilistic Linguistics, pp. 289–341. MIT Press, Cambridge (2003)
Manning, C.D.: Automatic acquisition of a large subcategorization dictionary from corpora. In: Proceedings of the Meeting of the Association for Computational Linguistics, pp. 235–242 (1993)
Brent, M.R.: From grammar to lexicon: unsupervised learning of lexical syntax. Comput. Linguist. 19, 203–222 (1993)
Briscoe, T., Carroll, J.: Automatic extraction of subcategorization from corpora. In: Proceedings of the 5th ACL Conference on Applied Natural Language Processing, Washington, DC., pp. 356–363 (1997)
Korhonen, A.: Subcategorization acquisition. Ph.D. thesis, University of Cambridge (2002)
Korhonen, A., Briscoe, T.: Extended lexical-semantic classification of English verbs. In: Moldovan, D., Girju, R. (eds.) Proceedings of the HLT-NAACL 2004: Workshop on Computational Lexical Semantics, Boston, Massachusetts, USA, 2–7 May 2004, pp. 38–45. Association for Computational Linguistics (2004)
Preiss, J., Briscoe, T., Korhonen, A.: A system for large-scale acquisition of verbal, nominal and adjectival subcategorization frames from corpora. In: Proceedings of the Meeting of the Association for Computational Linguistics, Prague, pp. 912–918 (2007)
Messiant, C., Poibeau, T., Korhonen, A.: LexSchem: a large subcategorization lexicon for French verbs. In: Proceedings of the International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco, 26 May–1 June 2008 (2008)
im Walde, S.S., Müller, S.: Using web corpora for the automatic acquisition of lexical-semantic knowledge. JLCL 28(2), 85–105 (2013)
Han, X., Zhao, T., Qi, H., Yu, H.: Subcategorization acquisition and evaluation for Chinese verbs. In: Proceedings of the 20th International Conference on Computational Linguistics, COLING 2004, Stroudsburg, PA, USA. Association for Computational Linguistics (2004)
Kawahara, D., Kurohashi, S.: Case frame compilation from the web using high-performance computing. In: Proceedings of the 5th International Conference on Language Resources and Evaluation, pp. 1344–1347 (2006)
Kawahara, D., Kurohashi, S.: A fully-lexicalized probabilistic model for Japanese syntactic and case structure analysis. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pp. 176–183 (2006)
Merlo, P., Esteve Ferrer, E.: The notion of argument in prepositional phrase attachment. Comput. Linguist. 32(3), 341–377 (2006)
Abend, O., Rappoport, A.: Fully unsupervised core-adjunct argument classification. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 226–236 (2010)
Fabre, C., Bourigault, D.: Exploiter des corpus annotés syntaxiquement pour observer le continuum entre arguments et circonstants. J. Fr. Lang. Stud. 18(1), 87–102 (2008)
Fabre, C., Frérot, C.: Groupes prépositionnels arguments ou circonstants: vers un repérage automatique en corpus. In: Actes de la 9éme conférence sur le Traitement Automatique des Langues Naturelles (TALN 2002), pp. 215–224 (2002)
Partee, B.H.: Lexical semantics and compositionality. In: Gleitman, L.R., Liberman, M. (eds.) An Invitation to Cognitive Science, Second edition, vol. 1: Language, pp. 311–360. MIT Press, Cambridge (1995)
Mitchell, J.: Composition in distributional models of semantics. Ph.D. thesis, University of Edinburgh (2011)
Firth, J.R.: A synopsis of linguistic theory 1930-1955. In: Studies in Linguistic Analysis, Philological Society, Oxford. Reprinted in F.R. Palmer (ed. 1968), Selected Papers of J.R. Firth 1952-1959, pp. 1–32. Longman, London (1957)
Harris, Z.S.: Distributional structure. Word 10, 146–162 (1954)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)
Kurohashi, S., Nagao, M.: KN parser: Japanese dependency/case structure analyzer. In: Proceedings of the Workshop on Sharable Natural Language Resources, pp. 48–55 (1994)
Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: The 6th Conference on Natural Language Learning (CoNLL-2002), pp. 63–69 (2002)
Flannery, D., Miyao, Y., Neubig, G., Mori, S.: A pointwise approach to training dependency parsers from partially annotated corpora. J. Nat. Lang. Process. 19(3), 167–191 (2012)
Sasano, R., Kawahara, D., Kurohashi, S., Okumura, M.: koubun/zyutugo-kou-kouzou kaiseki sisutemu knp no nagare to tokutyou (2013)
Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to Japanese morphological analysis. In: Proceedings of EMNLP 2004, pp. 230–237 (2004)
Asahara, M., Matsumoto, Y.: Ipadic version 2.7.0 users manual (2003)
Nihongo Kizyutu Bunpô Kenkyûkai: gendai nihongo bunpou 2: dai-3-bu kaku to koubun; dai-4-bu voisu (2009)
Martin, S.E.: A Reference Grammar of Japanese. Yale University Press, New Haven, London (1975)
Information-technology Promotion Agency (IPA): IPA lexicon of the Japanese language for computers, basic Japanese verbs (1987)
Acknowledgement
Pierre Marchal’s research has been partially supported by a national “contrat doctoral” from the ministry of research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Marchal, P., Poibeau, T. (2018). A Continuum-Based Model of Lexical Acquisition. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-75477-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)