LKR 2008: Large-Scale Knowledge Resources. Construction and Application pp 252-266 | Cite as
Distant Collocations between Suppositional Adverbs and Clause-Final Modality Forms in Japanese Language Corpora
Abstract
Co-occurring of modal adverbs and clause-final modality forms in the Japanese language exhibits a strong agreement-like behaviour. We refer to such co-occurrences as distant collocations - a notion that warrants further consideration within the fields of corpus linguistics and computational linguistics. In this paper we concentrate on a set of suppositional adverbs and investigate the kinds of clause-final modality forms that they frequently co-occur with. One group of adverbs is found to typically collocate with one group of modality forms (one modality type) to a high degree, but also co-occurs with other modality types. Analyzing a variety of corpora revealed that associations between certain adverbs and certain modality types are indeed a matter of degree, although the associations in some cases vary across different genres. The results are summarized with the help of cluster analysis. We believe that the basic analysis approaches in this paper can be extended to cover similar kinds of collocational behaviour within lexicons and other large-scale knowledge resources, as well as complementing the development of computer-assisted language learning systems.
Keywords
suppositional adverbs clause-final modality collocations corpora Japanese languagePreview
Unable to display preview. Download preview PDF.
References
- 1.Minami, F.: Gendai nihongo no kôzô (Structure of the Modern Japanese language), Taishûkan shoten, Tokyo (1974)Google Scholar
- 2.Kudô, H.: Fukushi to bun no chinjutsu no taipu (Adverbs and the type of clause-final modality). In: Nitta, Y., Masuoka, T. (eds.) Nihongo no bunpô 3 – Modariti (Japanese grammar 3: Modality), Iwanami shoten, Tokyo, pp. 161–234 (2000)Google Scholar
- 3.Bekeš, A.: Japanese Suppositional Adverbs: Modality and Probability in Speaker-Hearer Interaction. In: Unpublished paper, presented at the Conference Japanese Modality Revisited, June 24-25, 2006, SOAS, London (2006)Google Scholar
- 4.Bekeš, A.: Japanese suppositional adverbs in speaker-hearer interaction. In: The third conference on Japanese language and Japanese language teaching: Proceedings of the conference, Rome, March 17-19, 2005, pp. 34–48. Libreria editrice cafoscarina, Venezia (2006)Google Scholar
- 5.Nishina, K., Yoshihashi, K.: Japanese Composition Support System Displaying Occurrences and Example Sentences. In: Symposium on Large-scale Knowledge Resources (LKR2007), pp. 119–122 (2007)Google Scholar
- 6.Minami, F.: Gendai nihongo bunpô no rinkaku (Outline of the modern Japanese grammar), Taishûkan shoten, Tokyo (1993)Google Scholar
- 7.Bekeš, A.: Gengo ni okeru fukakujitsusei: kyôkai kara mita monogataribun ni okeru wa (Uncertainty in the language: use of particle wa in narrative and the notion of boundary). In: Organizing Committee (ed.) Dai 15 kai nihongo kyôiku renraku kaigi, Durham, August 3–4, 2002, pp. 88–97 (2003)Google Scholar
- 8.Partington, A.: Patterns and Meanings: Using corpora for English language research and teaching. John Benjamins, Amsterdam (1998)Google Scholar
- 9.Sinclair, J.: Corpus, Concordance, Collocation. Oxford University Press, Oxford (1991)Google Scholar
- 10.Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press, Cambridge, MA (1999)MATHGoogle Scholar
- 11.Church, K.W., Hanks, P.: Word Association Norms, Mutual Information, and Lexicography. Computational Linguistics 16(1), 22–29 (1990)Google Scholar
- 12.Ikehara, S., Shirai, S., Uchino, H.: A Statistical Method for Extracting Uninterrupted and Interrupted Collocations from Very Large Corpora. In: Proceedings of 16th International Conference on Computational Linguistics (COLING-96), pp. 574–579 (1996)Google Scholar
- 13.Milkov, N.: Logico-Linguistic Moleculism: Towards an Ontology of Collocations and other [Language] Patterns. In: Proceedings of OntoLex 2000: Ontologies and Lexical Knowledge Bases, pp. 82–94. OntoText Lab, Sofia (2001)Google Scholar
- 14.Oikawa, T.: Jinbunkagaku to kopyûtâ DATABASE, vol. 1, Sôgô kenkyû daigaku (transcribed interviews of 50 Japanese native speakers, approximately 820 KB) (1998)Google Scholar
- 15.Ohso, M.: Meidai kaiwa kôpasu. Kagakukenkyûhi kiban kenkyu (B)(2) Nihongogakushûjisho hensan ni muketa denshika kôpasu riyô ni yoru korokeeshon kenkyû (2001–3). Nagoya University Japanese Conversation Corpus (NUJCC). Unpublished research report. About 100 informal conversations between familiar participants. Text file size 3.5 MB (2003)Google Scholar
- 16.Erjavec, T., Kilgarriff, A., Srdanović, E.I.: A large public-access Japanese corpus and its query tool. In: CoJaS 2007, The Inaugural Workshop on Computational Japanese Studies. Ikaho (2007)Google Scholar
- 17.Kilgarriff, A., et al.: The Sketch Engine. In: Proc. Euralex. Lorient, France, pp. 105–116 (2004)Google Scholar
- 18.Srdanović, E.I., Erjavec, T., Kilgarriff, A.: A web corpus and word-sketches for Japanese. Journal of Natural Language Processing (submitted)Google Scholar
- 19.Minami, F.: Jutsugobun no kôzô (Structure of the predicative sentences), originally in Kokugo Kenkyu 18, republished in Hattori et al. (eds.) (1978: (pp. 507–530), 1964)Google Scholar
- 20.Sharoff, S.: Creating general-purpose corpora using automated search engine queries. WaCky! Working papers on the Web as Corpus. Gedit, Bologna (2006)Google Scholar
- 21.Ueyama, M., Baroni, M.: Automated construction and evaluation of a Japanese web-based reference corpus. In: Proceedings of Corpus Linguistics 2005. Birmingham, UK (2005)Google Scholar
- 22.Hirst, G.: Ontology and the Lexicon. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies in Information Systems, Springer, Berlin (2003)Google Scholar
- 23.Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA (1998)MATHGoogle Scholar