Distant Collocations between Suppositional Adverbs and Clause-Final Modality Forms in Japanese Language Corpora

  • Irena Srdanović Erjavec
  • Andrej Bekeš
  • Kikuko Nishina
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4938)

Abstract

Co-occurring of modal adverbs and clause-final modality forms in the Japanese language exhibits a strong agreement-like behaviour. We refer to such co-occurrences as distant collocations - a notion that warrants further consideration within the fields of corpus linguistics and computational linguistics. In this paper we concentrate on a set of suppositional adverbs and investigate the kinds of clause-final modality forms that they frequently co-occur with. One group of adverbs is found to typically collocate with one group of modality forms (one modality type) to a high degree, but also co-occurs with other modality types. Analyzing a variety of corpora revealed that associations between certain adverbs and certain modality types are indeed a matter of degree, although the associations in some cases vary across different genres. The results are summarized with the help of cluster analysis. We believe that the basic analysis approaches in this paper can be extended to cover similar kinds of collocational behaviour within lexicons and other large-scale knowledge resources, as well as complementing the development of computer-assisted language learning systems.

Keywords

suppositional adverbs clause-final modality collocations corpora Japanese language 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Minami, F.: Gendai nihongo no kôzô (Structure of the Modern Japanese language), Taishûkan shoten, Tokyo (1974)Google Scholar
  2. 2.
    Kudô, H.: Fukushi to bun no chinjutsu no taipu (Adverbs and the type of clause-final modality). In: Nitta, Y., Masuoka, T. (eds.) Nihongo no bunpô 3 – Modariti (Japanese grammar 3: Modality), Iwanami shoten, Tokyo, pp. 161–234 (2000)Google Scholar
  3. 3.
    Bekeš, A.: Japanese Suppositional Adverbs: Modality and Probability in Speaker-Hearer Interaction. In: Unpublished paper, presented at the Conference Japanese Modality Revisited, June 24-25, 2006, SOAS, London (2006)Google Scholar
  4. 4.
    Bekeš, A.: Japanese suppositional adverbs in speaker-hearer interaction. In: The third conference on Japanese language and Japanese language teaching: Proceedings of the conference, Rome, March 17-19, 2005, pp. 34–48. Libreria editrice cafoscarina, Venezia (2006)Google Scholar
  5. 5.
    Nishina, K., Yoshihashi, K.: Japanese Composition Support System Displaying Occurrences and Example Sentences. In: Symposium on Large-scale Knowledge Resources (LKR2007), pp. 119–122 (2007)Google Scholar
  6. 6.
    Minami, F.: Gendai nihongo bunpô no rinkaku (Outline of the modern Japanese grammar), Taishûkan shoten, Tokyo (1993)Google Scholar
  7. 7.
    Bekeš, A.: Gengo ni okeru fukakujitsusei: kyôkai kara mita monogataribun ni okeru wa (Uncertainty in the language: use of particle wa in narrative and the notion of boundary). In: Organizing Committee (ed.) Dai 15 kai nihongo kyôiku renraku kaigi, Durham, August 3–4, 2002, pp. 88–97 (2003)Google Scholar
  8. 8.
    Partington, A.: Patterns and Meanings: Using corpora for English language research and teaching. John Benjamins, Amsterdam (1998)Google Scholar
  9. 9.
    Sinclair, J.: Corpus, Concordance, Collocation. Oxford University Press, Oxford (1991)Google Scholar
  10. 10.
    Manning, C.D., Schütze, H.: Foundations of statistical natural language processing. MIT Press, Cambridge, MA (1999)MATHGoogle Scholar
  11. 11.
    Church, K.W., Hanks, P.: Word Association Norms, Mutual Information, and Lexicography. Computational Linguistics 16(1), 22–29 (1990)Google Scholar
  12. 12.
    Ikehara, S., Shirai, S., Uchino, H.: A Statistical Method for Extracting Uninterrupted and Interrupted Collocations from Very Large Corpora. In: Proceedings of 16th International Conference on Computational Linguistics (COLING-96), pp. 574–579 (1996)Google Scholar
  13. 13.
    Milkov, N.: Logico-Linguistic Moleculism: Towards an Ontology of Collocations and other [Language] Patterns. In: Proceedings of OntoLex 2000: Ontologies and Lexical Knowledge Bases, pp. 82–94. OntoText Lab, Sofia (2001)Google Scholar
  14. 14.
    Oikawa, T.: Jinbunkagaku to kopyûtâ DATABASE, vol. 1, Sôgô kenkyû daigaku (transcribed interviews of 50 Japanese native speakers, approximately 820 KB) (1998)Google Scholar
  15. 15.
    Ohso, M.: Meidai kaiwa kôpasu. Kagakukenkyûhi kiban kenkyu (B)(2) Nihongogakushûjisho hensan ni muketa denshika kôpasu riyô ni yoru korokeeshon kenkyû (2001–3). Nagoya University Japanese Conversation Corpus (NUJCC). Unpublished research report. About 100 informal conversations between familiar participants. Text file size 3.5 MB (2003)Google Scholar
  16. 16.
    Erjavec, T., Kilgarriff, A., Srdanović, E.I.: A large public-access Japanese corpus and its query tool. In: CoJaS 2007, The Inaugural Workshop on Computational Japanese Studies. Ikaho (2007)Google Scholar
  17. 17.
    Kilgarriff, A., et al.: The Sketch Engine. In: Proc. Euralex. Lorient, France, pp. 105–116 (2004)Google Scholar
  18. 18.
    Srdanović, E.I., Erjavec, T., Kilgarriff, A.: A web corpus and word-sketches for Japanese. Journal of Natural Language Processing (submitted)Google Scholar
  19. 19.
    Minami, F.: Jutsugobun no kôzô (Structure of the predicative sentences), originally in Kokugo Kenkyu 18, republished in Hattori et al. (eds.) (1978: (pp. 507–530), 1964)Google Scholar
  20. 20.
    Sharoff, S.: Creating general-purpose corpora using automated search engine queries. WaCky! Working papers on the Web as Corpus. Gedit, Bologna (2006)Google Scholar
  21. 21.
    Ueyama, M., Baroni, M.: Automated construction and evaluation of a Japanese web-based reference corpus. In: Proceedings of Corpus Linguistics 2005. Birmingham, UK (2005)Google Scholar
  22. 22.
    Hirst, G.: Ontology and the Lexicon. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies in Information Systems, Springer, Berlin (2003)Google Scholar
  23. 23.
    Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA (1998)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Irena Srdanović Erjavec
    • 1
  • Andrej Bekeš
    • 2
  • Kikuko Nishina
    • 1
  1. 1.Department of Human System ScienceTokyo Institute of TechnologyMeguro-kuJapan
  2. 2.University of Ljubljana Faculty of ArtsLjubljanaSlovenia

Personalised recommendations