Linguistic Ethnography: Identifying Dominant Word Classes in Text

  • Rada Mihalcea
  • Stephen Pulman
Conference paper

DOI: 10.1007/978-3-642-00382-0_48

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5449)
Cite this paper as:
Mihalcea R., Pulman S. (2009) Linguistic Ethnography: Identifying Dominant Word Classes in Text. In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg

Abstract

In this paper, we propose a method for ”linguistic ethnography” – a general mechanism for characterising texts with respect to the dominance of certain classes of words. Using humour as a case study, we explore the automatic learning of salient word classes, including semantic classes (e.g., person, animal), psycholinguistic classes (e.g., tentative, cause), and affective load (e.g., anger, happiness). We measure the reliability of the derived word classes and their associated dominance scores by showing significant correlation across different corpora.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Rada Mihalcea
    • 1
    • 2
  • Stephen Pulman
    • 2
  1. 1.Computer Science DepartmentUniversity of North TexasUSA
  2. 2.Computational Linguistics GroupOxford UniversityUK

Personalised recommendations