Computational Linguistics and Intelligent Text Processing

Volume 7181 of the series Lecture Notes in Computer Science pp 591-602

Building Subjectivity Lexicon(s) from Scratch for Essay Data

  • Beata Beigman KlebanovAffiliated withEducational Testing Service
  • , Jill BursteinAffiliated withEducational Testing Service
  • , Nitin MadnaniAffiliated withEducational Testing Service
  • , Adam FaulknerAffiliated withGraduate Center, The City University of New York
  • , Joel TetreaultAffiliated withEducational Testing Service

* Final gross prices may vary according to local VAT.

Get Access


While there are a number of subjectivity lexicons available for research purposes, none can be used commercially. We describe the process of constructing subjectivity lexicon(s) for recognizing sentiment polarity in essays written by test-takers, to be used within a commercial essay-scoring system. We discuss ways of expanding a manually-built seed lexicon using dictionary-based, distributional in-domain and out-of-domain information, as well as using Amazon Mechanical Turk to help “clean up” the expansions. We show the feasibility of constructing a family of subjectivity lexicons from scratch using a combination of methods to attain competitive performance with state-of-art research-only lexicons. Furthermore, this is the first use, to our knowledge, of a paraphrase generation system for expanding a subjectivity lexicon.


essay writing sentiment analysis sentiment polarity subjectivity lexicon C5.0 lexicon expansion paraphrase generation thesaurus resources