Mapping an Automated Survey Coding Task into a Probabilistic Text Categorization Framework

  • Daniela Giorgetti
  • Irina Prodanof
  • Fabrizio Sebastiani
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2389)

Abstract

This paper describes how to apply a probabilistic Text Categorization method to a different and new domain where documents are answers to open end questionnaires and codes viewed as categories consist of a hierarchical model. A reduced size training set may be used taking advantage of the hierarchical organization of categories. The system developed in this framework aims at helping psychologists in the evaluation of open end surveys inquiring about job candidates’ competencies.

Keywords

Automatic Code Word Event Dictionary Base Approach Social Science Computer Review Information Retrieval Automatic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Melina Alexa and Cornelia Zuell. Text analysis software: Commonalities, differences and limitations: The results of a review. Quality & Quantity, (34):299–321, 2000.CrossRefGoogle Scholar
  2. [2]
    Susan T. Dumais, John Platt, David Heckerman, and Mehran Sahami. Inductive learning algorithms and representations for text categorization. In Georges Gardarin, James C. French, Niki Pissinou, Kia Makki, and Luc Bouganim, editors, Proceedings of CIKM-98, 7th ACM International Conference on Information and Knowledge Management, pages 148–155, Bethesda, US, 1998. ACM Press, New York, US.CrossRefGoogle Scholar
  3. [3]
    Hay Group. Web site: http://www.haygroup.com. Last visited on April 8, 2002.
  4. [4]
    Leah S. Larkey. Automatic essay grading using text categorization techniques. In W. Bruce Croft, Alistair Moffat, Cornelis J. van Rijsbergen, Ross Wilkinson, and Justin Zobel, editors, Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval, pages 90–95, Melbourne, AU, 1998. ACM Press, New York, US.CrossRefGoogle Scholar
  5. [5]
    Andrew K. McCallum and Kamal Nigam. A comparison of event models for Naive Bayes text classification. In Proceedings of AAAI/ICML-98 Workshop on Learning for Text Categorization, pages 41–48, Madison, US, 1998. AAAI Press.Google Scholar
  6. [6]
    Andrew K. McCallum, Ronald Rosenfeld, Tom M. Mitchell, and Andrew Y. Ng. Improving text classification by shrinkage in a hierarchy of classes. In Jude W. Shavlik, editor, Proceedings of ICML-98, 15th International Conference on Machine Learning, pages 359–367, Madison, US, 1998. Morgan Kaufmann Publishers, San Francisco, US.Google Scholar
  7. [7]
    Tom M. Mitchell. Machine Learning. McGraw Hill, New York, US, 1997.MATHGoogle Scholar
  8. [8]
    Andrew J. Perrin. The CodeRead system: Using natural language processing to automate coding of qualitative data. Social Science Computer Review, 19(2):213–220, 2001.CrossRefGoogle Scholar
  9. [9]
    Daniel J. Pratt and William Mays. Automatic coding of transcript data for a survey of recent college graduates. In Proceedings of the section on Survey Methods of the American Statistical Association Annual Meeting, pages 796–801, 1989.Google Scholar
  10. [10]
    Raymond Raud and Michael Fallig. Automating the coding process with neural networks, 1995.Google Scholar
  11. [11]
    Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.CrossRefGoogle Scholar
  12. [12]
    Lyle M. Spencer and Signe M. Spencer. Competence at Work: models for Superior Performance. John Wiley & Sons, New York, US, 1993.Google Scholar
  13. [13]
    Lyle M. Spencer and Signe M. Spencer. Competenza nel Lavoro-Modelli per una Performance Superiore. Franco Angeli, 1995.Google Scholar
  14. [14]
    Peter Viechnicki. A performance evaluation of automatic survey classifiers. In Vasant Honavar and Giora Slutzki, editors, Proceedings of ICGI-98, 4th International Colloquium on Grammatical Inference, pages 244–256, Ames, US, 1998. Springer Verlag, Heidelberg, DE. Published in the “Lecture Notes in Computer Science” series, number 1433.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Daniela Giorgetti
    • 1
  • Irina Prodanof
    • 1
  • Fabrizio Sebastiani
    • 2
  1. 1.Istituto di Linguistica Computazionale del CNR di PisaItalia
  2. 2.Istituto di Elaborazione dell’Informazione del CNR di PisaItalia

Personalised recommendations