PorTAL 2002: Advances in Natural Language Processing pp 115-124 | Cite as
Mapping an Automated Survey Coding Task into a Probabilistic Text Categorization Framework
Conference paper
First Online:
Abstract
This paper describes how to apply a probabilistic Text Categorization method to a different and new domain where documents are answers to open end questionnaires and codes viewed as categories consist of a hierarchical model. A reduced size training set may be used taking advantage of the hierarchical organization of categories. The system developed in this framework aims at helping psychologists in the evaluation of open end surveys inquiring about job candidates’ competencies.
Keywords
Automatic Code Word Event Dictionary Base Approach Social Science Computer Review Information Retrieval Automatic
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- [1]Melina Alexa and Cornelia Zuell. Text analysis software: Commonalities, differences and limitations: The results of a review. Quality & Quantity, (34):299–321, 2000.CrossRefGoogle Scholar
- [2]Susan T. Dumais, John Platt, David Heckerman, and Mehran Sahami. Inductive learning algorithms and representations for text categorization. In Georges Gardarin, James C. French, Niki Pissinou, Kia Makki, and Luc Bouganim, editors, Proceedings of CIKM-98, 7th ACM International Conference on Information and Knowledge Management, pages 148–155, Bethesda, US, 1998. ACM Press, New York, US.CrossRefGoogle Scholar
- [3]Hay Group. Web site: http://www.haygroup.com. Last visited on April 8, 2002.
- [4]Leah S. Larkey. Automatic essay grading using text categorization techniques. In W. Bruce Croft, Alistair Moffat, Cornelis J. van Rijsbergen, Ross Wilkinson, and Justin Zobel, editors, Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval, pages 90–95, Melbourne, AU, 1998. ACM Press, New York, US.CrossRefGoogle Scholar
- [5]Andrew K. McCallum and Kamal Nigam. A comparison of event models for Naive Bayes text classification. In Proceedings of AAAI/ICML-98 Workshop on Learning for Text Categorization, pages 41–48, Madison, US, 1998. AAAI Press.Google Scholar
- [6]Andrew K. McCallum, Ronald Rosenfeld, Tom M. Mitchell, and Andrew Y. Ng. Improving text classification by shrinkage in a hierarchy of classes. In Jude W. Shavlik, editor, Proceedings of ICML-98, 15th International Conference on Machine Learning, pages 359–367, Madison, US, 1998. Morgan Kaufmann Publishers, San Francisco, US.Google Scholar
- [7]Tom M. Mitchell. Machine Learning. McGraw Hill, New York, US, 1997.MATHGoogle Scholar
- [8]Andrew J. Perrin. The CodeRead system: Using natural language processing to automate coding of qualitative data. Social Science Computer Review, 19(2):213–220, 2001.CrossRefGoogle Scholar
- [9]Daniel J. Pratt and William Mays. Automatic coding of transcript data for a survey of recent college graduates. In Proceedings of the section on Survey Methods of the American Statistical Association Annual Meeting, pages 796–801, 1989.Google Scholar
- [10]Raymond Raud and Michael Fallig. Automating the coding process with neural networks, 1995.Google Scholar
- [11]Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.CrossRefGoogle Scholar
- [12]Lyle M. Spencer and Signe M. Spencer. Competence at Work: models for Superior Performance. John Wiley & Sons, New York, US, 1993.Google Scholar
- [13]Lyle M. Spencer and Signe M. Spencer. Competenza nel Lavoro-Modelli per una Performance Superiore. Franco Angeli, 1995.Google Scholar
- [14]Peter Viechnicki. A performance evaluation of automatic survey classifiers. In Vasant Honavar and Giora Slutzki, editors, Proceedings of ICGI-98, 4th International Colloquium on Grammatical Inference, pages 244–256, Ames, US, 1998. Springer Verlag, Heidelberg, DE. Published in the “Lecture Notes in Computer Science” series, number 1433.Google Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 2002