Markov Blankets and Meta-heuristics Search: Sentiment Extraction from Unstructured Texts

  • Edoardo Airoldi
  • Xue Bai
  • Rema Padman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3932)


Extracting sentiments from unstructured text has emerged as an important problem in many disciplines. An accurate method would enable us, for example, to mine online opinions from the Internet and learn customers’ preferences for economic or marketing research, or for leveraging a strategic advantage. In this paper, we propose a two-stage Bayesian algorithm that is able to capture the dependencies among words, and, at the same time, finds a vocabulary that is efficient for the purpose of extracting sentiments. Experimental results on online movie reviews and online news show that our algorithm is able to select a parsimonious feature set with substantially fewer predictor variables than in the full data set and leads to better predictions about sentiment orientations than several state-of-the-art machine learning methods. Our findings suggest that sentiments are captured by conditional dependence relations among words, rather than by keywords or high-frequency words.


Bayesian Network Tabu Search Directed Acyclic Graph Tabu List Sentiment Category 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Golbeck, J., Hendler, J.: Accuracy of metrics for inferring trust and reputation. In: Proceedings of 14th International Conference on Knowledge Engineering and Knowledge Management (2004)Google Scholar
  2. 2.
    Airoldi, E.M., Anderson, A.G., Fienberg, S.E., Skinner, K.K.: Who wrote Ronald Reagan radio addresses? Bayesian Analysis 1, 289–320 (2006)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, pp. 79–86 (2002)Google Scholar
  4. 4.
  5. 5.
    Online news data,
  6. 6.
    Osgood, C.E., Suci, G.J., Tannenbaum, P.H.: The Measurement of Meaning. University of Illinois Press, Chicago, Illinois (1957)Google Scholar
  7. 7.
    Hatzivassiloglou, V., McKeown, K.: Predicting the semantic orientation of adjectives. In: Proceedings of the Eighth Conference on European Chapter of the Association for Computational Linguistics, ACL, pp. 174–181 (1997)Google Scholar
  8. 8.
    Huettner, A., Subasic, P.: Fuzzy typing for document management. In: Association for Computational Linguistics 2000 Companion Volume: Tutorial Abstracts and Demonstration Notes, pp. 26–27 (2000)Google Scholar
  9. 9.
    Das, S., Chen, M.: Yahoo! for amazon: Sentiment parsing from small talk on the web. In: Proceedings of the Eighth Asia Pacific Finance Association Annual Conference, APFA (2001)Google Scholar
  10. 10.
    Turney, P., Littman, M.: Unsupervised learning of semantic orientation from a hundred-billion-word corpus. Technical Report EGB-1094, National Research Council, Canada (2002)Google Scholar
  11. 11.
    Turney, P.: Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In: Proceedings Fortieth Annual Meeting of the Association for Computational Linguistics, pp. 417–424 (2002)Google Scholar
  12. 12.
    Dave, K., Lawrence, S., Pennock, D.: Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the Twelfth International Conference on World Wide Web, pp. 519–528 (2003)Google Scholar
  13. 13.
    Liu, H., Lieberman, H., Selker, T.: A model of textual affect sensing using real-world knowledge. In: Proceedings of the Eighth International Conference on Intelligent User Interfaces, pp. 125–132 (2003)Google Scholar
  14. 14.
    Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. MIT Press, Cambridge (2000)MATHGoogle Scholar
  15. 15.
    Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)MATHGoogle Scholar
  16. 16.
    Glover, F.: Tabu Search. Kluwer Academic Publishers, Dordrecht (1997)CrossRefMATHGoogle Scholar
  17. 17.
    Bai, X.: Tabu search enhanced graphical models for classification of high dimensional data. Technical Report CMU-CALD-05-101, School of Computer Science, Carnegie Mellon University (2005)Google Scholar
  18. 18.
    Airoldi, E., Cohen, W., Fienberg, S.: Bayesian models for frequent terms in text (manuscript, 2005)Google Scholar
  19. 19.
    Spirtes, P., Meek, C.: Learning bayesian networks with discrete variables from data. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pp. 294–299. AAAI Press, Menlo Park (1995)Google Scholar
  20. 20.
    Komarek, P., Moore, A.: Making logistic regression a core data mining tool (manuscript, 2005)Google Scholar
  21. 21.
    Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using em. Machine Learning 39, 103–134 (2000)CrossRefMATHGoogle Scholar
  22. 22.
    Joachims, T.: A statistical learning model of text classification with support vector machines. In: Proceedings of the Conference on Research and Development in Information Retrieval, pp. 128–136. ACM, New York (2001)Google Scholar
  23. 23.
    Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. Machine Learning 37, 277–296 (1999)CrossRefMATHGoogle Scholar
  24. 24.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282–289 (2001)Google Scholar
  25. 25.
    Carletta, J.: Assessing agreement on classification tasks: The kappa statistic. Computational Linguistics 22, 249–254 (1996)Google Scholar
  26. 26.
    Airoldi, E., Anderson, A., Fienberg, S., Skinner, K.: Who wrote Ronald Reagan radio addresses? Journal of Bayesian Analysis (to appear, 2005)Google Scholar
  27. 27.
    Ramsey, J., Bai, X., Glymour, C., Padman, R., Spirtis, P.: Mb fan search classifier for large data sets with few cases. Working paper, Department of Philosophy, Carnegie Mellon University (2004)Google Scholar
  28. 28.
    Bai, X., Padman, R., Airoldi, E.: Sentiment extraction from unstructured text using tabu search-enhanced markov blanket. In: Proceedings of KDD Workshop on Mining for and from the Semantic Web (MSWKDD) (2004)Google Scholar
  29. 29.
    Cohen, W.: Minor-third: Methods for identifying names and ontological relations in text using heuristics for inducing regularities from data (2004),
  30. 30.
    Bishop, Y., Fienberg, S., Holland, P.: Discrete Multivariate Analysis. Theory and practice. MIT Press, Cambridge (1975)MATHGoogle Scholar
  31. 31.
    Chickering, D., Meek, C., Heckerman, D.: Large-sample learning of bayesian networks is np-hard. In: Proceedings of Nineteenth Conference on Uncertainty in Artificial Intelligence, pp. 124–133. Morgan Kaufmann, San Francisco (2003)Google Scholar
  32. 32.
    Engstrom, C.: Topic dependence in sentiment classification. Technical Report 07-22-2004, St Edmund′s College. University of Cambridge (2004)Google Scholar
  33. 33.
    Finn, A., Kushmerick, N.: Learning to classify documents according to genre. In: IJCAI 2003 Workshop on Computational Approaches to Style Analysis and Synthesis (2003)Google Scholar
  34. 34.
    Koller, D., Sahami, M.: Towards optimal feature selection. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 284–292. Morgan Kaufmann, San Francisco (1996)Google Scholar
  35. 35.
    Lewis, D.D.: Evaluating Text Categorization. In: Proceedings of Speech and Natural Language Workshop, pp. 312–318. Morgan Kaufmann, San Francisco (1991)CrossRefGoogle Scholar
  36. 36.
    Margaritis, D., Thrun, S.: Bayesian network induction via local neighborhoods. In: Advances in Neural Information Processing System (1999)Google Scholar
  37. 37.
    Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  38. 38.
    Montgomery, A., Kannan, S.: Learning about customers without asking. GSIA Working Paper, Carnegie Mellon University (2002)Google Scholar
  39. 39.
    Piatetsky-Shapiro, G., Steingold, S.: Measuring lift quality in database marketing. SIGKDD Explorations 2, 76–80 (2000)CrossRefGoogle Scholar
  40. 40.
    Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 445–453 (1998)Google Scholar
  41. 41.
    Xing, E.P., Karp, M.J.R.: Feature selection for high-dimensional genomic microarray data. In: Proceedings 18th International Conf. on Machine Learning. Morgan Kaufmann, San Francisco (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Edoardo Airoldi
    • 1
  • Xue Bai
    • 1
    • 2
  • Rema Padman
    • 2
  1. 1.School of Computer ScienceCarnegie Mellon UniversityPittsburghUSA
  2. 2.The John Heinz III School of Public Policy and ManagementCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations