A Text Cube Approach to Human, Social and Cultural Behavior in the Twitter Stream

  • Xiong Liu
  • Kaizhi Tang
  • Jeffrey Hancock
  • Jiawei Han
  • Mitchell Song
  • Roger Xu
  • Bob Pokorny
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7812)


Twitter is a microblogging website that has been useful as a source for human social behavioral analysis, such as political sentiment analysis, user influence, and spread of news. In this paper, we discuss a text cube approach to studying different kinds of human, social and cultural behavior (HSCB) embedded in the Twitter stream. Text cube is a new way to organize data (e.g., Twitter text) in multiple dimensions and multiple hierarchies for efficient information query and visualization. With the HSCB measures defined in a cube, users are able to view statistical reports and perform online analytical processing. Along with viewing and analyzing Twitter text using cubes and charts, we have also added the capability to display the contents of the cube on a heat map. The degree of opacity is directly proportional to the value of the behavioral, social or cultural measure. This kind of map allows the analyst to focus attention on hotspots of concern in a region of interest. In addition, the text cube architecture supports the development of data mining models using the data taken from cubes. We provide several case studies to illustrate the text cube approach, including public sentiment in a U.S. city and political sentiment in the Arab Spring.


Data Cube Text Cube Cultural Behavior Unstructured Text Star Schema 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Numrich, S.K., Tolk, A.: Challenges for Human, Social, Cultural, and Behavioral Modeling. SCS M&S Magazine 1(1) (January 2010)Google Scholar
  2. 2.
    Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Predicting elections with Twitter: What 140 characters reveal about political sentiment. In: International AAAI Conference on Weblogs and Social Media (2010)Google Scholar
  3. 3.
    Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring User Influence in Twitter: The Million Follower Fallacy. In: Fourth International AAAI Conference on Weblogs and Social Media (2010)Google Scholar
  4. 4.
    Lerman, K., Ghosh, R.: Information Contagion: An Empirical Study of the Spread of News on Digg and Twitter Social Networks. In: Fourth International AAAI Conference on Weblogs and Social Media, Washington, DC, May 23-26 (2010)Google Scholar
  5. 5.
    Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)CrossRefGoogle Scholar
  6. 6.
    Liu, X., Tang, K., Hancock, J., Han, J., Song, M., Xu, R., Manikonda, V., Pokorny, B.: SocialCube: A Text Cube Framework for Analyzing Social Media Data. In: Proceedings of ASE International Conference on Social Informatics, Washington, DC (December 2012)Google Scholar
  7. 7.
    Lin, C., Ding, B., Han, J., Zhu, F., Zhao, B.: Text Cube: Computing IR Measures for Multidimensional Text Database Analysis. In: Proc. 2008 Int. Conf. on Data Mining, Pisa, Italy (December 2008)Google Scholar
  8. 8.
    Zhang, D., Zhai, C., Han, J.: Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases. In: Proc. 2009 SIAM Int. Conf. on Data Mining, Sparks, NV (April 2009)Google Scholar
  9. 9.
    Zhang, D., Zhai, C., Han, J.: MiTexCube: MicroTextCluster Cube for Online Analysis of Text Cells. In: Proc. 2011 NASA Conf. on Intelligent Data Understanding, Mountain View, CA (October 2011)Google Scholar
  10. 10.
    Zhao, B., Lin, C.X., Ding, B., Han, J.: TEXplorer: Keyword based object ranking and exploration in multidimensional text databases. In: Int. Conf. on Information and Knowledge Management (October 2011)Google Scholar
  11. 11.
    Liu, X., Tang, K., Buhrman, J.R., Cheng, H.: An agent-based framework for collaborative data mining optimization. In: IEEE International Symposium on Collaborative Technologies and Systems (2010)Google Scholar
  12. 12.
    Tang, K., Liu, X., Tang, Y., Manikonda, V., Buhrman, J.R., Cheng, H.: ABMiner: A scalable data mining framework to support human performance analysis. In: International Conference on Applied Human Factors and Ergonomics (July 2010)Google Scholar
  13. 13.
    Brown, C., Frazee, J., Beaver, D., Liu, X., Hoyt, F., Hancock, J.: Evolution of Sentiment in the Libyan Revolution (2011), White Paper at
  14. 14.
    Liu, X., Hancock, J., Zhang, G., Xu, R., Bazarova, N.: Exploring linguistic features for deception detection in unstructured text. In: Hawaii International Conference on System Sciences, January 4-7 (2012)Google Scholar
  15. 15.
    Ekman, P.: Telling Lies: Clues to Deceit in the Marketplace, Politics, and Marriage. Norton & Company Inc., New York (2001)Google Scholar
  16. 16.
    Russell, J.A.: A circumplex model of affect. Journal of Personality and Social Psychology 39, 1161–1178 (1980)CrossRefGoogle Scholar
  17. 17.
    Mehrabian, A.: Nonverbal communication. Aldine-Atherton, Chicago (1972)Google Scholar
  18. 18.
    Hancock, J.T., Landrigan, C., Silver, C.: Expressing emotion in text. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2007), pp. 929–932 (2007)Google Scholar
  19. 19.
    Hancock, J.T., Gee, K., Ciaciaco, K., Mae, J.: I’m sad you’re sad: Emotional contagion in CMC. In: Proceedings of the ACM Conference on Computer-Supported Cooperative Work (2008)Google Scholar
  20. 20.
    Kramer, A.D.I.: An unobtrusive behavioral model of “Gross National Happiness”. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (2010)Google Scholar
  21. 21.
    Golder, S., Macy, M.: Diurnal and Seasonal Mood Vary with Work, Sleep and Daylength across Diverse Cultures. Science 333, 1878–1881 (2011)CrossRefGoogle Scholar
  22. 22.
    Pennebaker, J.W., Booth, R.J., Francis, M.E.: Linguistic Inquiry and Word Count: LIWC. LIWC, Austin,
  23. 23.
    Schwarz, N., Clore, G.L.: Mood, Misattribution, and Judgments of Well-Being: Informative and Directive Functions of Affective States. JPSP 45, 513–523 (1983)Google Scholar
  24. 24.
    Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using the second order information for training SVM. Journal of Machine Learning Research 6, 1889–1918 (2005)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  26. 26.
    Aha, D., Kibler, D., Albert, M.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)Google Scholar
  27. 27.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Xiong Liu
    • 1
  • Kaizhi Tang
    • 1
  • Jeffrey Hancock
    • 2
  • Jiawei Han
    • 3
  • Mitchell Song
    • 1
  • Roger Xu
    • 1
  • Bob Pokorny
    • 1
  1. 1.Intelligent Automation, Inc.USA
  2. 2.Cornell UniversityUSA
  3. 3.University of Illinois at Urbana-ChampaignUSA

Personalised recommendations