TweCoM: Topic and Context Mining from Twitter

  • Luca CaglieroEmail author
  • Alessandro Fiori
Part of the Lecture Notes in Social Networks book series (LNSN, volume 6)


Social networks and online communities are taking a primary role in enabling communication and content sharing (e.g., posts, documents, photos, videos) among Web users. Knowledge discovery from user-generated content is becoming an increasingly appealing research context. Many different approaches have been devoted to addressing this issue.This chapter proposes the TweCoM (Tweet Context Miner) framework which entails the mining of relevant recurrences from the content and the context in which Twitter messages (i.e., tweets) are posted. The framework combines two main efforts: (i) the automatic generation of taxonomies from both post content and contextual features, and (ii) the extraction of hidden correlations by means of generalized association rule mining. Since generalized association rule mining is commonly driven by user-provided taxonomies, discovered recurrences are often unsatisfactory. To overcome this issue, two different taxonomy inference procedures have been applied, depending on the kind of information. In particular, relationships holding in context data provided by Twitter are exploited to automatically construct aggregation hierarchies over contextual features, while a hierarchical clustering algorithm is exploited to build a taxonomy over most relevant tweet content keywords. To counteract the excessive level of detail of the extracted information, conceptual aggregations (i.e., generalizations) of concepts hidden in the analyzed data are exploited in the association rule mining process. The extraction of generalized association rules allows discovering high level recurrences by evaluating the extracted taxonomies. Experiments performed on real Twitter posts show the effectiveness and the efficiency of the proposed framework in analyzing tweet content and related context as well as highlighting relevant trends in tweet propagation.


Association Rule Rule Mining Frequent Itemsets Aggregation Function Association Rule Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Abrol, S., Khan, L.: Twinner: understanding news queries with geo-content using twitter. In: Proceedings of the 6th Workshop on Geographic Information Retrieval, pp. 1–8. ACM, New York (2008)Google Scholar
  2. 2.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Record, vol. 22, pp. 207–216. ACM, New York (1993)Google Scholar
  3. 3.
    Agarwal, D., Phillips, J., Venkatasubramanian, S.: The hunting of the bump: on maximizing statistical discrepancy. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, pp. 1137–1146. ACM, New York (2006)Google Scholar
  4. 4.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: Dbpedia: A Nucleus for a Web of Open Data. The Semantic Web, pp. 722–735. Springer, Heidelberg (2007)Google Scholar
  5. 5.
    Baralis, E., Cagliero, L., Cerquitelli, T., D’Elia, V., Garza, P.: Support driven opportunistic aggregation for generalized itemset extraction. In: IEEE Conference of Intelligent Systems, pp. 102–107. IEEE, Washington, DC (2010)Google Scholar
  6. 6.
    Basile, P., Gendarmi, D., Lanubile, F., Semeraro, G.: Recommending Smart Tags in a Social Bookmarking System, pp. 22–29. IEEE, Washington, DC (2007)Google Scholar
  7. 7.
    Bender, M., Crecelius, T., Kacimi, M., Michel, S., Neumann, T., Parreira, J., Schenkel, R., Weikum, G.: Exploiting social relations for query expansion and result ranking. In: IEEE 24th International Conference on Data Engineering Workshop, pp. 501–506. ACM, New York (2008)Google Scholar
  8. 8.
    Bogorny, V., Valiati, J., da Silva Camargo, S., Engel, P., Alvares, L.: Towards Elimination of Redundant and Well Known Patterns in Spatial Association Rule Mining, pp. 343–360. Springer, Berlin/Heidelberg (2008)Google Scholar
  9. 9.
    Clifton, C., Cooley, R., Rennie, J.: TopCat: data Mining for Topic Identification in a Text Corpus, pp. 949–964. IEEE, Washington, DC (2004)Google Scholar
  10. 10.
    Gates, S., Teiken, W., Cheng, K.: Taxonomies by the numbers: building high-Performance taxonomies. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 568–577. ACM, New York (2005)Google Scholar
  11. 11.
    Han, J., Fu, Y.: Mining multiple-level association rules in large databases. IEEE Trans. Knowl. Data Eng. 11(5), 798–805 (2002)Google Scholar
  12. 12.
    Hatzivassiloglou, V., Gravano, L., Maganti, A.: An investigation of linguistic features and clustering algorithms for topical document clustering. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 224–231. ACM, New York (2000)Google Scholar
  13. 13.
    Herlocker, J., Konstan, J., Terveen, L., Riedl, J.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004)CrossRefGoogle Scholar
  14. 14.
    Heymann, P., Ramage, D., Garcia-Molina, H.: Social tag prediction. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 531–538. ACM, New York (2008)Google Scholar
  15. 15.
    Hovy, E., Lin, C.: Automated text summarization in SUMMARIST. In: Advances in Automatic Text Summarization, vol. 94. MIT, Cambridge (1999)Google Scholar
  16. 16.
    Ienco, D., Meo, R.: Towards the Automatic Construction of Conceptual Taxonomies, pp. 327–336. Springer, London (2008)Google Scholar
  17. 17.
    Kasneci, G., Ramanath, M., Suchanek, F., Weikum, G.: The YAGO-NAGA approach to knowledge discovery. ACM SIGMOD Rec. 37(4), 41–47 (2009)CrossRefGoogle Scholar
  18. 18.
    Kivinen, J., Mannila, H.: Approximate inference of functional dependencies from relations. Theor. Comput. Sci. 149(1), 129–149 (1995)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Lappas, T., Arai, B., Platakis, M., Kotsakos, D., Gunopulos, D.: On burstiness-aware search for document sequences. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 477–486. ACM, New York (2009)Google Scholar
  20. 20.
    Li, X., Guo, L., Zhao, Y.: Tag-based social interest discovery. In: Proceeding of the 17th International Conference on World Wide Web, pp. 675–684. ACM, New York (2008)Google Scholar
  21. 21.
    Li, Q., Wang, J., Chen, Y., Lin, Z.: User comments for news recommendation in forum-based social media. Inf. Sci. 180, 4929–4939 (2010)CrossRefGoogle Scholar
  22. 22.
    MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  23. 23.
    Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the twitter stream. In: Proceedings of the 2010 International Conference on Management of data, pp. 1155–1158. ACM, New York (2010)Google Scholar
  24. 24.
    Mennis, J., Liu, J.: Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change. Trans. GIS 9(1), 5–17 (2005)CrossRefGoogle Scholar
  25. 25.
    Neshati, M., Hassanabadi, L.: Taxonomy construction using compound similarity measure. In: Proceedings of the OTM Confederated International Conference on On the Move to Meaningful Internet Systems, pp. 915–932. Springer, Berlin/Heidelberg (2007)Google Scholar
  26. 26.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Database Theory, pp. 398–416, Springer, Berlin (1999)Google Scholar
  27. 27.
    Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: Proceedings of the Third ACM Conference on Recommender Systems, pp. 385–388. ACM, New York (2009)Google Scholar
  28. 28.
    Porter, M.F.: An algorithm for suffix stripping. In: Readings in Information Retrieval pp. 313–316. Morgan Kaufmann, San Francisco (1997)Google Scholar
  29. 29.
    Pramudiono, I., Kitsuregawa, M.: Fp-tax: tree structure based generalized association rule mining. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, p. 63. ACM, New York (2004)Google Scholar
  30. 30.
    Schmitz, C., Hotho, A., Jaschke, R., Stumme, G.: Mining association rules in folksonomies. In: Data Science and Classification, pp. 261–270. Springer, Berlin (2006)Google Scholar
  31. 31.
    Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 259–266. ACM, New York (2008)Google Scholar
  32. 32.
    Sigurbjornsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp. 327–336. ACM, New York (2008)Google Scholar
  33. 33.
    Srikant, R., Agrawal, R.: Mining generalized association rules. In: International Conference on Very Large Data Bases, pp. 407–419. Morgan Kaufmann, San Fransisco (1995)Google Scholar
  34. 34.
    Srikant, R., Vu, Q., Agrawal, R.: Mining association rules with item constraints. In: Conference on Knowledge Discovery and Data Mining, vol. 97, pp. 67–73. AAAI, Menlo Park (1997)Google Scholar
  35. 35.
    Sriphaew, K., Theeramunkong, T.: A new method for finding generalized frequent itemsets in generalized association rule mining. In: Seventh International Symposium on Computers and Communications, pp. 1040–1045. IEEE, Washington, DC (2002)Google Scholar
  36. 36.
    Tan, P., Kumar, V., Srivastava, J.: Selecting the right interestingness measure for association patterns. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 41. ACM, New York (2002)Google Scholar
  37. 37.
    Woon, W., Madnick, S.: Asymmetric information distances for automated taxonomy construction. Knowl. Inf. Syst. 21(1), 91–111 (2009)CrossRefGoogle Scholar
  38. 38.
    Xue, Y., Zhang, C., Zhou, C., Lin, X., Li, Q.: An effective news recommendation in social media based on users’ preference. In: International Workshop on Education Technology and Training, vol. 1, pp. 627–631. IEEE, Washington, DC (2009)Google Scholar
  39. 39.
    Yin, Z., Li, R., Mei, Q., Han, J.: Exploring social tagging graph for web object classification. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 957–966. ACM, New York (2009)Google Scholar

Copyright information

© Springer-Verlag Wien 2013

Authors and Affiliations

  1. 1.Politecnico di TorinoTorinoItaly
  2. 2.IRC@C: Institute for Cancer Research at CandioloCandiolo (TO)Italy

Personalised recommendations