Extracting Collective Trends from Twitter Using Social-Based Data Mining

  • Gema Bello
  • Héctor Menéndez
  • Shintaro Okazaki
  • David Camacho
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8083)


Social Networks have become an important environment for Collective Trends extraction. The interactions amongst users provide information of their preferences and relationships. This information can be used to measure the influence of ideas, or opinions, and how they are spread within the Network. Currently, one of the most relevant and popular Social Network is Twitter. This Social Network was created to share comments and opinions. The information provided by users is specially useful in different fields and research areas such as marketing. This data is presented as short text strings containing different ideas expressed by real people. With this representation, different Data Mining and Text Mining techniques (such as classification and clustering) might be used for knowledge extraction trying to distinguish the meaning of the opinions. This work is focused on the analysis about how these techniques can interpret these opinions within the Social Network using information related to IKEA® company.


Collective Trends Social Network Data Mining Classification Clustering Twitter 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Twitter web site (2013),
  2. 2.
    Ahonen-Myka, H.: Mining all maximal frequent word sequences in a set of sentences. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM 2005, pp. 255–256. ACM, New York (2005)Google Scholar
  3. 3.
    Bruckhaus, T.: Collective intelligence in marketing. In: Casillas, J., Martínez-López, F.J. (eds.) Marketing Intelligent Systems Using Soft Computing. STUDFUZZ, vol. 258, pp. 131–154. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Buhmann, M.D.: Radial Basis Functions. Cambridge University Press, New York (2003)CrossRefzbMATHGoogle Scholar
  5. 5.
    Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20, 273–297 (1995)zbMATHGoogle Scholar
  6. 6.
    Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)CrossRefzbMATHGoogle Scholar
  7. 7.
    Cutting, D.R., Karger, D.R., Pedersen, J.O., Tukey, J.W.: Scatter/gather: a cluster-based approach to browsing large document collections. In: Proceedings of the 15th Annual International ACM Sigir Conference on Research and Development in Information Retrieval, SIGIR 1992, pp. 318–329. ACM, New York (1992)CrossRefGoogle Scholar
  8. 8.
    Domingos, P., Pazzani, M.: On the optimality of the simple bayesian classifier under zero-one loss. Mach. Learn. 29(2-3), 103–130 (1997)CrossRefzbMATHGoogle Scholar
  9. 9.
    Frakes, W.B., Baeza-Yates, R.A. (eds.): Information Retrieval: Data Structures & Algorithms. Prentice-Hall (1992)Google Scholar
  10. 10.
    Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. PhD thesis, University of Waikato, Hamilton, New Zealand (1998)Google Scholar
  11. 11.
    Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Applied Statistics 28, 100–108 (1979)CrossRefzbMATHGoogle Scholar
  12. 12.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis, 9th edn. Wiley-Interscience (March 1990)Google Scholar
  13. 13.
    Larose, D.T.: Discovering Knowledge in Data. John Wiley and Sons (2005)Google Scholar
  14. 14.
    Li, Y., Chung, S.M., Holt, J.D.: Text document clustering based on frequent word meaning sequences. Data Knowl. Eng. 64(1), 381–404 (2008)CrossRefGoogle Scholar
  15. 15.
    Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefzbMATHGoogle Scholar
  16. 16.
    Powers, D.M.W.: Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness & Correlation. Technical Report SIE-07-001, School of Informatics and Engineering, Flinders University, Adelaide, Australia (2007)Google Scholar
  17. 17.
    Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  18. 18.
    Trung, D.N., Jung, J.J., Lee, N., Kim, J.: Thematic analysis by discovering diffusion patterns in social media: An exploratory study with tweetScope. In: Selamat, A., Nguyen, N.T., Haron, H. (eds.) ACIIDS 2013, Part II. LNCS, vol. 7803, pp. 266–274. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  19. 19.
    Zamir, O., Etzioni, O.: Web document clustering: a feasibility demonstration. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1998, pp. 46–54. ACM, New York (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Gema Bello
    • 1
  • Héctor Menéndez
    • 1
  • Shintaro Okazaki
    • 2
  • David Camacho
    • 1
  1. 1.Departamento de Ingeniería Informática. Escuela Politécnica SuperiorUniversidad Autónoma de MadridMadridSpain
  2. 2.Department of Finance and Marketing Research. College of Economics and Business AdministrationUniversidad Autónoma de MadridMadridSpain

Personalised recommendations