Stretching the Life of Twitter Classifiers with Time-Stamped Semantic Graphs

  • Amparo Elizabeth Cano
  • Yulan He
  • Harith Alani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8797)

Abstract

Social media has become an effective channel for communicating both trends and public opinion on current events. However the automatic topic classification of social media content pose various challenges. Topic classification is a common technique used for automatically capturing themes that emerge from social media streams. However, such techniques are sensitive to the evolution of topics when new event-dependent vocabularies start to emerge (e.g., Crimea becoming relevant to War_Conflict during the Ukraine crisis in 2014). Therefore, traditional supervised classification methods which rely on labelled data could rapidly become outdated. In this paper we propose a novel transfer learning approach to address the classification task of new data when the only available labelled data belong to a previous epoch. This approach relies on the incorporation of knowledge from DBpedia graphs. Our findings show promising results in understanding how features age, and how semantic features can support the evolution of topic classifiers.

Keywords

social media topic detection DBpedia concept drift feature relevance decay 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proc. Conf. on EMNLP (2006)Google Scholar
  2. 2.
    Cano, E., He, Y., Liu, K., Zhao, J.: A weakly supervised bayesian model for violence detection in social media. In: Proc. 6th IJCNLP 2013 (2013)Google Scholar
  3. 3.
    Cano, A.E., Varga, A., Rowe, M., Ciravegna, F., He, Y.: Harnessing linked knowledge source for topic classification in social media. In: Proc. 24th ACM Conf. on Hypertext and Social Media, Paris, France (2013)Google Scholar
  4. 4.
    Caruana, R.: Multitask learning.  28(1), 41–75 (1997)Google Scholar
  5. 5.
    Chen, G.H., Nikolov, S., Shah, D.: A latent source model for nonparametric time series classification. In: Advances in Neural Information Processing Systems (2013)Google Scholar
  6. 6.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: And Other Kernel-Based Learning Methods. Cambridge University Press (2000)Google Scholar
  7. 7.
    Daumé, I.: Frustratingly easy domain adaptation. In: Proceedings of the 2007 ACL (2007)Google Scholar
  8. 8.
    Diao, Q., Jiang, J., Zhu, F., Lim, E.-P.: Finding bursty topics from microblogs. In: Proc. 50th Annual Meeting of the ACL, Jeju Island, Korea (2012)Google Scholar
  9. 9.
    Dries, A., Rückert, U.: Adaptive concept drift detection. Stat. Anal. Data Min. 2(56) (2009)Google Scholar
  10. 10.
    Genc, Y., Sakamoto, Y., Nickerson, J.V.: Discovering context: Classifying tweets through a semantic transform based on wikipedia. In: Schmorrow, D.D., Fidopiastis, C.M. (eds.) FAC 2011. LNCS, vol. 6780, pp. 484–492. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    He, Y.: Incorporating sentiment prior knowledge for weakly supervised sentiment analysis. ACM Transactions on Asian Language Information Processing 11(2), 4:1–4:19 (2012)Google Scholar
  12. 12.
    He, Y., Lin, C., Gao, W., Wong, K.-F.: Tracking sentiment and topic dynamics from social media. In: Proc. of the Sixth Int. Conf. on Weblogs and Social Media, Dublin, Ireland (2012)Google Scholar
  13. 13.
    Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: Development and comparative experiments. Inf. Process. Manage. 36(6), 779–808 (2000)CrossRefGoogle Scholar
  14. 14.
    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc. 14th IJCAI, vol. 2 (1995)Google Scholar
  15. 15.
    Lv, W., Xu, W., Guo, J.: Transfer learning in classification based on semantic analysis. In: 2nd Int. Conf. on ICCSNT (2012)Google Scholar
  16. 16.
    Milikic, N., Jovanovic, J., Stankovic, M.: Discovering the dynamics of terms semantic relatedness through twitter. In: Proceedings, 1st Workshop on #MSM 2011 (2011)Google Scholar
  17. 17.
    Muñoz García, O., García-Silva, A., Corcho, O., de la Higuera Hernández, M., Navarro, C.: Identifying Topics in Social Media Posts using DBpedia. In: Proc. of the NEM Summit (2011)Google Scholar
  18. 18.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. on Knowl. and Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  19. 19.
    Phan, X.-H., Nguyen, L.-M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proc. 17th Int. Conf. on World Wide Web, WWW 2008, Beijing, China (2008)Google Scholar
  20. 20.
    Porter, M.: An algorithm for suffix stripping. Program 14(3) (1980)Google Scholar
  21. 21.
    Saif, H., Fernandez, M., He, Y., Alani, H.: Senticircles for contextual and conceptual semantic sentiment analysis of twitter. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 83–98. Springer, Heidelberg (2014)Google Scholar
  22. 22.
    Salzberg, S.L., Fayyad, U.: On comparing classifiers: Pitfalls to avoid and a recommended approach. In: Data Mining and Knowledge Discovery, pp. 317–328 (1997)Google Scholar
  23. 23.
    Song, Y., Wang, H., Wang, Z., Li, H., Chen, W.: Short text conceptualization using a probabilistic knowledgebase. In: Int. Joint. Conf. of AI (IJCAI). IJCAI/AAAI (2011)Google Scholar
  24. 24.
    Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in twitter to improve information filtering. In: Proc. of the Int. ACM SIGIR (2010)Google Scholar
  25. 25.
    Thrun, S.: Is learning the n-th thing any easier than learning the first? In: Advances in Neural Information Processing Systems (1996)Google Scholar
  26. 26.
    Varga, A., Cano, A., Rowe, M., Ciravegna, F., He, Y.: Linked knowledge sources for topic classification of microposts: A semantic graph-based approach. In: JWS: Science, Services and Agents on the WWW (2014)Google Scholar
  27. 27.
    Zhao, X., Shu, B., Jiang, J., Song, Y., Yan, H., Li, X.: Identifying event-related bursts via social media activities. In: Proc. of the Joint Conference on EMNLP, Jeju Island, Korea (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Amparo Elizabeth Cano
    • 1
  • Yulan He
    • 2
  • Harith Alani
    • 1
  1. 1.Knowledge Media InstituteOpen UniversityUK
  2. 2.School of Engineering and Applied ScienceAston UniversityUK

Personalised recommendations