Natural Language Processing for Social Event Classification

  • Duc-Duy NguyenEmail author
  • Minh-Son Dao
  • Truc-Vien T. Nguyen
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 326)


In this paper, a simple but effective method for social event detection based mainly on natural language processing is introduced. Meanwhile existing approaches use many typical text-classification methods and disregard the importance of language characteristics, the proposed method exploits such language characteristics from text items in social metadata (e.g. title, description and tag) to leverage social event detection. First and foremost, we analyze the specific characteristics of natural language in social media to choose the most suitable features. Second, we employ common natural language processing techniques along with machine learning methods to extract features and perform classification. As a result, we experienced the F1 score higher than the results of related works that used state-of-the-art methods. The proposed method proves the significance of understanding language characteristics in building social event classification programs. It also offers good clues to improve existing works on social event detection.


Social Event Classification Social Event Detection social network text classification language processing natural language characteristics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ellison, N.B., et al.: Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication 13(1), 210–230 (2007)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Procs of the 19th Int. Conf. on World Wide Web, pp. 591–600 (2010)Google Scholar
  3. 3.
    Gupta, I., Gautam, K., Krishna, C.: VIT@ MediaEval 2013 Social Event Detection Task: Semantic Structuring of Complementary Information for Clustering Events. In: MediaEval (2013)Google Scholar
  4. 4.
    Reuter, T., et al.: Social Event Detection at MediaEval 2013: Challenges, datasets, and evaluation. In: Procs. of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain, October 18-19 (2013)Google Scholar
  5. 5.
    Dekang, L.: An information-theoretic definition of similarity. Journal of ICML 98, 296–304 (1998)Google Scholar
  6. 6.
    Schinas, E., et al.: CERTH@ MediaEval 2013 Social Event Detection Task. In: Procs. of the MediaEval 2013 Multimedia Benchmark Workshop, Barcelona, Spain, October 18-19 (2013)Google Scholar
  7. 7.
    Brenner, M., Izquierdo, E.: Social event detection and retrieval in collaborative photo collections. In: Procs of the 2nd ACM International Conference on Multimedia Retrieval (2012)Google Scholar
  8. 8.
    Sakaki, T., et al.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Procs. of the 19th international conference on World Wide Web, pp. 851–860 (2010)Google Scholar
  9. 9.
    Duranti, A., Goodwin, C.: Rethinking context: Language as an interactive phenomenon, vol. (11). Cambridge University Press (1992)Google Scholar
  10. 10.
    Nguyen, T.T.V., Dao, M.S., Mattivi, R., De Natale, F.: Event Detection from Social Media: User-centric Parallel Split-n-merge and Composite Kernel. In: Procs. of ICMR 2014 Workshop on Social Events in Web Multimedia (SEWM) (2014)Google Scholar
  11. 11.
    Marcus, M., et al.: Building a large annotated corpus of English: The Penn Treebank. Journal of Computational Linguistics 19(2), 313–330 (1993)Google Scholar
  12. 12.
    Lewis, D., et al.: Rcv1: A new benchmark collection for text categorization research. The Journal of Machine Learning Research 5, 361–397 (2004)Google Scholar
  13. 13.
    Vapnik, V.: The nature of statistical learning theory. Springer (2000)Google Scholar
  14. 14.
    Vapnik, V.: Statistical learning theory (adaptive and learning systems for signal processing, communications and control series). John Wiley & Sons, A Wiley-Interscience Publication, New York (1998)Google Scholar
  15. 15.
    Joachims, T.: A statistical learning learning model of text classification for support vector machines. In: Procs. of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 128–136 (2001)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Duc-Duy Nguyen
    • 1
    Email author
  • Minh-Son Dao
    • 1
  • Truc-Vien T. Nguyen
    • 2
  1. 1.University of Information TechnologyVietnam National University, HCMCHanoiVietnam
  2. 2.University of LuganoLuganoSwitzerland

Personalised recommendations