Abstract
Any discussion in social media can be fruitful if the people involved in the discussion are related to a field. In a similar way to advertise an event, it is useful to find users who are interested in the content of the event. In social networks like Twitter, which contain a large number of users, the categorization of users based on their interests will help this cause. This paper presents an efficient supervised machine learning approach which categorizes Twitter users based on three important features(Tweet-based, User-based and Time-series based) into six interest categories - Politics, Entertainment, Entrepreneurship, Journalism, Science & Technology and Healthcare. We compare the proposed feature set with different traditional classifiers like Support Vector Machines, Naive-Bayes, k-Nearest Neighbours, Decision Tree and Logistic Regression, and obtain upto 89.82% accuracy in classification. We also propose a design for a real-time system for Twitter user profiling along with a prototype implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Twellow. https://www.twellow.com/splash/ (accessed March 10, 2015)
Twitter. https://about.twitter.com/company (accessed March 10, 2015)
Twitter Streaming APIs. https://dev.twitter.com/streaming/overview (accessed March 10, 2015)
Weka 3: Data Mining Software in Java. http://www.cs.waikato.ac.nz/ml/weka/ (accessed March 16, 2015)
India to have third-largest Twitter population by 2014: eMarketer (2014). http://indianexpress.com/article/india/politics/india-to-have-third-largest-twitter-population-by-2014-emarketer (accessed March 10, 2015)
Github - Twitter User Categorization (2015). https://github.com/AKSHAYH/twitterusercategorization (accessed March 19, 2015)
An, J., Cha, M., Gummadi, P.K., Crowcroft, J.: Media landscape in twitter: a world of new conventions and political diversity. In: ICWSM (2011)
Bifet, A., Frank, E.: Sentiment knowledge discovery in twitter streaming data. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS, vol. 6332, pp. 1–15. Springer, Heidelberg (2010)
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. Journal of Computational Science 2(1), 1–8 (2011)
Chakrabarti, D., Punera, K.: Event summarization using tweets. In: ICWSM 2011, pp. 66–73 (2011)
De Choudhury, M., Diakopoulos, N., Naaman, M.: Unfolding the event landscape on twitter: classification and exploration of user categories. In: Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, pp. 241–244. ACM (2012)
Kouloumpis, E., Wilson, T., Moore, J.: Twitter sentiment analysis: the good the bad and the omg! In: ICWSM 2011, pp. 538–541 (2011)
McCord, M., Chuah, M.: Spam detection on twitter using traditional classifiers. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., GarcÃa Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 175–186. Springer, Heidelberg (2011)
Medvet, E., Bartoli, A.: Brand-related events detection, classification and summarization on twitter. In: 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 1, pp. 297–302. IEEE (2012)
Michelson, M., Macskassy, S.A.: Discovering users’ topics of interest on twitter: a first look. In: Proceedings of the Fourth Workshop on Analytics for Noisy Unstructured Text Data, pp. 73–80. ACM (2010)
Pennacchiotti, M., Popescu, A.-M.: Democrats, republicans and starbucks afficionados: user classification in twitter. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 430–438. ACM (2011)
Siswanto, E., Khodra, M.L.: Predicting latent attributes of twitter user by employing lexical features. In: 2013 International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 176–180. IEEE (2013)
Siswanto, E., Khodra, M.L., Dewi, E., Joni, L.: Prediction of interest for dynamic profile of twitter user. In: 2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA), pp. 266–271. IEEE (2014)
Thongsuk, C., Haruechaiyasak, C., Saelee, S.: Multi-classification of business types on twitter based on topic model. In: 2011 8th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), pp. 508–511. IEEE (2011)
Van der Maaten, L.J.P., Postma, E.O., van den Herik, H.J.: Matlab toolbox for dimensionality reduction. MICC, Maastricht University (2007)
Yang, T., Lee, D., Yan, S.: Steeler nation, 12th man, and boo birds: classifying twitter user interests using time series. In: 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 684–691. IEEE (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Raghuram, M.A., Akshay, K., Chandrasekaran, K. (2016). Efficient User Profiling in Twitter Social Network Using Traditional Classifiers . In: Berretti, S., Thampi, S., Dasgupta, S. (eds) Intelligent Systems Technologies and Applications. Advances in Intelligent Systems and Computing, vol 385. Springer, Cham. https://doi.org/10.1007/978-3-319-23258-4_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-23258-4_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23257-7
Online ISBN: 978-3-319-23258-4
eBook Packages: EngineeringEngineering (R0)