Twigraph: Discovering and Visualizing Influential Words Between Twitter Profiles

  • Dhanasekar Sundararaman
  • Sudharshan Srinivasan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10540)


The social media craze is on an ever increasing spree, and people are connected with each other like never before, but these vast connections are visually unexplored. We propose a methodology Twigraph to explore the connections between persons using their Twitter profiles. First, we propose a hybrid approach of recommending social media profiles, articles, and advertisements to a user. The profiles are recommended based on the similarity score between the user profile, and profile under evaluation. The similarity between a set of profiles is investigated by finding the top influential words thus causing a high similarity through an Influence Term Metric for each word. Then, we group profiles of various domains such as politics, sports, and entertainment based on the similarity score through a novel clustering algorithm. The connectivity between profiles is envisaged using word graphs that help in finding the words that connect a set of profiles and the profiles that are connected to a word. Finally, we analyze the top influential words over a set of profiles through clustering by finding the similarity of that profiles enabling to break down a Twitter profile with a lot of followers to fine level word connections using word graphs. The proposed method was implemented on datasets comprising 1.1 M Tweets obtained from Twitter. Experimental results show that the resultant influential words were highly representative of the relationship between two profiles or a set of profiles.


Twitter Clustering Profile modeling Profile similarity Multiple profiles connectivity 


  1. 1.
    Java, A., et al.: Why we twitter: understanding microblogging usage and communities. In: Proceedings of the 9thWebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis. ACM (2007)Google Scholar
  2. 2.
    Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. LREc 10, 2010 (2010)Google Scholar
  3. 3.
    Gupta, P., et al.: WTF: The who to follow service at twitter. In: Proceedings of the 22nd International Conference on World Wide Web. ACM (2013)Google Scholar
  4. 4.
    Hannon, J., McCarthy, K., Smyth, B.: Finding useful users on twitter: twittomender the followee recommender. In: Clough, P., Foley, C., Gurrin, C., Jones, Gareth J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 784–787. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-20161-5_94 CrossRefGoogle Scholar
  5. 5.
    Kagan, V., Stevens, A., Subrahmanian, V.S.: Using twitter sentiment to forecast the 2013 pakistani election and the 2014 indian election. IEEE Intell. Syst. 30(1), 2–5 (2015)CrossRefGoogle Scholar
  6. 6.
    Tunggawan, E., Soelistio, Y.E.: And the Winner is…: Bayesian Twitter-based Prediction on 2016 US Presidential Election. arXiv preprint arXiv:1611.00440 (2016)
  7. 7.
    Ramos, J.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning (2003)Google Scholar
  8. 8.
    Jing, L.-P., Huang, H.-K., Shi, H.-B.: Improved feature selection approach TFIDF in text mining. In: Proceedings of 2002 International Conference on Machine Learning and Cybernetics, vol. 2. IEEE (2002)Google Scholar
  9. 9.
    Huang, A.: Similarity measures for text document clustering. In: Proceedings of the Sixth New Zealand Computer Science Research Student Conference (NZCSRSC 2008), Christchurch, New Zealand (2008)Google Scholar
  10. 10.
    Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining, vol. 400(1) (2000)Google Scholar
  11. 11.
    Shah, N., Mahajan, S.: Document clustering: a detailed review. Int. J. Appl. Inf. Syst. 4(5), 30–38 (2012)Google Scholar
  12. 12.
    Cutting, D.R., et al.: Scatter/gather: a clusterbased approach to browsing large document collections. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (1992)Google Scholar
  13. 13.
    Bhaumik, H., et al.: Towards reliable clustering of english text documents using correlation coefficient. In: 2014 International Conference on Computational Intelligence and Communication Networks (CICN). IEEE (2014)Google Scholar
  14. 14.
    Li, G., Liu, F.: A clustering-based approach on sentiment analysis. In: 2010 International Conference on Intelligent Systems and Knowledge Engineering (ISKE). IEEE (2010)Google Scholar
  15. 15.
    Kavyasrujana, D., Rao, B.C.: Hierarchical clustering for sentence extraction using cosine similarity measure. In: Satapathy, S., Govardhan, A., Raju, K., Mandal, J. (eds.) Emerging ICT for Bridging the Future - Proceedings of the 49th Annual Convention of the Computer Society of India (CSI) Volume 1. AISC, vol. 337, pp. 185–191. Springer, Cham (2015). doi: 10.1007/978-3-319-13728-5_21
  16. 16.
    Radev, D.R., et al.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40(6), 919–938 (2004)CrossRefzbMATHGoogle Scholar
  17. 17.
    Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity.In: AAAI, vol. 6 (2006)Google Scholar
  18. 18.
  19. 19.
    Reuters Institute for the Study of Journalism. Digital news report 2015: Tracking the future of news (2015).
  20. 20.
    Pew Research Center. The evolving role of news on twitter and facebook (2015).
  21. 21.

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.SSN College of EngineeringChennaiIndia
  2. 2.SRM UniversityChennaiIndia

Personalised recommendations