Advertisement

Mining Climate Change Awareness on Twitter: A PageRank Network Analysis Method

  • Ahmed Abdeen HamedEmail author
  • Asim Zia
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9155)

Abstract

Climate change is one of this century’s greatest unbalancing forces that affect our planet. Mining the public awareness is an essential step towards the assessment of current climate policies, dedication of sufficient resources, and construction of new policies for business planning. In this paper, we present an exploratory data mining method that compares two types of networks. The first type is constructed from a set of words collected from a Climate Change corpus, which we consider as ground-truth (i.e., base of comparison). The other type of network is constructed from a reasonably large data set of 72 million tweets; it is used to analyze the public awareness of climate change on Twitter.

The results show that the social-language used on Twitter is more complex than just single word expressions. While the term climate and the hashtag (#climate) scored a lower rank, complex terms such as (“Climate Change”) and (“Climate Engineering”) were more dominant using hashtags. More interestingly, we found the (#ClimateChange) hashtag is the top ranked term, among all other features, used on Twitter to signal climate familiarity expressions. This is indeed striking evidence that demonstrates a great deal of awareness and provides hope for a better future dealing with Climate Change issues.

Keywords

PageRank Network analysis K-H Networks Bigrams Networks Measurement Awareness Climate change Twitter 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco (1994). http://dl.acm.org/citation.cfm?id=645920.672836
  2. 2.
    Aizawa, A.: An information-theoretic perspective of tfidf measures. Information Processing and Management 39(1), 45–65 (2003). http://www.sciencedirect.com/science/article/pii/S0306457302000213 zbMATHMathSciNetCrossRefGoogle Scholar
  3. 3.
    Allesina, S., Pascual, M.: Googling food webs: Can an eigenvector measure species’ importance for coextinctions? PLoS Comput Biol 5(9), e10004942009 (2009). http://dx.doi.org/10.1371%2Fjournal.pcbi.1000494
  4. 4.
    Bekkerman, R., Allan, J.: Using bigrams in text categorization (2003)Google Scholar
  5. 5.
    Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. Journal of Computational Science 2(1), 1–8 (2011)CrossRefGoogle Scholar
  6. 6.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1–7), 107–117 (1998)CrossRefGoogle Scholar
  7. 7.
    Callaway, J.M.: Adaptation benefits and costs: are they important in the global policy picture and how can we estimate them? Global Environmental Change 14(3), 273–282 (2004). http://www.sciencedirect.com/science/article/pii/S0959378004000366. the Benefits of Climate PolicyCrossRefGoogle Scholar
  8. 8.
    Conover, M., Ratkiewicz, J., Francisco, M., Gonçalves, B., Menczer, F., Flammini, A.: Political polarization on twitter. In: ICWSM (2011)Google Scholar
  9. 9.
    Curry, T.E.: Public awareness of carbon capture and storage: a survey of attitudes toward climate change mitigation. Ph.D. thesis, Massachusetts Institute of Technology (2004)Google Scholar
  10. 10.
    De Deyne, S., Storms, G.: Word associations: Network and semantic properties. Behavior Research Methods 40(1), 213–231 (2008). http://dx.doi.org/10.3758/BRM.40.1.213 CrossRefGoogle Scholar
  11. 11.
    Ding, Y.: Topic-based pagerank on author cocitation networks. J. Am. Soc. Inf. Sci. Technol. 62(3), 449–466 (2011). http://dx.doi.org/10.1002/asi.21467 Google Scholar
  12. 12.
    Ding, Y., Yan, E., Frazho, A., Caverlee, J.: Pagerank for ranking authors in co-citation networks. J. Am. Soc. Inf. Sci. Technol. 60(11), 2229–2243 (2009). http://dx.doi.org/10.1002/asi.v60:11 CrossRefGoogle Scholar
  13. 13.
    Do, T.D., Hui, S.C., Fong, A.C.M.: Associative feature selection for text mining. International Journal of Information Technology 12(4) (2006)Google Scholar
  14. 14.
    Dodds, P.S., Danforth, C.M.: Measuring the happiness of large-scale written expression: Songs, blogs, and presidents. Journal of Happiness Studies 11(4), 441–456 (2010)CrossRefGoogle Scholar
  15. 15.
    Esbjörn-Hargens, S.: An ontology of climate change. Journal of Integral Theory and Practice 5(1), 143–174 (2010)Google Scholar
  16. 16.
    Forman, G.: An extensive empirical study of feature selection metrics for text classification. The Journal of machine learning research 3, 1289–1305 (2003)zbMATHGoogle Scholar
  17. 17.
    Hamed, A.A.: An Exploratory Analysis of Twitter Keyword-Hashtag Networks and Their Knowledge Discover Applications. Ph.d. dissertation, University of Vermont (2014)Google Scholar
  18. 18.
    Hamed, A.A., Wu, X.: Does social media big data make the world smaller? an exploratory analysis of keyword-hashtag networks. In: IEEE BigData Congress (2014)Google Scholar
  19. 19.
    Hamed, A.A., Wu, X., Fandy, T.: Mining patterns in big data k-h networks. In: ACS/IEEE International Conference on Computer Systems and Applications, AICCSA 2014, Doha, Qatar (2014), November 10–13, 2014Google Scholar
  20. 20.
    Hamed, A.A., Wu, X., Fingar, J.: A twitter-based smoking cessation recruitment system. In: ASONAM (2013)Google Scholar
  21. 21.
    Hamed, A.A., Wu, X., Rubin, A.: A twitter recruitment intelligent system: association rule mining for smoking cessation. Social Netw. Analys. Mining 4(1) (2014). http://dx.doi.org/10.1007/s13278-014-0212-6
  22. 22.
    Hearst, M.A.: Untangling text data mining. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 3–10. Association for Computational Linguistics (1999)Google Scholar
  23. 23.
    Jensen, L.J., Saric, J., Bork, P.: Literature mining for the biologist: from information retrieval to biological discovery. Nature reviews genetics 7(2), 119–129 (2006)CrossRefGoogle Scholar
  24. 24.
    Jing, L.P., Huang, H.K., Shi, H.B.: Improved feature selection approach tfidf in text mining. In: Proceedings of 2002 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 944–946. IEEE (2002)Google Scholar
  25. 25.
    Kam, X.N.C., Stoyneshka, I., Tornyova, L., Fodor, J.D., Sakas, W.G.: Bigrams and the richness of the stimulus. Cognitive Science 32(4), 771–787 (2008). http://dx.doi.org/10.1080/03640210802067053 CrossRefGoogle Scholar
  26. 26.
    Kolchinsky, A., Abi-Haidar, A., Kaur, J., Hamed, A.A., Rocha, L.M.: Classification of protein-protein interaction full-text documents using text and citation network features. IEEE/ACM Trans. Comput. Biol. Bioinformatics 7(3), 400–411 (2010). http://dx.doi.org/10.1109/TCBB.2010.55 CrossRefGoogle Scholar
  27. 27.
    Levenbach, G.J.: A dutch bigram network. Word Ways 21(11) (1998). http://digitalcommons.butler.edu/wordways/vol21/iss3/11
  28. 28.
    Lorenzoni, I., Nicholson-Cole, S., Whitmarsh, L.: Barriers perceived to engaging with climate change among the uk public and their policy implications. Global environmental change 17(3), 445–459 (2007)CrossRefGoogle Scholar
  29. 29.
    Macintyre, G., Jimeno Yepes, A., Ong, C.S., Verspoor, K.: Associating disease-related genetic variants in intergenic regions to the genes they impact. PeerJ 2, e639 (2014). https://dx.doi.org/10.7717/peerj.639 CrossRefGoogle Scholar
  30. 30.
    Marsi, E., Oztürk, P., Aamot, E., Sizov, G., Ardelan, M.V.: Towards text mining in climate science: extraction of quantitative variables and their relations. In: Proceedings of the Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing (2014)Google Scholar
  31. 31.
    McMahn, J.: Forget global warming and climate change, call it ’climate disruption’, March 2015Google Scholar
  32. 32.
    Mihalcea, R., Tarau, P., Figa, E.: Pagerank on semantic networks, with application to word sense disambiguation. In: Proceedings of the 20th International Conference on Computational Linguistics, COLING 2004. Association for Computational Linguistics, Stroudsburg (2004). http://dx.doi.org/10.3115/1220355.1220517
  33. 33.
    Neil Adger, W., Arnell, N.W., Tompkins, E.L.: Successful adaptation to climate change across scales. Global environmental change 15(2), 77–86 (2005)CrossRefGoogle Scholar
  34. 34.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and trends in information retrieval 2(1–2), 1–135 (2008)CrossRefGoogle Scholar
  35. 35.
    Pardalos, P., Boginski, V.L., Vazacopoulos, A.: Data mining in biomedicine, vol. 7. Springer (2008)Google Scholar
  36. 36.
    Radev, D.R., Jing, H., Sty, M., Tam, D.: Centroid-based summarization of multiple documents. Information Processing and Management 40(6), 919–938 (2004). http://www.sciencedirect.com/science/article/pii/S0306457303000955 zbMATHCrossRefGoogle Scholar
  37. 37.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988). http://www.sciencedirect.com/science/article/pii/0306457388900210 CrossRefGoogle Scholar
  38. 38.
    Sampei, Y., Aoyagi-Usui, M.: Mass-media coverage, its influence on public awareness of climate-change issues, and implications for japans national campaign to reduce greenhouse gas emissions. Global Environmental Change 19(2), 203–212 (2009)CrossRefGoogle Scholar
  39. 39.
    Sebastiani, F.: Machine learning in automated text categorization. ACM computing surveys (CSUR) 34(1), 1–47 (2002)CrossRefGoogle Scholar
  40. 40.
    Semenza, J.C., Hall, D.E., Wilson, D.J., Bontempo, B.D., Sailor, D.J., George, L.A.: Public perception of climate change: voluntary mitigation and barriers to behavior change. American journal of preventive medicine 35(5), 479–487 (2008)CrossRefGoogle Scholar
  41. 41.
    Signorini, A., Segre, A.M., Polgreen, P.M.: The use of twitter to track levels of disease activity and public concern in the us during the influenza a h1n1 pandemic. PloS one 6(5), e19467 (2011)CrossRefGoogle Scholar
  42. 42.
    Tan, C.M., Wang, Y.F., Lee, C.D.: The use of bigrams to enhance text categorization. Inf. Process. Manage. 38(4), 529–546 (2002). http://dx.doi.org/10.1016/S0306-4573(01)00045-0 zbMATHCrossRefGoogle Scholar
  43. 43.
    Whitmarsh, L.: Behavioural responses to climate change: Asymmetry of intentions and impacts. Journal of Environmental Psychology 29(1), 13–23 (2009)CrossRefGoogle Scholar
  44. 44.
    Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2007). http://dx.doi.org/10.1007/s10115-007-0114-2 CrossRefGoogle Scholar
  45. 45.
    Xie, X., Jin, J., Mao, Y.: Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks. BMC Evolutionary Biology 11(1), 242 (2011). http://dx.doi.org/10.1186/1471-2148-11-242 CrossRefGoogle Scholar
  46. 46.
    Ye, N., et al.: The handbook of data mining, vol. 24. Lawrence Erlbaum Associates Mahwah, NJ (2003)Google Scholar
  47. 47.
    Zhang, W., Yoshida, T., Tang, X.: A comparative study of tf*idf, LSI and multi-words for text classification. Expert Systems with Applications 38(3), 2758–2765 (2011). http://www.sciencedirect.com/science/article/pii/S0957417410008626 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.EPSCoRUniversity or VermontBurlingtonUSA

Personalised recommendations