Skip to main content

Exploratory Analysis of Marketing and Non-marketing E-cigarette Themes on Twitter

  • Conference paper
  • First Online:
Social Informatics (SocInfo 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10047))

Included in the following conference series:

Abstract

Electronic cigarettes (e-cigs) have been gaining popularity and have emerged as a controversial tobacco product since their introduction in 2007 in the U.S. The smoke-free aspect of e-cigs renders them less harmful than conventional cigarettes and is one of the main reasons for their use by people who plan to quit smoking. The US food and drug administration (FDA) has introduced new regulations early May 2016 that went into effect on August 8, 2016. Given this important context, in this paper, we report results of a project to identify current themes in e-cig tweets in terms of semantic interpretations of topics generated with topic modeling. Given marketing/advertising tweets constitute almost half of all e-cig tweets, we first build a classifier that identifies marketing and non-marketing tweets based on a hand-built dataset of 1000 tweets. After applying the classifier to a dataset of over a million tweets (collected during 4/2015 – 6/2016), we conduct a preliminary content analysis and run topic models on the two sets of tweets separately after identifying the appropriate numbers of topics using topic coherence. We interpret the results of the topic modeling process by relating topics generated to specific e-cig themes. We also report on themes identified from e-cig tweets generated at particular places (such as schools and churches) for geo-tagged tweets found in our dataset using the GeoNames API. To our knowledge, this is the first effort that employs topic modeling to identify e-cig themes in general and in the context of geo-tagged tweets tied to specific places of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although achieving high F-scores for the minority class is generally difficult in heavily skewed datasets, they typically lend themselves to building classifiers with high overall accuracy across all classes or high F-score for the majority class.

References

  1. Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the Workshop on Languages in Social Media, pp. 30–38. Association for Computational Linguistics (2011)

    Google Scholar 

  2. Barrington-Trimis, J.L., Urman, R., Berhane, K., Unger, J.B., Cruz, T.B., Pentz, M.A., Samet, J.M., Leventhal, A.M., McConnell, R.: E-cigarettes and future cigarette use. Pediatrics 138, e20160379 (2016)

    Article  Google Scholar 

  3. Blei, D.M., Lafferty, J.D.: Topic models. In: Srivastava, A., Sahami, M. (eds.) Text Mining:Classification, Clustering, and Applications, chapter 4, pp. 71–93. CRC Press, Chapman and Hall (2009)

    Google Scholar 

  4. Centers for Disease Control. E-cigarette use triples among middle and high school students in just one year. http://www.cdc.gov/media/releases/2015/p0416-e-cigarette-use.html

  5. Chaney, A.J.-B., Blei, D.M.: Visualizing topic models. In: International Conference of Weblogs and Social Media, ICWSM 2012 (2012)

    Google Scholar 

  6. Chen, I.-L., et al.: FDA summary of adverse events on electronic cigarettes. Nicotine Tob. Res. 15(2), 615–616 (2013)

    Article  Google Scholar 

  7. Cheng, X., Yan, X., Lan, Y., Guo, J.: BTM: Topic modeling over short texts. Knowl. Data Eng. IEEE Trans. 26(12), 2928–2941 (2014)

    Article  Google Scholar 

  8. Chu, K.-H., Unger, J.B., Allem, J.-P., Pattarroyo, M., Soto, D., Cruz, T.B., Yang, H., Jiang, L., Yang, C.C.: Diffusion of messages from an electronic cigarette brand to potential users through twitter. PloS One 10(12), e0145387 (2015)

    Article  Google Scholar 

  9. Cole-Lewis, H., Pugatch, J., Sanders, A., Varghese, A., Posada, S., Yun, C., Schwarz, M., Augustson, E.: Social listening: A content analysis of e-cigarette discussions on twitter. J. Medi. Int. Res. 17(10), e243 (2015)

    Google Scholar 

  10. Cole-Lewis, H., Varghese, A., Sanders, A., Schwarz, M., Pugatch, J., Augustson, E.: Assessing electronic cigarette-related tweets for sentiment and content using supervised machine learning. J. Med. Int. Res. 17(8), e208 (2015)

    Google Scholar 

  11. Culotta, A., Kumar, N.R., Cutler, J.: Predicting the demographics of twitter users from website traffic data. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 72–78 (2015)

    Google Scholar 

  12. Etter, J.-F., Bullen, C., Flouris, A.D., Laugesen, M., Eissenberg, T.: Electronic nicotine delivery systems: a research agenda. Tob. Control 20(3), 243–248 (2011)

    Article  Google Scholar 

  13. Food and Drug Administration, HHS et al.: Deeming tobacco products to be subject to the federal food, drug, and cosmetic act, as amended by the family smoking prevention and tobacco control act; restrictions on the sale and distribution of tobacco products and required warning statements for tobacco products. final rule. Federal Reg. 81(90), 28973 (2016)

    Google Scholar 

  14. Godea, A.K., Caragea, C., Bulgarov, F.A., Ramisetty-Mikler, S.: An analysis of twitter data on e-cigarette sentiments and promotion. In: Holmes, J.H., Bellazzi, R., Sacchi, L., Peek, N. (eds.) AIME 2015. LNCS (LNAI), vol. 9105, pp. 205–215. Springer, Heidelberg (2015). doi:10.1007/978-3-319-19551-3_27

    Chapter  Google Scholar 

  15. Han, S., Kavuluru, R.: On assessing the sentiment of general tweets. In: Barbosa, D., Milios, E. (eds.) CANADIAN AI 2015. LNCS (LNAI), vol. 9091, pp. 181–195. Springer, Heidelberg (2015). doi:10.1007/978-3-319-18356-5_16

    Google Scholar 

  16. Hoffman, M., Bach, F.R., Blei, D.M.: Online learning for latent Dirichlet allocation. Adv. Neural Inf. Proc. Syst. 21, 856–864 (2010)

    Google Scholar 

  17. Hong, L., Davison, B.D.: Empirical study of topic modeling in twitter. In: Proceedings of the 1st Workshop on Social Media Analytics, pp. 80–88. ACM (2010)

    Google Scholar 

  18. Huang, J., Kornfield, R., Szczypka, G., Emery, S.L.: A cross-sectional examination of marketing of electronic cigarettes on twitter. Tob. Control 23, 26–30 (2014). (suppl 3)

    Article  Google Scholar 

  19. Kavuluru, R., Sabbir, A.: Toward automated e-cigarette surveillance: Spotting e-cigarette proponents on Twitter. J. Biomed. Inf. 61, 19–26 (2016)

    Article  Google Scholar 

  20. Kim, A.E., Hopper, T., Simpson, S., Nonnemaker, J., Lieberman, A.J., Hansen, H., Guillory, J., Porter, L.: Using twitter data to gain insights into e-cigarette marketing and locations of use: An infoveillance study. J. Med. Int. Res. 17(11), e251 (2015)

    Google Scholar 

  21. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751, October 2014

    Google Scholar 

  22. Klein, E.G., Berman, M., Hemmerich, N., Carlson, C., Htut, S., Slater, M.: Online e-cigarette marketing claims: A systematic content and legal analysis. Tob. Regul. Sci. 2(3), 252–262 (2016)

    Article  Google Scholar 

  23. Landis, J., Koch, G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  24. Levy, D.T., Cummings, K.M., Villanti, A.C., Niaura, R., Abrams, D.B., Fong, G.T., Borland, R.: A framework for evaluating the public health impact of e-cigarettes and other vaporized nicotine products. Addiction (2016)

    Google Scholar 

  25. Liu, W., Ruths, D.: What’s in a name? using first names as features for gender inferencein twitter. In: Proceedings of the AAAI Spring Symposium: AnalyzingMicrotext, pp. 10–16 (2013)

    Google Scholar 

  26. Malik, S., Smith, A., Hawes, T., Papadatos, P., Li, J., Dunne, C., Shneiderman, B.: Topicflow: visualizing topic alignment of twitter data over time. In: Proceedings of the 2013 IEEE/ACM International Conference Onadvances in Social Networks Analysis and Mining, pp. 720–726. ACM (2013)

    Google Scholar 

  27. Martin, E., Clapp, P.W., Rebuli, M.E., Pawlak, E.A., Glista-Baker, E.E., Benowitz, N.L., Fry, R.C., Jaspers, I.: E-cigarette use results in suppression of immune and inflammatory-response genes in nasal epithelial cells similar to cigarette smoke. Am. J. Physiol. Lung Cell. Mol. Physiol. 311, L135–L144 (2016)

    Google Scholar 

  28. McNeill, A., Brose, L., Calder, R., Hitchman, S., Hajek, P., McRobbie, H.: E-cigarettes: an evidence update. Report from Public Health England (2015)

    Google Scholar 

  29. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 21, 3111–3119 (2013)

    Google Scholar 

  30. Myslín, M., Zhu, S.-H., Chapman, W., Conway, M.: Using twitter to examine smoking behavior and perceptions of emerging tobacco products. J. Med. Int. Res. 15(8), e174 (2013)

    Google Scholar 

  31. Nguyen, D., Gravel, R., Trieschnigg, D., Meder, T.: how old do you think i am? a study of language and age in twitter. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media (ICWSM), pp. 439–448 (2013)

    Google Scholar 

  32. OCallaghan, D., Greene, D., Carthy, J., Cunningham, P.: An analysis of the coherence of descriptors in topic modeling. Expert Syst. Appl. 42(13), 5645–5657 (2015)

    Article  Google Scholar 

  33. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  34. Pew Research Internet Project. Part 1: Teens and social media use. http://www.pewinternet.org/2013/05/21/part-1-teens-and-social-media-use/

  35. Rios, A., Kavuluru, R.: Convolutional neural networks for biomedical text classification:application in indexing biomedical articles. In: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 258–267. ACM (2015)

    Google Scholar 

  36. Rudy, S., Durmowicz, E.: Electronic nicotine delivery systems: overheating, fires andexplosions. Tob. Control (2016) (in press)

    Google Scholar 

  37. Singh, T., Arrazola, R., Corey, C., Husten, C., Neff, L., Homa, D., King, B.: Tobacco use among middle and high school students - United States, 2011–2015. MMWR Morb. Mortal. Wkly. Rep. 65(14), 361–367 (2016)

    Article  Google Scholar 

  38. Wilson, E.B.: Probable inference, the law of succession, and statistical inference. J. Am. Statist. Assoc. 22(158), 209–212 (1927)

    Article  Google Scholar 

Download references

Acknowledgements

We thank anonymous reviewers for constructive criticism that helped improve the presentation of this paper. This research was supported by the National Center for Research Resources and the National Center for Advancing Translational Sciences, US National Institutes of Health (NIH), through Grant UL1TR000117 and the Kentucky Lung Cancer Research Program through Grant PO2-415-1400004000-1. The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ramakanth Kavuluru .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Han, S., Kavuluru, R. (2016). Exploratory Analysis of Marketing and Non-marketing E-cigarette Themes on Twitter. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10047. Springer, Cham. https://doi.org/10.1007/978-3-319-47874-6_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47874-6_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47873-9

  • Online ISBN: 978-3-319-47874-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics