Skip to main content

Chasing the Wrong Cloud: Mapping the 2019 Vaping Epidemic Using Data from Social Media

  • Conference paper
  • First Online:
Social, Cultural, and Behavioral Modeling (SBP-BRiMS 2022)

Abstract

The digital trails of activity on social media are valuable for public health due to their potential to reveal risky health behavior, but there are still considerable methodological issues associated to using data from social media. One particular source of bias is the presence of automated accounts, or social bots, whose activity may compromise predictive tasks based on social media data. In this work, we collected a corpus of public tweets about electronic vaping and combine them with data from the CDC to predict the incidence of lung injuries by state. We show that only when likely bot accounts are removed the relative volume of tweets about vaping predicts injuries, but this correlation disappears otherwise. We compare the predictive power of these data against survey-based predictions, and show that our models achieve the lowest generalization error. These results highlight the importance of bot detection as a data cleaning step and the potential value of social media data in the context of public health.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 13 December 2022

    In an older version of this paper, there was error in figure 2. This has been corrected.

References

  1. Achrekar, H., Gandhe, A., Lazarus, R., Yu, S.H., Liu, B.: Predicting flu trends using twitter data. In: 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 702–707 (2011)

    Google Scholar 

  2. Allem, J.P., Ferrara, E.: Could social bots pose a threat to public health? Am. J. Publ. Health 108(8), 1005–1006 (2018)

    Google Scholar 

  3. Allem, J.P., Ferrara, E., Uppu, S.P., Cruz, T.B., Unger, J.B.: E-cigarette surveillance with social media data: Social bots, emerging topics, and trends. JMIR Publ. Health Surveillance 3(4), e98 (2017)

    Google Scholar 

  4. Arrazola, R.A., et al.: Tobacco use among middle and high school students-united states, 2011–2014. Morb. Mortal. Wkly Rep. 64(14), 381 (2015)

    Google Scholar 

  5. Auxier, B., Anderson, M.: Social media use in 2021. Technical report, Pew Research Center, April 2021

    Google Scholar 

  6. Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)

    Article  Google Scholar 

  7. Bond, R., Messing, S.: Quantifying social media’s political space: estimating ideology from publicly revealed preferences on Facebook. Am. Polit. Sci. Rev. 109(1), 62–78 (2015)

    Google Scholar 

  8. Broniatowski, D.A., et al.: Weaponized health communication: Twitter bots and Russian trolls amplify the vaccine debate. Am. J. Publ. Health 108(10), 1378–1384 (2018)

    Google Scholar 

  9. Chan, A.K.M., Nickson, C.P., Rudolph, J.W., Lee, A., Joynt, G.M.: Social media for rapid knowledge dissemination: early experience from the COVID-19 pandemic. Anaesthesia 75(12), 1579–1582 (2020)

    Article  Google Scholar 

  10. Colditz, J.B., Welling, J., Smith, N.A., James, A.E., Primack, B.A.: World vaping day: contextualizing vaping culture in online social media using a mixed methods approach. J. Mixed Methods Res. 13(2), 196–215 (2019)

    Article  Google Scholar 

  11. De Choudhury, M., Gamon, M., Counts, S., Horvitz, E.: Predicting depression via social media. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 7, no. 1, pp. 128–137 (2021)

    Google Scholar 

  12. Ferrara, E., Varol, O., Davis, C., Menczer, F., Flammini, A.: The rise of social bots. Commun. ACM 59(7), 96–104 (2016)

    Article  Google Scholar 

  13. Friedman, A.S.: Association of vaping-related lung injuries with rates of e-cigarette and cannabis use across us states. Addiction 116(3), 651–657 (2021)

    Article  Google Scholar 

  14. Gallotti, R., Valle, F., Castaldo, N., Sacco, P., Domenico, M.D.: Assessing the risks of ‘infodemics’ in response to COVID-19 epidemics. Nat. Hum. Behav. 4(12), 1285–1293 (2020)

    Google Scholar 

  15. Gruhl, D., Guha, R., Kumar, R., Novak, J., Tomkins, A.: The predictive power of online chatter. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. KDD 2005, pp. 78–87. Association for Computing Machinery, New York, August 2005

    Google Scholar 

  16. Kennedy, R., Wojcik, S., Lazer, D.: Improving election prediction internationally. Science 355(6324), 515–520 (2017)

    Article  Google Scholar 

  17. Kergl, D., Roedler, R., Seeber, S.: On the endogenesis of Twitter’s spritzer and gardenhose sample streams. In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), pp. 357–364 (2014)

    Google Scholar 

  18. Lazer, D., Kennedy, R., King, G., Vespignani, A.: The parable of google flu: traps in big data analysis. Science 343(6176), 1203–1205 (2014)

    Google Scholar 

  19. Pfeffer, J., Mayer, K., Morstatter, F.: Tampering with Twitter’s sample API. EPJ Data Sci. 7(1) (2018)

    Google Scholar 

  20. Ruths, D., Pfeffer, J.: Social media for large studies of behavior. Science 346(6213), 1063–1064 (2014)

    Google Scholar 

  21. Shao, C., Ciampaglia, G.L., Varol, O., Yang, K.C., Flammini, A., Menczer, F.: The spread of low-credibility content by social bots. Nat. Commun. 9(1), 1–9 (2018)

    Article  Google Scholar 

  22. Varol, O., Ferrara, E., Davis, C., Menczer, F., Flammini, A.: Online human-bot interactions: Detection, estimation, and characterization. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, pp. 280–289, May 2017

    Google Scholar 

  23. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Statis. Soc. Ser. B (Statis. Methodol.) 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgement

The authors would like to thank Filippo Menczer and Kai-Cheng Yang for providing access to the BotometerLite API, and to Hunter Morera for help with data collection and coding during the initial part of this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parush Gera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gera, P., Ciampaglia, G.L. (2022). Chasing the Wrong Cloud: Mapping the 2019 Vaping Epidemic Using Data from Social Media. In: Thomson, R., Dancy, C., Pyke, A. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2022. Lecture Notes in Computer Science, vol 13558. Springer, Cham. https://doi.org/10.1007/978-3-031-17114-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-17114-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-17113-0

  • Online ISBN: 978-3-031-17114-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics