Understanding Behavioral Differences Between Short and Long-Term Drinking Abstainers from Social Media
Drinking alcohol has high cost on society. The journey from being a regular drinker to a successful quitter may be a long and hard journey, fraught with the risk to relapse. Research has shown that certain behavioral changes can be effective towards staying abstained. Traditional way to conduct research on drinking abstainers uses questionnaire based approach to collect data from a curated group of people. However, it is an expensive approach in both cost and time and often results in small data with less diversity. Recently, social media has emerged as a rich data source. Reddit is one such social media platform that has a community (‘subreddit’) with an interest to quit drinking. The discussions among the group dates back to year 2011 and contain more than 40,000 posts. This large scale data is generated by users themselves and without being limited by any survey questionnaires. The most predictive factors from the features (unigrams, topics and LIWC) associated with short-term and long-term abstinence are identified using Lasso. It is seen that many common patterns manifest in unigrams, topics and LIWC. Whilst topics provided much richer associations between a group of words and the outcome, unigrams and LIWC are found to be good at finding highly predictive solo and psycho linguistically important words. Combining them we have found that many interesting patterns that are associated with the successful attempt made by the long-term abstainer, at the same time finding many of the common issues faced during the initial period of abstinence.
KeywordsFeature selection Health promotion Reddit Stop drinking Abstinence
This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning.
- 3.Feldman, R., Sanger, J.: The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge University Press (2007)Google Scholar
- 5.Grant, B.F., Stinson, F.S., Dawson, D.A., Chou, S., Dufour, M., Compton, W., Kaplan, K.: Prevalence and co-occurrence of substance use disorders and independent mood and anxiety disorders. Alcohol. Res. Health 29(2), 107–120 (2006)Google Scholar
- 6.Jonas, D.E., Garbutt, J.C., Amick, H.R., Brown, J.M., Brownley, K.A., Council, C.L., Viera, A.J., Wilkins, T.M., Schwartz, C.J., Richmond, E.M.: Behavioral counseling after screening for alcohol misuse in primary care: a systematic review and meta-analysis for the US Preventive Services Task Force. Ann. Intern. Med. 157(9), 645–654 (2012)CrossRefGoogle Scholar
- 11.Organization, W.H.: Global status report on alcohol and health. World Health Organization (2014)Google Scholar
- 13.Pennebaker, J.W., Booth, R.J., Boyd, R.L., Francis, M.E.: Linguistic Inquiry and Word Count: LIWC 2015 [Computer software]. Pennebaker Conglomerates, Inc. (2015)Google Scholar
- 14.Pennebaker, J.W., Boyd, R.L., Jordan, K., Blackburn, K.: The development and psychometric properties of LIWC 2015. UT Faculty/Researcher Works (2015)Google Scholar
- 16.Tamersoy, A., De Choudhury, M., Chau, D.H.: Characterizing smoking and drinking abstinence from social media. In: Proceedings of the 26th ACM Conference on Hypertext & Social Media, pp. 139–148 (2015)Google Scholar
- 17.Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal Stat. Soc. Ser. B (Methodological) 58, 267–288 (1996)Google Scholar