Skip to main content

Analyzing Online Reviews of Substance Use Disorder Treatment Facilities in the USA Using Machine Learning

INTRODUCTION

Substance use disorder (SUD) continues to have alarming impacts on morbidity and mortality.1 Quality metrics for SUD remain undefined and less is known about the patient experience when accessing and obtaining care.2 The search for treatment often begins online where individuals research facilities and are presented with aggregated online reviews. Online reviews are not validated or representative, yet their volume and use are expanding. Research into online reviews of healthcare presents an opportunity to identify trends in organic, patient-driven content.3 Informing the development of SUD treatment metrics with patient perspectives is important. We analyzed online reviews of US SUD treatment facilities to identify potential drivers of patient experience.

METHODS

We matched SUD treatment facilities from the Substance Abuse and Mental Health Services Administration (SAMHSA) to corresponding facilities on Yelp (review website). Facility records were matched using the shortest string matching Levenshtein distance, a linguistic method comparing name and address across SAMHSA and Yelp. Matched pairs were scored according the metaphone algorithm, a measure of phonetic similarity. The matches were manually reviewed and selected for analysis. We excluded non-SUD facilities (e.g., dentists), and similar to prior research, we selected facilities with at least five reviews to maximize the potential for identifying themes.4 We extracted the relative frequency of words and phrases (consisting of two or three consecutive words). We generated 25 latent Dirichlet allocation (LDA) topics using the MALLET implementation.5 LDA uses a dimension reduction procedure to identify latent topics in large quantities of text. Topics represent a list of words which cluster together. The distribution of LDA topics were extracted for each facility. Themes were categorized by independent research team review.

Ordinary least squares regression was used to identify topics associated with facility ratings. Effect size was measured using Pearson’s r. We identified 10 reviews each with the highest topic prevalence and assigned themes. We used Benjamini-Hochberg p-correction and p < 0.05 for indicating meaningful correlations.

RESULTS

Of 13,926 SAMHSA designated SUD facilities, we identified 12,938 online reviews matched to 1857 facilities over 15 years (09/2005–07/2020) and 533 facilities with >5 reviews (8149 reviews). The mean rating was 2.86 (1.0 SD). Review language correlated with positive or negative ratings is displayed in Figure 1.

Figure 1
figure 1

Language associated with positive and negative online ratings of substance use disorders. font size represents stronger correlation with positive or negative review and increased frequency of word usage is represented by darker shading.

We identified 16 total themes associated with positive and negative reviews (Table 1). The top five themes most correlated with a positive review included the following: long-term recovery (r = 0.66), dedicated staff (r = 0.63), dedication to patients (r = 0.56), group therapy experience (r = 0.43), and inpatient rehabilitation (r = 0.37). The top five themes most correlated with a negative review included the following: professionalism (r = −0.53), phone communication (r = −0.49), overall communication (r = −0.42), wait times in facility (r = −0.40), and management (r = −0.34).

Table 1 Statistical Insights on Differential LDA Topics Across High and Low Facility Ratings. Significance Was Measured Using Paired, Two-Tailed t Test with Benjamini-Hochberg p-correction (p < 0.05)

DISCUSSION

Using machine learning, we investigated narrative content from over 8000 online reviews of US SUD treatment facilities. Research in online reviews has grown to identify potential indicators of patient experience within hospitals, emergency departments, urgent care centers, and nursing facilities.6 This study builds upon prior work to provide a national analysis of reviews pertaining to SUD treatment.4 Consistent with prior research, we focused on facilities with ≥5 reviews in order to maximize the potential for identifying themes across facilities and avoid instances of facilities with infrequent or rare reviews. Future research will need to focus on identifying a threshold number of reviews per facility to provide more local actionable insights.

Themes within positive reviews reveal insights on topics which are difficult to quantify in structured surveys and highlight interpersonal connections such as a focus on long-term recovery or the dedication of staff. These potential drivers of positive reviews can be used to emphasize a facility’s core mission and support broader guiding principles for SUD care. Themes associated with negative reviews emphasize a lack of professionalism and provide actionable patient-driven observations to guide improvements in communication and waiting times. This study demonstrates the ability to analyze a collective “digital patient voice” and provides a patient-centered approach toward informing quality measures for SUD treatment.

Limitations include retrospective design, selection, and responder biases inherent to online reviews. Online platforms use proprietary software to filter inappropriate, invalid, or inaccurate reviews and do not verify the identity of the individual posting a rating or reviewer. Nonetheless, online platforms provide an unstructured, organic, and accessible venue for patients to share experiences and inform healthcare.

References

  1. Bahorik AL, Satre DD, Kline-Simon AH, Weisner CM, Campbell CI. Alcohol, Cannabis, and Opioid Use Disorders, and Disease Burden in an Integrated Health Care System: J Addict Med 2017;11(1):3–9.

    Article  Google Scholar 

  2. Garnick DW, Horgan CM, Acevedo A, McCorry F, Weisner C. Performance measures for substance use disorders – what research is needed? Addict Sci Clin Pract 2012;7(1):18.

    Article  Google Scholar 

  3. Ranard BL, Werner RM, Antanavicius T, et al. Yelp Reviews Of Hospital Care Can Supplement And Inform Traditional Surveys Of The Patient Experience Of Care. Health Aff Proj Hope 2016;35(4):697–705.

    Article  Google Scholar 

  4. Agarwal AK, Wong V, Pelullo AM, et al. Online Reviews of Specialized Drug Treatment Facilities-Identifying Potential Drivers of High and Low Patient Satisfaction. J Gen Intern Med 2019;

  5. Blei D, Ng A, Jordan M. Latent Dirichlet Allocation. 2001.

    Google Scholar 

  6. Ryskina KL, Andy AU, Manges KA, Foley KA, Werner RM, Merchant RM. Association of Online Consumer Reviews of Skilled Nursing Facilities With Patient Rehospitalization Rates. JAMA Netw Open 2020;3(5):e204682.

Download references

Funding

Funding provided by NIH NIDA 1R21DA050761.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anish K. Agarwal MD, MPH, MS.

Ethics declarations

Conflict of Interest

The authors declare that they do not have a conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Agarwal, A.K., Guntuku, S.C., Meisel, Z.F. et al. Analyzing Online Reviews of Substance Use Disorder Treatment Facilities in the USA Using Machine Learning. J GEN INTERN MED 37, 977–980 (2022). https://doi.org/10.1007/s11606-021-06618-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11606-021-06618-7