Substance use disorder (SUD) continues to have alarming impacts on morbidity and mortality.1 Quality metrics for SUD remain undefined and less is known about the patient experience when accessing and obtaining care.2 The search for treatment often begins online where individuals research facilities and are presented with aggregated online reviews. Online reviews are not validated or representative, yet their volume and use are expanding. Research into online reviews of healthcare presents an opportunity to identify trends in organic, patient-driven content.3 Informing the development of SUD treatment metrics with patient perspectives is important. We analyzed online reviews of US SUD treatment facilities to identify potential drivers of patient experience.


We matched SUD treatment facilities from the Substance Abuse and Mental Health Services Administration (SAMHSA) to corresponding facilities on Yelp (review website). Facility records were matched using the shortest string matching Levenshtein distance, a linguistic method comparing name and address across SAMHSA and Yelp. Matched pairs were scored according the metaphone algorithm, a measure of phonetic similarity. The matches were manually reviewed and selected for analysis. We excluded non-SUD facilities (e.g., dentists), and similar to prior research, we selected facilities with at least five reviews to maximize the potential for identifying themes.4 We extracted the relative frequency of words and phrases (consisting of two or three consecutive words). We generated 25 latent Dirichlet allocation (LDA) topics using the MALLET implementation.5 LDA uses a dimension reduction procedure to identify latent topics in large quantities of text. Topics represent a list of words which cluster together. The distribution of LDA topics were extracted for each facility. Themes were categorized by independent research team review.

Ordinary least squares regression was used to identify topics associated with facility ratings. Effect size was measured using Pearson’s r. We identified 10 reviews each with the highest topic prevalence and assigned themes. We used Benjamini-Hochberg p-correction and p < 0.05 for indicating meaningful correlations.


Of 13,926 SAMHSA designated SUD facilities, we identified 12,938 online reviews matched to 1857 facilities over 15 years (09/2005–07/2020) and 533 facilities with >5 reviews (8149 reviews). The mean rating was 2.86 (1.0 SD). Review language correlated with positive or negative ratings is displayed in Figure 1.

Figure 1
figure 1

Language associated with positive and negative online ratings of substance use disorders. font size represents stronger correlation with positive or negative review and increased frequency of word usage is represented by darker shading.

We identified 16 total themes associated with positive and negative reviews (Table 1). The top five themes most correlated with a positive review included the following: long-term recovery (r = 0.66), dedicated staff (r = 0.63), dedication to patients (r = 0.56), group therapy experience (r = 0.43), and inpatient rehabilitation (r = 0.37). The top five themes most correlated with a negative review included the following: professionalism (r = −0.53), phone communication (r = −0.49), overall communication (r = −0.42), wait times in facility (r = −0.40), and management (r = −0.34).

Table 1 Statistical Insights on Differential LDA Topics Across High and Low Facility Ratings. Significance Was Measured Using Paired, Two-Tailed t Test with Benjamini-Hochberg p-correction (p < 0.05)


Using machine learning, we investigated narrative content from over 8000 online reviews of US SUD treatment facilities. Research in online reviews has grown to identify potential indicators of patient experience within hospitals, emergency departments, urgent care centers, and nursing facilities.6 This study builds upon prior work to provide a national analysis of reviews pertaining to SUD treatment.4 Consistent with prior research, we focused on facilities with ≥5 reviews in order to maximize the potential for identifying themes across facilities and avoid instances of facilities with infrequent or rare reviews. Future research will need to focus on identifying a threshold number of reviews per facility to provide more local actionable insights.

Themes within positive reviews reveal insights on topics which are difficult to quantify in structured surveys and highlight interpersonal connections such as a focus on long-term recovery or the dedication of staff. These potential drivers of positive reviews can be used to emphasize a facility’s core mission and support broader guiding principles for SUD care. Themes associated with negative reviews emphasize a lack of professionalism and provide actionable patient-driven observations to guide improvements in communication and waiting times. This study demonstrates the ability to analyze a collective “digital patient voice” and provides a patient-centered approach toward informing quality measures for SUD treatment.

Limitations include retrospective design, selection, and responder biases inherent to online reviews. Online platforms use proprietary software to filter inappropriate, invalid, or inaccurate reviews and do not verify the identity of the individual posting a rating or reviewer. Nonetheless, online platforms provide an unstructured, organic, and accessible venue for patients to share experiences and inform healthcare.