Physician online ratings are ubiquitous and influential, but they also have their detractors. Given the lack of scientific survey methodology used in online ratings, some health systems have begun to publish their own internal patient-submitted ratings of physicians.
The purpose of this study was to compare online physician ratings with internal ratings from a large healthcare system.
Retrospective cohort study comparing online ratings with internal ratings from a large healthcare system.
Kaiser Permanente, a large integrated healthcare delivery system.
Physicians in the Southern California region of Kaiser Permanente, including all specialties with ambulatory clinic visits.
The primary outcome measure was correlation between online physician ratings and internal ratings from the integrated healthcare delivery system.
Of 5438 physicians who met inclusion and exclusion criteria, 4191 (77.1%) were rated both online and internally. The online ratings were based on a mean of 3.5 patient reviews, while the internal ratings were based on a mean of 119 survey returns. The overall correlation between the online and internal ratings was weak (Spearman’s rho .23), but increased with the number of reviews used to formulate each online rating.
Physician online ratings did not correlate well with internal ratings from a large integrated healthcare delivery system, although the correlation increased with the number of reviews used to formulate each online rating. Given that many consumers are not aware of the statistical issues associated with small sample sizes, we would recommend that online rating websites refrain from displaying a physician’s rating until the sample size is sufficiently large (for example, at least 15 patient reviews). However, hospitals and health systems may be able to provide better information for patients by publishing the internal ratings of their physicians.
Websites providing user-submitted ratings of goods and services have become ubiquitous and are commonly used by consumers to inform their decision-making. This trend has also spread to healthcare,1,2,3,4,5,6,7,8,9,10,11,12,13,14 as there are now dozens of publicly available websites dedicated to providing online ratings of physicians.11 In a recent survey, 65% of individuals reported being aware of physician rating websites, and 35% reported seeking online physician reviews within the past year.15 Among those who sought physician ratings information online, 35% reported selecting a physician based on good ratings and 37% reported avoiding a physician with bad ratings.15
Reaction to physician rating websites has been mixed. Proponents argue that broad access to patient-submitted ratings promotes transparency, empowers patients to make informed decisions, and provides an impetus for underperforming physicians to improve.16 Detractors, on the other hand, have expressed concern about the websites serving as a forum for disgruntled patients, with the potential to defame physicians and cause them psychological harm.16, 17 Since the websites lack a mechanism of validating reviewer identity, there is also the possibility that multiple reviews could be left by the same individual or that reviews could be submitted by individuals posing as patients, or even by physicians themselves.18 Critics also note that the online ratings have not been found to reflect objective measures of clinical quality.19, 20
Concerns have also been raised regarding the validity of the online ratings, as the sample size used to formulate each rating is often small.3, 5, 8, 21 Detractors also point out that the individuals who post online reviews may not be representative of the population as a whole.16
While the companies that publish these ratings online tend to be the best known, they are not the only entities seeking to assess patient ratings of physicians. Many hospitals, health maintenance organizations (HMOs), and health systems also conduct surveys of patients—for example, via the Press Ganey Medical Practice Survey22 or the Consumer Assessment of Healthcare Providers and Systems Clinician and Group (CG-CAHPS) Survey23—and use the data to generate their own internal ratings of their physicians. Given that these internal ratings are based on validated survey methodologies and include a large number of responses solicited from a broad and random sample of the patient population, it is possible that they could provide a better estimate of each physician’s actual patient satisfaction rating. While some prior reports have sought to assess the correlation between online and internal physician ratings, these have primarily been small-scale studies of academic physicians that were restricted to a limited number of websites24 or a single subspecialty,25 institution,26 or department.27
In this study, we sought to determine the extent to which publicly available online ratings of physicians, which are typically based on a small number of unsolicited reviews, correlate with internal patient-submitted ratings from a large integrated healthcare delivery system. We also examined the association between physician years in practice and patient-submitted rating separately for the online and internal ratings, as younger individuals have been shown to be more likely to leave online reviews,28 and more likely to prefer younger doctors.29
Study Population and Selection Criteria
Kaiser Permanente, the nation’s largest HMO, operates in 8 regions throughout the USA. Southern California is the largest such region, with over 4.5 million members as of 2018.30 For this study, all partnered physicians in the Southern California region of Kaiser Permanente were included (n = 6656), including all specialties with ambulatory clinic visits. Physicians were excluded if demographic information was not available (n = 250), if the physician’s age was over 65 (a mandatory retirement age in the Southern California region of Kaiser Permanente; n = 182), or if a valid internal patient-submitted rating was not available (n = 786; see below). Application of these inclusion and exclusion criteria yielded 5438 physicians for analysis.
Physician demographics were obtained from the healthcare system including sex, race/ethnicity, board certification, specialty, fellowship training, and medical school graduation year. Physicians in internal medicine, family medicine, general pediatrics, urgent care, and continuing care were categorized as “non-specialists,” while physicians in other fields of medicine were categorized as “specialists.” Years in practice was calculated for each physician as the number of years which had elapsed since the year of medical school graduation minus the average duration of each physician’s residency (e.g., 3 years for internal medicine) and, if applicable, fellowship (e.g., 2 years for infectious disease). Years in practice was categorized as 10 or fewer, 11–19, or 20 or more. Since only partnered physicians were included in this study and it typically takes 3 years to achieve partnership in the Kaiser Permanente system, the 10 or fewer group was primarily comprised of physicians who had between 3 and 10 years in practice.
Kaiser Permanente routinely solicits feedback on each physician from the patients that he or she sees in clinic. This feedback is collected via a 27-question Kaiser Permanente–specific survey which is sent via mail as well as e-mail to a random sampling of patients following the clinic appointment. Survey completion is voluntary, and all responses are confidential.
For each physician, the internal patient–submitted rating is based on responses to the question, “How would you rate this doctor or health care provider?” Responses to this question are on a 10-point scale ranging from 1 (“Worst doctor or health care provider possible”) to 10 (“Best doctor or health care provider possible”). The internal rating is then calculated as the number of respondents who select 9 or 10 for this question, divided by the total number of responses. (For example, a physician with 92% of responses to this question in the 9–10 range would have an internal patient-submitted rating of 92.0.) In the Southern California region of Kaiser Permanente, internal ratings are considered valid once they are based on 30 or more responses. For the purposes of this study, internal ratings of less than 80.0 were considered to be “low.” The internal ratings utilized in this analysis were based on survey responses received between July 2013 and June 2015, during which time the response rate was approximately 18%.
To find each physician’s online rating(s), web-based searches were conducted during the months of June and July 2016 by 3 individuals who were blinded to the internal ratings (NRU, SYMS, and KCX). These searches were performed via the Google search engine utilizing each physician’s first, middle (if available), and last names as well as the degree (“MD”) and state (“CA”). Matching was performed on the basis of this information, as well as each physician’s medical specialty and practice location. All online ratings identified within the first 20 search results were recorded. For each online rating, we recorded the name of the website, the rating, and the number of patient reviews used to formulate the rating. For each physician, the overall online rating was then calculated as the average of all online ratings found, weighted by the number of reviews used to formulate each rating. All ratings were on a 5-point scale, and ratings below 3 were considered to be “low.”
To analyze the correlation between the online and internal ratings, the Spearman rank correlation was calculated. To evaluate the association between years in practice and physician rating, multivariable logistic regression was performed separately for the online and internal ratings, in both instances controlling for physician sex, race/ethnicity, board certification, and specialty (specialist or not). All tests were two sided, and a p value of < 0.05 was considered significant. Statistical analysis was performed by two members of the research team (K.O. and C.Y.K.) with the use of SAS (version 9; SAS Institute, Cary, NC) and SPSS (version 22; IBM Corporation, Armonk, New York).
There was no external source of funding for this study.
All elements of the study were approved by the Kaiser Permanente Institutional Review Board (IRB).
There were 5438 physicians who met all inclusion and exclusion criteria. Forty percent (2195/5438) were female, and the vast majority were board certified (96.0%; 5221/5438) (Table 1). The mean internal rating was 88.0 (median 89.0, standard deviation 6.73). The internal ratings were based on a mean of 119 survey returns over the 2-year period (median 117, standard deviation 45, range 30–342).
Approximately three-quarters of physicians were found to be rated online (77.1%; 4191/5438), including Vitals.com (n = 3399), Healthgrades.com (n = 2647), UCompareHealthCare.com (n = 1399), RateMDs.com (n = 315), and WebMD.com (n = 285). One thousand five hundred fifty-five physicians were found to be rated on a single website, 1504 on two, 887 on three, 220 on four, 24 on five, and 1 on 6 websites. Of physicians who were found to be rated online, the mean online rating was 4.1 out of 5 (median 4.8, standard deviation 1.2). The online ratings were based on a mean of 3.5 reviews (median 2, standard deviation 10.0).
The correlation between the overall online rating and the internal rating was weak (Spearman’s rho .23). This correlation increased as the number of reviews used to formulate each online rating increased, reaching 0.42 for online ratings which were based on 15 or more reviews (Table 2). The correlation coefficients for the individual websites ranged from 0.16 to 0.30 (Table 3).
Of 4191 physicians found to be rated online, 22.2% (929/4191) were found to have one or more low online ratings (< 3.0 out of 5). The mean number of reviews used to formulate these low online ratings was 2.8 (median 1, standard deviation 3.7). Of physicians with any low online rating, 81.5% (757/929) had a positive internal rating (≥ 80.0). Of physicians with a low internal rating (< 80.0), 63.6% had no low online ratings (< 3.0 out of 5). In this categorical comparison, there was also poor agreement between the online and internal ratings (κ = 0.113, p < 0.001).
Among the internal ratings, physician years in practice was not found to be associated with the likelihood of low ratings. Among the online ratings, however, increasing physician years in practice were found to be associated with a greater likelihood of any low online rating, as well as a low overall online rating (weighted average; Table 4).
Physician online ratings are ubiquitous and influential,15 but they also have their detractors. Critics point out that the online ratings may not accurately reflect patient satisfaction given that the number of reviews used to formulate each rating is often small. In four recent studies on the topic, for example, the average number of reviews used to calculate each physician’s online rating was 2.4,5 3.2,8 2.7,3 and 4.5,21 with nearly half of the ratings based on a single review.8
In our study, the online ratings were based on an average of 3.5 reviews (median 2), and correlation with the internal ratings (which were based on an average of 119 survey returns) was weak. This correlation increased with the number of reviews used to calculate each online rating, reaching 0.42 (“moderate”) for ratings which were based on 15 or more reviews.
While some prior reports have sought to assess the correlation between online and internal physician ratings, these have primarily been small-scale studies of academic physicians that were restricted to a limited number of websites or a single subspecialty, institution or department. Ryan and colleagues examined online ratings from Vitals.com and Healthgrades.com for 16 otolaryngologists at a single academic medical center and did not find significant correlation with results from the Press Ganey Medical Practice Survey.27 Chen and associates conducted a similar study among 200 faculty members at the University of Utah and documented similar results.24 Widmer and colleagues studied physicians at the Mayo Clinic, matching 113 physicians with negative online reviews to 113 physicians without negative reviews, and found similar Press Ganey scores in the two groups.26 Ricciardi and colleagues compared publicly available internal ratings with online ratings for 415 orthopaedic surgeons, although no overall correlation was calculated.25
Critics of online physician ratings also point out that the individuals who choose to submit reviews online may not be representative of the general population. For example, prior research has suggested that younger individuals may be more likely to leave online reviews.28 Since younger individuals may also be more likely to prefer younger doctors,29 we sought to examine the association between physician years in practice and patient-submitted ratings separately for the online and internal ratings. In our study, we found that negative online ratings were significantly more common among physicians who had been in practice for a longer period of time. Among the internal ratings, however, no such association was observed. While the relationship between physician age and patient outcomes is uncertain,31, 32 it does appear that physicians with greater number of years in practice are more likely to have a negative overall online rating than their less experienced colleagues.
While the online rating websites may have their flaws, most stakeholders agree that reviews of physicians have utility in that they enable patients to make more informed decisions regarding their care. Recently, a growing number of hospitals and health systems have begun to publish the internal patient-submitted ratings of their physicians.33 Since these internal ratings are based on a larger number of reviews from a broad and random sample of the patient population, they may provide better information on physicians for patients and consumers. In addition, such a strategy may allow hospitals to improve the perception of their own physicians who have been rated negatively online. In our study, for example, over 80% of physicians with a low online rating were found to have a positive rating internally.
The results of this investigation should be considered in light of our study design. We compared online and internal ratings for a large number of community-based physicians from a wide variety of facilities and specialties, which may increase the generalizability of our findings. Some may question whether patients would feel comfortable rating physicians honestly in surveys returned to the integrated healthcare delivery system, even with assurances of confidentiality. However, the percentage of physicians with low ratings internally (12.6%; 683/5438) was similar to the percentage of physicians who had a low overall rating online (13.1%; 549/4191). Since we do not have demographic data on the patients whose reviews were used to generate the internal Kaiser Permanente ratings, the degree to which these individuals are representative of the patient population cannot be assessed. Because the integrated healthcare delivery system surveys patients randomly after a clinic visit, patients who attend clinic visits more frequently may be oversampled. However, this may still represent an improvement over the online ratings, which do not feature a formal sampling methodology. Finally, it is possible that some online ratings could have been missed by our search strategy. However, the strategy utilized in our study mimics the one commonly used by patients when seeking information on physicians online.
In summary, online physician ratings do not correlate well with internal ratings from a large integrated healthcare delivery system. The correlation between the online and internal ratings increased with the number of reviews used to formulate each online rating, however, suggesting that the weak overall correlation may be related to the small sample sizes used to formulate most online ratings. Given that many patients are not aware of the statistical issues associated with small sample sizes, we would recommend that online rating websites refrain from displaying a physician’s rating until the sample size is sufficiently large (for example, at least 15 patient reviews). However, hospitals and health systems may be able to provide better information for patients by publishing the internal ratings of their physicians.
Atkinson S. Current status of online rating of Australian doctors. Aust J Prim Health. 2014;20(3):222–3.
Bakhsh W, Mesfin A. Online ratings of orthopedic surgeons: analysis of 2185 reviews. Am J Orthop (Belle Mead NJ). 2014;43(8):359–63.
Black EW, Thompson LA, Saliba H, Dawson K, Black NM. An analysis of healthcare providers’ online ratings. Inform Prim Care. 2009;17(4):249–53.
Detz A, Lopez A, Sarkar U. Long-term doctor-patient relationships: patient perspective from online reviews. J Med Internet Res. 2013;15(7):e131.
Ellimoottil C, Hart A, Greco K, Quek ML, Farooq A. Online reviews of 500 urologists. J Urol. 2013;189(6):2269–73.
Emmert M, Meier F. An analysis of online evaluations on a physician rating website: evidence from a German public reporting instrument. J Med Internet Res. 2013;15(8):e157.
Emmert M, Meier F, Heider AK, Durr C, Sander U. What do patients say about their physicians? an analysis of 3000 narrative comments posted on a German physician rating website. Health Policy. 2014;118(1):66–73.
Gao GG, McCullough JS, Agarwal R, Jha AK. A changing landscape of physician quality reporting: analysis of patients’ online ratings of their physicians over a 5-year period. J Med Internet Res. 2012;14(1):e38.
Kadry B, Chu LF, Kadry B, Gammas D, Macario A. Analysis of 4999 online physician ratings indicates that most patients give physicians a favorable rating. J Med Internet Res. 2011;13(4):e95.
Lopez A, Detz A, Ratanawongsa N, Sarkar U. What patients say about their doctors online: a qualitative content analysis. J Gen Intern Med. 2012;27(6):685–92.
Merrell JG, Levy BH, 3rd, Johnson DA. Patient assessments and online ratings of quality care: a “wake-up call” for providers. Am J Gastroenterol. 2013;108(11):1676–85.
Sabin JE. Physician-rating websites. Virtual Mentor. 2013;15(11):932–6.
Segal J, Sacopulos M, Sheets V, Thurston I, Brooks K, Puccia R. Online doctor reviews: do they track surgeon volume, a proxy for quality of care? J Med Internet Res. 2012;14(2):e50.
Wallace BC, Paul MJ, Sarkar U, Trikalinos TA, Dredze M. A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews. J Am Med Inform Assoc. 2014;21(6):1098–103.
Hanauer DA, Zheng K, Singer DC, Gebremariam A, Davis MM. Public awareness, perception, and use of online physician rating sites. JAMA. 2014;311(7):734–5.
Strech D. Ethical principles for physician rating sites. J Med Internet Res. 2011;13(4):e113.
Segal J, Sacopulos MJ, Rivera DJ. Legal remedies for online defamation of physicians. J Leg Med. 2009;30(3):349–88.
Emmert M, Sander U, Pisch F. Eight questions about physician-rating websites: a systematic review. J Med Internet Res. 2013;15(2):e24.
Daskivich TJ, Houman J, Fuller G, Black JT, Kim HL, Spiegel B. Online physician ratings fail to predict actual performance on measures of quality, value, and peer review. J Am Med Inform Assoc. 2018;25(4):401–407.
Okike K, Peter-Bibb TK, Xie KC, Okike ON. Association Between Physician Online Rating and Quality of Care. J Med Internet Res. 2016;18(12):e324.
Randhawa S, Viqar A, Strother J, et al. How Do Patients Rate Their Radiation Oncologists in the Modern Era: An Analysis of Vitals.com. Cureus. 2018;10(9):e3312.
Press Ganey Associates Inc. “Patient Experience - Patient-Centered Care.” Available online at: http://www.pressganey.com/solutions/patient-experience. Last accessed 26 June 2019.
Agency for Healthcare Research and Quality. “CAHPS Clinician & Group Survey.” Available online at: https://www.ahrq.gov/cahps/surveys-guidance/cg/index.html. Last accessed 26 June 2019.
Chen J, Presson A, Zhang C, Ray D, Finlayson S, Glasgow R. Online physician review websites poorly correlate to a validated metric of patient satisfaction. J Surg Res. 2018;227:1–6.
Ricciardi BF, Waddell BS, Nodzo SR, et al. Provider-Initiated Patient Satisfaction Reporting Yields Improved Physician Ratings Relative to Online Rating Websites. Orthopedics. 2017;40(5):304–310.
Widmer RJ, Maurer MJ, Nayar VR, et al. Online Physician Reviews Do Not Reflect Patient Satisfaction Survey Responses. Mayo Clin Proc. 2018;93(4):453–457.
Ryan T, Specht J, Smith S, DelGaudio JM. Does the Press Ganey Survey Correlate to Online Health Grades for a Major Academic Otolaryngology Department? Otolaryngol Head Neck Surg. 2016;155(3):411–5.
Smith A, Anderson M. “Online Shopping and E-Commerce.” Pew Research Center, 2016 December. Available online at: http://assets.pewresearch.org/wp-content/uploads/sites/14/2016/12/16113209/PI_2016.12.19_Online-Shopping_FINAL.pdf . Last accessed 26 June 2019.
McKinstry B, Yang SY. Do patients care about the age of their general practitioner? A questionnaire survey in five practices. Br J Gen Pract. 1994;44(385):349–51.
Kaiser Permanente. “Southern California Fast Facts.” Kaiser Permanente. 2019 March. Available online at: https://about.kaiserpermanente.org/who-we-are/fast-facts/southern-california-fast-facts. Last accessed 26 June 2019.
Tsugawa Y, Jena AB, Orav EJ, et al. Age and sex of surgeons and mortality of older surgical patients: observational study. BMJ. 2018;361:k1343.
Tsugawa Y, Newhouse JP, Zaslavsky AM, Blumenthal DM, Jena AB. Physician age and outcomes in elderly patients in hospital in the US: observational study. BMJ. 2017;357:j1797.
Lee V. Transparency and Trust - Online Patient Reviews of Physicians. N Engl J Med. 2017;376(3):197–199.
The authors would like to thank Mimi Hugh MS MPH and the Kaiser Permanente Southern California Member Appraisal of Physician/Provider Services (MAPPS) Committee for their assistance with this study, as well as all the physicians of the Southern California Permanente Medical Group.
All elements of the study were approved by the Kaiser Permanente Institutional Review Board (IRB).
Conflict of Interest
The authors declare that they do not have a conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This study has not been previously presented.
About this article
Cite this article
Okike, K., Uhr, N.R., Shin, S.Y.M. et al. A Comparison of Online Physician Ratings and Internal Patient-Submitted Ratings from a Large Healthcare System. J GEN INTERN MED 34, 2575–2579 (2019). https://doi.org/10.1007/s11606-019-05265-3