The present study was motivated by a desire to explore the usefulness of user-generated online content for public opinion research. While the amount of information shared by individuals online has increased dramatically, these data on personal experiences, opinions, and attitudes have been generally underutilized. Our second aim was to determine a level of correspondence between the results obtained from the analysis of UGC compared to those obtained by analyzing conventional survey data. Finally, the third purpose of this study was to provide a practical example of tools and procedures for analyzing UGC. To achieve these three goals, we focused specifically on public attitudes towards healthcare systems. Below we highlight the main findings of this study.
First, the results obtained from the word frequency analysis reflected all three dimensions of healthcare evaluation, namely affordability, accessibility, and quality of health services. The words with the highest frequencies included “cost(s)”, “insurance”, “pay(id)”, “doctor(s)” and “hospital(s)”. This result indicates that the most commenters are greatly concerned with the affordability and quality domains, although the accessibility domain is also important. While it is not clear without further analysis what exactly the users mean by “access”, the commenters mention frequently “wait times”, indicating that this issue presents a significant barrier to the accessibility of healthcare services.
The analysis of word frequency for female and male commenters separately revealed that although most of the high-frequency words were identical, female users tend to focus more on the affordability of healthcare services, indicated by such words as “free”, “coverage”, “taxes”, and “need”, while male users tend to prioritize the institutional structure of healthcare systems using such words as “government” and “private”.
A comparison of high-frequency words related to healthcare between 10% of commenters with the most positive sentiment score and 10% with the least positive sentiment score showed that although such factors as insurance, doctors, and costs are relevant to both groups, the most positive group frequently mentioned such aspects as “access”, “free”, and “patients”, while the least positive group were concerned with “private”, “wait times”, “coverage”, and “poor”.
Taken together, the results of the word frequency analysis provide support for the importance of the institutional factors in determining public satisfaction with healthcare systems. In particular, the manner in which healthcare provision is organized and funded (public or private), as well as certain aspects of quality (number of doctors, number of hospitals) and access (wait times) play an important role in influencing public opinion about healthcare services. These findings provide additional validity to the institutional indicators collected as part of the official healthcare services statistics by the agencies such as the World Health Organization (WHO) or the Eurostat.
Literature on public attitudes towards healthcare systems have shown inconsistent results in regard to the influence played by gender [8, 17, 28]. Based on the analyzed set of readers’ comments, we found that female commenters tend to have a lower compound sentiment score than do the male commenters, indicating that women readers are less satisfied with healthcare services. However, it is important to note that the difference between the means of two groups was not found to be statistically significant. At the same time, in line with the literature on the individual-level determinants of healthcare attitudes the results of the word frequency analysis confirmed that personal experiences with the healthcare system play an important role for healthcare system evaluation [2].
Regarding other individual-level characteristics relevant to healthcare satisfaction, while we were able to include gender into our analysis, it was not possible to investigate the effect of age, given that the information on this variable was missing and could not be directly inferred from the user comments. On the other hand, we can assume that the individuals posting on the NYT website have similar socio-demographic profiles in terms of educational level and income. As our analysis shows, despite similar socio-demographic backgrounds, these individuals exhibited divergent opinions about healthcare systems and their comparison. Thus, in contrast to previous studies, this result suggests that this group of commenters should not be treated as homogeneous.
Using sentence-level sentiment analysis, we obtained a sentiment score for 61 countries, distributed across several continents. To ensure the robustness of our findings, we focused on the set of 22 countries that were assigned the sentiment score more than once in the body of comments. Most of these countries belong to the European region but we also obtained the sentiment score for Australia, Cuba, Israel, and New Zealand. This allowed us to compare countries from different regions across the globe, which has not been possible using cross-national survey data that are often confined to a single region. Interestingly, although the European healthcare systems traditionally enjoy high levels of public support compared to other regions [25, 29], in our analysis the top three countries with the most positive sentiment score were found to be outside the European area (Israel, New Zealand, and Cuba).
The findings from the sentiment analysis at the country level for the European region produced results that are quite similar to those obtained on the basis of aggregated cross-national survey data. The results of the correlation analysis indicated a relatively high correspondence of country sentiment scores with the mean scores from the EQLS (0.70) and a moderate correspondence with the mean scores from the ESS (0.63). This is an encouraging result, considering the ease and the low cost associated with obtaining the UGC data, especially in comparison with the comparative survey data.
We should acknowledge certain limitations of this study that offer several opportunities for future work. First, like with much of the research using user-generated content, we should be cautious in generalizing our results. The majority of the individuals who participated in commenting online represent a general NYT readership. As mentioned above, in terms of the socio-demographics, this group of commenters is characterized by medium to high level of education and income. In addition, many of these individuals have either lived or studied in a different country. Thus, the results of this study may not be generalizable to other groups of respondents, particularly those with lower level of education and income, less experience abroad, and those with limited experience with the Internet.
Another limitation common to many studies employing user-generated data, is that the body of comments may grow while the platform for commenting remains open. Hence, it is possible that the findings based on the body of comments at the given time may be different when another time frame is analyzed. To illustrate, when we first accessed the data in March 2018, the NYT article “The Best Healthcare System in the World” had 636 comments. As of January 24, 2020, the body of comments has grown to 771 comments, with the last comment made in April 12, 2019, when the administrators closed the comments section for this article. Much of UGC is real-time or streaming data and, as such, it is generated continuously. Implementing streaming APIs and similar tools will assist with gathering, managing, and analyzing such data, as they are generated.
Finally, as mentioned above, user-generated data are often complex and noisy. In this regard, UGC presents similar challenges to those of qualitative data. And much like the analysis of qualitative information, the analysis of UGC does not often lends itself easily to replicability, especially when not all steps in the analysis are automatized. In this study, as part of the validation step,Footnote 5 we had to conduct a manual check of the results of sentiment analysis and remove a number of cases to ensure consistency. This was possible considering the small set of comments analyzed but would be difficult and time-consuming with a larger corpus of comments. We hope that the development of better algorithms and analytical tools will ensure a greater replicability and reproducibility of UGC analysis in the future.
Despite these challenges, we believe that user-generated online data offer promising opportunities for public opinion research. This study provides only one preliminary examination of how user-generated data can be analyzed and future opportunities are abundant. For example, future studies may consider analyzing a larger corpus of online comments and/or different online platforms for gathering information on healthcare attitudes. While we do not consider the online user-generated content to be a direct competitor to more traditional survey data, we are certain that the richness of such data combined with appropriate analytical tools would be of great benefit to researchers of public opinion.