Introduction

Symptom checker Apps (SCAs) are eHealth applications designed to support laypeople in assessing their symptoms and receiving recommendations on medically appropriate actions related to their health [1]. Users can input their health-related information into SCAs through a chatbot or search strings, and SCAs retrieve and categorize the input. Some SCAs are advertised as AI-based, and most generate healthcare-related information and recommendations for actions based on user input [2].

Although SCAs are already in use, their impact on healthcare systems remains poorly understood. Recent scoping reviews described ambiguous effects of SCAs [1, 3], indicating that they could both reduce or induce oversupply. The effectiveness of SCAs in delivering adequate and precise information and recommendations must be considered. Additionally, the possible impact of SCAs on healthcare systems depends on several factors, including the characteristics of SCAs and how SCAs are used. Finally, the impact of SCAs on users’ health-related behavior, such as seeking healthcare, must also be considered.

Recent studies have shown that the diagnostic accuracy and triage capabilities of SCAs are highly variable. A recent study reported a triage accuracy for primary conditions varying between 48.8% and 90.1% [1]. Additionally, a significant disparity in diagnostic accuracy between SCAs and emergency physicians has been reported. While SCAs correctly identified the primary diagnosis in only 30% of cases, emergency physicians achieved a much higher accuracy rate, successfully diagnosing 81% of cases [4]. In addition, another study found that medical laypeople still outperformed SCAs [5]. Consequently, SCAs currently struggle to reliably assist patients in navigating healthcare and addressing adequate medical recommendations.

Understanding the impact of SCAs on the healthcare system requires consideration of user demographics, such as (e)health literacy and attitudes toward technology [3, 6, 7]. Research indicates that SCA users are often female, well-educated, Caucasian, with health insurance and a regular healthcare provider [8, 9]. Recent studies showed that health literacy levels in Germany have declined over the years with the reported uncertainty being mainly related to online resources [10]. However, some users found SCAs useful for self-diagnosis and reported positive health effects [11], while others had problems giving and interpreting concrete information on symptom time patterns or severity [12]. Such difficulties may initiate unnecessary healthcare-seeking behavior, although the evidence remains inconclusive [13]. Additionally, increased eHealth literacy may lead to greater subjective trust in SCAs and the ability to critically evaluate their recommendations, but not necessarily to a change in actual trust-based behavior [14]. Lastly, user attitudes toward technology play a significant role, with “tech seekers” being more likely to use SCAs in the future compared to “tech rejectors” and “unsure acceptors” [15]. Concurrently with internet research, the usage of SCAs may also magnify preexisting user characteristics associated with unwarranted healthcare-seeking tendencies rather than operating independently [16]. As an example, SCA may worsen hypochondria, similarly to how internet research is already known to do among vulnerable patient groups [16].

There is a research gap concerning the influence of concepts such as hypochondria, self-efficacy, technology affinity, and health literacy on the use of SCAs. Therefore, the aim of this explorative study was to identify meaningful predictors for SCA use considering user characteristics.

Methods

An explorative cross-sectional survey was conducted. The survey was available online or as a paper and pencil version. The STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) checklist [17] was applied.

Measurements

Due to the limited literature on SCAs, pilot interviews with SCA users and SCA experts were conducted to ensure a meaningful concept selection for the survey content. In addition, to identify potential characteristics of the user group, we drew on literature related to the use of health applications.

Thus, the following concepts were selected: eHealth literacy [18], hypochondria [19], self-efficacy [20], and affinity for technology [21]. Table 1 presents a comprehensive overview, detailing the reliability, validity, scale, and scoring of the evaluated scales (General Life Satisfaction Short Scale [22], German Version of the eHealth Literacy Scale [23], Whiteley Index [24, 25], General self-efficacy short scale [26], Ultra-Short Scale for Assessing Affinity for Technology Interaction [27]) used in this study.

Furthermore, the presence of chronic diseases, private screen time (as a potential indicator of smartphone use) were assessed. Sociodemographic variables such as age, gender, and school education were also assessed in this study.

Table 1 Overview, detailing the reliability, validity, scale, and scoring of the evaluated scales used in this study

Recruitment

The survey was conducted from November 2020 to June 2021. The sample comprised different recruiting strands to reach a wide variety of participants and ensure a sufficient number of SCA users for the statistical analysis. In the first strand, n = 50.000 German citizens were contacted via mail to participate in the survey. The intended recipients were representatively selected by an external partner (T + R Dialog Marketing (Berlin, Germany) and Acxiom (Neu-Isenburg, Germany)). Further participants were recruited by mailing lists of the University of Tübingen and the University Hospital of Tübingen, social media and by cooperating GP practices. The second strand aimed to reach SCA users only; therefore, participants were only included if they had SCA experience. Targeted advertisements via social media, the social channels of the University Hospital of Tübingen, the homepage of a German newspaper and the social channels of federal health insurance were conducted to recruit further SCA users.

Data exclusion

We assumed a missing completely at random mechanism (only single values were unplausible or missing, omitted by chance). Participants with missing data on the primary outcome were excluded (n = 2). Furthermore, physicians (n = 19) were excluded due to the assumption that their medical knowledge would have a significant influence on SCA usage.

Statistical analysis

The primary outcome variable was whether participants had already used SCAs. Statistical analyses were conducted in different steps. The first step comprised variable selection using a least absolute shrinkage and selection operator (LASSO [28]) regularized logistic regression analysis considering nine predictors (as listed in Table 2). The second-step model comparison involved an intercept-only model, a full model and a model with the selected predictor using conventional logistic regression. In the third step, we utilized the identified predictors to derive parameter estimates and p-values. This process led to the determination of the main analysis parameters. A post-hoc analysis was conducted in the fourth step.

Propensity score matching

Users and non-users were matched with propensity score matching [29, 30] on an initial set of potential confounders [31]. Confounder covariates included school education and age, as we assumed that we reached a younger and better-educated user population due to our targeted recruiting strategy via social media and university mailing lists. A nearest neighbor matching algorithm [29] was applied. Missing data on the predictors were imputed using a random forest approach [32] that enables the imputation of missing information in mixed-data (categorical and continuous). Out-of-bag errors were considered [32].

Predictor selection using LASSO regularized logistic regression

The participants were divided into training (70%) and test (30%) data sets. The training data set was used to fit a model on the given data, and the test data set was used to evaluate the model [33]. A 0.632 bootstrap estimator [34] was applied as the resampling method for lambda selection. The sensitivity, specificity, Positive Predictive Value (PPV), Negative Predictive Value (NPV) and accuracy rate were calculated by fitting to the test data set. An overview of the predictors included can be found in Table 2.

Model comparison of an intercept only, a full model and a model with the identified predictor

A conventional logistic regression was fit on the complete matched data set to derive odds ratios (ORs) and confidence intervals (CIs). To identify potential multicollinearity, we employed the Variance Inflation Factor (VIF), which assesses the variance of a coefficient within the full model in comparison to its variance when modeled independently [33]. A VIF value exceeding 5 were considered as indicative of significant collinearity [33]. The Akaike information criterion (AIC) of a full model, an intercept-only model and the model with the LASSO selected predictors was compared to assess model performance. The smaller the AIC, the better the performance of the model [33].

Parameter estimators, CI and p-values of the models, including the selected predictors

Parameter estimators were derived from conventional logistic regression. Two sensitivity analyses were conducted to ensure the robustness of the ORs, CIs and p values considering different sample compositions. We applied a different matching algorithm (full optimal matching [29]) for the sensitivity analysis. Additionally, we used the whole sample without matching for sensitivity in the second analysis.

Post hoc Analysis: categorization of the WI

Finally, a post hoc analysis was conducted considering a predictor identified in step 3. The variable was dichotomized to identify clinically relevant persons, and a Pearson’s χ2 test was conducted.

Data Processing

Data processing and statistical analyses were conducted with R Version 4.1.1 [35, 36] and R Studio Version 1.4 [37].

Results

A total of 869 participants (n = 116 paper-pencil, n = 753 online) completed the survey. As participants were matched 1:1 and 67 users finished the survey, the final analysis included n = 134 participants. The median age of the population was 31 (IQR 24–49), and 67% were female. The matched variables (age and school education) were well balanced between the user and non-user groups (Love Plot Supplemental Fig. 1).

Table 1 describes the matched sample stratified for SCA use, including all predictors used in the LASSO regression. Univariate analyses were conducted for all predictors. In addition to subjective rated health, hypochondria and self-efficacy showed a significant association with SCA use.

Table 2 Overview of the potential predictors stratified for SCA use and univariate analysis

Identification of meaningful predictors

The training data set comprised 93 participants. The test data set comprised 41 participants. Nine variables were initially considered for predictor selection, as detailed in Table 2. The selection process, which involved a LASSO regularized logistic regression, identified two variables with nonzero coefficients: hypochondria (WI) and self-efficacy (ASKU). Consequently, these two variables were chosen as predictors in the conventional logistic regression model. The LASSO coefficient profiles against log (λ) are shown in Fig. 1, as is the bootstrapped ROC curve for the regularization parameter λ. Figure 2 shows the LASSO coefficient profiles against log (λ), Lambda = 0.112 when the error of the model is minimized, and 2 variables were selected. Sensitivity, specificity, Negative / Positive Predictive Values (NPV / PPV) and balanced accuracy can be found in Table 3.

Table 3 Model evaluation of the LASSO regression of the test data set
Fig. 1
figure 1

Bootstrapped ROC curve for λ

Fig. 2
figure 2

LASSO coefficient profiles against log (λ), Lambda = 0.112 when the error of the model is minimized, and 2 variables were selected

Model comparisons

The AIC of the full model was 184.43, and the intercept-only model derived an AIC of 187.76. The logistic regression model based on the results of the LASSO variable selection (WI and ASKU-S) had the lowest AIC of 172.21 and therefore showed an improved performance compared to the full and intercept-only model. The VIF = 1.035 showed no considerable multicollinearity.

Parameter estimators and predictor robustness

Table 4 shows the odds ratios, confidence intervals of the odds ratios, and p values of the logistic regression and its sensitivity analyses comprising the previously identified predictors, hypochondria and self-efficacy. The OR of the predictor hypochondria (WI) showed a similar value (1.24–1.26) for all three models and a significant p value (P <.001). The OR size corresponds to a small effect [38]. The ORs for self-efficacy, measured across the three models, were not statistically significant (P >.05). Additionally, the variation inflation factor for all models was low (VIF < 1.06), indicating no considerable multicollinearity.

Table 4 Results of the conducted logistic regression and the sensitivity analysis

Post hoc analysis

Pearson’s χ2 test revealed a significant difference between non-users and users among participants with clinically relevant levels of hypochondria on the WI.

Over half of the SCA users had a WI sum score higher than the cut-off of five, indicating clinically relevant hypochondria (Table 5).

Table 5 Post hoc analysis with categorized WI stratified for the user group

Discussion

In this exploratory study, we identified WI-assessed hypochondria as a reliable predictor for SCA use. This predictor consistently affected all analyses, including the two sensitivity analyses. Furthermore, lower values of self-efficacy assessed with the ASKU-S were identified as a positive predictor for SCA use in the main analysis. The sensitivity analyses did not replicate the effect of this variable; thus, its role remains unclear due to the rather moderate sample size.

Comparison with prior work

Hypochondria was identified as a predictor for SCA use and revealed a stable effect throughout our analyses. Over half of the SCA users had a WI sum score higher than the cut-off of five, indicating clinically relevant hypochondria (Table 4). This level of anxiety may affect a patient’s ability to adequately handle action recommendations and symptom classifications. Thus, these SCA users might be susceptible to the negative effects of SCA use. Hypochondria in the context of SCAs can be classified as cyberchondria, considering the working definition of Vismara [39]. A 2020 study discouraged self-diagnosis using SCAs among cyberchondriac patients and emphasized adjusting expectations accordingly when accessing health information online [40]. Another recent study revealed that some people with high WI (hypochondria) scores felt worse after online symptom checking, while others with low scores felt better [41]. Given this literature and our findings, it appears that patients with health anxiety are less likely to benefit from SCAs, despite being more inclined to use them. The transferability of results from online health-related use to SCAs is important to consider, as they suggest that prolonged use is associated with increased functional impairment and anxiety both before and after checking [41]. The impact of using SCAs on health-anxious patients remains unclear and warrants further investigation.

Furthermore, we examined self-efficacy as a meaningful predictor since a recent study indicated an association between self-efficacy and the adoption of SCA use [20]. The results of the predictor self-efficacy were ambiguous with differing effect sizes in our analyses. It is still uncertain how much self-efficacy contributes to determining the usage of SCAs.

Affinity for technology was another variable we considered since the literature indicated a potential association [15]. A study that examined SCA user profiles with a latent class analysis revealed that the latent class of “tech seekers” showed the highest odds of using SCA [15]. However, the results in our rather moderate sample do not suggest an association between affinity for technology and SCA use. Reasons for the discrepancy might be the different operationalization of the concepts (e.g., a scale rather than profiles) or the different study populations.

The broad use of SCAs can lead to individual- and systemic-level effects. SCAs could lead to a misuse of health care resources [4], such as users visiting emergency departments too early or too often. As a result, these users in nonurgent conditions put further strain on the health system by possibly increasing costs and taking resources from patients who need emergency care [3, 42]. To mitigate these risks, software developers should provide transparent information about the potential dangers of using SCAs. This information could be presented in the form of an instruction leaflet, available after downloading or using an SCA in a browser. The instruction should clearly state that using SCAs may increase health anxiety. The language used in the instruction should be concise and easy to understand so that users can absorb the information and take appropriate action.

The existing knowledge about SCAs should be used to improve SCA design in the best possible way and implement improvements to minimize the negative effects and strengthen the potential positive effects. Physicians should be trained to consider pre-informed patients and promote dialog. It is necessary to better understand the relationships between cyberchondria, hypochondria, and e-health literacy in the context of SCA use to derive recommendations for systemic interventions and plan targeted and helpful interventions.

Strengths and limitations

In this study, we conducted an industry-independent investigation of SCA users. Furthermore, our research did not limit itself to a single SCA application; instead, we examined the usage patterns across various types of SCA applications, enhancing the generalizability of our findings. Additionally, by matching users and non-users based on age and education, we controlled for these variables, thereby strengthening the reliability of our analysis.

A limitation of this study is the cross-sectional design we employed. This approach restricts our ability to infer causal relationships between variables, as it only provides a snapshot in time, thereby limiting our understanding of the dynamics and directionality of the relationships observed. Additionally, the recruitment of this study lead to younger and better-educated individuals introduces a potential selection bias. Our study’s moderate sample size, while adequate for exploratory purposes, may not capture the full spectrum of SCA usage characteristics. Conducting this research on a larger scale would be beneficial to validate our findings and identify more nuanced predictors of SCA use. Moreover, our approach of double targeting SCA users to ensure a higher response rate might lead to response bias, potentially resulting in an overrepresentation of the views and behaviors of more engaged or interested users. In light of these limitations, future research should consider longitudinal studies involving more diverse and larger samples.

Conclusions

Hypochondria emerged as a significant predictor of SCA use in our sample, with a consistently stable effect. Over half of the SCA users had clinically relevant hypochondria considering their values on the WI, which may impact their ability to handle SCAs effectively. According to the literature, persons with hypochondria are less likely to benefit from SCA. These users could be further unsettled by risk-averse triage and unlikely but serious diagnosis suggestions. Software developers should provide transparent information about the potential dangers of using SCAs, including that SCA use may increase health anxiety. Individuals with higher levels of health anxiety (hypochondria) might experience increased anxiety or functional impairment due to SCA use. Users should be cautious of over-relying on SCAs for health information and diagnosis. For healthcare professionals, training in addressing patient concerns arising from SCA use may be beneficial, particularly for managing individuals with high health anxiety. Further, the widespread use of SCAs may potentially lead to the misuse of healthcare resources, with nonurgent cases increasing the burden on emergency services.