Prospective Observational Study on the Prevalence and Diagnostic Value of General Practitioners’ Gut Feelings for Cancer and Serious Diseases

Background General practitioners (GPs) have recognized the presence of gut feelings in their diagnostic process. However, little is known about the frequency or determinants of gut feelings or the diagnostic value of gut feelings for cancer and other serious diseases. Objective To assess the prevalence of gut feelings in general practice, examine their determinants and impact on patient management, and measure their diagnostic value for cancer and other serious diseases. Design This prospective observational study was performed using the Gut Feelings Questionnaire (GFQ). Participants Participants included 155 GPs and 1487 of their patients, from four Spanish provinces. Main Measures Sociodemographic data from patients and GPs; the reasoning style of GPs; the characteristics of the consultation; the presence and kind of gut feeling; the patient’s subsequent contacts with the health system; and new cancer and serious disease diagnoses reported at 2 and 6 months post-consultation. Key Results GPs experienced a gut feeling during 97% of the consultations: a sense of reassurance in 75% of consultations and a sense of alarm in 22% of consultations. A sense of alarm was felt at higher frequency given an older patient, the presence of at least one cancer-associated symptom, or a non-urban setting. GPs took diagnostic action more frequently after a sense of alarm. After 2 months, the sense of alarm had a sensitivity of 59% for cancer and other serious diseases (95% CI 47–71), a specificity of 79% (95% CI 77–82), a positive predictive value of 12% (95% CI 9–16), and a negative predictive value of 98% (95% CI 86–98). Conclusions Gut feelings are consistently present in primary care medicine, and they play a substantial role in a GP’s clinical reasoning and timely diagnosis of serious disease. The sense of alarm must be taken seriously and used to support diagnostic evaluation in patients with a new reason for encounter.


BACKGROUND
Uncertainty around diagnosis is one of the biggest challenges that a clinician faces when caring for a patient. This is particularly relevant for general practitioners (GPs) 1 , whose work is associated with one of the highest perceptions of uncertainty 2 . GPs are confronted with an immense range of symptoms, and in some cases a seemingly minor symptom can indicate a serious diagnosis 3 . For example, most lower-back pain disappears within a year; however, in 1 of 350 patients with backache, the pain will be the guiding sign for a serious diagnosis 4 . This uncertainty forces GPs to optimize the use of their analytical and nonanalytical reasoning tools. In this sense, the use of intuition in medicine has long been recognized as part of the art of medicine and even as representing tacit knowledge essential to good practice 5,6 . Intuitive sensations, called "gut feelings," have been described as a "useful light that goes on suddenly to announce that there is something unusual." 7 GPs have been reported to recognize the existence of gut feelings and consider them a useful tool for decision-making 8,9 , and even a separate track in their clinical reasoning 10 .There are two kinds of gut feelings: a sense of alarm that leads a GP to worry about a patient's health status even if they have not yet found any specific indication; and a sense of reassurance that leads a GP to feel confident about the patient's management and outcome even though they may not be certain about the diagnosis, a sense that everything fits in 8 .
There are many studies regarding the use of gut feelings by GPs, hospital specialists, and nurses 9,[11][12][13][14][15][16][17] . GPs reported using their gut feelings in suspecting cancer 9,[17][18][19][20] and other serious diseases 21,22 . In Denmark and the UK, the GPs' sense of alarm has been accepted as a valid reason for referring a patient to specific pathways of cancer diagnosis 23,24 . It has been suggested that gut feelings' diagnostic values are routine GP's consultations, where serious diseases and cancer have a low prevalence 25 . However, few studies have evaluated the frequency and diagnostic value of gut feelings in primary care consultations 20 . Hjertholm et al. found that the suspicion of cancer and serious disease in primary care consultations had a prevalence of 5.7% and a positive predictive value (PPV) of 9.8% 26 . Donker et al. observed a PPV of 35% for cancer-related gut feelings, and reported that this value increased according to the ages of the patient and the doctor 12 . In order to objectively measure gut feelings, a Gut Feeling Questionnaire (GFQ) was created and validated in a Dutch context 22 ; since then, it has been made available in seven languages 27,28 . The GFQ determines whether a gut feeling has arisen during a consultation. In a study using the GFQ, Barais et al. found that GPs had a gut feeling in 99.16% of consultations concerning patients with dyspnea or chest pain, corresponding to a sense of alarm in 35% of these cases and a sense of reassurance in 65% of the cases. Among patients with dyspnea or chest pain, the presence of a sense of alarm increased the probability of a lifethreatening disease from 20 to 35%, while the presence of a sense of reassurance decreased the probability to 12% 13 .
Non-analytical, intuitive reasoning is a substantial part of the diagnostic process; it induces and guides analytical reasoning and deliberate action. However, the prevalence, diagnostic value, and determinants of gut feelings are not yet fully known. More knowledge of these aspects might lead to a better understanding of the consultation process and help practitioners undertake timely diagnostic evaluation and avoid errors.
This study aimed to assess the prevalence and determinants of gut feelings in general practice, the subsequent management of patients in light of the kind of gut feeling experienced by the GP, and the diagnostic value of gut feelings for cancer and other serious diseases.

METHODS
In this prospective observational study, we used the Spanish and Catalan versions of the GFQ. The work was carried out in primary care centers of four Spanish provinces (Balearic Islands, Madrid, Barcelona, and Lugo) during 2019-2020.
Participants were GPs and their patients. The protocol of the study has already been published 29 .

Participants
GPs were invited to participate during workshops held in the health centers. Those who accepted the invitation were instructed on data gathering. During at least one working day, GPs included consecutive patients with at least one new reason for consultation. Exclusion criteria were consultations with non-residents, terminally ill patients, or patients younger than 18 years old, and consultations for bureaucratic reasons. At the end of each consultation (index consultation), patients were given oral and written information about the study and signed an informed consent document.

Measurements
We collected sociodemographic and practice data on the participating GPs (age, sex, training tasks, rural/non-rural health center, and years with the same list of patients). We used a 4item Likert scale validated by Martínez-Cañabate et al. 30 in her PhD thesis. Each item has 4 possible answers, from completely disagree to completely agree. The scale assesses whether the professional carries out a practice more oriented to the biological (lower scores) or psychosocial (higher scores) sphere 31,32 . GPs completed the Rational-Experiential Inventory (REI) 33 . This Likert scale has 40 items and has been validated in Spanish population. Each item has 5 possible answers, from completely disagree to completely agree. The REI measures rational and experiential thinking styles and includes subscales of selfreported ability and engagement with each thinking style 34 .
After each index consultation, GPs recorded sociodemographic data obtained from the patient and how long they had been on the doctor's list. Data about the visit were recorded, such as the type and consultation duration (longer or shorter than 6 min), the language used, and the presence of cancerassociated symptom(s) [35][36][37][38] (Table 1). Finally, the GP complet-  9,28,39 . Item 1 (repeated at the end as item 11) assesses whether the patient's case elicited a gut feeling in the consultation. Items 2-6 are rated using a 5-point Likert scale that ranges from completely disagree to completely agree. Item 2 concerns the sense of reassurance and items 3-6 relate to the sense of alarm. A sense of alarm is considered present when the answer to item 1 or 11 indicates a sense of alarm, or when the answer to item 1 or 11 is "not applicable" and at least one of the scores of items 3-6 is higher than 3. A sense of reassurance is considered present when the answer to item 1 or 11 indicates a sense of reassurance or when the answer for item 1 or 11 is "not applicable" and the score for item 2 is higher than 3. A gut feeling is considered to be absent when the answers for items 1 and 11 are both "not applicable," none of the scores for items 3-6 is higher than 3/4, and the score for item 2 is lower than 4/5.
Two months and 6 months after the index consultation, we reviewed primary care and hospital clinical records to collect new diagnoses of cancer (except non-melanoma skin cancer) and other serious diseases among the participating patients. Recurrence of cancer in patients considered disease-free at the time of the index consultation was regarded as a new diagnosis. Beginning with the list of serious diseases published by Hjertholm et al. 26 , two researchers independently judged whether a newly diagnosed disease was "serious" or not. When there was disagreement, a third researcher made the final decision. Six months after the index consultation, we also recorded patient contacts with health care services.

Statistical Analysis
We performed a descriptive analysis of all selected variables to describe sample characteristics and the prevalence of gut feelings. A bivariate analysis was carried out, in which the presence of a sense of reassurance or alarm was compared with the characteristics of the GP, patient, and consultation. We used the chi-square test for categorical variables, and Student's Ttest for continuous variables. OR and 95% CI were calculated. A multivariate logistic regression analysis was done to assess the independent relationships between the variables and the kind of gut feeling. Variables with p≤ 0.20 were introduced in the model 40 . We assessed changes in the coefficients at each step to detect confusion, and tested interactions. Sensitivity, specificity, positive and negative predictive values (PPVs, NPVs), and positive and negative likelihood ratios (LR+, LR −) were calculated for the sense of alarm and the sense of reassurance. We assumed that the sense of alarm aims to identify patients with high probability for a serious outcome, while the sense of reassurance aims to identify patients with low probability for a serious outcome. Logistic multivariate analysis was also used to calculate the risk of serious disease depending on the type of gut feeling, adjusted for patient age, sex, visit type, visit duration, and cancer-related symptom(s).
We assessed goodness of fit for every model with the Hosmer-Lemeshow test. Analysis was done with SPSS.v.25.

RESULTS
We invited 272 GPs; of them, 155 participated. The GPs reported on 1487 patients (63.2% female) over 328 working days (see flowchart in Figure 1). Most of the patients were Spanish-born and lived in urban environments; their mean age was 51.9 years. Nearly six of 10 patients presented at least one cancer-associated symptom. The characteristics of the GPs, patients, and consultations are described in Table 2.

Prevalence of Gut Feelings
GPs experienced a gut feeling during 97.1% of the consultations: a sense of reassurance was recorded in 1120 consultations (75.3%) and a sense of alarm was recorded in 324 consultations (21.7%). In 43 consultations, the GFQ did not determine a gut feeling. These cases were excluded from the analysis. The GPs, patients, and consultations characteristics are categorized by the type of GF present in Table 3.
We found no difference in the frequency of reassurance or alarm regarding most of the determinants studied. The sense of alarm was more prevalent, and the sense of reassurance was less prevalent under the following conditions: when GPs' NFC engagement scores (rational reasoning) were higher; in consultations with older patients; when a patient presented at least one cancer-associated symptom; in non-urban areas; or when the language used during the consultation was not the GP's native language. Regarding the features of the consultations, the prevalence for a sense of alarm was higher in consultations that lasted longer than 6 min or with fewer patients seen that day. Our multivariate analysis (see Table 3) confirmed the above-described results found in the bivariate analysis, except for the number of patients visited in the same day. Table 4 shows the actions taken by GPs during the 6 months after the index consultation, categorized by the kind of gut feeling. Patients visited their GP more frequently after GPs experienced a sense of alarm than a sense of reassurance. GPs more frequently ordered laboratory tests, radiological investigations, and primary care procedures after experiencing a sense of alarm and referred more frequently to both outpatient services and the emergency department. There was no difference in patients' sick leave based on the type of feeling experienced by the GP.

Diagnostic Value
The presence of a diagnosis of cancer or serious disease could be evaluated in 1385 patients (see Figure 1). At 2 months after the index consultation, 64 patients (4.6%) had been newly diagnosed with cancer or another serious disease. At 6 months, a total of 116 patients had been newly diagnosed with a serious disease (8.3%; nine with cancer).
Diagnostic values are shown in Table 5. After 2 months, the sense of alarm for cancer or a serious disease had a sensitivity of 59.3%, specificity of 79.4%, a PPV of 12.2%, an NPV of 97.5%; an LR+ of 2.8, and an LR− of 0.5. After 6 months, most of these figures were similar for the sense of alarm, except that the PPV was 18.3% and the NPV was 94.5%.
The adjusted OR for a serious diagnosis after 2 months was 5.3 after a sense of alarm and 0.19 after a sense of reassurance. Six months after the index consultation, the adjusted OR was 3.6 after a sense of alarm and 0.2 after a sense of reassurance.

Summary of Findings
This is the first study seeking to estimate the prevalence and diagnostic value of gut feelings in the consultations of GPs. Our study showed that GPs had a gut feeling almost every time they consulted with a patient for a new reason; these feelings were a sense of reassurance approximately 75% of the time. A more frequent sense of alarm was associated with various determinants, such as the GP being more engaged with analytical reasoning, the patient's age, the practice being located in a non-urban area, the presence of at least one cancerassociated symptom, and incongruence in the native languages of the patient and GP. We also observed that the presence of a sense of alarm increased the number of tests performed and the referrals to secondary care for further investigation. The sense of alarm experienced by the GP increased the possibility that the patient would receive a new diagnosis of cancer or another serious disease by 2 months (adjusted OR 5.3) and 6 months (adjusted OR 3.6) after the initial consultation. This possibility decreased after the GP's perception of a sense of reassurance, with an adjusted OR of 0.19 at 2 months and 0.27 at 6 months. The presence of a sense of alarm increased the likelihood of the diagnosis of a cancer or a serious disease at 2 months from the consultation from 4.6 to 12.3% and from 8.4 to 18.4% at 6 months, while the presence of a sense of reassurance decreased these likelihoods to 2.4% and 5.9%, respectively.

Strengths and Limitations
The GFQ is a validated measure for determining gut feelings. Our prospective design enabled us to obtain accurate and reliable results. The use of primary care and hospital electronic clinical records prevented loss of information, such as unrecorded diagnoses.
We did not reach the estimated sample size of consultations 29 , as 43% of the GPs decided not to participate. Therefore, our data lacked the power needed for us to draw conclusions about the diagnostic value of GFs related exclusively to cancer. The distributions of participant GPs by age, sex, and non-urban vs. urban environment were essentially the same as those previously observed among Spanish GPs 41,42 .
The outbreak of the COVID-19 pandemic during the last months of data collection might have influenced our results. All the index consultations occurred prior to the pandemic, but the follow-up periods for 1/3 of the cases ended after the pandemic started. Thus, although COVID-19 was not a diagnostic possibility during the index consultation, patients were exposed to the new disease during the follow-up period. Moreover, the Spanish National Health System stopped all non-urgent activity during the first months of the pandemic, probably delaying some cancer diagnoses 43,44 . We considered COVID-19 to be a serious disease when patients suffered complications, needed hospitalization, or died. We found 36 confirmed COVID-19 cases among participants, including 35 mild cases and 1 with pneumonia. Otherwise, the prevalence of cancer and the other serious diseases was comparable between the present study and previous relevant reports 26 .

Comparison with Existing Literature
The consistent appearance of gut feelings during the consultations indicates that GPs habitually use intuitive reasoning. The intuitive decision-making system is fast, automatic, effortless, and difficult to control 45,46 . In primary care health centers, which are characterized by massive numbers of consultations and strict time constraints, the contribution of "intuition" to the decision-making process is obvious. The high prevalence of gut feelings involving a sense of reassurance is in line with the low probability of serious disease in primary care, as many complaints are innocent and temporary indispositions 47 .
Studies measuring gut feelings with non-validated tools found a much lower prevalence of the sense of alarm 26,37,48 . The Hawthorne effect 49 , which is how the awareness of being studied may impact the behavior of the study subjects 50 , should be considered a possible source of bias. Although the GPs did not know if their answers to the GFQ would reflect a sense of alarm or a sense of reassurance, they might have changed their behavior and been more suspicious when interpreting the patient's symptoms during their participation in the study, potentially leading to an overestimation of the sense of alarm.
If the high prevalence of the sense of alarm found in our study using the GFQ was an overestimation, then it had inevitably influenced the predictive value. Barais et al. 13 used the GFQ among French GPs in patients consulting for dyspnea or chest pain; the authors observed that gut feelings were present in 99.15% of consultations, with 35% of them representing a sense of alarm and 65% representing a sense of reassurance. The higher prevalence of a sense of alarm, compared to that found our study, can be easily explained because these authors selected patients with dyspnea and chest pain, who have a much higher risk of serious outcome. We assume that the Hawthorne effect is at least partially responsible for our finding that a high proportion of the gut feelings experienced by GPs corresponded to a sense of alarm. To mitigate this effect, several strategies have been proposed that should be considered in future research on gut feelings, such as assuring the participants that the objective of the study is to identify gut feelings without judging the clinician's performance, triangulating the collection of information, and collecting information over long periods of time while discarding the first set of data collected 51 . Another less probable explanation could be that case vignettes from real practice were used to validate the GFQ and perhaps the cut-off values need to be refined in the context of real consultations 39 .
Our results showed that the kind of gut feelings was influenced by some characteristics of the GPs, patients, and/or of GP general practitioner, SD standard deviation, NFC need for cognition, FI faith in intuition the consultations. The style of reasoning (rational or intuitive) did not appear to generally affect the occurrence of gut feelings. However, somewhat to our surprise, GPs prone to rational reasoning had more frequent experiences of a sense of alarm. The Spanish non-urban population is older than the urban population, which could explain the higher prevalence of GPs having a sense of alarm in non-urban areas 42 . The presence of at least one cancer-associated symptom increased the prevalence of a sense of alarm, which is consistent with previous published evidence 36,37 . The sense of alarm may activate the diagnostic process by stimulating a GP to formulate and weigh working hypotheses involving a serious outcome 11 that is, the sense of alarm was associated with a longer consultation. Doctors of different specialties have acknowledged the presence of gut feelings in their diagnostic process 14,15,21 , although they considered that it is more frequent and appropriate among GPs because of the greater number of diagnostic possibilities a GP faces after a patient. Intuition played a greater role and was more widely accepted in specialties like general internal medicine, pediatrics, and psychiatry 15 . These specialties, along with family medicine, are the ones where physicians have a higher perception of uncertainty in their daily work 2 .
We found an increasing number of GP visits, tests, and referrals for patients with whom the GP experienced a sense of alarm. Our results are comparable with those observed by Hjertholm et al. 26 , where the number of GP consultations, primary-care specialist, and diagnostic imaging increased in the 2-month period after a consultation when the GP had a suspicion of serious disease, while the use of hospital services (inpatient and outpatient increased both 2 and 6 months after).
These findings could be expected, as this gut feeling induces the diagnostic process of gathering more data.
Regarding the diagnostic value of gut feelings, other authors also observed increased probabilities of serious disease after a sense of alarm. Hjertholm et al. 26 found that the risk of a diagnosis of cancer or another serious disease was 2.98 higher 2 months after the index consultation in the case of an sense of alarm. Ingeman et al. 52 found that 24% of patients with whom the GP felt a gut feeling of cancer were finally diagnosed with cancer. A meta-analysis on the diagnostic utility of gut feelings in diagnosing cancer in primary care showed that a gut feeling associated with cancer increased the odds of cancer four times 20 . These results justified the decision made in Denmark and the UK to accept the GPs' gut feeling as a valid reason for referring a patient to specific pathways of cancer diagnosis 23,24 .
The value of LR+ of a sense of reassurance (1.95) implies that the pretest probability of serious disease decreases from 8.4 to 5.5%, and thus did not contribute greatly to ruling out cancer or a serious disease, so still the GP has to consider several hypotheses before discarding a serious diagnosis. The LR+ of a sense of alarm (2.8) modified the pretest probability for cancer or serious disease from 8.4 to 18.4%. As proposed by Barais 51 , given the very low prevalence of serious diseases, LR+ values between 2 and 5 could be of interest since they increase the probability of serious disease by 15-30%. Therefore, a sense of alarm should be taken seriously in general practice, and clinicians should follow up patients with an analytical reasoning track.

CONCLUSIONS
The results of our study showed that gut feelings are substantially present in primary care. Gut feelings, especially a sense of alarm, contribute valuably to the diagnostic process and must be taken seriously when seeing patients with a new reason for encounter, which should lead to a diagnostic review. A gut feeling is a substantial part of clinical reasoning and supports GPs in timely diagnosing cancer or other serious diseases. Rational reasoning-prone GPs did not differ from   their intuitive reasoning-prone colleagues with respect to experiencing gut feelings. Medical students must be trained in becoming aware of their own gut feelings and how to deal with them. Further research should focus on the significance of gut feelings related to specific symptoms and signs, and on the factors that could increase the prognostic and diagnostic value of GPs' gut feelings.