A clinical risk matrix for obstructive sleep apnea using Bayesian network approaches

  • Daniela Ferreira-Santos
  • Pedro Pereira Rodrigues
Open Access
Regular Paper


In obstructive sleep apnea, respiratory effort is maintained but ventilation decreases/disappears due to upper-airway partial/total occlusion. This condition affects about 4% of men and 2% of women worldwide. This study aimed to define an auxiliary diagnostic method that can support the decision to perform polysomnography, based on risk and diagnostic factors. Our sample performed polysomnography between January and May 2015. Two Bayesian classifiers were used to build the models: Naïve Bayes and Tree Augmented Naïve Bayes, using 38 variables identified by literature review or just a selection of 6. Area under the ROC curve, sensitivity, specificity and predictive values were evaluated using leave-one-out and cross-validation techniques. From a total of 241 patients, only 194 fulfilled the inclusion criteria, 123 (63%) were male, with a mean age of 58 years, 66 (34%) patients had a normal result and 128 (66%) a diagnosis of obstructive sleep apnea. The cross-validated AUCs for each model were: NB38: 69.2%; TAN38: 69.0%; NB6: 74.6% and TAN6: 63.6%. Regarding risk matrix, female gender presented a starting rate of 8%, comparing to 20% in male gender, almost 3 times higher. The high (34%) proportion of normal results confirms the need for a pre-evaluation prior to polysomnography, making the search for a validated model to screen patients with suspicion of obstructive sleep apnea essential, especially at primary care level.


Obstructive sleep apnea Risk factors Diagnosis Bayesian network Clinical model Sensitivity Specificity 

1 Introduction

The substantial medical, social, and economic consequences of untreated obstructive sleep apnea (OSA), the overwhelming number of patients who have escaped clinical detection, and the likelihood of successful treatment strongly justify screening [6]. In this clinical outcome, diagnostic models need to have a high sensitivity, as false negatives should be avoided, to prevent excluding patients with moderate or severe OSA from performing polysomnography (PSG) [24, 34], the standard test for OSA final diagnosis.

This study aimed to define auxiliary diagnostic methods that can support the decision to perform PSG, based on risk and diagnostic factors by means of interactive models or risk matrix. The secondary objectives were to describe the population; develop and validate a Bayesian network-based risk matrix in the study population; optimize the need to perform PSG and produce a Bayesian network model for daily use in the primary care setting.

This paper is organized as follows. The background section exposes the related work on the theme. Following section presents the research methodology. Section 4 gives an overview of the achieved results, and Sects. 5 and 6 interpret the results of the work and provide the main findings and recommendations for the work.

2 Background

Apnea is defined as the complete cessation of airflow for at least 10 s, while a hypopnea is a reduction in airflow (30–50%) that is followed by an awakening or a decrease in oxyhemoglobin saturation (3–4%) [6, 22]. There are 3 types of apneas: central, mixed and obstructive. Central sleep apnea is a reduction in the respiratory effort resulting in reduced or absent ventilation, while mixed apnea begins with central apneas that leads to obstructive events [5, 22]. In OSA, respiratory effort is maintained but ventilation decreases or disappears because of partial/total occlusion of the upper airway [6, 22, 23, 28, 32].

OSA severity is assessed with apnea–hypopnea index (AHI), obtained through PSG, which is the number of apneas and hypopneas per hour of sleep [22]. Recommendations from the American Academy of Sleep Medicine state that OSA is present when AHI \(\ge \) 5. It can be classified as mild (AHI: 5–15), moderate (AHI: 15–30), or severe (AHI \(\ge \) 30) [6, 7, 13, 22]. Approximately 30% of the general public is affected by a significant sleep problem, often of long standing [39]. OSA affects about 4% of men and at least 2% of women worldwide [5, 7, 13, 18, 39]. The signs, symptoms, and consequences of OSA are a direct result of upper-airway repetitive collapse. This leads to sleep fragmentation, hypoxemia, hypercapnia, marked swings in intrathoracic pressure, and increased sympathetic activity.

Reported risk factors for developing OSA include different groups of variables, such as demographic, clinical history, comorbidities and even factors collected during the consultation, for example, male gender [1, 10, 16, 20, 21, 22, 27, 30, 36, 42, 43, 44, 45], aging [1, 16, 20, 21, 22, 27, 30, 36, 42, 43, 44, 45], obesity [1, 10, 12, 16, 20, 21, 22, 30, 36, 42, 43, 45], history of smoking [1, 10, 12, 22, 30, 45], increased neck [1, 10, 16, 20, 21, 22, 27, 30, 36, 43, 45] and abdominal [30, 45] circumferences, arterial [1, 12, 16, 20, 22, 30, 36, 43, 44, 45] and pulmonary [12, 30] hypertension, atrial fibrillation [30, 36], stroke [1, 22, 30, 45], myocardial infarction [1, 30, 45], and high-risk driving populations (such as truck drivers) [10, 12, 30, 42].

Diagnostic criteria for OSA are based on clinical signs and symptoms determined during a sleep consultation, which includes a sleep oriented history, physical examination, and findings identified in an objective exam [13, 28]. It should also include an evaluation, for example, of snoring, witnessed apneas, gasping/choking episodes, daytime sleepiness severity with the Epworth Sleepiness Scale (ESS), nocturia, and morning headaches [22]. The diagnostic methods available are PSG and home testing with portable monitors (tends to underestimate the severity) [3, 6, 7, 13, 18, 22, 39]. PSG is time consuming, labor intensive, limited to urban areas, costly and faces long waiting lists [18, 22], so many studies have been trying to tackle the problem that comes with it. Rodsutti et al. [35] conducted a study to develop and validate a decision rule (based on risk factors) that would allow prioritization on the waiting list, using univariate analysis and multiple logistic regression, achieving a scoring scheme or color-coded tables for easy clinical application. Sun et al. [40] used three questionnaires to improve sensitivity and specificity for discrimination of moderate to severe OSA, based on a genetic algorithm. Montoya et al. [37] based their work on several epidemiological and clinical variables, sought to find alternatives to PSG, using logistic regression analysis and multivariate logistic regression to determine the best model for distinguishing OSA patients from the healthy ones. In the end, the work produced a algorithm to calculate the prediction of AHI in a new patient.

All the previous have focused on traditional simpler methods for decision support. Nowadays, prediction models are generated by artificial intelligence, using decision trees, neural networks, support vector machines, and Bayesian networks [9]. All should have good performance, good ability to handle data entry errors or omissions, transparency of diagnostic knowledge, ability to explain decisions, and the algorithms should be able to reduce the number of tests needed for making a reliable diagnosis [26]. While searching, we found studies that attempted to apply these new techniques in OSA. One study [14] tackled the tedious and time-consuming task of analyzing PSG records, automatizing both the detection and classification of sleep apneas, through analysis of wavelets and Bayesian neural networks. The other [19] classified patients with possible diagnosis of OSA into groups according to the severity of the disease using a decision tree producing algorithm based on nonlinear analysis of three respiratory signals instead of full PSG. However, none of the found approaches addressed only clinical and demographic variables that could be used earlier in the healthcare process flow, as they require diagnostic data from PSG.

In fact, technologies exist to tackle the problem at later stages, e.g., disease management [31], but there is a clear lack of solutions for the early diagnosis of OSA. Bayesian networks have been used in several clinical domains, especially given their balance between accurate predictions and their specific interpretability in the clinical domain, resembling the human reasoning in a probabilistic way, along with their ability to produce predictions with missing values, presenting high performance in areas like pneumonia and breast cancer [26].

3 Methods

This study was designed according to the Standard for Reporting Diagnostic accuracy studies (STARD) list [4], updated in 2015. Its guiding principle was to select items that, when reported, would help readers to judge the potential for bias in the study, to appraise the applicability of the study findings and the validity of conclusions and recommendations. STARD guidelines have been generally used for adequately validating diagnostic tools.

3.1 Patients

We have included all patients referred to perform a polysomnography at Vila Nova de Gaia/Espinho hospital center sleep laboratory, between January and May 2015. Inclusion criteria were defined as follows: patients aged more than 18 years old and with suspicion of OSA. On the other hand, patients already diagnosed (performing therapeutic studies), patients with suspicion of another sleep disease, patients with severe lung diseases or neurological conditions, and pregnant women were excluded. In case of duplicate exams, the best sleep efficiency was selected.

3.2 Variables

A literature review on PubMed (April 19, 2015) was performed to define the most relevant variables to be collected from medical and/or sleep laboratory records. The search contained “risk factors”, “sleep apnea, obstructive” and “diagnosis” as Mesh terms, obtaining 1397 articles, from which 48 were used for variable definition.1 A total of 38 predictive variables were collected: demographic variables: gender, race and age; physical examination: body mass index (BMI), neck (NC) and abdominal circumferences (AC) and craniofacial and upper-airway abnormalities (CFA); clinical history: daytime sleepiness, snoring, witnessed apneas, choking/gasping, refreshing sleep, restless sleep, humor alterations, concentration decrease, morning headaches, decreased libido, motor vehicle crashes, drivers, nocturia, alcohol, smoking, coffee, sedatives, family history/genetics and ESS; comorbidities: atrial fibrillation (AF), stroke, myocardial infarction (MI), arterial and pulmonary hypertension (PHT), congestive heart failure (CHF), diabetes, dyslipidemia, renal failure, hypothyroidism, gastroesophageal reflux, depression and anxiety.

3.3 Data collection and preprocessing

Medical and/or sleep laboratory records were retrospectively collected between January 1, 2015 to May 31, 2015. Clinical information of each patient (39 variables) was extracted from the central clinical records along with the sleep laboratory data, making all the clinical files available. We screened for missing information but, although we had all the records, some predictive variables were not present or described. This rose a problem in the construction of the Bayesian network models, creating the need to make an assumption: when learning the networks’ structure (and only then), if the variable was not present in the records, we assumed it was absent hence imputing category "No". Our models development and validation were performed in this sample. The outcome measure was the clinical diagnosis, obtained from AHI, categorized into normal (AHI < 5) or OSA (AHI\(\ge \) 5).

We performed a preprocessing of the data and all the continuous variables were categorized:
  • BMI (\(< 30\,\hbox {Kg}/\hbox {m}^{2}\): normal, \(\hbox {BMI}\ge 30\,\hbox {Kg}/\hbox {m}^{2}\): obese);

  • female NC (\(\le 37\,\hbox {cm}\): normal, \(> 37\,\hbox {cm}\): increased); male NC (\(\le 42\,\hbox {cm}\): normal, \(> 42\,\hbox {cm}\): increased);

  • female AC (\(\le 80\) cm: normal, \(>80\) cm: increased); male AC (\(\le 94\,\hbox {cm}\): normal, \(> 94\,\hbox {cm}\): increased);

  • age (\(< 40\), 40–54, 55-69, \(\ge 70\) years);

  • smoking (yes, no, ex-smoker);

  • ESS (0–10: normal, 11–24: daytime sleepiness);

  • AHI (0–4: normal, 5–14: mild, 15–29: moderate, \(\ge \) 30: severe).

3.4 Bayesian networks

Generally, a Bayesian network represents a joint distribution of one set of variables, specifying the assumption of independence between them with the interdependence between variables being represented by a directed acyclic graph [29]. Each variable is represented by a node in the graph and is dependent on the set of variables represented by its ascendant nodes. This dependence is represented by a conditional probability table that describes the probability distribution of each variable, given their ascendant variables. Naïve Bayes (NB), which assumes conditional independence among factors, and Tree Augmented Naïve Bayes (TAN)[17], which allows for an optional dependence for each factor, were the Bayesian network classifiers used to build our models. They were chosen given their previous results in other clinical domains [11, 25]. Four models were evaluated and compared, differing on the classifier (NB and TAN) and the number of predictive factors (38 or 6).

3.5 Statistical analysis

Diagnostic models were defined using a Bayesian network built over the set of available variables and area under the curve (AUC) was performed. Model parameters were validated by comparing the AUC in the derivation cohort with those calculated from a leave-one-out and a 10 times twofold stratified cross-validation (for variability assessment with independent training and testing). We used R software to: (a) perform descriptive and associative analysis, using packages gmodels [15], epitools [2] and MASS (modern applied statistics with S) [41]; (b) learn and validate the models, using packages bnlearn [38] and gRain; and (c) analyze AUC, using package pROC [33]. SamIam [8] software was used to visually consult the conditional probabilities, given the outcome.

The application of the selected models generated in this work can be visualized by means of (a) Bayesian inference (TAN6) and (b) appropriately defined risk matrix (NB6). The models with selected variables (NB6 & TAN6) were built after performing Chi-square test (unless otherwise specified) to all the 38 variables. The selected variables were chosen as those with a univariate significant association with the outcome, considering a 5% significance level, or a 10% significance level if at least 5 patients were observed in each outcome category (Normal or OSA), and for which no quality problems were suspected.

In order to choose which variables should be included in the risk matrix, we evaluated the variables with statistical significance, the odds ratios obtained in the multivariate logistic regression and those which had higher clinical relevance were chosen as factor. Each cell of the matrix represents the marginal posterior outcome probability estimate for that subgroup of patients. The precision of such estimates is given by a 95% credible interval, computed from a Monte Carlo simulation of one hundred thousand samples from the derived joint probability model (i.e., the NB6). The risk values in each cell of the matrix represent the expected risk for a patient in that subgroup, while the credible interval encloses 95% of risk estimates for patients in that subgroup (i.e., only 5% of patients in that subgroup have a risk estimate outside the credible interval). We believe that this approach is more interesting from the clinical point of view than the usual one, in which a confidence interval (CI) of the expected risk of all patients in each subgroup is computed and presented. To assess the discriminative ability of the risk matrix for the outcome, specific cutoff values were chosen after assessing the AUC of the derivation cohort, aiming at a sensitivity of 95%, to allow a rule-out approach aiming to avoid false negatives.

This was approved by the Ethics Commission of Vila Nova de Gaia/Espinho hospital center, following the Declaration of Helsinki.

4 Results

4.1 Population characteristics and analyzed outcome

We considered for inclusion 241 patients, being 47 excluded for several reasons: 7 duplicates; 8 missing clinical file; 19 under eighteen years old, and 13 therapeutic studies. In the 194 patients included, 123 (63%) were male (mean age 58 years); sixty-six patients (34%) had a normal result with a mean age of 50 years. Of the 128 patients with OSA (66%) (mean age 62 years), 63 (33%) were categorized as mild, 32 (16%) as moderate, and 33 (17%) as severe.

Table 1 describes the dataset obtained from the medical and/or sleep laboratory records.

In those with craniofacial and upper-airway abnormalities, described as part of physical examination, the percentage of patients in OSA group was higher than in the group without OSA (97 vs. 79%, p value \(\le 0.05\)). Other variables describe the same effect, such as witnessed apneas, nocturia, alcohol consumption, atrial fibrillation, stroke, myocardial infarction and driver. The opposite effect emerges in daytime sleepiness (94 vs. 77% in the OSA group, p value \(\le 0.05\)), concentration decreased and ESS, which presented a contradiction to the literature and the inherent meaning of the variables.
Table 1

Descriptive analysis of the derivation cohort before imputation (absolute and relative frequencies are presented and p values result of Chi-square test, unless otherwise specified). Variables with high number of missing values (e.g., snoring) are described with two proportions: the proportion in the real dataset, and the proportion in the imputed dataset


Total (\(n=194\))

OSA (\(n=128\))

Normal (\(n=66\))

Crude OR [CI 95%]

p value

Adjusted OR [CI 95%]

Male gender

123 (63)

92 (72)

31 (47)

2.72 [1.55–5.28]


3.72 [1.81–7.90]



191 (99)

126 (98)

65 (99)





3 (2)

2 (2)

1 (2)

0.51 [0.11–6.68]



   Mean (IQR)

58 (50–67)

62 (54–70)

50 (41–61)

1.08 [1.05–1.11]

\(< 0.001\)*


   \(< 40\)

19 (10)

5 (4)

14 (21)


\(< 0.001\)



56 (29)

29 (23)

27 (41)

2.42 [0.93–8.59]


2.26 [0.66–8.57]


79 (41)

59 (46)

20 (30)

6.56 [2.54–23.04]


7.18 [2.17–27.13]

   \(\ge 70\)

40 (21)

35 (27)

5 (8)

13.61 [4.50–64.37]


17.83 [4.18–91.19]


   Median (IQR)

29 (26–34)

30 (26–34)

29 (26–33)

1.02 [0.98–1.08]




91 (47)

64 (50)

27 (41)

1.37 [0.79–2.60]




   Median (IQR)

42 (39–45)

42 (40–46)

40 (37–44)

1.13 [1.05–1.22]




107 (55)

77 (60)

30 (46)

1.72 [0.99–3.27]


2.01 [1.00–4.09]


   Median (IQR)

107 (99–113)

108 (100–114)

105 (97–111)

1.02 [1.00–1.05]




180 (93)

120 (94)

60 (91)

1.31 [0.52–4.43]




175 (90)–(98)

114 (97)

61 (100)

0.00 [0.01–5.23]



Witnessed apneas

104 (54)–(70)

72 (75)

32 (60)

1.83 [0.96–3.99]


1.22 [0.59–2.54]


76 (39)–(59)

49 (59)

27 (60)

0.90 [0.46–2.01]



Vehicle crashes

6 (3)–(8)

2 (4)

4 (16)

0.16 [0.04–1.20]



Refreshing sleep

45 (23)–(29)

31 (32)

14 (25)

1.30 [0.67–2.89]



Humor alterations

2 (1)

1 (100)

1 (100)





70 (36)–(63)

53 (69)

17 (50)

2.00 [0.96–4.94]


1.51 [0.71–3.27]

Restless sleep

76 (39)–(77)

48 (75)

28 (80)

0.68 [0.29–2.06]



Decreased libido

18 (9)–(95)

13 (100)

5 (83)

2.17 [0.26–210.02]



Morning headaches

66 (34)–(59)

44 (61)

22 (55)

1.19 [0.59–2.78]



Alcohol consumption

96 (50)–(61)

65 (66)

31 (53)

1.63 [0.89–3.30]





32 (17)

20 (17)

12 (19)

0.78 [0.39–1.85]




56 (29)–(30)

40 (33)

16 (25)



50 (26)–(93)

27 (96)

23 (89)

1.69 [0.37–19.98]




16 (8)–(9)

13 (12)

3 (5)

1.96 [0.70–7.97]




120 (62)–(87)

76 (88)

44 (85)

1.23 [0.52–3.70]



Daytime sleepiness

114 (59)–(84)

65 (77)

49 (94)

0.20 [0.07–0.79]




2 (1)–(67)

0 (0)

2 (100)

0.00 [0.00–5.49]




   Median (IQR)

11 (5–15)

10 (3–13)

13 (7–16)

0.91 [0.87–0.96]

\(< 0.001^+\)


   Daytime sleepiness

102 (53)

57 (45)

45 (68)

0.36 [0.20–0.71]



Concent. Decrease

42 (22)–(54)

20 (43)

22 (71)

0.28 [0.12–0.81]




43 (22)–(92)

32 (97)

11 (79)

4.00 [0.87–50.10]


2.03 [0.86–5.12]

Atrial fibrillation

15 (8)–(22)

15 (31)

0 (0)

8.82 [1.08–334.34]




17 (9)–(23)

14 (29)

3 (12)

2.14 [0.73–9.46]




14 (7)–(20)

12 (26)

2 (9)

2.22 [0.67–12.47]



Pulm. Hypertension

1 (1)–(25)

1 (50)

0 (0)

1.00 [0.11–220.62]



Cong. Heart Fail.

20 (10)–(25)

16 (28)

4 (17)

1.52 [0.56–5.83]




47 (24)–(72)

33 (72)

14 (74)

0.79 [0.29–3.02]




99 (51)–(93)

76 (95)

23 (89)

1.90 [0.58–11.03]



Renal failure

3 (2)–(30)

2 (33)

1 (25)

0.60 [0.11–15.16]




2 (1)–(33)

1 (50)

1 (25)

0.75 [0.12–45.17]



Gastroesoph. Reflux

3 (2)–(21)

2 (22)

1 (20)

0.50 [0.10–10.35]




123 (63)–(94)

91 (96)

32 (89)

2.21 [0.72–11.04]




53 (27)–(95)

28 (97)

25 (93)

1.08 [0.23–15.11]







63 (33)




32 (16)




33 (17)


OSA obstructive sleep apnea, OR odds ratio, CI confidence interval, IQR interquartile range, BMI body mass index, NC neck circumference, AC abdominal circumference, ESS Epworth somnolence scale, Concent. Decrease concentration decrease, CFA craniofacial and upper-airway abnormalities, MI myocardial infarction, Pulm. Hypertension pulmonary hypertension, Cong. Heart Fail. congestive heart failure, AHI apnea–hypopnea index

\(^\#\)Fisher’s exact test, *Independent T test, \(^+\)Mann–Whitney U test

4.2 Bayesian diagnostic models

In order to unveil the interdependent relationships between the analyzed outcome (OSA) and the 38 variables considered, Bayesian network-based models were built. R software was used to learn the probabilities obtained from the dataset described in Table 1, using the aforementioned imputation strategy for structure learning. This resulted in four models: NB38 and TAN38, NB6 and TAN6 with TAN structures presented in Figs. 1 and 2 and NB structures presented in supplementary figures S1 and S2. The ROC curves of each model are presented in Fig. 3, demonstrating in sample AUC of 82% [76–88%] for NB38, 90% [86–94%] for TAN38, 79% [73–86%] for NB6, and 79% [73–86%] for TAN6.

Clinically speaking, interpreting TAN38 is a hard task, given the time to apply it in a primary care consultation. Thus, TAN6 contains only relevant variables, that is, gender, witnessed apneas, age, nocturia, CFA, and NC. Although AC, concentration decrease, ESS, and daytime sleepiness would be eligible for the final model, categorized AC lost significance, while the remaining three presented contradictory results raising data collection quality suspicion.
Fig. 1

Tree Augmented Naïve Bayes with 38 variables (TAN38) representing the relationship between the outcome (OSA) and each variable, and relationships between predictive factors

Fig. 2

Tree Augmented Naïve Bayes with 6 variables (TAN6) representing the relationship between the outcome (OSA) and each variable, and relationships between predictive factors. The bars within each variable represent the prior marginal probabilities for each variable’s category. Arrows represent association between variables, but do not convey any causal relationship, the association between the outcome and each of the remaining variables being imposed on the model

Fig. 3

Receiver operating characteristics analyses and area under the curve values for NB38, TAN38, NB6 and TAN6, as well as for the internal validation procedures

4.3 Model validation

The Bayesian models were validated following an internal approach, which consisted of two different tests (leave-one-out and 10 times twofold stratified cross-validation). ROC analysis was performed independently for the derivation cohort and the respective AUC, along with their 95% CIs, illustrated in Fig. 3. The AUC values of the leave-one-out nearly overlapped those of the cross-validation. Furthermore, the overall discrimination power was high for both strategies, using leave-one-out. Based on NB6, the best model according to the validation results (Table 2), the following cutoff was determined: values above 32.0% were considered to be a positive test result, i.e., outcome presence. Table 2 presents the performance of the chosen cutoffs for all models, presenting sensitivities of 90.0% [88.2–91.8%] for NB38, 81.9% [77.7–86.0%] for TAN38, 94.1% [92.9–95.4%] for NB6, and 90.2% [88.0–92.4%] for TAN6.

4.4 Risk matrix

To determine which variables should be included in the risk matrix, a multivariate logistic regression was carried out using all the independent variables considered. Those that were clinically relevant were selected and included in the final matrix as risk factors for OSA: gender (OR 3.72 [1.81–7.90%]), age (\(< 40\): OR ref, 40–54: OR 2.26 [0.66–8.57%], 55–69: OR 7.18 [2.17–27.13%], \(\ge 70\) years: OR 17.83 [4.18–91.19%]), neck circumference (OR 2.01 [1.00–4.09%]), and witnessed apneas (OR 1.22 [0.59–2.54%]). The risk matrix (Table 3) conveys the risk of having OSA stratified by relevant factors. The highest value (95%) was observed for male patients, age above seventy, increased neck circumference and witnessed apneas. The lowest value (8%) was observed in female patients, age under forty, with normal neck circumference, and no witnessed apneas.
Table 2

Validity assessment [%] estimated from 10 times twofold cross-validation











67.1 [65.6–68.7]

90.0 [88.2–91.8]

22.9 [19.0–26.8]

69.4 [68.4–70.5]

53.9 [48.3–59.5]

69.2 [66.6–71.8]



66.9 [64.5–69.3]

81.9 [77.7–86.0]

37.9 [31.3–44.5]

72.1 [70.2–74.0]

53.1 [48.1–58.1]

69.0 [66.4–71.6]



70.2 [69.2–71.3]

94.1 [92.9–95.4]

23.8 [19.9–27.7]

70.6 [69.7–71.6]

68.4 [64.2–72.5]

74.6 [72.9–76.2]



67.5 [66.3–68.6]

90.2 [88.0–92.4]

23.5 [20.0–27.0]

69.6 [68.8–70.4]

56.6 [52.7–60.5]

63.6 [60.9–66.3]

NB38, NB6: Naïve Bayes with 38 or 6 variables; TAN38, TAN6: Tree Augmented Naïve Bayes with 38 or 6 variables

5 Discussion

To our knowledge, no one has attempted to analyze risk and diagnostic factors for OSA the way this study does. Our study is one of the first to build and validate risk models for OSA based solely on clinical and demographic variables, which have the key advantage of being easily available and quickly acquired. We focused on the most important risk and diagnostic factors, being aware of the clinical definition of OSA. We obtained a proportion of normal results in 66 patients (34%), revealing a large number of unnecessary exams that are performed every day in Vila Nova de Gaia/Espinho hospital center, with the possibility of this result being generalized to the different hospital centers in the country.

Male gender was more prevalent (63%), agreeing with the literature [1, 10, 16, 20, 21, 22, 27, 30, 36, 42, 43, 44, 45]. Possible explanations are the higher prevalence of craniofacial and upper-airway abnormalities (21%), and also snoring (90%). Also, higher age (> 55 years old) was linking to a higher OSA prevalence. The strata of 55–69 and more than 70 years old had a total of 94 patients (73%), followed by 29 patients (23%) in the 40–54 years old. In previous work, age was not included due to the fact that it was dominating the remaining dependencies, so we have been working in a new approach for dealing with this variable. We performed a new analysis with consequent alterations in the preprocessing levels, making the model more robust and clinically accurate, as in the literature this variable is described as one of the most important risk factors for OSA.

During the physical examination, body mass index, neck, and abdominal circumferences were collected. Regarding obesity, described as an important risk factor for OSA, we found that the percentage of patients, in the pathology group, having a normal BMI is equal to the percentage of the patients with obesity (50%). This could explain why in our study obesity is not included in the selected variables. Analyzing neck and abdominal circumferences, unadjusted for gender, we saw higher percentages of increased level, 60 vs. 46% in NC and 94 vs. 91% in AC. In some studies, neck circumference is described as one of the most important risk factors, but not always considered by several medical doctors, creating the need to perform more studies with this variable. Craniofacial and upper-airway abnormalities are important in the pathogenesis of OSA, particularly in non-obese patients, and our study demonstrated it. In patients with normal weight, we found that 26 (96%) patients had craniofacial and upper-airway abnormalities. The differences in craniofacial morphology may explain some of the variation in risk of OSA. Only 3% of OSA patients reported not snoring during the night, while 100% of the patients in the group without OSA report snoring. This might explain why snoring cannot continue to be an important risk factor. Depression or anxiety is highly related to sleep problems, becoming nowadays one of the most studied factors, but only 53 (27%) patients have the diagnosis.
Table 3

Risk matrix showing the probability [%] of having obstructive sleep apnea [95% credible interval] (color table online)

Open image in new window , male; Open image in new window , female;
, apneas witnessed;
, apneas not witnessed; NC, neck circumference; low risk (< 25%—green); medium risk (between 25% and 50%—yellow); high risk (between 50% and 75%—orange); very high risk (>75%—red)

The clinical definition of OSA includes several diagnostic factors, such as daytime sleepiness. One way to confirm its presence is with ESS. Even though this questionnaire is not specific for OSA, it has been often used in Vila Nova de Gaia/Espinho hospital center. When we analyzed the OSA group, we discovered a median of 10 (3–13), demonstrating a normal result for this group (cutoff in 10 points). However, when we analyzed the median in patients without OSA we found a higher value (13 (7–16)). This highlights the possibility that ESS is not adjusted for the pathology. Another common diagnostic factor is witnessed apneas. Men have a percentage of 60% against 43% in women. This can be explained by female bed partners having a lower threshold for symptom perceptions and reporting it less than male bed partners.

For visual inference Bayesian network we chose TAN6. Even though NB6 presents a higher value of sensitivity (94.1% [92.9–95.4%]), in clinical settings, the variables presented in our study are not independent, so we should assume the relation among variables, especially when not all variables are available to the physician. The TAN6 has a sensitivity of 90.2% [88.0–92.4%], meaning that 90% of OSA patients would perform PSG, rejecting 10% of patients who would be referred to follow-up consultation. Further studies are needed to optimize the model and also to externally validate it preferable with a prospective validation cohort.

Visually, the network model TAN6 is very intuitive. It is a simple and friendly model that can be easily accessed and filled in, helping the diagnosis of this pathology in the primary care centers or other facilities that have a gap in its diagnosis. The substructures that raised from this model were also interesting. Gender is related to witnessed apneas, aspect described by physicians and specialists; likewise, witnessed apneas are associated with age. Additionally, age is associated with nocturia (aging increases the need to use the bathroom more often) and with craniofacial and upper-airway abnormalities (possibility of accidents in the adult life and also obesity development). Another common relation described in care is the association between craniofacial and upper-airway abnormalities and increased neck circumference, which was also present in our model.

The final risk results were arranged into a color-coded and user-friendly matrix that constitutes a preliminary but useful tool that can be used by primary care physicians and others in the diagnostic decision-making process. In the case of female gender, the rate started at 8% (under than 40 years, normal neck circumference and without witnessed apneas) to 86% (higher than 70 years old, neck circumference increased and witnessed apneas). In the case of male gender, the initial rate is almost 3 times higher than for female gender. It started at 20% (under than 40 years, normal neck circumference and without witnessed apneas) to 95% (more than 70 years, with neck circumference increased and witnessed apneas). Low risk of having OSA is related with the female gender under 40 years. In contrast, high risk is prevalent in the male gender above 50 years old.

One limitation to fully use the set of predictive variables was the lack of representativeness of some factors, such as vehicle crashes, humor alterations, decreased libido, pulmonary hypertension, congestive heart failure, renal failure, hypothyroidism, gastroesophageal reflux and genetics, which might have led to a bias in the 38 variables models. Also, we acknowledge that the retrospective nature of the derivation cohort is a limitation to the study and, most of all, that the low specificity of the resulting models make them somewhat limited. However, we manage to provide a sensitive tool (sensitivity \(>90\%\)) which nonetheless prevents 1 out of 5 healthy individuals from unnecessarily performing PSG exam (specificity \(>20\%\)), improving from current clinical practice. This improvement could also result in financial benefits for the healthcare system. Using a simple back of the envelope calculation, the relatively small district hospital where our cohort was recruited could potentially perform 466 PSG (yearly extrapolation of 194 PSG in 5 months). Given a 20% specificity of our model, from the 158 normal PSG (yearly extrapolation of 66 normal PSG in 5 months), 32 would not have been referred to perform PSG. Considering Portuguese mandamus number 207/2017, each PSG is priced at €939,14 leading to a total saving of €30.052,48. Moreover, 31 patients with OSA (yearly extrapolation of 10% of 128 OSA patients in 5 months) would also have their PSG delayed by our approach, representing additional savings of €29.113,34, for a grand total of €59.165,82. Certainly, these 31 patients should have performed PSG, thus diminishing the clinical benefit of our proposal. However, limited inspection on the data supports our belief that these would have been mild or moderate OSA patients, who could perhaps be rescheduled for follow-up consultation and PSG in the subsequent months without serious harm. Either way, given the current wait list situation, the 63 PSG vacancies would most likely be filled with patients, with the expected savings corresponding to one and a half months worth of PSG work. Nevertheless, we defer this discussion to future work, where we will perform a cost-effectiveness analysis (using Monte Carlo simulations) to assess the expected overall impact of our approach.

6 Conclusion

Our study added important knowledge to the state of art regarding OSA. Moreover, this knowledge is delivered in the form of an intuitive and user-friendly model (TAN6), which can be used by any physician, and in the form of a risk matrix that physicians can use to quantify the probability for having OSA. We present as main risk factors: gender, age, neck circumference, and craniofacial and upper-airway abnormalities, and as diagnostic factors: witnessed apneas and nocturia. Naïve Bayes was considered a better classifier, while TAN showed advantages for visual inference, proving the great advantages of Bayesian network models (dealing with missing information and simple graphical representation, showing not only the probabilities given the patient characteristics but also the relationship between variables).

According to our cross-validated evaluation, we expected around 30% of false positives which, although unwanted, is nonetheless an improvement if we compare with all patients at risk being referred to sleep consultation and polysomnography. Nonetheless, we were able to rule out 25% of healthy patients, which, in our understanding, would alleviate health services by reducing the burden of unneeded consultations and the wait lists for polysomnography, while identifying more than 90% of patients with OSA.

Portugal, like many other countries, does not have a validated method to screen patients with suspicion of OSA, before performing polysomnography, so we think that our models (TAN6 and risk matrix) consist in valid methods. Current work is focused on bringing the model toward primary care facilities.


  1. 1.

    Full review description and references used in this phase are not shown for space purposes but can be provided upon request.



The authors would like to thank the sleep laboratory team of Vila Nova de Gaia/Espinho hospital center, especially Liliana Leite, and its informatics department in the name of Domingos Pereira and Joaquim Pereira, and Matilde Monteiro-Soares for critical review of the manuscript. DFS acknowledges “Fundação para a Ciência e Tecnologia (FCT)”, Portugal under grant number PD/BD/13553/2018. The work of DFS and PPR has been developed under the scope of project NORTE-01-0145-FEDER-000016 (NanoSTIMA), financed by the North Portugal Regional Operational Programme (NORTE-2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF).


  1. 1.
    Al Lawati, N.M., Patel, S.R., Ayas, N.T.: Epidemiology, risk factors, and consequences of obstructive sleep apnea and short sleep duration. Prog. Cardiovasc. Dis. 51(4), 285–293 (2009). CrossRefGoogle Scholar
  2. 2.
    Aragon, T. J.: epitools: epidemiology tools. R Package (2012)Google Scholar
  3. 3.
    Blondet, M., Yapor, P., Latalladi-Ortega, G., Alicea, E., Torres-Palacios, A., Rodríguez-Cintrón, W.: Prevalence and risk factors for sleep disordered breathing in a Puerto Rican middle-aged population. Sleep Breath. 13(2), 175–180 (2009). CrossRefGoogle Scholar
  4. 4.
    Bossuyt, P.M., Reitsma, J.B., Bruns, D.E., Gatsonis, C.A., Glasziou, P.P., Irwig, L., Lijmer, J.G., Moher, D., Rennie, D., De Vet, H.C.W., Kressel, H.Y., Rifai, N., Golub, R.M., Altman, D.G., Hooft, L., Korevaar, D.A., Cohen, J.F., for the STARD Group: STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Clin Chem 61, 1446–1452 (2015).
  5. 5.
    Broström, A., Sunnergren, O., Årestedt, K., Johansson, P., Ulander, M., Riegel, B., Svanborg, E.: Factors associated with undiagnosed obstructive sleep apnoea in hypertensive primary care patients. Scand. J. Primary Health Care 30(2), 107–113 (2012). CrossRefGoogle Scholar
  6. 6.
    Chung, S., Jairam, S., Hussain, M.R.G., Shapiro, C.M.: How, what, and why of sleep apnea. Perspectives for primary care physicians. Can. Fam. Phys. 48, 1073–80 (2002)Google Scholar
  7. 7.
    Corral-Peñafiel, J., Pepin, J.L., Barbe, F.: Ambulatory monitoring in the diagnosis and management of obstructive sleep apnoea syndrome. Eur. Respir. Rev. 22(129), 312–24 (2013). CrossRefGoogle Scholar
  8. 8.
    Darwiche, A.: Modeling and Reasoning with Bayesian networks. Cambridge University Press, New York (2009)CrossRefzbMATHGoogle Scholar
  9. 9.
    Darwiche, A.: Bayesian networks. Commun. ACM 53(12), 80–90 (2010). CrossRefzbMATHGoogle Scholar
  10. 10.
    Davies, R.J., Ali, N.J., Stradling, J.R.: Neck circumference and other clinical features in the diagnosis of the obstructive sleep apnoea syndrome. Thorax 47(2), 101–5 (1992)CrossRefGoogle Scholar
  11. 11.
    Dias, C.C., Rodrigues, P.P., Coelho, R., Santos, P.M., Fernandes, S., Lago, P., Caetano, C., Rodrigues, Â., Portela, F., Oliveira, A., Ministro, P., Cancela, E., Vieira, A.I., Barosa, R., Cotter, J., Carvalho, P., Cremers, I., Trabulo, D., Caldeira, P., Antunesl, A., Rosa, I., Moleiro, J., Peixe, P., Herculano, R., Gonçalves, R., Gonçalves, B., Sousa, H.T., Contente, L., Morna, H., Lopes, S., Magroc, F.: Development and validation of risk matrices for Crohn’s disease outcomes in patients who underwent early therapeutic interventions. J. Crohn’s Colitis 11(4), 445–453 (2017). Google Scholar
  12. 12.
    Doghramji, P.P.: Recognition of obstructive sleep apnea and associated excessive sleepiness in primary care. J. Fam. Pract. 57(8 Suppl), S17–23 (2008)Google Scholar
  13. 13.
    Epstein, L.J., Kristo, D., Strollo, P.J., Friedman, N., Malhotra, A., Patil, S.P., Ramar, K., Rogers, R., Schwab, R.J., Weaver, E.M., Weinstein, M.D.: Adult Obstructive Sleep Apnea Task Force of the American Academy of Sleep Medicine: clinical guideline for the evaluation, management and long-term care of obstructive sleep apnea in adults. J. Cin. Sleep Med. 5(3), 263–276 (2009)Google Scholar
  14. 14.
    Fontenla-Romero, O., Guijarro-Berdiñas, B., Alonso-Betanzos, A., del Rocío Fraga-Iglesias, A., Moret-Bonillo, V.: A Bayesian Neural Network Approach for Sleep Apnea Classification, pp. 284–293. Springer, Berlin (2003)Google Scholar
  15. 15.
    Gregory Warnes, A.R., Bolker, B., Lumley, T., Johnson, R.C.: Various R programming tools for model fitting. R Package (2015)Google Scholar
  16. 16.
    Hoffstein, V., Szalai, J.P.: Predictive value of clinical features in diagnosing obstructive sleep apnea. Sleep 16(2), 118–22 (1993)Google Scholar
  17. 17.
    Huang, K., King, I., Lyu, M.R..: Constructing a large node Chow-Liu tree based on frequent itemsets. In: ICONIP 2002—Proceedings of 9th International Conference on Neural Information Processing Computing Intelligent E-Age, vol. 1, pp. 498–502 (2002)Google Scholar
  18. 18.
    Jennum, Pc, Riha, R.L.: Epidemiology of sleep apnoea/hypopnoea syndrome and sleep-disordered breathing. Eur. Respir. J. 33(4), 907–914 (2009). CrossRefGoogle Scholar
  19. 19.
    Kaimakamis, E., Bratsas, C., Sichletidis, L., Karvounis, C., Maglaveras, N.: Screening of patients with obstructive sleep Apnea syndrome using C4.5 algorithm based on non linear analysis of respiratory signals during sleep. In: 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 3465–3469. IEEE (2009).
  20. 20.
    Kapur, V.: Obstructive sleep apnea: diagnosis, epidemiology, and economics. Respir. Care 55(9), 1155–1167 (2010)Google Scholar
  21. 21.
    Kohler, M.: Risk factors and treatment for obstructive sleep apnea amongst obese children and adults. Curr. Opin. Allergy Clin. Immunol. 9(1), 4–9 (2009). MathSciNetCrossRefGoogle Scholar
  22. 22.
    Lam, J.C.M., Sharma, S.K., Lam, B.: Obstructive sleep apnoea: definitions, epidemiology & natural history. Indian J. Med. Res. 131, 165–170 (2010)Google Scholar
  23. 23.
    Lee, W., Nagubadi, S., Kryger, M., Mokhlesi, B.: Epidemiology of obstructive sleep apnea: a population-based perspective. Expert Rev. Respir. Med. 2(3), 349–364 (2008). CrossRefGoogle Scholar
  24. 24.
    Leite, L., Costa-Santos, C., Rodrigues, P.P.: Can we avoid unnecessary polysomnographies in the diagnosis of obstructive sleep apnea? A Bayesian network decision support tool. In: 2014 IEEE 27th International Symposium on Computer-Based Medical Systems, pp. 28–33. IEEE (2014).
  25. 25.
    Libânio, D., Dinis-Ribeiro, M., Pimentel-Nunes, P., Dias, C., Rodrigues, P.: Predicting outcomes of gastric endoscopic submucosal dissection using a Bayesian approach: a step for individualized risk assessment. Endosc. Int. Open 05(07), E563–E572 (2017). CrossRefGoogle Scholar
  26. 26.
    Lucas, P.J.F., van der Gaag, L.C., Abu-Hanna, A.: Bayesian networks in biomedicine and health-care. Artif. Intell. Med. 30(3), 201–14 (2004). CrossRefGoogle Scholar
  27. 27.
    Manber, R., Armitage, R.: Sex, steroids, and sleep: a review. Sleep 22(5), 540–55 (1999)Google Scholar
  28. 28.
    Mansfield, D.R., Antic, N.A., McEvoy, R.D.: How to assess, diagnose, refer and treat adult obstructive sleep apnoea: a commentary on the choices. Med. J. Aust. 199(8), S21–6 (2013). CrossRefGoogle Scholar
  29. 29.
    Mitchell, T.: Machine Learning. McGraw-Hill, Singapore (1997)zbMATHGoogle Scholar
  30. 30.
    Pagel, J., Hirshkowitz, M., Doghramji, P., Ballard, R.: Obstructive sleep apnea: recognition and management in primary care. Suppl. J. Fam. Pract. 348(5), S1–S31 (2008). Google Scholar
  31. 31.
    Rafael-Palou, X., Steblin, A., Vargiu, E.: Remotely supporting patients with obstructive sleep apnea at home. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (2016)Google Scholar
  32. 32.
    Robichaud-Hallé, L., Beaudry, M., Fortin, M.: Obstructive sleep apnea and multimorbidity. BMC Pulm. Med. 12(1), 60 (2012). CrossRefGoogle Scholar
  33. 33.
    Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.C., Müller, M.: pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 12(1), 77 (2011). CrossRefGoogle Scholar
  34. 34.
    Rodrigues, P.P., Santos, D.F., Leite, L.: Obstructive sleep apnea diagnosis: the Bayesian network model revisited. In: 2015 IEEE 28th International Symposium on Computer-Based Medical Systems, pp. 115–120. IEEE (2015).
  35. 35.
    Rodsutti, J., Hensley, M., Thakkinstian, A., D’Este, C., Attia, J.: A clinical decision rule to prioritize polysomnography in patients with suspected sleep apnea. SLEEP 27(4), 694–699 (2004)CrossRefGoogle Scholar
  36. 36.
    Romero, E., Krakow, B., Haynes, P., Ulibarri, V.: Nocturia and snoring: predictive symptoms for obstructive sleep apnea. Sleep Breath. 14(4), 337–343 (2010). CrossRefGoogle Scholar
  37. 37.
    Santaolalla Montoya, F., Iriondo Bedialauneta, J.R.R., Aguirre Larracoechea, U., Martinez Ibargüen, A., Sanchez Del Rey, A., Sanchez Fernandez, J.M., Martinez Ibarguen, A., Sanchez Del Rey, A., Sanchez Fernandez, J.M.: The predictive value of clinical and epidemiological parameters in the identification of patients with obstructive sleep apnoea (OSA): a clinical prediction algorithm in the evaluation of OSA. Eur. Arch. Oto-Rhinolaryngol. 264(6), 63743 (2007). Google Scholar
  38. 38.
    Scutari, M.: Learning Bayesian networks with the bnlearn R Package. J. Stat. Softw. 35(3), 1–22 (2010). CrossRefGoogle Scholar
  39. 39.
    Stores, G.: Clinical diagnosis and misdiagnosis of sleep disorders. J. Neurol. Neurosurg. Psychiatry 78(12), 1293–1297 (2007). CrossRefGoogle Scholar
  40. 40.
    Sun, L.M., Chiu, H.W., Chuang, C.Y., Liu, L.: A prediction model based on an artificial intelligence system for moderate to severe obstructive sleep apnea. Sleep Breath. 15(3), 317–323 (2011). CrossRefGoogle Scholar
  41. 41.
    Venables, W.N., Ripley, B.D.: Modern applied statistics with S. Technometrics 45(1), 111–111 (2003). zbMATHGoogle Scholar
  42. 42.
    Wall, H., Smith, C., Hubbard, R.: Body mass index and obstructive sleep apnoea in the UK: a cross-sectional study of the over-50s. Primary Care Respir. J. 21(4), 371–376 (2012). CrossRefGoogle Scholar
  43. 43.
    Young, T.: Predictors of sleep-disordered breathing in community-dwelling adults. The Sleep Heart Health Study. Arch. Intern. Med. 162(8), 893 (2002). CrossRefGoogle Scholar
  44. 44.
    Young, T., Evans, L., Finn, L., Palta, M.: Estimation of the clinically diagnosed proportion of sleep apnea syndrome in middle-aged men and women. Sleep 20(9), 705–6 (1997)CrossRefGoogle Scholar
  45. 45.
    Young, T., Skatrud, J., Peppard, P.E.: Risk factors for obstructive sleep apnea in adults. J. Am. Med. Assoc. 291(16), 2013–2016 (2004). CrossRefGoogle Scholar

Copyright information

© The Author(s) 2018
corrected publication April 2018

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Center for Health Technology and Services Research (CINTESIS)Faculty of Medicine of the University of PortoPortoPortugal
  2. 2.Community Medicine, Information and Health Decision Sciences (MEDCIDS) DepartmentFaculty of Medicine of the University of PortoPortoPortugal

Personalised recommendations