A multicentre prospective observational study comparing arterial blood gas values to those obtained by pulse oximeters used in adult patients attending Australian and New Zealand hospitals
Pulse oximetry is widely used in the clinical setting. The purpose of this validation study was to investigate the level of agreement between oxygen saturations measured by pulse oximeter (SpO2) and arterial blood gas (SaO2) in a range of oximeters in clinical use in Australia and New Zealand.
Paired SpO2 and SaO2 measurements were collected from 400 patients in one Australian and two New Zealand hospitals. The ages of the patients ranged from 18 to 95 years. Bias and limits of agreement were estimated. Sensitivity and specificity for detecting hypoxaemia, defined as SaO2 < 90%, were also estimated.
The majority of participants were recruited from the Outpatient, Ward or High Dependency Unit setting. Bias, oximeter-measured minus arterial blood gas-measured oxygen saturation, was − 1.2%, with limits of agreement − 4.4 to 2.0%. SpO2 was at least 4% lower than SaO2 for 10 (2.5%) of the participants and SpO2 was at least 4% higher than the SaO2 in 3 (0.8%) of the participants. None of the participants with a SpO2 ≥ 92% were hypoxaemic, defined as SaO2 < 90%. There were no clinically significant differences in oximetry accuracy in relation to clinical characteristics or oximeter brand.
In the majority of the participants, pulse oximetry was an accurate method to assess SaO2 and had good performance in detecting hypoxaemia. However, in a small proportion of participants, differences between SaO2 and SpO2 could have clinical relevance in terms of patient monitoring and management. A SpO2 ≥ 92% indicates that hypoxaemia, defined as a SaO2 < 90%, is not present.
Australian and New Zealand Clinical Trials Registry (ACTRN12614001257651). Date of registration: 2/12/2014.
KeywordsArterial blood gas Hypoxaemia Oxygen Pulse oximeter Validation
Arterial blood gas
British Thoracic Society
High dependency unit
Intensive Care Unit
Partial pressure of arterial carbon dioxide
- Pulse Oximeter
Pulse oximeter used to detect oxygen saturation
Oxygen saturation measured by arterial blood gas sample
Oxygen saturation measured by pulse oximetry
Thoracic Society of Australia and New Zealand
Pulse oximeter measured oxygen saturation is a non-invasive approximation of arterial oxygen saturation (SpO2), which is considered the fifth vital sign in clinical assessment [1, 2, 3]. In clinical practice monitoring of SpO2 values is required to titrate oxygen therapy to avoid the risks of hypoxaemia and hyperoxaemia [1, 2].
Assessment of agreement between the gold standard arterial blood gas (ABG) measurement of oxygen saturation (SaO2) and SpO2 is essential for the interpretation and use of pulse oximetry values. It is also essential for the development of safe and practical recommendations for SpO2 targets for the titration of oxygen therapy. Overestimation of actual SaO2 may mean clinically relevant hypoxaemia is not detected or treated. Conversely, underestimation of actual SaO2 may result in unnecessary oxygen therapy with the associated risks of hyperoxaemia.
The United States regulatory body, the Food and Drug Administration (FDA) centre, requires the accuracy of pulse oximeters to be tested against SaO2, in healthy adults in laboratory settings . In clinical practice a number of factors influence oximeter accuracy including the degree of hypoxaemia, hypercapnia, glycosylated haemoglobin (HbA1c), skin pigmentation, movement artefacts, peripheral perfusion and use of nail polish or acrylic nails [3, 5, 6, 7, 8, 9, 10, 11, 12]. Clinical studies report that SpO2 can both over and underestimate SaO2, and the values may have wide limits of agreement [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]. However, oximeter accuracy may also differ by oximeter model [7, 8, 12, 18, 19]. Manufacturers are continuously evolving sensor technology and software algorithms . This means previous studies may not be directly relevant to current clinical practice because of the population groups and oximeter models used.
In our recent study investigating the accuracy of oximeters used in Australian and New Zealand Intensive Care Units (ICUs), we demonstrated a mean bias for SaO2 minus SpO2 of only 0.15%, with limits of agreement plus or minus 4.4% . In this study we aim to investigate the agreement between SaO2 and SpO2 measurements by oximeters currently in use in Australian and New Zealand hospitals outside the critical care setting, either on the ward or in the Emergency (ED), High dependency Unit (HDU) or outpatient departments. Secondary objectives were to evaluate the diagnostic performance of SpO2 to detect hypoxaemia, and investigate factors affecting oximeter accuracy.
This multicentre prospective non-experimental observational study compared simultaneous SpO2 and SaO2 measurements in inpatients and outpatients at Westmead Hospital in Australia, and Wellington and Christchurch Hospitals in New Zealand. It was prospectively registered on the Australian and New Zealand Clinical Trials Registry (ACTRN12614001257651). Ethical approval was obtained from the Northern B Ethics Committee in New Zealand (14/NTB/115) and the Western Sydney Local Health District Human Research Ethics Committee in Australia (LNR/14/WMEAD/387).
Patients aged 16 years or older who were to have an ABG measurement as part of routine clinical care were recruited. Full written informed consent was provided in New Zealand by participants, or next of kin if participants were unable to (for example, if they were too unwell). Participants were not recruited if they had a diagnosis of sickle cell anaemia, methaemoglobinemia, carbon monoxide (CO) poisoning, or were previously recruited to the study and had paired SpO2 and SaO2 values successfully recorded. They could also be excluded for any other condition which, at the investigator’s discretion, was believed may present a safety risk or impact upon the feasibility of the study or the interpretation of the study results.
Participants were identified in hospital wards and outpatient clinics. Demographic data were recorded. Skin colour was assessed using the Fitzpatrick scale .
SpO2 was measured during a clinically indicated ABG. The oximeter probe was put in place for at least 10 s prior to the ABG, or longer if indicated by manufacturer’s instructions. SpO2 was measured from an earlobe or finger probe, depending on departmental policies and what the staff member responsible for performing oximetry would usually use to monitor that patient. If a finger probe was used it was placed on the index finger on the contra-lateral side to ABG sampling. Where possible, nail polish was removed before measurement.
The SpO2 value recorded was the value on the oximeter when blood was first observed to enter the ABG collection vial. If the participant was receiving supplementary oxygen at the time of the ABG, this was also recorded. Measurements paired with ABG samples subsequently identified to be venous or unusable, e.g. sample too small for analysis, were excluded. The models of oximeter and ABG analyser were recorded. Data recorded from the ABG were SaO2, partial pressure of oxygen (PaO2), partial pressure of carbon dioxide (PaCO2), Carboxyhaemoglobin (CoHb), Methaemoglobin (MetHb) and HbA1c, if measured as part of clinical practice. Investigators were asked to record whether they had any concerns with oximeter accuracy, such as nail polish that was not removed, poor oximeter signal, or patient movement. Participants in which there was a reported concern with oximeter accuracy were not excluded from analyses.
Bland Altman plots and estimation of bias and limits of agreement were used to describe the agreement between SpO2 and SaO2 measurement, using SaO2 as the reference standard.
The diagnostic performance of SpO2 < 90% to detect hypoxaemia, defined as a SaO2 < 90% and defined as a PaO2 < 60 mmHg, was evaluated using contingency tables, with sensitivities and specificities estimated by an exact binomial method for proportions. A post hoc analysis of the ability for SaO2 < 90% to detect a PaO2 < 60 mmHg was performed using the same methods.
Categorical factors assessed for influence on oximeter accuracy
Location of measurement (ED, HDU, ward, or outpatient department)
Position of the oximeter probe (finger or ear)
Recognised condition associated with chronic respiratory failure (chronic obstructive pulmonary disease, obesity hypoventilation syndrome, bronchiectasis, cystic fibrosis, neuromuscular disease and chest wall deformities such as severe kyphoscoliosis)
Current tobacco smoking status (current versus ex or non-smoker)
Skin pigmentation (based on modified Fitzpatrick scale with patient skin colour classified as either: Light (Type I to Type II), Medium (Type III to Type IV) or Dark (Type V to Type VI))
To estimate the difference between SpO2 and SaO2 due to different oximetry devices, estimation of variance components and associated intra-class correlation coefficients for the effect of oximeters as well as best linear unbiased predictors of the effect of individual oximeters were assessed by mixed linear models and estimation by restricted maximum likelihood.
SAS version 9.4 was used.
The planned sample size of 400 was based on three considerations. Firstly, for the analysis of variables that predict the size of the bias we sought to have between 20 and 40 participants for each degree of freedom in the ANOVA. Based on the six variables, some of which have multiple levels, this required between 200 and 400 participants. Secondly the estimates of paired SD for the SpO2 to SaO2 difference from patients in a range of clinical settings were 0.55% , 2.1% , and 2.2% . There is 80% power, with a type I error rate of 5%, to detect a SpO2 to SaO2 difference of 2% for any of the variables that might predict bias, if there were two equal sized groups of 21 participants. For estimation of variance of components for the different pulse oximeters by Best Unbiased Linear Predictors between 20 and 25 participants per oximeter brand were required and it was estimated that between 10 and 20 oximeter brands would be used.
Participant characteristics (N = 400)*
Min to max
18.7 to 95.1
Smoking status (n = 399)
Conditions associated with chronic respiratory failure
Hypercapnia** on ABG
At least one:
Hypercapnia** on ABG
Individual conditions associated with chronic respiratory failure
Chronic obstructive pulmonary disease
Obesity hypoventilation syndrome
Chest wall deformity
Peripheral vascular disease
New Zealand Ethnicity
Cook Island Māori
ABG and oximetry data
Mean (SD), min-max*
93.5 (3.8), 72 to 100
94 (92 to 96)
Participants with SpO2<90% (N, (%))
Concern with SpO2 data accuracy recorded by investigator (N, (%))***
94.7 (3.8), 72.1 to 100
95.7 (93.2 to 97.1)
Participants with SaO2<90% (N, (%))
74.8 (21.3), 37.9 to 396
73 (64.7 to 83)
Participants with PaO2<60 mmHg (N, (%))
40.3 (7.2), 25.4 to 87.2
39 (35.8 to 43.4)
136.7 (18.8), 67 to 192
137 (126.5 to 149)
All participants (N = 358)
2.1 (1.2), 0 to 7
1.8 (1.4 to 2.3)
Current smokers (N = 40)
3.9 (1.7), 0.9 to 6.7
4.0 (2.4 to 5.4)
Ex smokers (N = 183)
1.9 (0.9), 0.0 to 7.0
1.7 (1.4 to 2.2)
Never smokers (N = 135)
1.9 (0.7), 0.0 to 4.2
1.7 (1.3 to 2.3)
Agreement between SpO2 and SaO2
Detection of hypoxaemia
Diagnostic performance of SpO2 and SaO2
Ability for SpO2 <90% to detect SaO2 <90%
Ability for SpO2 <92% to detect SaO2 <90%
Ability for SpO2 <90% to detect PaO2 <60 mmHg
PaO2 <60 mmHg
Ability for SaO2 <90% to detect PaO2 <60 mmHg
PaO2 <60 mmHg
Factors potentially influencing oximeter accuracy
There was no statistical evidence of an association between SaO2 and bias between SpO2 and SaO2; Spearman coefficient 0.003, P = 0.94. Of the other factors from Table 1, only a diagnosis of diabetes was identified as a predictor of bias (P = 0.05). In diabetics it was − 0.8 (95% limits of agreement − 4.4 to 2.8), in non-diabetics it was − 1.2 (− 4.4 to 2.0). Detailed results are presented in the Online Additional File (Additional file 1: Figure S3 and Table S5).
There were at least 14 different oximeter models used. The most common oximeter models used were the Nonin Avant 9700 in 103 participants (26%), Massimo Rainbow Radical 7 in 92 participants (23%) and the Nonin Avant 4000 in 76 participants (19%) (See Additional file 1: Table S1 for all models). The difference in the estimation of variance components was 0.16 for oximeter brand and 2.48 for residual, resulting in an intra-class correlation coefficient of 0.94. This can be interpreted as approximately 6% of variation in the relationship between SpO2 versus SaO2 being due to oximeter brand. Detailed results by oximeter are shown in the Online Additional file 1: Table S6).
Concern with oximeter accuracy was reported by investigators in 16 patients, nine of which had nail polish, acrylic nail or double nail. Other causes for concern are presented in the Online Additional file 1: Table S1).
The bias and limits of agreement between SpO2 and SaO2 suggest that pulse oximetry is an accurate method to assess SaO2 in most adult patients in the clinical setting. However, in a small number of participants potentially clinically important differences between SpO2 and SaO2 could affect patient assessment and management. A practical guide that can be derived from these data is that a SpO2 ≥ 92% effectively rules out presence of hypoxaemia, indicated by a SaO2 < 90%. There were no clinically significant differences in oximeter accuracy based on absolute level of SaO2, hospital location, numerous clinical characteristics or oximeter brand.
The magnitude of bias and associated limits of agreement from the range of oximeters in this study suggested that overall they perform at a similar level or better than oximeters used in many of the clinical studies performed in the last 10 years [5, 6, 8, 10, 11, 12, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 30, 31]. This is in keeping with constant oximeter sensor technology and software improvements by manufacturers over time . Specifically, the bias and limits of agreement for SaO2 minus SpO2 were similar to the values recently obtained in critically unwell patients in the ICU setting (0.15%, limits of agreement plus or minus 4.4%) .
The negative bias of − 1.2%, albeit small, meant that the oximeters tended to underestimate SaO2. Such underestimation has the potential to result in a conservative estimate of risk of hypoxaemia and may lead to more liberal oxygen therapy than required. SpO2 underestimated SaO2 by at least 4% in around 3% of participants, and overestimated it by at least 4% in less than 1% of participants. These findings mean that while the oximeters performed well overall, there were still potentially clinically relevant differences in SpO2 and SaO2 in a small proportion of the participants. In the majority of the participants with SpO2 and SaO2 values differing by at least 4% the investigators did not state they had any concerns with oximeter accuracy. This highlights the potential difficulty in identifying when an oximetry value is incorrect and emphasises the importance of guideline recommendations to consider oximetry values in clinical context .
The TSANZ  and BTS  guidelines for acute oxygen therapy both recommend use of pulse oximetry as a vital sign and tool to titrate oxygen therapy to a target oxygen saturation range. The TSANZ recommend oxygen is delivered to a SpO2 target range of 92 to 96% in patients not at risk of hypercapnic respiratory failure . This range was developed to reduce the risks of both hyperoxaemia and hypoxaemia, while recognising potential oximeter accuracy limitations . The lower limit of 92% is supported by a SpO2 saturation of ≥92% indicating that hypoxaemia (SaO2 < 90%) is not present. The recommended upper SpO2 limit of 96%, aimed at avoiding hyperoxaemia, is supported by the finding that 12 of the 13 participants with a PaO2 of greater than 100 mmHg had a SpO2 value over 96%.
A SpO2 < 90% had a specificity of only 70.5% in identifying a PaO2 < 60 mmHg, while for SaO2 < 90% it was only 54.1%. These values are in keeping with the majority of participants being positioned to the left of the predicted oxygen haemoglobin dissociation curve. In keeping with recommendations by the TSANZ Oximetry Guidelines , these findings highlight the limitations of estimating PaO2 from saturation values, and vice versa.
Patients with sickle cell anaemia, methaemoglobinemia, or CO poisoning were excluded from the study and nail polish was removed where possible as these factors are well established to impact on oximeter results . SaO2, oximeter model and the numerous clinical variables were not found to significantly impact on oximeter accuracy. However, it was not possible to evaluate the effect of earlobe oximetry, Fitzpatrick scale V or VI, or ED location on accuracy due to there being only one participant in each of these categories.
This study had the advantage of a multicentre design and use of a range of oximeters routinely available to clinical staff in a variety of hospital settings. A wide range of adult patients were included, both in terms of presenting diagnosis and illness severity. While there were a range of SaO2 values between 72 and 100%, the results cannot be applied to patients with a SaO2 of under 70%, at which oximeter inaccuracy is well recognised . Results may not be applicable to paediatric patients or adult patients in theatre, ICU or ED, especially as a variety of factors specific to these patients have been previously identified as affecting oximeter accuracy [11, 15, 17, 25, 27, 28, 30, 31]. Having only one participant with a Fitzpatrick score of V, and none with VI, meant study findings may not be applicable to patients with higher skin pigmentation. This is especially important as oximeter accuracy has been demonstrated to decrease as pigmentation increases, particularly at lower SaO2 levels and in oximeters of the same brand as some of those used in our study (Massimo Radical and Nonin 9700) .
Single oximeter and ABG measurement pairing from each participant were used, which has the advantage of removing potential bias from repeated measures in the same participant. However, this did mean we could not specifically assess the accuracy of SpO2 to detect changes in SaO2 over time.
Overall, the oximeters in this study had good accuracy in determining individual SaO2 values and detecting hypoxaemia in a range of clinical settings. The use of a SpO2 of 92% as the lower boundary for the titration of oxygen therapy was supported by 100% sensitivity for SpO2 < 92% in identifying hypoxaemia (SaO2 < 90%). In a small number of participants discrepancies between SpO2 and SaO2 could have implications for patient assessment and management. This highlights the importance interpreting SpO2 within clinical context.
Much of the information in this manuscript has been included in J Pilcher’s thesis for completion of a PhD at Victoria University Wellington. Additionally, portions of the data presented have been published as an abstract: Ploen L, Pilcher J, Beckert L, Swanney M, Beasley R. An investigation into the bias of pulse oximeters. Respirology. 2016; 21, 6.
JP, JC, MW, LB and RB contributed to the initial design of the study. JP, LP, SM, GB, JC, LH, SL, and MS acquired data. MW performed statistical analyses. JP, MW and RB analysed/interpreted the data. All authors contributed to the preparation of this manuscript. All authors have seen and approved the manuscript. Please note that MS has passed away.
Ethics approval and consent to participate
Ethical approval was obtained from the Northern B Ethics Committee in New Zealand (14/NTB/115) and the Western Sydney Local Health District Human Research Ethics Committee in Australia (LNR/14/WMEAD/387). Participants or their next of kin provided full written informed consent in New Zealand.
Consent for publication
RB reports grants from Fisher and Paykel healthcare, outside the submitted work. Of note, the Medical Research Institute of New Zealand receives funding from the HRC Independent Research Organisations Capability Fund. JP, JC and RB are members of the Thoracic Society of Australia and New Zealand Adult Oxygen Guidelines Group. RB is a member of the BTS Emergency Oxygen Guideline Group. All other authors declare that they have no competing interests.
- 4.FDA. Pulse Oximeters – Premarket Notification Submissions: Guidance for Industry and Food and Drug Administration Staff. 2013. https://www.fda.gov/downloads/MedicalDevices/DeviceRegulationandGuidance/GuidanceDocuments/UCM081352.pdf. Accessed June 2018.
- 7.Feiner JR, Severinghaus JW. Dark skin decreases the accuracy of pulse oximeters at low oxygen saturation: the effects of oximeter probe type and gender. Int Anesth Res Soc. 2007;105:S18–23.Google Scholar
- 32.Peng L, Yan C, Lu H, Xia Y. Evaluation of analytic and motion-resistant performance of the Mindray 9006 Pulse Oximeter. Med Sci Monit. 2007;13:19–28.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.