Introduction

Respiratory instability is common in extremely premature infants and results in fluctuations in oxygenation. These fluctuations are usually detected by pulse oximetry which is the most common method to monitor arterial oxygen saturation (SpO2) in extremely premature infants. A study has shown that premature infants receiving supplemental oxygen can spend on average 16% of the time below the prescribed target range of SpO2.1 This is in large part the result of episodes of intermittent hypoxemia (IH) that are commonly observed in infants with increased frequency during the first four weeks after birth.2,3

To measure SpO2 pulse oximeters utilize red and infrared light transmitted by an emitter and sensed by a photoreceptor to detect the relative proportion of oxygenated and deoxygenated hemoglobin and extract the pulsatile elements of the transmitted light to estimate arterial oxygen saturation. Motion is common in neonates and can have a negative impact on SpO2’s signal-to-noise ratio. Motion artifact can affect accuracy and signal quality of pulse oximetry, which may lead to erroneous readings and loss of data.4 Motion is often present during or preceding IH episodes. While motion is likely associated with disturbance in ventilation that leads to IH,5,6 it is also possible that it may cause artifact and lead to a falsely low-SpO2 values. Some investigators have suggested that SpO2 readings of hypoxemia in the presence of motion may be false.7 However, there are insufficient data to completely rule out some of these may be true IH episodes. Motion is not always continuous but rather intermittent, which may allow for accurate tracking of the actual arterial oxygen saturation by the pulse oximeter during a hypoxemia episode associated with motion.

Data are lacking on the impact of motion artifact on the reliability of SpO2 during hypoxemia episodes. This void in data may influence the responsiveness of the clinical staff when episodes of IH are accompanied by patient activity. On one hand, lack of response to a true IH episode may be detrimental, whereas a false IH episode may result in an unnecessary exposure to an increase in the fraction of inspired oxygen (FiO2), which is the most common response in a premature infant during IH episodes.

The objective of this study was to evaluate the reliability of pulse oximetry in detecting episodes of IH in the presence of motion by using simultaneously measured SpO2 values and pulse plethysmograph waveforms from two pulse oximeters in the same infant for validation.

Methods

Premature infants of gestational age ≤28 weeks admitted to the Newborn ICU at Holtz Children’s Hospital of the University of Miami/Jackson Memorial Medical Center who were enrolled in the Prematurity Related Ventilatory Control (Pre-Vent) study sponsored by the National Heart Lung and Blood Institute of NIH at this site8 were considered eligible for this ancillary study if they required simultaneous monitoring with two pulse oximeters in two different extremities while receiving supplemental oxygen. Only infants with basal SpO2 within 90–95% who showed a difference in basal SpO2 levels between the two pulse oximeters ≤3% were included in the analysis. The Pre-Vent study was approved by University of Miami IRB and Jackson Memorial Clinical Research Office. Enrollment in the Pre-Vent study was done with waiver of consent. All data was deidentified.

All infants were monitored using the neonatal bedside monitors (Mx800 and Mx700, Philips Healthcare, Andover, MA) equipped with SpO2 and pulse plethysmograph (Pleth) measurement capabilities (non-rainbow X2, DSP V4.6.0.2, Masimo Signal Extraction Technology pulse oximeter9 SET®, Masimo, Irvine, CA). The SpO2 averaging time was 10 s. Cardiorespiratory data during routine clinical care were recorded from the serial port using mini computing devices (Intel Stick, Intel, Santa Clara, CA and Kangaroo mobile desktop, Infocus, Irvine, CA) using dedicated software (MediCollector Bedside, Medicollector, Winchester, MA) and by the Philips Data Warehouse System (DWC, Philips Healthcare, Andover, MA) that collects the data in a central file server.

As clinically indicated, two simultaneous SpO2 (SpO2-1 and SpO2-2) measurements were continuously obtained from two different extremities. Pleth was sampled at 125 samples per second. SpO2 data collected using MediCollector software was sampled at 0.9765 samples per second. SpO2 Data collected with DWC is sampled at 1.024 samples per second.

The analysis of SpO2 and Pleth waveforms was done by an analytical software tool designed to achieve a non-subjective analysis of the motion artifact during IH episodes. IH episodes of two severities were evaluated. The working definitions of the two severities of IH episodes for purposes of the study were a decline in SpO2 < 90% for ≥10 s and a decline in SpO2 < 80% for ≥10 s, respectively. Episodes with SpO2 < 80% were a subset of those with SpO2 < 90%. The analytical tool allows the user via graphic interphase to manually select each hypoxemia episode with a decline <90% for ≥10 s in at least one of the SpO2-1 or SpO2-2 signals. Their respective Pleth-1 and Pleth-2 waveforms were automatically analyzed for detection of motion during the episode. Each episode was classified as type A, B, C, or D based on the decline in SpO2 and the temporally associated presence of distortion in the their associated Pleth as follows: A (true hypoxemia), both SpO2 show a decrease while only one Pleth shows distortion due to motion, thus indicating that the SpO2 affected by motion is showing a true decline in spite of motion; B (false Hypoxemia), one SpO2 shows a decrease and its Pleth shows distortion due to motion while the other SpO2 does not decrease and its Pleth does not show distortion due to motion; C (suspected hypoxemia), both SpO2 show a decrease and both Pleth waveforms show distortion due to motion; and D (true hypoxemia-motion free) both SpO2 show a decrease and both Pleth waveforms show no distortion due to motion. This was repeated for episodes with SpO2 < 80% for % for ≥10 s. In addition, each episode with SpO2 < 90% was also classified according to the presence of motion artifact in the Pleth waveforms during the pre-episode baseline immediately preceding the start of the decline in SpO2 as well as during the period of the decline in SpO2. The pre-episode baseline, decline in SpO2 and the episode segments are illustrated in Fig. 1.

Fig. 1: Hypoxemia episode Type A (true hypoxemia).
figure 1

The figure shows a simultaneous decline in both SpO2 signals. Only one Pleth waveform is affected by motion artifact as indicated in by an increase in autocorrelation index above baseline (lower panel, gray) while the other Pleth does not show motion and the autocorrelation index remain at baseline (lower panel, black). The arrows indicate the pre-episode baseline, declining SpO2 and the episode segments that were examined for the presence of motion artifact.

The analytic tool evaluated both Pleth waveforms and detected distortion in the signals due to motion by applying the autocorrelation technique.10 In brief, the autocorrelation technique was used to detect level of deviation from the original periodic Pleth to identify segments distorted by motion. For this, the autocorrelation coefficient was calculated every 0.5 s from the immediately preceding 2 s Pleth segment and another 2 s Pleth segment that started 2.5 s before. The autocorrelation coefficient ranges between −1.0 and 1.0 and an autocorrelation coefficient approaching zero indicates reduced periodicity. An autocorrelation coefficient that exceeded the zero threshold for ≥2 s was used to indicate motion artifact. The analytic tool was validated by two observers who assessed the presence of motion related distortion in the Pleth waveforms.

For each episode classified as A, B, C, or D the analytic tool subsequently calculated the difference in nadir values between the two SpO2 signals to evaluate changes in the magnitude of the decline of SpO2 into hypoxemia and the cross-correlation coefficient of the two SpO2 signals to evaluate differences in SpO2 profiles over the duration of the episode. A mean value was obtained for all calculated variables from each episode type for each infant. Statistical analysis consisted of within-subject comparisons using General Linear Model (GLM) for Repeated Measures using the IBM SPSS Statistics package Version 27 (International Business Machines Corporation, New York). A p-value of <0.05 was considered statistically significant.

Results

Data from 22 infants monitored with two pulse oximeters simultaneously were reviewed. Of these, 2 infants were excluded due to pre- to post-ductal differences in basal SpO2. Data from 20 infants were included in the analysis. These 20 infants were born at a gestational age of [median (interquartile range)] 25 (23–27) days and weighed 669 (526–850) g at birth. Nine of them were females and 11 males. Nine infants were black, 2 white and 9 were Hispanic. The postnatal age at the time of study was 13 (8–20) days. At the time of study, 18 infants were receiving invasive mechanical ventilation and 2 non-invasive respiratory support. The FiO2 at the time of study was 0.45 (0.30–0.73). Around the time of study 12 infants had been diagnosed with PDA. At time of study 5 infants were receiving caffeine and 7 infants were on anticonvulsant medications. These treatments did not change during the duration of the period included in the analysis.

In these infants two pulse oximeter probes were placed to evaluate pre- to post-ductal gradient in SpO2. Only those infants who did not show differences in basal SpO2 levels were included. Continuous data periods of 24 to 72 h duration per infant were evaluated. The duration of the data period depended on the time the two SpO2 probes were kept in place for each infant. A total of 283 episodes with SpO2 < 90% for ≥10 s for were evaluated for a mean of 14.1 ± 5.7 episodes per infant. In 158 episodes SpO2 decreased <80% for ≥10 s for a mean of 7.9 ± 5.5 episodes per infant. These episodes are a subset of those episodes with SpO2 < 90%. Figures 1 to 4 show representative examples of each type of episode as classified by the analytical tool based on the observed decline on each of the SpO2 signals and the presence of motion artifact in their respective Pleth. These examples also illustrate the intermittent occurrence of distortion in Pleth due to motion.

Fig. 2: Hypoxemia episode Type B (false hypoxemia).
figure 2

In this example, one SpO2 signal declines into hypoxemia while the other one remains stable. The Pleth waveform from the declining SpO2 showed motion detected by an increase in the autocorrelation index above baseline (lower panel, black).

Fig. 3: Hypoxemia episode type C (suspected hypoxemia).
figure 3

In this example both SpO2 signals declined while motion was detected in both Pleth waveforms by increases in the autocorrelation index above baseline (lower panel, black and gray).

Fig. 4: Hypoxemia episode type D (true hypoxemia-motion free).
figure 4

The figure shows both SpO2 signals declined simultaneously while none of the Pleth waveforms showed motion artifact. The autocorrelation index from both Pleth remained at baseline (lower panel, black and gray).

Table 1 shows data on the number and proportion of episodes with SpO2 < 90% as classified by the analytic tool according to the presence of motion artifact during the entire episode. As shown in the table, most of these episodes were classified as type C (suspected hypoxemia) with motion artifact identified in both Pleth waveforms while SpO2-1 and SpO2-2 simultaneously declined <90%. More importantly, analysis revealed that episodes type B (false hypoxemia with motion artifact) were significantly less prevalent than those of type A (true hypoxemia with motion artifact). This is the comparison of most interest as the SpO2 signal from a non-moving extremity serves as reference for the SpO2 signal from the moving extremity.

Table 1 Classification of episodes with SpO2 < 90% according to the presence of motion during the entire episode.

Table 1 also provides further insight into the profile of the two SpO2 waveforms according to the episode classification. The mean basal SpO2 values for both SpO2-1 and SpO2-2 signals were comparable between the two pulse oximeters and between the type of episode. The cross-correlation coefficient between SpO2-1 and SpO2-2 signals for the entire duration of the episode was significantly lower for episode type B (false hypoxemia). The difference in nadir between the two SpO2 was significantly larger for episodes classified as type B (false hypoxemia), whereas episodes of type A, C and D had higher correlation coefficient and smaller differences in nadir between the two SpO2 signals and did not differ significantly between types A, C, and D.

Table 2 shows the number and proportion of episodes with SpO2 < 80% classified by the analytic tool according to the presence of motion artifact during the entire episode. Similar to what was observed in episodes with SpO2 < 90%, most of the episodes with SpO2 < 80%. were classified as type C (suspected hypoxemia) and episodes type B (false hypoxemia with motion artifact) were significantly less prevalent than type A (true hypoxemia with motion artifact).

Table 2 Classification of episodes with SpO2 < 80% according to the presence of motion during the entire episode.

Table 3 shows the number and proportion of episodes with SpO2 < 90% classified by the analytic tool according to the presence of motion artifact during the period of decline in SpO2 from baseline to the nadir level. Similar to what was observed when episodes were classified according to the presence of motion during the entire episode, classification during the declining SpO2 period also showed that type B episodes (false hypoxemia with motion artifact) were significantly less prevalent than type A (true hypoxemia with motion artifact).

Table 3 Classification of episodes with SpO2 < 90% according to the presence of motion during the period of declining SpO2.

Table 4 shows the number and proportion of episodes with SpO2 < 90% classified by the analytic tool according to the presence of motion artifact during the pre-episode baseline period. Similar to the classification according to the presence of motion during the entire episode, classification during the pre-episode baseline period showed that type B episodes were significantly less prevalent.

Table 4 Classification of episodes with SpO2 < 90% according to the presence of motion during the pre-episode baseline.

Discussion

Pulse oximetry measurements in extremely premature infants can be affected by artifact due to movement of the extremity where the probe is applied.4,9,11,12 Motion can alter the light traveling to the photoreceptor end of the probe and it can possibly induce venous blood pulsations, which could lead to underestimation of arterial oxygen saturation by SpO2.

The possible effects of motion on SpO2 accuracy are more important in premature infants who present with episodes of IH. This uncertainty can influence the trust of caregivers on SpO2 and thereby influence their responsiveness when SpO2 declines into hypoxemia. In these infants an IH episode is usually attended by a transient increase in FiO2, ventilator settings if the infant is receiving mechanical ventilation, or tactile stimulation if the episode is related to apnea, in an attempt to reduce its severity or duration. The presence of motion due to increased patient activity that is temporally associated with the occurrence of the IH episode could influence the responsiveness of the bedside caregivers to the episode. An artifactually low-SpO2 reading accompanied by motion may result in an unnecessary exposure to increased FiO2 or ventilatory support respiratory whereas a low-SpO2 reading that reflects true hypoxemia despite motion may expose the infant to prolonged and/or severe hypoxemia.

This is the first study to use clinically obtained data to evaluate the reliability of SpO2 during motion artifact in extremely premature infants. The data provided by this study show that in most of the IH episodes where one probe was affected by motion while the other probe was not, and thereby serving as a reference, SpO2 can adequately track hypoxemia (type A vs. B). In those episodes (type A) the difference in nadir level between the two SpO2 signals was small and comparable to the difference in nadir observed in episodes of type D when none of the two SpO2 signals were affected by motion. The cross-correlation coefficient between the two SpO2 signals in episodes type A (true hypoxemia during motion) was high and comparable to those of type D (no motion), which indicated the timing of decline and recovery were not affected by the motion artifact. These findings can be explained in part by fact that motion of the extremities in premature infants is not always continuous but rather intermittent and therefore it is possible that pulse oximeters can provide SpO2 readings that track the true values of arterial oxygen saturation with sufficient accuracy by including in the averaged SpO2 reading those values measured during intermittent non-motion segments.

This study also showed that in most of the IH episodes that were evaluated motion affected both SpO2 signals with artifact detected in both Pleth waveforms. These episodes were classified as type C (suspected hypoxemia) because none of the SpO2 signals could be used as a reference for the other for analysis purposes. Although there are insufficient data to determine if these episodes represent true hypoxemia, it can be argued based on the differences between type A and B episodes that a considerable proportion of the type C episodes could actually reflect true hypoxemia.

In this study the working definition of IH episodes consisted of a decline in SpO2 below 90%. Although it can be argued that the these are mild fluctuations that do not represent actual hypoxemia, it was reassuring to observe that evaluation of more severe episodes with SpO2 declining below 80% showed similar proportions as those <90% when classified according to the presence of motion artifact in the Pleth.

The main reason why this cohort of extremely premature infants were included in the analysis was that of being monitored by two pulse oximeters simultaneously. In this cohort this was done clinically to evaluate differences between pre- and post-ductal SpO2 levels. Although the possible influence of pre- to post-ductal differences cannot be completely ruled out, it is likely they had minimal impact. This is because infants included in the analysis had to have similar basal SpO2 levels. Also important to note is the fact that the SpO2 signal affected by motion was not consistently measured in a pre- or the post-ductal extremity. The episodes were classified according to the detected motion in the Pleth regardless of which of the two SpO2 signals was being affected. In other words, in some episodes the pre- while in other episodes the post-ductal probe was being affected by motion while the other was not and served as reference, and vice-versa.

Although these findings primarily apply to the pulse oximeter used in this center, they can be cautiously extrapolated to other pulse oximeter devices. This is because the motion artifact is intermittent in nature rather than continuous, which does not appear to affect ability of a pulse oximeter to track decreasing arterial oxygen saturation in the presence of motion.

An important aspect of the analysis in this study was the use of an analytical tool developed specifically for the study to classify IH episodes into the four types described above. This tool analyzed the Pleth waveforms to detect distortion. This distortion was mainly attributed to motion artifact, which disrupts the transmission and detection of red and infrared light that is used to measure SpO2. It is possible that the distortion in the Pleth waveforms was due to non-arterial blood pulses, but motion is likely the main factor producing those pulses. The use of this analytical tool provided a non-subjective evaluation while reducing the risk of bias.

Although the data obtained analyzed in this study is representative of what occurs clinically in the newborn ICU, these findings need to be validated with a more robust study design using two pulse oximeters in infants without suspicion of pre to post-ductal differences in SpO2 and including video recordings or use of other techniques to detect motion in the extremity where the pulse oximeter probe was applied.

The findings of this study are significant because the lack of or excessive trust in SpO2 detected hypoxemia events may have detrimental effects in extremely premature infants. Pleth waveforms are generally displayed by pulse oximeters and caregivers often observe the loss in periodicity as a marker of artifact or often observe increased patient motion and assume that is likely the reason for the low-SpO2 alarm. Based on the present findings such response may be erroneous. Lack of response when SpO2 reflects true hypoxemia in the presence of motion may expose the infant to prolonged hypoxemia. This is particularly relevant in view of the findings from large trials indicating prolonged IH episodes are associated with poor neurodevelopmental outcome, more BPD, and severe retinopathy of prematurity in this population.2,3,13,14,15

In conclusion, analysis of SpO2 data collected from two pulse oximeters in the same extreme premature infant show that in the presence of motion and IH episodes, SpO2 is likely to reflect a true hypoxemia event. The observations made in this study were made during routine conditions of standard clinical care in the newborn ICU. These findings should be used as framework for future studies conducted to further validate the reliability of pulse oximetry to motion and other sources of artifact under more controlled conditions.