Anesthesia information and management systems (AIMS) are increasingly being adopted in anesthesia practice. Such automated patient records are considered superior to handwritten records as they are less time-consuming and retain comparable or higher accuracy.1-5 An AIMS is not only useful as an instrument to enhance anesthesia record-keeping or support clinical decisions, but it can also be useful as a resource to answer clinical research questions, and in particular to generate hypotheses for further research.6-13

Although the quality of data capturing and storage in the AIMS database is considered highly accurate, not all stored values are necessarily based on reliable measurements. In retrieving data from the monitoring systems, an AIMS cannot determine whether a certain value is a “true” value or an artifact that occurred while measuring the value (e.g., false low oxygen saturation caused by a dislocated pulse oximeter). Consequently, when using the data for research purposes, it may be difficult or even impossible to distinguish between true values and artifacts. Obviously, artifacts may influence research results. If the occurrence rate of artifacts is high, research results based on AIMS data may be unreliable estimates. Moreover, artifacts may introduce bias if certain artifacts are associated exclusively with certain procedures. To prevent the AIMS from storing artifacts as true values, intelligent filtering can be applied during data capturing. Still, this does not completely prevent artifacts from being stored as “real” values.

The incidence of artifacts in AIMS databases and the procedure or time-specific associations of artifacts which may influence research results are yet undetermined. In this study, we assessed the reliability of AIMS data for research purposes by estimating prospectively the occurrence rate of artifacts in the vital parameter values recorded in our AIMS database. Moreover, we recorded the causes of artifactual measures stored in the AIMS.

Methods

This prospective observational study included 86 adult patients who underwent ear-nose-throat (ENT), general, or neurosurgery requiring general anesthesia. The numbers of each type of procedure were allocated to approximate an even distribution of anesthesia time over the three types of surgery. All procedures were performed in a tertiary referral centre (University Medical Center, Utrecht, The Netherlands) within a six-week period in 2010. The need for written informed consent was waived by the Medical Research Ethics Committee of the University Medical Center, Utrecht (final approval October 5th, 2010). According to requirements of Dutch law, the anonymity and confidentiality of routinely collected clinical data were assured.

All patients were monitored by a Datex Ohmeda S/5™ monitoring system (GE Healthcare, Waukesha, WI, USA) that has a built-in filter for artifacts. The heart rate (HR) displayed on the monitor is derived from the electrocardiogram (ECG), the plethysmogram, or the invasive blood pressure (IBP) curve, and it is updated every five seconds by calculating the mean HR over the last ten seconds. The oxygen saturation is derived from the plethysmogram and displayed beat-to-beat. After the monitoring system automatically determines the J-point in the electrocardiogram, the ST-segment deviation (elevation or depression in millimetres) is updated every five seconds by calculating the mean ST-segment over the last eight QRS complexes. The noninvasive blood pressure (NIBP) is displayed each time it is measured, and the IBP curve is displayed beat-to-beat.

Values from the Datex monitoring system are stored automatically in a locally developed AIMS (Vierkleurenpen©, version 1.4.5, 2010) that samples data from the monitoring system every five seconds. In order to prevent the monitoring system from capturing artifacts, data are recorded in the database and displayed every minute, but only after a filter is applied. This means that the median value per minute is calculated and stored for HR, saturation, ST-segment, and IBP (systolic, diastolic, and mean), while the NIBP (systolic, diastolic, and mean) is recorded every time it is measured (no filtering). Storing the median value per minute was considered an effective method to prevent the AIMS from storing the majority of the artifacts caused, for example, by positioning the patient or electrocautery. Such artifacts are usually short in duration and thus seldom influence the median value of the twelve values captured per minute.

All data for HR, ST-segment, oxygen saturation, NIBP, and IBP were collected automatically by the institutional AIMS system and then collected manually in the operating room by the first author who was present for all procedures reported in this investigation and attended to the monitor as well as the values stored in the AIMS. We included three surgical types (ENT, general, and neurosurgical) in our study since artifacts are thought to be influenced by the type and location of surgery.

Data were collected from the time the monitoring system was connected to the patient until the monitor was disconnected. All data stored in the AIMS (i.e., those values displayed on screen) were evaluated as either reliable values or possible artifacts. An artifact was defined as any value that did not reflect the patient’s current physiologic state, as defined below.

Definition of an artifact

We defined an artifact a priori as one or all of the following:

  1. 1.

    For purposes of this investigation, an artifact was defined as any value deviating outside a biologically plausible range (see Table 1). A fixed plausible range was used for HR and oxygen saturation. For the ST-segment, NIBP, and IBP, an individual baseline value was calculated immediately after induction of anesthesia, using all values before induction. The average (NIBP / IBP) or median (ST-segment) of these values was then defined as the normal range.

    Table 1 Definition of baseline and deviating values
  2. 2.

    A value of NIBP or IBP was considered a possible artifact if it deviated ≥ 30% from the preceding value (Table 1).

  3. 3.

    A value was considered an artifact, even if it was within a previously defined “normal” range, when it was clearly observed as being unreliable based on the investigator’s consultation with the anesthesiologist regarding observations in the operating room (e.g., if the surgeon leaned on the NIBP cuff).

A value that was considered a possible artifact was verified with the attending anesthesiologist or anesthetic nurse as to whether it reflected the patient’s physiologic state. Values considered artifacts were manually recorded together with their causes. The investigation only considered data that were stored in the AIMS, therefore, data filtered by the Datex monitoring system (e.g., electrocautery filtering) were not considered.

For purposes of this investigation, we expressed the number of artifacts for each parameter as a percentage of the total number of observations with the respective 95% confidence interval (CI). An episode was defined as a period of deviation that included one or more consecutive artifacts. The number of artifacts and deviating values per episode were calculated and expressed as medians with an interquartile range. In addition, frequencies of the different causes for artifacts were counted.

Outcome

The primary outcome was the percentage of artifacts for each of the included parameters with a 95% CI. Secondary outcomes included the number of values deviating from the predefined baseline value and the percentage of these deviations being caused by artifacts. In addition, we reported the number of episodes across which these values and artifacts were distributed. Finally, we determined the most common causes of artifactual values.

Analysis and statistics

To estimate the percentage of artifacts in the AIMS database, we performed a sample size calculation to determine the minimum number of values to include in the study. In this calculation, we assumed a maximum 4% incidence of artifacts, and we wanted to rule out a > 5% incidence with 95% certainty, implying that the width of the 95% CI should be a maximum 1% on each side. Furthermore, we assumed that the probability that each stored value was an artifact was independent of the occurrence of any other artifact, i.e., artifacts within patients were independent.

As the NIBP is the measurement performed least frequently, the sample size calculation was based on the assumption that 4% of the stored NIBP values would be an artifact. The further assumptions that the NIBP was being measured at least every five minutes and a 95% CI would not exceed 5% led to a required minimum of 1,850 NIBP measurements or 9,250 min (1,850 x 5 min) of “anesthesia time”.

Results

Eighty-six patients were included in the study over a period of 9,534 min of anesthesia time. Heart rate, ST-segment, oxygen saturation, and NIBP were measured in all patients; IBP was measured in 12 patients. Baseline characteristics are shown in Table 2. The mean (standard deviation; SD) age was 55.1 (17.9) yr with differences between specialties: ENT: 60.8 (16.9) yr; general surgery: 52.2 (17.6) yr; and neurosurgery: 49.7 (17.8) yr.

Table 2 Baseline characteristics

The number of values stored during the 9,534 min were: HR 9,442, oxygen saturation 9,415, ST-segment 9,026, and NIBP 2,754 (Table 3). Table 3 also provides stratification by specialty and the number of episodes over which the deviating values and artifacts were distributed. Overall, the percentage of artifacts was 0.0 for HR (95% CI: 0.0 to 0.1), 0.3 for oxygen saturation (95% CI: 0.2 to 0.5), 4.7 for ST-segment (95% CI: 4.3 to 5.2), 2.3 for NIBP (95% CI: 1.8 to 2.9) and 14 for IBP values (95% CI: 12 to 15).

Table 3 Results of analysis performed on all values during anesthesia

Artifacts as a percentage of total deviations represented 1.6 for HR (95% CI: 0.4 to 5.7), 24 for oxygen saturation (95% CI: 18 to 32), 83 for ST-segment (95% CI: 76 to 87), 3.3 for NIBP (95% CI: 2.5 to 87), and 27 for IBP values (95% CI: 24 to 31), as shown in the last column of Table 3.

We found that many of the artifacts occurred before incision and after surgical closing. Since these periods are known to be times of extreme variability, a sensitivity analysis was performed to exclude these specific periods (Table 4). The results for HR, oxygen saturation, ST-segment, and NIBP were minimally affected compared with the overall analysis. However, the incidence of artifacts and deviations in IBP was decreased from 14% to 3.9% and from 27% to 4.4%, respectively (Table 4). Furthermore, we found that relocation of the ECG electrodes was a major cause of ST-segment artifacts, occurring in only four patients undergoing procedures in the thoracic region or in procedures requiring the patient to be placed in a prone position (three general surgery patients and one neurosurgery patient). An additional analysis excluding these patients resulted in a decrease in the incidence of ST-segment values being artifacts to 0.8% (95% CI: 0.7 to 1.1) and a decrease in the incidence of deviating ST-segment values being artifacts to 61% (95% CI: 51 to 71).

Table 4 Results of sensitivity analysis performed on values between surgical incision and closure

Table 5 shows the direction of the deviating values (upwards or downwards) and the number of deviations per episode. Overall, 96% of the HR deviations were upwards (tachycardia), and only 2.7% of NIBP deviations were upwards (hypertensive) compared with 25% of the IBP deviations.

Table 5 Direction of deviations and number of deviations per episode

The most common causes for artifacts in oxygen saturation, ST-segment, NIBP, and IBP were dislocation of the pulse oximeter (65%), relocation of ECG electrodes (83%), manipulation of the blood pressure cuff (84%), and relocation of either the IBP sensor or the patient (53%) (Table 6).

Table 6 Causes of artifacts

Discussion

The aim of this study was to assess the reliability of the data in an AIMS database. We determined that the AIMS database provides reliable data for HR and oxygen saturation, whereas NIBP values show an error rate of up to 4.6%. For ST-segment and IBP values, the incidence of artifacts stored as values varied from 2-9% and from 11-34%, respectively, depending on the type of surgery. These results were obtained with application of a static one-minute median filter. As many other types of AIMS database filters exist, results may vary per system. This should be taken into account when using data from an AIMS database.

We chose to analyze records from patients undergoing general, ENT, or neurosurgical procedures. We considered these procedures to contain a substantial and meaningful variety of artifact incidences and respective causes as a consequence of their variations in anatomic localization, surgical approaches, and patient characteristics. This variation was reflected in our results, which show differences in the prevalence of artifacts when comparing the three specialties (Tables 3 and 4). Overall, we consider the incidence rate of artifacts in HR, oxygen saturation, ST-segment, and NIBP to be an acceptable error rate in an AIMS, hence, our opinion is that data regarding these parameters derived from an AIMS can be used for research purposes. It should be taken into account, though, that approximately 80% of the deviating ST-segment values appeared to be artifacts as well as approximately 25% of the deviating oxygen saturation and IBP values. When we excluded patients whose ECG electrodes were relocated, the overall incidence of ST-segment artifacts was decreased to 0.8%; nevertheless, approximately 60% of the deviating values were artifacts. The objective has to be carefully considered for research using blood pressure values. The IBP may seem more reliable than the NIBP because it is measured more frequently. On the other hand, the IBP contains more artifacts, mainly at the beginning and end of the procedure. However, by including only the period from incision to surgical closing, the percentage of artifactual deviating IBP values can be decreased to 4.4%. The occurrence of artifacts is important to consider when using AIMS data, depending on the goal of a particular study. When evaluating the causes of these artifacts, it seems that the most frequent ones (e.g., relocation of ECG electrodes, leaning against the blood pressure cuff, and relocation and delayed zeroing of the invasive pressure sensor) can easily be avoided.

It is difficult to compare the results from this study with the available literature. Although some studies have compared the number of incidences of artifacts derived from automated records with those derived from manual data entry, the aim of our study was to determine prospectively the number of artifacts stored in an anesthesia information and management system database. Moreover, in previous studies, the incidence was expressed as the number of cases (patients) containing artifacts rather than the number of artifacts that occur in a certain amount of values. Eden et al. 3 primarily investigated the potential of an AIMS and the accuracy of its data entry for 4,429 procedures, and they found that 12% of procedures contained at least one HR artifact and 2% contained at least five. Furthermore, they found that 1.5% of the cases contained at least four extreme values (HR < 20 or > 180 and oxygen saturation < 80%), 60% of which were artifacts. Edsall et al.4 compared manual and computerized anesthesia records with respect to time demands and record quality. They found two artifacts (one in oxygen saturation and one in expiratory carbon dioxide) in the computerized records, but they included only five AIMS-recorded patients. Gostt et al.14 developed an algorithm to annotate pulse oximetry artifacts automatically and tested its accuracy in routine surgical procedures. They designed the algorithm to label all oxygen levels < 90% in 20 surgical patients. Thirteen values < 90% were found, and nine (69%) of these were artifacts.

When looking at the causes of artifacts, Takla et al.15 provided a list of the most common causes for artifacts, but they did not quantify these causes. Görges et al. 16 studied the alarms in a medical intensive care unit and classified these as effective, ineffective, or actively ignored. They showed that the number of ineffective and ignored alarms, which can be interpreted as artifacts, could be decreased if the alarm presentation was delayed by 19 sec. This suggests that the duration of a deviation is important in differentiating between true deviations and artifacts. In our study, we had comparable observations, particularly in ST-segment and IBP measurements (Table 5).

In the present study, values deviating from the normal range were defined in accordance with clinically used definitions of tachy- and bradycardia, hypoxia, ST-elevation and ST-depression, and hypo- and hypertension.13 As such, definitions partially include individual baseline values (i.e., values obtained before induction of anesthesia) with a corresponding normal range; a normal range for each parameter in every individual patient was calculated immediately before surgery. The resulting normal range may not be fully representative of the physiologic state of the patient, since patients can be stressed before undergoing surgery. However, we do not think that this influenced the results because we registered artifactual values that deviated from the individual normal range and values that fell within this normal range. A second limitation is that values considered as being artifacts were not evaluated by a reference test to confirm whether they indeed were artifacts; they were only verified immediately with the attending anesthesiologist or anesthetic nurse. Third, we found a high artifact incidence in IBP values (Table 4); however, this was found in a small subset of the investigation (12 out of 86 patients). Importantly, all of these IBP artifacts occurred during positioning of the patient or during emergence from anesthesia, and in most cases, they were due to relocation of the pressure sensor. In addition, 17% (57 values) of the IBP artifacts were caused by a dampened curve, which can be caused by clotting of the arterial catheter, in which case it is considered an artifact, but it can also be caused by cardiogenic shock. In general, when analyzing IBP data, a post hoc analysis can be performed to determine whether the dampened curve was caused by clotting. This can be achieved by comparing the IBP values with the corresponding NIBP values. However, NIBP is mostly measured infrequently if continuous IBP measurements are in use. Finally, in our sample size calculation, we assumed that the artifacts would occur independently of each other. However, our results show that artifact episodes within a single patient often contained multiple values, especially in the ST-segment and IBP values, suggesting that the occurrence of artifacts is not “independent”. It can therefore be argued that we underestimated the required “anesthesia time” in our sample size calculation. Nevertheless, our sample size calculation was based on the least frequently measured parameter (NIBP) in which most artifacts actually did occur independently (i.e., episodes contained single artifacts).

In conclusion, we found that storing the median value per minute to filter capturing of continuous vital parameter values in an AIMS database provides reliable data for HR and oxygen saturation with artifact rates below 0.5% and provides acceptable reliability for NIBP data with a 2.3% artifact rate. The presence of artifacts should be taken into account in research using vital parameter data from AIMS databases, especially when using IBP values; data should also be checked for reliability. In this study, knowledge about the method of artifact filtering for both the monitoring system and the AIMS is essential, and studies using AIMS data should describe methods of data acquisition, filtering, and storage.