Background

There are no widely accepted universal standards for data quality in health systems research, despite several articles and reports emphasizing their importance [1,2,3,4,5,6,7,8,9,10]. While there are known methods for assessing data quality in patient registries and health information systems, there are few published methodological guidelines for integrating Data Quality Assurance (DQA) protocols into large-scale health systems research trials, especially in resource-limited settings [5, 9, 11,12,13,14]. High-quality data are crucial in health systems research as scientific recommendations based on those data have implications for policy and practice [5, 8].

Error rates in clinical trials have been described in the literature ranging from 2.8% to 26.9% across multiple studies [15,16,17,18,19,20]. There are no minimally acceptable data-quality standards included in US Federal guidelines for clinical research; therefore, researchers establish their own acceptable error rates and measurement methods [10]. Onsite monitoring of clinical trial sites and database audits occur; however, published systematic approaches to field verification of data quality during trial implementation are rare, and their absence limits opportunities to remediate data-quality issues in real time [14, 21, 22]. Clinical trials often require multiple data collection activities, all subject to different sources of error; therefore, DQA activities often must target multiple dimensions of quality [14, 16, 23]. DQA methods must address all possible sources of error in an integrated, systematic, and supportive manner to promote continuous data quality improvement throughout implementation [24, 25].

The BetterBirth Trial is a matched-pair, cluster-randomized controlled trial (RCT) of the BetterBirth Program, which uses coaching-based implementation of the World Health Organization (WHO) Safe Childbirth Checklist to improve quality of facility-based deliveries in Uttar Pradesh, India, and to reduce 7-day maternal and neonatal morbidity and mortality [26]. This complex and large-scale trial includes three sources of data: patient registry, delivery observation, and post-delivery patient-reported outcomes data. In the trial, over 6300 deliveries were observed, and over 153,000 mother-baby pairs across 120 study sites were followed to assess health outcomes [27]. We designed and implemented the Data Quality Monitoring and Improvement System (DQMIS), a robust, multi-component, and integrated DQA mechanism, to ensure high-quality data throughout study implementation. This study aimed to evaluate the DQMIS and its effectiveness for ensuring data quality. In the absence of published approaches to field verification of data quality during trials, here we report the implementation components and results of an integrated DQA system.

Methods

Data collection activities in the BetterBirth Trial

The trial included five data collection activities related to the three sources of data. 1) Essential birth practices performed by birth attendants during deliveries were observed and recorded by facility-based observers. Following observations, 2) the observation data recorded on paper forms were transferred to the electronic data entry app. 3) Patient data, sourced from paper-based facility registers, were extracted by facility-based data collectors and entered into a paper-based study register. Following data extraction, 4) patient data were transferred to the electronic data entry app by facility-based data collectors. Finally, 5) call center staff contacted patients to assess maternal and neonatal mortality and seven maternal morbidities using a standardized questionnaire [27] and entered these data directly into the electronic data collection app.

Design of the DQMIS

We designed the DQMIS to reinforce six dimensions of data quality [28] (Table 1). The DQMIS comprised of five complementary functional components, including: 1) a monitoring and evaluation (M&E) team to support data management and quality; 2) a DQA protocol, including data collection audits and targets, rapid data feedback, and supportive supervision; 3) training on data quality; 4) standard operating procedures (SOPs) for data collection; and 5) an electronic data collection and reporting system (Table 2).

Table 1 Operational definitions for six dimensions of data quality, adapted from Brown W, et al. [28]
Table 2 Functional components of the DQMIS and corresponding dimensions of data quality

Functional components of the DQMIS

Monitoring and evaluation (M&E) team support for data management and quality

Two M&E staff managed operations of the DQMIS across all data collection activities and provided technical assistance and capacity development to supervisory staff in the field. The M&E team was responsible for oversight of all functional components of the DQMIS, including the DQA protocol, organizing trainings, developing and revising SOPs as needed, and providing technical assistance on data collection and report interpretation. The M&E reinforced all six dimensions of data quality throughout the trial.

Standard operating procedures (SOPs) for data collection

Tools were designed and SOPs for each data collection activity were defined prior to study start. All data collection tools were programmed into the electronic component of the data collection system to facilitate automated and scalable data quality monitoring. SOPs included frequency, method, and technique for each data collection activity.

Training

All data collectors and supervisors participated in an 8-day orientation training program focused on implementation of SOPs, data-collection tools, the electronic data collection apps, and reporting system. As a part of this orientation, a 1-day training focused on the functional components of the DQA protocol. Additionally, data collectors engaged in active learning by visiting facilities to learn study implementation processes in the field. Subsequent staff-wide and staff-specific refresher trainings were delivered throughout implementation of the trial.

Electronic data collection and reporting system

We developed a data collection and reporting system to centralize data management for the trial. The system included front-end smartphone and tablet-based electronic data collection applications (based off Dimagi’s open-source CommCare platform) for each data collection tool, a secure cloud-based server for data storage and integrity, and a reporting portal for study operations, including data quality. The reporting system produced data quality reports using pre-defined algorithms and data visualizations to facilitate near real-time feedback on accuracy of trial data.

DQA protocol, including audits, real-time data feedback, supportive supervision

We designed a standardized DQA protocol as an integrated component of the trial to continuously assess and improve data accuracy and reliability throughout implementation. Supervisors performed audits on data collectors to address quality of the five data collection activities. Audits targeted accuracy of data entry, delivery observations, and patient-reported outcomes ascertained by the call center. The auditing process, unique for each data collection activity, required perfect accuracy on a sample of data collected by each data collector in a phased approach. Following orientation, each data collector began an intensive phase of auditing lasting 6 weeks (or longer in case of any difficulty achieving targets). After achieving performance targets of the intensive phase, data collectors graduated into a maintenance phase, with audits repeating every 3 months. No a priori decisions were made regarding the proportion of data in each data collection activity that would be assessed for quality; rather, the data collector’s ability to achieve set targets determined the proportion of data within each data collection activity that was checked for accuracy. Perfect accuracy was required for each performance target; any errors required that the audit be repeated from the beginning (Table 3).

Table 3 Data sources and audit methods

The DQA protocol was supported by rapid, timely, and automatic data quality feedback. Data quality reports were designed to inform supervisors and study management staff of audit results at the level of data collector, including accuracy rates, DQA phase, error trends, target achievement, and data entry delay. Additionally, reports designed for study management presented aggregated accuracy rates and error trends across data collectors. Reports were available within 24 h of audits and accessed via smartphone and tablet. In observance of blinding rules related to observation and outcomes data for certain staff, reports displayed accuracy in green and errors in red, rather than the actual data (Fig. 1).

Fig. 1
figure 1

Data quality accuracy report for patient-reported outcomes

In addition, we designed a supportive supervision model to facilitate data accuracy and reliability (quality improvement (QI)) across all data collection activities. Experienced supervisors were assigned to support specific data collectors in order to build trust and rapport. Utilizing the reporting system, supervisors reviewed audit results on a continual basis to identify target accomplishment and occurrence of errors. Thereafter, immediate onsite support was provided to data collectors. Success was celebrated, and challenges were addressed in a supportive manner. First, supervisors shared accuracy reports with staff to address challenges. Second, sources of error were discussed, whether they were related to data entry, interpretation, or technical aspects of the app. Finally, supervisors and data collectors together devised strategic plans to improve accuracy, which included refresher training, one-on-one support, and peer-to-peer mentorship. The M&E team provided ongoing support to supervisors in this process.

Data analysis

Descriptive statistics were calculated for accuracy results, including proportion of forms evaluated for accuracy, overall accuracy, and accuracy by data collection activity. The proportion of forms evaluated for accuracy was calculated as the number of forms audited out of the total number of forms collected over the same time period (7 November 2014 to 6 September 2016). The percent accuracy was calculated for all forms audited. A form was considered accurate if all questions were consistent between both entries of the form. A form was considered inaccurate if it contained one or more errors. The percent accuracy for forms for each activity was plotted over time by month and assessed for trends. The relative risk of accuracy for each data collection activity by each consecutive form audited was calculated using relative risk regression, clustered by data collector [29, 30]. All statistical analyses were performed using SAS 9.4®.

Results

Data collection staff gradually increased as the volume of data increased over the course of the trial. At their maximum, data collection staff included 32 facility-based observers (26 data collectors, six supervisors), 116 facility-based field workers (78 data collectors, 38 supervisors), and 33 call center staff (26 callers, 6 supervisors, 1 manager).

Completeness, precision, and integrity

These three dimensions of data quality were primarily guaranteed through the back-end design of the data collection and reporting system. All electronic data collection apps included required fields and skip patterns to prevent missing values upon data entry, guaranteeing completeness of all datasets. Data precision was protected through data definitions and field restrictions in the electronic data collection system. The secure cloud-based server certified the integrity of data by preventing data manipulation by any staff.

Timeliness and reliability

Timeliness of data was reinforced by the SOPs for data collection and by routine staff trainings, which emphasized that each data collector enter data from paper-based forms to electronic apps as soon as possible after data collection. For the two data collection activities for which primary data collection was paper-based (data entry of observation checklist, and data entry of patient data into the study register), the mean duration until electronic entry was 0.46 and 2.14 days, respectively. Reliability of data was accomplished through all five functional components of the DQMIS, collectively ensuring consistency in data collection across data collectors.

Proportion and accuracy of trial data audited

Among the five data collection activities, the proportion of forms (case-level data) audited ranged from 2.17% to 39.32%. The DQA protocol resulted in a high overall rate of accuracy across all data collection activities in the trial, with accuracy of each data collection activity ranging from 91.77% to 99.51% (Table 4).

Table 4 Proportion and accuracy of trial data audited (7 Nov 2014 to 6 Sept 2016)

Accuracy of trial data over time

All data collection activities demonstrated an upward trend in accuracy improvement throughout implementation. For example, monthly accuracy of observation of birth attendant practices at observation point (OP)2 increased from 73.68% to 100% (Fig. 2). The accuracy of each question in all data collection activities was also analyzed. Over time, question-level accuracy never decreased. In most instances, question-level accuracy remained high throughout and, in several instances, question-level accuracy improved over time.

Fig. 2
figure 2

Accuracy rate and trend of each data collection activity by month (7 Nov 2014 to 6 Sept 2016). OP observation point

Accuracy of data collectors over time

Data collector accuracy remained high from the first audit through all consecutive audits. A small but significant increase in accuracy was achieved throughout consecutive audits for three of the data collection activities and for three of the four OPs. For the other data collection activities, there was no significant change in data collector accuracy as it remained high throughout the trial. In no case did accuracy decrease among data collectors throughout consecutive auditing (Table 5).

Table 5 Unadjusted trend in accuracy of data collectors over time

Discussion

Our integrated DQMIS resulted in exceptionally high data quality for the trial. Error rates in clinical trials have been reported as high as 26.9%, and could range even higher due to a lack of standardization of data quality measurement [19]. Our overall error rate of 1.67%, as measured by accuracy auditing, provides evidence for the feasibility and effectiveness of integrating DQA into the implementation of health systems research trials. Our DQMIS was successful, despite a steady increase in staff volume, complex and multiple data sources, a vast geographic catchment area across 24 districts, and a large sample size. This success is largely attributable to a number of factors, which we describe below.

Well-designed technology and data collection processes

It is essential to plan for data quality control mechanisms during the design phase of QI and health systems research trials [31]. We guaranteed completeness, precision, and integrity of data throughout implementation of the trial through several layers of quality control. Stringent and deliberate front-end data entry rules prevented data collectors from entering values outside specified ranges or choosing options that contradicted previous responses. Additionally, significant time and resources were dedicated to implementing robust back-end restrictions into the data collection system to prevent data loss or corruption from occurring. The reporting system enabled the study team, based in India and the US, to monitor data collection indicators to ensure consistent data collection processes. As reported elsewhere [5, 31], this forethought and design facilitated a high-quality dataset.

Well-defined SOPs

It has also been acknowledged that SOPs and indicator definitions are essential for reliable and accurate data collection in clinical trials [5, 11, 22, 24]. Prior to data collection in the trial, the study protocol was systematically designed with a focus on ensuring data quality through standardization of processes. Data collection tools were designed with validated questions, pre-tested, and finalized through an iterative process. As a reference for data collectors and supervisors, tool guides were developed which included instructions for how to use instruments, definitions, and interpretation guidelines for each question. Tool guides also reinforced consistency of data collection and entry to ensure reliability. Tool guides were adapted and refined throughout the trial to address definitional and other challenges that arose during data collection. Additionally, SOPs and trainings emphasized the importance of timely data entry, reducing the possibility of lost data or inaccuracy as a result of data entry delay.

Integration into data collection workflow

While methods for assessing data quality in patient registries and health information systems are known, little has been recently published on integrating DQA methods into clinical trial data collection workflows [5, 9, 11,12,13,14, 24]. By integrating the DQA protocol into daily workflows, supervisors had the opportunity to support quality throughout implementation of the study. Assigning challenging targets for the intensive phase and lessening these in the maintenance phase reinforced our integrated and continuous system of quality improvement. Following orientation, each data collector was held to high performance standards, fostered by our supportive supervision model. Once achieving intensive phase targets, data collectors were still held to the same targets, but on a less frequent basis to routinely check and bolster accuracy. The aim was to make data collectors accountable for their own performance quality. In addition, the integrated nature of the DQMIS ensured that the proportion of data checked for quality was adapted to the performance of the data collector. The design of the DQA protocol established that the proportion of data checked for quality should be determined by a data collector’s ability to achieve certain performance targets. Target achievement and ongoing supportive supervision together influenced sustained quality throughout implementation. While the ratio of data collectors to supervisors ranged from 2:1 to 4:1 depending on the data collection activity, future trials should consider data collection volume, geographic scope, and minimum quality standards when determining human resource needs for DQA.

Data feedback paired with supportive supervision

Coaching for QI, when paired with performance monitoring and data feedback, has been shown to be effective in healthcare and other disciplines [32,33,34]. Recognizing this, we designed a complementary supportive supervision and data feedback model for DQA. Our near real-time reporting system facilitated the continuous monitoring of data accuracy. The design of the system, to rapidly analyze and report on audit results, enabled supervisors to promptly provide support to data collectors to improve data quality. Our supportive supervision model placed an emphasis on building capacity and promoting quality instead of penalizing lower performers. Supervisors were trained in coaching and mentorship techniques in order to emphasize strengths and target areas of improvement. Achievement of accuracy targets was celebrated, and improvement strategies were mutually identified between data collectors and supervisors. The combination of timely data feedback and supportive supervision was integral to the success of the DQA protocol.

Impact on data collection

During trial implementation, the DQMIS had multiple impacts on data collection methods and refinement of certain questions. Data quality reports highlighted specific concerns related to facility-based observers’ definitional interpretation of key study variables. In one instance, reports demonstrated low accuracy for the observation checklist item: “Was the following available at the bedside: sterile scissors or blade to cut cord.” Supervisors informed managers and study staff of wide variability in data collectors’ interpretation and definitions of sterility. Given this, study management staff chose to revise this checklist item to: “Was the following available at the bedside: clean scissors or blade to cut cord,” along with comprehensive guidelines on how to interpret whether the items were ‘clean.’ ‘Clean’ was defined as sterilized (directly removed from autoclave or boiler) or having no visible marks (dirt, blood, etc.). Data collectors received training on these changes, and scenario-based role playing helped to test their understanding. Following this, subsequent monthly accuracy rates for this checklist item increased to 100% for the duration of implementation. In the absence of data quality reports, inaccurate and unreliable data collection would have persisted.

Limitations

There are a few limitations to the design and implementation of the DQMIS. First, it is possible that our reliance on the supervisor as the gold standard for delivery observation may have resulted in data incorrectly being considered accurate. There was no other available gold standard, however; therefore, this choice was the most reliable option in the absence of alternatives. Additionally, facility staff not employed by the study entered data in facility registers. For this reason, our DQA is unable to verify the reliability of registration data. We also lack evidence of the cost-effectiveness of the DQMIS. Finally, in order to conduct DQA auditing of facility-based field workers and provide support across the vast geographic size of the study catchment area, the nearly 2:1 ratio of these workers to supervisors was required. This may not be feasible or necessary in other settings.

Conclusions

The findings of this study demonstrate that integrated methods of DQA combined with SOPs, rapid data feedback, and supportive supervision during trial implementation are feasible, effective, and necessary to ensure high-quality data. In the absence of widely disseminated data quality methods and standards for large health systems RCT interventions, we developed the DQMIS to ensure reliability and serve as a model for future trials. Future efforts should focus on standardization of DQA processes and reporting requirements for data quality in health systems research.