Abstract
Adoption and use of real-world data (RWD) for decision-making has been complicated by concerns regarding whether RWD was fit-for-purpose or was of sufficient validity to support the creation of credible RWE. This has greater urgency as regulatory agencies begin to use real world evidence (RWE) to inform decisions about treatment effectiveness. Researchers need an efficient and systematic method to screen the quality of RWD sources considered for use in studies of effectiveness and safety. Based on a literature review we developed a listing of screening criteria that have been previously proposed to assess the quality of RWD sources. We also developed an additional criterion based on Modern Validity Theory. While there has occurred some convergence of conceptual frameworks to assess data quality (DQ) and there is much agreement on specific assessment criteria, consensus has yet to emerge on how to assess whether a specific RWD source is reliable and fit-for-purpose. To create a user-friendly tool to assess whether RWD sources may have sufficient quality to support a well-designed RWE study for submission to a regulatory authority, we grouped the quality criteria with a view to harmonize published frameworks and to be consistent with how researchers generally evaluate existing RWD sources for research that they intend to submit to regulatory agencies. Screening data quality criteria were grouped into five dimensions after a comprehensive literature review via PubMed: authenticity, transparency, relevance, accuracy, and track record. The resultant tool was tested for its response burden using a hypothetical administrative claims data source. Providing responses to the screening criteria required only few hours effort by an experienced data source manager. Thus, the tool should not be an onerous burden on data source providers if asked by prospective researchers to provide the required information. Assessing whether a particular data source is fit-for-purpose will be facilitated by the use of this tool, but it will not be sufficient by itself. Fit-for-purpose judgements will still require further careful consideration based on the context and the specific scientific question of interest. Unlike prior DQ frameworks (DQF), the track record dimension of the tool adds the consideration of experience with RWD sources consistent with Modern Validity Theory. However, the tool does not address issues of study design and analysis that are critical to regulatory agencies in evaluating the robustness and credibility of the real-world evidence generated.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The use of real-world data (RWD) and real-world evidence (RWE) derived from RWD has seen wide adopted by pharmaceutical developers and a variety of decision makers including doctors, payers, health technology authorities and regulatory agencies (Berger et al. 2015; Berger et al. 2017; Schneeweiss et al. 2016; Berger and Crown 2022; Daniel et al. 2018; Zou et al. 2021). Credible RWE can be created from good quality RWD from routine practice when investigated within well-designed and well-executed research studies (Schneeweiss et al. 2016; Berger and Crown 2022). Adoption and use of RWD is complicated by concerns regarding whether particular sources of RWD are of “good quality” and “fit-for-purpose”. These concerns have become more urgent as regulatory agencies are increasingly using RWD as external comparators for single-arm clinical trials and are exploring whether non-interventional RWD studies can provide substantial supplementary evidence of treatment effectiveness. While the recent emphasis on data quality (DQ) has focused on the use of RWD for assessing disease burdens and treatment effectiveness, evaluation of DQ and fitness-for-purpose is also required for safety studies. However, expanding the use of RWE in safety evaluation will probably require data sources beyond administrative claims (Dal Pan 2022).
The US Food and Drug Administration’s (FDA) guidance, “Assessing Electronic Health Record and Medical Claims Data to Support Regulatory Decision Making,” states that for all study designs, it is important to ensure the reliability and relevance of data used to help support a regulatory decision (FDA 2021a). Reliability included data accuracy, completeness, provenance, and traceability; relevance includes key data elements (exposures, outcomes, covariates) and a sufficient number of representative patients for the study. The FDA guidance “Considerations for the Use of Real-World Data and Real-World Evidence to Support Regulatory Decision-Making for Drug and Biological Products” (FDA 2023) emphasizes the need for early consultation with the FDA to ensure the acceptability of study design and analytic plans. With respect to data sources, it states that feasibility of data access is critical: “such evaluations of data sources or databases for feasibility purposes serve as a way for the sponsor and FDA to (1) assess if the data source or database is fit for use to address the research question being posed and (2) estimate the statistical precision of a potential study without evaluating outcomes for treatment arms.”
The European Medicines Agencies (EMA) in the European Union (EU) has also issued a draft “Data Quality Framework for EU medicines regulation” (EMA 2022). It defines DQ as fitness for purpose to user needs in relation to health research, policy making, and regulation and the data reflect the reality which they aim to represent (TEHDS EU 2022). It divides the determinants of DQ into foundational, intrinsic, and question specific categories. Foundational determinants are those that pertain to the processes and systems through which data are generated, collected and made available. Intrinsic determinants pertain to aspects that are inherent to a specific dataset. Question specific determinants pertain to aspects of DQ that cannot be defined independent of a specific question. It also distinguishes three levels of granularity of DQ: value level, column level, and dataset level. The dimensions, including subdimensions, and metrics of DQ are divided into the following categories: reliability, extensiveness, coherence, timeliness, and relevance.
-
Reliability (precision, accuracy, and plausibility) evaluates the degree to which the data correspond to reality.
-
Extensiveness (completeness and coverage) evaluates whether the data are sufficient for a particular study.
-
Coherence examines the extent to which different parts of a dataset are consistent in the representation and meaning. This dimension is subdivided into format coherence, structural coherence, semantic coherence, uniqueness, conformance, and validity.
-
Timeliness is defined as the availability of data at the right time for regulatory decision making.
-
Relevance is defined as the extent to which a dataset presents the elements required to answer a research question.
TransCelerate has issued a simpler framework entitled “Real-World Data Audit Considerations” that is divided into pillars of relevance, accrual, provenance, completeness, and accuracy (TransCelerate 2022). These frameworks are part of an ongoing dialogue among stakeholders from which international standards for RWD will eventually emerge.
There are a number of efforts to assess the utility of real-world data sources for a variety of purposes and settings. For example, Observational Health Data Sciences and Informatics (ODHSI) has open source tools, such as ACHILLES and Data Quality Dashboard, to be leveraged in the development of the DARWIN (Data Analytics and Real-World Interrogation Network) database in the EU. Other efforts have focused on the quality of prospective registries, which have different issues for DQ compared with the re-use of existing data sources including the Registry Evaluation and Quality Standards Tool (REQueST) developed by EUnetHTA (EUnetHTA 2019) and “Registries for Evaluating Patient Outcomes: A Users Guide: Fourth Edition,” developed by U.S. Agency for Healthcare Research and Quality (Glicklich et al. 2020).
In a systematic assessment of DQ evaluation, Bian et al. (2020) identified twelve DQ dimensions (currency, correctness/accuracy, plausibility, completeness, concordance, comparability, conformance, flexibility, relevance, usability/ease-of-use, security, information loss and degradation, consistency, and understandability/interpretability). They concluded that definitions of DQ dimensions and methods were not consistent in the literature; they called for further work to generate understandable, executable, and reusable DQ measures. To that end, we have developed a user-friendly set of screening criteria to help researchers of varying experience assess whether existing reusable RWD sources may be fit-for-purpose when their objective is to answer questions from regulatory agencies or to support claims regarding benefits and risks of therapies.
We took our cue on the definition of “fit-for-purpose” from the FDA draft guidance on selecting, developing, or modifying fit-for-purpose clinical outcome assessments (COAs) for patient-focused drug development guidance (to help sponsors use high quality measures of patients’ health in medical product development programs) which states that fit-for-purpose in the regulatory context means the same thing as valid within modern validity theory, i.e., validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests,” and that a clinical outcomes assessment is considered fit-for-purpose when “the level of validation associated with a medical product development tool is sufficient to support its context of use”(FDA 2022). While the term validity is defined in epidemiology to be comprised of internal and external validity relating to study design and execution, we designed the RWD screening tool to focus on evaluation of the RWD itself within the larger framework of modern validity theory (Royal 2017).
After all, as Wilkinson notes, “good data management is not a goal in itself, but rather is the key conduit leading to knowledge discovery and innovation, and to subsequent data and knowledge integration and reuse by the community after the data publication process” (Wilkinson 2016). Wilkinson proposed the FAIR principles for the management of RWD generated by public funds (although they are also applicable to datasets created in the private sector) (Wilkinson 2016); data sources should be findable, accessible, interoperable, and reusable. These recommendations are complemented by the recommendations of the Duke-Margolis white paper “Determining Real-World Data’s Fitness for Use and the Role of Reliability” (Mahendraratnam et al. 2019) that explored whether RWD are fit-for-purpose by the application of rigorous verification checks of data integrity.
While experts in modern validity theory have not reached consensus on the attributes of validity, there are basic tenets that most such theorists have adopted (Royal 2017). Validity pertains to the inferences or interpretations made about a set of scores, measures, or in this case—data sources, as opposed to their intrinsic properties. As applied to evaluation of RWD sources, this means that they must be considered fit-for-purpose for generating credible RWE through well-designed and well-executed study protocols to inform decision making. Modern validity theory would suggest that the accumulation of evidence should be employed to determine if this inference regarding RWD quality is adequately supported. Hence, validity of a data source is a judgement on a continuum onto which new evidence is added and is assessed as part of a cumulative process because knowledge of multiple factors (e.g., new populations/samples of participants, differing contexts, new knowledge, etc.) are gained over time. This element of RWD source evaluation is not specifically recognized in the current recommendations by the FDA and the EMA.
As noted earlier, an obstacle to developing a consensus regarding evaluation of DQ is that many terms have been used to describe their dimensions and elements, and the terminology has been used inconsistently, despite efforts at harmonization (Kahn et al. 2016, Bian et al. 2020). Regardless, RWE derived from RWD that focuses on the natural history of disease and adverse effects of treatment have long been considered “valid” by decision-makers. Recently, the issue of data validity has urgently become a focus for regulatory initiatives as RWE derived from RWD is being expanded in its use to inform decisions about treatment effectiveness and comparative effectiveness. These decisions demand a greater level of certainty in order to trust the study results.
A crucial dimension in assessing the validity of data calls for the need for transparency through traceability and accessibility. The FDA reinforced this point in several recent guidance documents, as has the HMA-EMA (European Union’s Heads of Medicines Agencies-European Medicines Agency) Joint Big Data Taskforce Report (HMA-EMA 2019). The FDA guidance on “Considerations for the Use of Real-World Data and Real-World Evidence to Support Regulatory Decision-Making for Drug and Biological Products” states that “If certain RWD are owned and controlled by other entities, sponsors should have agreements in place with those entities to ensure that relevant patient-level data can be provided to FDA and that source data necessary to verify the RWD are made available for inspection as applicable.” (FDA 2023). The FDA noted in its “Data Standards for Drug and Biological Product Submissions Containing Real-World Data Guidance for Industry” that “during data curation and data transformation, adequate processes should be in place to increase confidence in the resultant data. Documentation of these processes may include but are not limited to electronic documentation (i.e., metadata-driven audit trails, quality control procedures, etc.) of data additions, deletions, or alterations from the source data system to the final study analytic data set(s)” (FDA 2021b).
Interests in the creation of “regulatory-grade” RWD and RWE in the US was spurred by the 21st Century Cures Act. In the EU, there is the Innovative Medicines Initiative (IMI) Get Real and the HMA-EMA Big Data Joint Taskforce. As noted in the Framework for FDA’s Advancing Real-World Evidence Program for early regulatory engagements to evaluate the potential use of RWE for the support of a new indication for a drug already approved or to help satisfy drug post approval study requirements, the strength of RWE submitted in support of a regulatory decision will depend on its reliability that encompasses more than transparency in data accrual and quality control, but also clinical study methodology, and the relevance of the underlying data (FDA 2018, 2023).
The National Institute for Health and Care Excellence (NICE) in the United Kingdom issued a real-world evidence framework in 2022 (NICE 2022). Among the elements to address data suitability were data provenance and governance, DQ including completeness and accuracy, and data relevance (data content, differences in patients and care settings, sample size, length of follow-up). They developed the DataSAT Tool that sought information on data sources, data linkages, purpose of data collection, description of data collected, time period of data collection, data curation, data specification (ex. data dictionary) and data management/quality assurance.
The European Medicines Regulatory Network strategy to 2025 includes the creation of the DARWIN [Data Analytics and Real-World Interrogation Network] (Arlett 2020; Arlett et al. 2021); it builds on the HMA-EMA Big Data Joint Taskforce Report (HMA-EMA 2019) that RWD is challenged by a lack of standardization, sometimes limited precision and robustness of measurements, missing data, variability in content and measurement processes, unknown quality, and constantly changing datasets. The report viewed the number of EU databases that currently meet minimum regulatory requirements for content and that are readily accessible, citing Pacurariu et al. (2018), as “disappointingly low”. The International Coalition of Medicine Regulatory Authorities (ICMRA) has called for global regulators to collaborate on standards for incorporating real-world evidence in decision making (ICMRA 2023).
In developing a user-friendly set of criteria, we attempted to find the right balance between the granularity of requested information and the response burden. We defined the dimensions of data suitability (e.g., sufficiency of quality and fitness-for-purpose) in plain English terms consistent with existing frameworks as discussed above. Although we primarily focus on the US and EU, the tool may have relevance to other jurisdictions as well, with the understanding of various local data privacy protection requirements for data-access.
2 Materials and methods
Our literature review did not focus on tools or frameworks that primarily addressed study design and/or execution and/or statistical analysis (ex. Wang et al. 2021, 2023; Gatto et al. 2022; Gebrye et al. 2023; Campbell et al. 2023). Rather, we focused on frameworks and proposals that addressed the foundational and intrinsic qualities (as defined above by the EMA) of real world data. Our starting point was the widely cited articles such as Kahn et al. (2016) and the well-executed systematic review reported by Bian et al. (2020) that included relevant articles published through February 2020. We supplemented this with a search of the PubMed-listed articles through August 31, 2023 using the following inclusion criteria: any language; type of articles included reviews only, or reviews or systematic reviews. There were no exclusion criteria. Boolean operators were used for specifying texts or text strings such as “quality”, “real world data”, and “real world evidence”. The PubMed search engine was used to conduct the search twice, one for reviews only, while the other for either reviews or systematic reviews. In addition, we also separately accessed the websites of the FDA and the EMA.
The literature was summarized in tables to delineate criteria employed to assess the potential reliability and relevance of RWD. To make our tool user-friendly, we grouped the criteria into 5 broad dimensions that, based upon our experience, conform to the general categories of considerations that are top-of-mind for most researchers in evaluating potential RWD sources. Because of the large number of criteria and the variability in terminology used, we limited the number of criteria within the tool to manage its response burden; and since this is intended as a screening tool for existing RWD sources such that researchers will need to do additional investigation to determine whether a RWD source is fit for their intended purposes, we did not view this as a problem. The criteria included here in this tool are consistent with those in the NICE DataSAT tool, with the exception of our inclusion of criteria regarding the track-record of RWD source use.
3 Results
The framework of the screening criteria included the following dimensions: authenticity, transparency, relevance, accuracy, and track-record. We defined these dimensions as follows:
-
A.
Authenticity A data source is considered authentic when its provenance is well-documented, and its authenticity can be verified with its component data sources as needed.
-
B.
Transparency A data source is considered transparent when the processes used in data acquisition, curation, editing, and linkage are described adequately.
-
C.
Relevance A data source is considered relevant when it contains the population of interest in adequate numbers and sufficient length of follow-up; and it contains the data elements required in order to implement an analysis plan of a real-world protocol.
-
D.
Accuracy A data source is operationally considered accurate when it can be documented that it depicts the concepts they are intended to represent, insofar as can be assured by DQ checks (i.e., the data is of sufficient quality to investigate specific research questions). Further, key data elements can be audited as deemed required.
-
E.
Track-record A data source will be considered more trustworthy in general when it can document its track-record of use cases that resulted in the creation of credible real-world evidence.
Using this framework, we developed the ATRAcTR (Authentic Transparent Relevant Accurate Track-Record) screening tool to evaluate real-world data sources (Table 1).
Authenticity is a foundational and necessary requirement for considering using an RWD source in real world evidence clinical study. This is consistent with FDA and EMA criteria requesting confirmation of data provenance and data traceability. While this may be obvious, recent experience during the COVID pandemic indicates that it cannot be taken for granted (Ledford and Van Noorden 2020). For some commercially available data sources, documentation alone may not be considered adequate. Verification via access to the data may be expected by regulators. Provenance documentation should provide information on what data was collected, why was it collected, the sources of the data, how the data was collected and the timeframe over which it was (or continues to be) collected. Any changes to the database over time should be disclosed (ex. addition of laboratory results, biomarkers, etc.). Disclosure is important to demonstrate that data was appropriately de-identified and/or describe the procedures that are in place to protect patient confidentiality. Of course, the provenance of publicly supported datasets may be directly known by regulatory agencies via their participation in their creations (ex. SENTINEL and FDA, DARWIN and EMA) (Arlett 2020; FDA 2020; EMA 2021).
Transparency in the description of the processes and procedures employed to acquire, curate, transform, edit, and link data is essential to assessing whether planned analyses will be performed on reliable good quality data. Documentation should describe what extract, transform and load (ETL) procedure was used, whether the data were transformed according to specified standards into specified formats (e.g., a common data model), how the data dictionary can be accessed, and whether there were any imputations or adjustments for incorrect or missing data. If more than one data source were combined to create the dataset, the processes for linking them should be described (ex. unique patient ID, tokenization). If the source data contained free-text contents, the processes for extracting information from the free text should be described (e.g., manually, natural language processing). How changes in coding conventions (e.g., ICD-9 to ICD-10) via General Equivalence Mappings (Center for Medicare and Medicaid Services—CMS) over time were handled should also be described. For data sources that are still adding new data, the latency of data accrual and the refresh cadence should be described. Furthermore, whether the data sources contain any synthetic data should be disclosed including the reason for its inclusion.
The relevance of the data source is essential in determining whether a data source is fit-for-purpose and is a bespoke process for a particular real-world clinical study. It includes assessing the sample size of the population of interest it contains, the observation period and length of follow-up. Documentation must be provided to allow researchers to assess whether data elements required for analysis are captured (e.g. population, intervention, comparator, outcomes, time, and confounders). Additional useful information is provided by demographic characteristics of the data including age, gender, ethnicity, and geography in the dataset. How this request is interpreted (e.g., overall database, study population) will depend on how the data source provider understands the needs of the researcher. The interpretation of data source representativeness will depend on its intended use. It is useful to disclose whether the data source was designed to be representative of the population, or if it is a convenience sample, and it should be stated why it can represent the population of interest.
We define accuracy broadly here as an assessment of the integrity of the data, most frequently characterized by conducting DQ checks for conformance to a common data model, plausibility and completeness; it also includes the concepts of reliability, extensiveness, and coherence as discussed in the EU framework. We separated accuracy from transparency since this requires its own distinct review procedures while recognizing that both are components of reliability and in deciding whether a data source is fit-for-purpose. A substantial literature (Kahn et al. 2016; Blacketer et al. 2021; Dreyer et al. 2010; Dreyer 2018; Girman et al. 2019; Hall et al. 2012; Kahn et al. 2015; Liaw et al. 2021; Miksad and Abernathy 2018; Qualls et al. 2018; Razzaghi et al. 2022; Reynolds et al. 2020; Schmidt et al. 2021; Simon et al. 2022; Zozus et al. 2015) describes the need for DQ checks and their specifics. Individual data curators generally set their own standards, making disclosure important. Because of the longer experience with administrative claims data, there are more generally accepted standards than there currently exist for EHR data. Software programs (e.g., Open-source software programs such as R or Python packages or functions, open source and commercial DQ assessment tools, as well as software packages such as Aetion Evidence Platform (Aetion.com) or Instant Health Data (Panalgo.com) have been developed to make the process operationally feasible (Liaw et al. 2021; Schmidt et al. 2021). Plausibility checks include an assessment of external consistency and internal consistency (e.g., temporal value violations, unexpected distributions or combinations, out-of-range value anomalies, data contradictions, and how null values were handled). There should also be disclosed whether there were DQ checks performed to evaluate if any unique patient contributed data under more than one identifier. Moreover, it should be disclosed whether there was any interruption or significant changes in data collection and/or processing over time that impacted data continuity (e.g., who contributed data, what data was collected, or how it was processed).
As discussed earlier, and consistent with modern validity theory, one gains insight into the general quality of a data source based upon its performance track-record of use cases where a data source successfully or failed to result in credible RWE. Performance track-record would include the publications that have employed the data source (especially for similar research questions to the study of interest) as well as a listing of unpublished use cases. If the data source was used in a study designed to emulate a randomized controlled trial (Franklin et al. 2021) and was successful, this may provide additional confidence in the trustworthiness of the dataset. It may not provide greater confidence across all situations, but it provides the likelihood that the data source will be fit-for-purpose for a variety of scenarios or study designs. Use cases and/or publications that were incorporated in responses to regulatory questions or to support regulatory decisions should be highlighted. For newly available data sources, previous performance history may be meager; however, such datasets can be provisionally viewed as suitable based on their providing evidence that they satisfy the criteria of authenticity, transparency, relevancy, and accuracy. As noted earlier, examining the track records of RWD sources has not been explicitly identified in current frameworks; we suspect that this reflects the fact that evaluating relevance or fitness-for-purpose is a bespoke judgement relating to the particular decision facing a regulator.
3.1 Pilot-testing on hypothetical data
With the goal of a user-friendly set of criteria, we tested the response burden. A highly experienced author (WHC) who held leadership positions for US RWD sources (e.g., MarketScan and Optum Clinformatics Data Mart) completed the tool for a hypothetical data source in two to three hours (Appendix A). This suggests that data source providers will not face an onerous response burden. Note that, for a given data source, the information requested remains the same across different studies. Therefore, the response burden will likely decrease for a data source provider across multiple requests, as well as the familiar with data, as well as the list of DQ criteria. Appendix A offers an example of what is a reasonable granularity of detail that provides a good picture of DQ for screening purposes. We expect that whoever responds to the tool will determine what is reasonable in discussion with researchers seeking to obtain access to the data source.
4 Discussion
Regulatory bodies such as the FDA and EMA, as well as others, have turned their attention to RWE to complement the information from randomized controlled trials to assess the benefit-risk profiles of therapeutics. They have long used evidence derived from real-world data sources to assess safety profiles and have begun moving from passive to active surveillance through the use of networks such as SENTINEL. Increasingly, standing cohorts are being leveraged for signal detection and confirmation (Huybrechts et al. 2021).
On a limited basis, RWE was considered adequate to support labelling changes when randomized controlled trials have not been considered feasible (FDA 2023). Recently, RWE was accepted by the FDA as the sole supplementary information to support labeling changes (FDA 2021c). Tacrolimus was approved for prophylaxis of organ rejection in patients receiving liver transplants based upon a Supplemental New Drug Application supported by a non-interventional RWE study. Tacrolimus had also been approved for use in kidney and heart transplants. The data source used in the RWE study was the U.S. Scientific Registry of Transplant Recipients. There were minimal concerns regarding relevance since the registry contained data on all lung transplants in the U.S., death and graft failures were adjudicated, and most necessary information was captured. There were low concerns on reliability since there was transparent RWD management, there was a low percentage of missing variables and few non-plausible lengths of hospitalization; most variables were coded accurately. The FDA determined that this non-interventional study with historical controls to be adequate and well-controlled and noted that, outcomes of organ rejection and death are virtually certain without therapy. The dramatic effect of treatment helped to preclude bias as explanation of results.
With the widespread digitization of medical information, how to expand the use of RWD to assess clinical effectiveness has become a priority both in the U.S. and in Europe. Indeed, the FDA has utilized RWE on a limited basis to evaluate the effectiveness of treatments for rare diseases and cancers. However, RWD submitted to support effectiveness claims have been frequently found deficient due to issues of relevancy (e.g., representativeness of population, small sample size) and accuracy (e.g., missing data) (Mahendraratnam et al. 2022; Bakker et al 2022). Government funded efforts to create “fit-for-purpose” RWD have begun in both the U.S. and Europe including SENTINEL, PCORNet, and DARWIN. Use of SENTINEL has been restricted thus far to safety issues but its extension to its application in treatment effectiveness studies is on the table as part of the FDA’s Advancing Real-World Evidence Program, as well as FDA’s Real-World Evidence Program more broadly. Progress in this endeavor will depend on changing the culture, behaviors, and standards applied by researchers and data source providers; in turn, the impetus for such change will come, in large part, by changing requirements of regulatory bodies as they expand the use of RWE to address perceived weaknesses of real-world observational studies in comparison to randomized controlled trials (Berger and Crown 2022; Simon et al. 2022).
Currently, for commercially available non-Medicare or Medicaid data sources, there is limited disclosure (e.g., transparency) describing how datasets are assembled, nor is there much disclosure and/or published literature on assessments of their DQ. Regulators and researchers have had to rely on their experience with specific data sources to judge their reliability rather than a formal set of criteria. For data sources that are new to the regulators and researchers, or for researchers with less experience or knowledge of RWD, it may be difficult to judge the quality of the data sources without sufficient disclosure from the data providers.
The screening tool described here will be immediately helpful to researchers in evaluating the quality of existing RWD sources that they intend to reuse to generate new RWE; it will help them gain understanding of the data provenance and how patient confidentiality is protected, how data are extracted and curated, what DQ checks have been performed, and the performance history of the data source. It is not expected that all data source providers will be necessarily willing or able to provide all of the information requested in ATRAcTR due to concerns regarding protection of intellectual property. However, the greater number and more granular the information they are able to provide will increase confidence in those requesting access to the dataset that it has the potential to generate regulatory-grade RWE. We note that the RCT-DUPLICATE Initiative (Wang et al. 2022) was able to reproduce the results of a large number of published comparative safety or effectiveness studies employing several commonly accessed real-world data sources (e.g., Medicare, Marketscan, Optum Clinformatics Data Mart, and Clinical Practice Research Datalink); this suggests that the providers of these data sources are likely be able to provide complete and adequate responses to the ATRAcTR.
We would also caution that using ATRAcTR is not sufficient in of itself to assess whether a particular RWD source is fit-for-purpose. Fit-for-purpose judgements will still require further careful considerations based on the context and the specific scientific question of interest. Moreover, the set of criteria does not address issues of study design and analysis that are critical to regulatory agencies in evaluating the robustness and credibility of the real-world evidence generated. Unlike prior data quality frameworks, the track record dimension of the tool adds the consideration of experience with RWD sources consistent with Modern Validity Theory.
There are some limitations in our efforts to develop the ATRAcTR. We did not use a formal process to group quality criteria into dimensions, nor did we further identify subdimensions, unlike the EMA’s DQF approach. We used this as a starting point, but sought to reduce the number of dimensions by broadening their scope to make the tool more user-friendly; moreover, the criteria included in the five dimensions cover most of issues raised by Bian and colleagues. Indeed, the tool format is compatible with existing frameworks, with the exception of our addition of reviewing the track record for a particular RWD source. Neither did we perform a formal validation study, since this is intended to be a screening instrument and researchers will clearly be in a better position with this information. We did not create a scoring manual to recommend the cut-off points as benchmark values for high, medium or low DQ, since there are no existing benchmarks available that lends credibility to the idea of a uniform threshold for DQ. Moreover, the RWE generated will be assessed by regulators not only on source DQ and its fitness-for-purpose but on the appropriateness and rigor of study design and statistical analysis. In the end, researchers still need to judge using the totality of DQ assessment results how to interpret the information received. In addition, we did not try to establish what are minimum standards to conclude that a specific RWD source is of “good quality” and “fit-for-purpose”. Nevertheless, we are confident that the ATRAcTR will assist researchers in identifying potential existing reusable RWD sources for regulatory purposes.
Data availability
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
References
Arlett, P.: DARWIN EU (Data Analytics and Real World Interrogation Network). https://www.ema.europa.eu/en/documents/presentation/presentation-proposal-darwin-eu-data-analytics-real-world-interrogation-network-parlett-ema_en.pdf. Accessed 31 May 2023 (2020)
Arlett, P., Kjaer, J., Broich, K., Cooke, E.: Real-world evidence in EU medicines regulation: enabling use and establishing value. Clin. Pharmacol. Ther. 111(1), 21–23 (2021)
Bakker, E., Plueschke, K., Jonker, C., Kurtz, X., et al.: Contributions of real-world evidence in European Agency’s regulatory decision making. Clin. Pharmacol. Ther. 113(1), 135–151 (2022)
Berger, M.L., Axelsen, K., Lipset, C., Gutteridge, A., et al.: Optimizing the leveraging of real world data: how it can improve the development and use of medicines. Value Health 18, 127–130 (2015)
Berger, M.L., Crown, W.: How can we make more rapid progress in the leveraging of real-world evidence by regulatory decision makers? Value Health 25(20), 167–170 (2022)
Berger, M.L., Daniel, G., Frank, K., Hernandez, A., McClellan, M., et al.: A framework for regulatory use of real-world evidence. Duke-Margolis Center for Health Policy. https://healthpolicy.duke.edu/sites/default/files/atoms/files/rwe_white_paper_2017.09.06.pdf. Accessed 31 May 2023 (2017)
Bian, J., Lyu, T., Loiacono, A., Viramontes, M., et al.: Assessing the practice of data quality evaluation in a national clinical data research network through a systematic scoping review in the era of real-world data. J. Am. Med. Inform. Assoc. 27, 1999–2010 (2020)
Blacketer, C., Defalco, F., Ryan, P., Rijnbeek, P.: Increasing trust in real-world evidence through evaluation of observational data quality. J. Am. Med. Inform. Assoc. 28(10), 2251–2257 (2021)
Campbell, U., Honig, N., Gatto, N.: SURF: a screening tool (for sponsors) to evaluate whether using real-world data to support an effectiveness claim in an FDA application has regulatory feasibility. Clin. Pharm. Ther. (2023). https://doi.org/10.1002/cpt.3021
Center for Medicare and Medicaid Services (CMS): Available online: https://www.cms.gov/medicare/coding/icd10/downloads/icd-10_gem_fact_sheet.pdf. Accessed 30 Aug 2023
Dal Pan, G.: The use of real-world data to assess the impact of safety-related regulatory interventions. Clin. Pharmacol. Ther. 111, 98–107 (2022)
Daniel, G., Silcox, C., Bryan, J., McClellan, M., et al.: Characterizing RWD quality and relevancy for regulatory purposes. Duke Margolis Center for Health Policy. https://healthpolicy.duke.edu/publications/characterizing-rwd-quality-and-relevancy-regulatory-purposes-0. Accessed 31 May 2023 (2018)
Dreyer, N.: Advancing a framework for regulatory use of real-world evidence: when real is reliable. Ther. Innov. Regul. Sci. 52(3), 362–436 (2018)
Dreyer, N., Schneeweiss, S., McNeil, B., Berger, M., et al.: GRACE Principles: recognizing high-quality observational studies of comparative effectiveness. Am. J. Manag. Care 16(6), 467–471 (2010)
European Medicines Agencies/Heads of Medicines Agencies, Data Quality Framework for EU medicines regulation. Available online: https://www.ema.europa.eu/en/documents/regulatory-procedural-guideline/data-quality-framework-eu-medicines-regulation_en.pdf. Accessed 31 May 2023 (2022)
European Medicines Agency, EU Big Data Stakeholder Forum. https://www.ema.europa.eu/en/documents/report/report-eu-big-data-stakeholder-forum-2021_en.pdf. Accessed 31 May 2023 (2021)
EUnetTHA, REQueST Tool and its vision paper. https://www.eunethta.eu/request-tool-and-its-vision-paper/. Accessed 30 Aug 2023 (2019)
FDA: Framework for FDA’s real-world evidence program. https://www.fda.gov/media/120060/download. Accessed 31 May 2023 (2018)
FDA: FDA and the Sentinel Operations Center, Standardization and querying of data quality metrics and characteristics for electronic health data, data quality metrics system final report, Version # 5.1, sentinel initiative. https://www.sentinelinitiative.org/sites/default/files/Methods/Standardization_and_Querying_of_Data_Quality_Metrics.pdf. Accessed 31 May 2023 (2020)
FDA: Assessing electronic health record and medical claims data to support regulatory decision making. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/real-world-data-assessing-electronic-health-records-and-medical-claims-data-support-regulatory. Accessed 31 May 2023 (2021)
FDA: Data standards for drug and biological product submissions containing real-world data guidance for industry. https://www.fda.gov/media/153341/download. Accessed 31 May 2023 (2021)
FDA: FDA approves new use of transplant drug based on real-world evidence—7/16/21. https://www.fda.gov/drugs/news-events-human-drugs/fda-approves-new-use-transplant-drug-based-real-world-evidence .Accessed 31 May 2023 (2021)
FDA: Patient-focused drug development: selecting, developing, or modifying fit-for-purpose clinical outcome assessments guidance for industry, food and drug administration staff, and other stakeholders. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/patient-focused-drug-development-selecting-developing-or-modifying-fit-purpose-clinical-outcome. Accessed 31 May 2023 (2022)
FDA: Considerations for the use of real-world data and real-world evidence to support regulatory decision-making for drug and biological products. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-real-world-data-and-real-world-evidence-support-regulatory-decision-making-drug. Accessed 29 Aug 2023 (2023)
Franklin, J., Patorno, E., Desai, R., Glynn, R., et al.: Emulating randomized clinical trials with nonrandomized real-world evidence studies: first results from the RCT DUPLICATE initiative. Circulation 143(10), 1002–1013 (2021)
Gatto, N., Campbell, U., Rubenstein, E., Jaksa, A., et al.: The structured process to identify fit-for-purpose data: a data feasibility assessment framework. Clin. Pharmacol. Ther. 111, 122–134 (2022)
Gebrye, T., Fatoye, F., Mbada, C., Hakimi, Z.: A scoping review on quality assessment tools used in systematic reviews and meta-analysis of real-world studies. Rheumatol. Int. 43, 1573–1581 (2023)
Glicklich, R., Leavy, M., Dreyer, N.: Registries for patient outcomes: a user’s guide (Fourth Edition). https://effectivehealthcare.ahrq.gov/sites/default/files/pdf/registries-evaluating-patient-outcomes-4th-edition.pdf. Accessed 30 Aug 2023 (2023)
Girman, C., Ritchey, M., Zhou, W., Dreyer, N.: Considerations in characterizing real-world data relevance and quality for regulatory purposes: a commentary. Pharmacoepidemiol. Drug Saf. 28, 439–442 (2019)
Hall, G., Sauer, B., Bourke, A., Brown, J., et al.: Guidelines for good database selection and use in pharmacoepidemiology research. Pharmacoepidemiol. Drug Saf. 21, 1–10 (2012)
HMA-EMA Joint Big Data Taskforce Phase II Report: Evolving data-driven regulation. https://www.ema.europa.eu/en/documents/other/hma-ema-joint-big-data-taskforce-phase-ii-report-evolving-data-driven-regulation_en.pdf. Accessed 31 May 2023. (2019)
Huybrechts, K., Kulldorff, M., Hernández-Díaz, S., Bateman, B., et al.: active surveillance of the safety of medications used during pregnancy. Am. J. Epidemiol. 190(6), 1159–1168 (2021)
International Coalition of Medicine Regulatory Authorities, ICMRA Statement on International Collaboration to enable real world evidence (RWE) for regulatory decision-making. https://icmra.info/drupal/sites/default/files/2022-07/icmra_statement_on_rwe.pdf. Accessed 31 May 2023 (2023)
Kahn, M., Brown, J., Chun, A., Davidson, B., et al.: Transparent reporting of data quality in distributed data networks. eGEMs 3(1), 7 (2015). https://doi.org/10.13063/2327-9214.1052
Kahn, M., Callahan, T., Barnard, J., Bauck, A., et al.: A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. GEMs 4(1), 18 (2016). https://doi.org/10.13063/2327-9214.1244
Ledford, H., Van Noorden, R.: High-profile coronavirus retractions raise concerns about data oversight. Nature 582(7811), 160 (2020)
Liaw, S., Guan, J., Guo, N., Ansari, S.: Quality assessment of real-world data repositories across the data life cycle: a literature review. J. Am. Inf. Assoc. 28(7), 1591–1599 (2021)
Mahendraratnam, N., Mercon, K., Gill, M., Benzing, L., McClellan, M.: Understanding use of real-world data and real-world evidence to support regulatory decisions on medical product effectiveness. Clin. Pharmacol. Ther. 111(1), 150–154 (2022)
Mahendraratnam, N., Silcox, C., Mercon, K., Kroetsch, A., et al.: Determining real-world data’s fitness for use and the role of reliability. Duke Margolis Center for Health Policy. https://healthpolicy.duke.edu/sites/default/files/2019-11/rwd_reliability.pdf. Accessed 31 May 2023 (2019)
Miksad, R., Abernethy, A.: Harnessing the power of real-world evidence (RWE): a checklist to ensure regulatory-grade data quality. Clin. Pharmacol. Ther. 103(2), 202–205 (2018)
NICE real-world evidence framework. www.nice.org.uk/corporate/ecd9. Accessed 29 Aug 2023 (2022)
Pacurariu, A., Plueschke, K., McGettigan, P., Morales, D., et al.: Electronic healthcare databases in Europe: descriptive analysis of characteristics and potential for use in medicines regulation. BMJ Open 8, e023090 (2018). https://doi.org/10.1136/bmjopen-2018-023090
Qualls, L., Phillips, T., Hammill, B., Topping, J., et al.: Evaluating foundational data quality in the national patient-centered clinical research network (PCORnet®). EGEMS (wash DC) 6(1), 3 (2018). https://doi.org/10.5334/egems.199
Razzaghi, H., Greenberg, J., Bailey, L.: Developing a systematic approach to assessing data quality in secondary use of clinical data based on intended use. Learn Health Sys. 6, e10264 (2022). https://doi.org/10.1002/lrh2.10264
Reynolds, M., Bourke, A., Dreyer, N.: Considerations when evaluating real-world data quality in the context of fitness for purpose. Pharmacoepidemiol. Drug Saf. 29, 1316–1318 (2020)
Royal, K.: Four tenets of modern validity theory for medical education assessment and evaluation. Adv. Med. Educ. Pract. 8, 567–570 (2017)
Schmidt, C., Struckmann, S., Enzenbach, C., Reineke, A., et al.: Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R. BMC Med. Res. Methodol. 21, 63 (2021). https://doi.org/10.1186/s12874-021-01252-7
Schneeweiss, S., Eichler, H.-G., Garcia-Altes, A., Chinn, C., et al.: Real world data in adaptive biomedical innovation: a framework for generating evidence fit for decision- making. Clin. Pharmacol. Ther. 100, 633–646 (2016)
Simon, G., Bindman, A., Dreyer, N., Platt, R.: When can we trust real-world data to evaluate new medical treatments? Clin. Pharmacol. Ther. 111(1), 24–29 (2022)
Toward the European Health Data Space (TEHDAS EU): European health data space data quality framework, deliverable 6.1 of TEHDAS EU 3rd Health 566 Program (GA: 101035467). https://tehdas.eu/results/tehdas-develops-data-quality-recommendations. Accessed 31 May 2023
TransCelerate RWD Audit Readiness Considerations—Draft for Public Review. Available online: https://www.transceleratebiopharmainc.com/wp-content/uploads/2022/12/RWD-Audit-Readiness-Considerations-Document.pdf. Accessed 31 May 2023 (2022)
Wang, S., Pinheiro, S., Hua, W., Arlett, P., et al.: STaRT-RWE: structured template for planning and reporting on the implementation of real world evidence studies. BMJ 12, 1–8 (2021)
Wang, S., Pottegard, A., Crown, W., Arlett, P., et al.: HARmonized protocol template to enhance reproducibility of hypothesis evaluating real-world evidence studies on treatment effects: a good practices report of a joint ISPE/ISPOR task force. Pharmacoepidemiol. Drug Saf. 32, 44–55 (2023)
Wang, S., Sreedhara, S., Schneeweiss, S.: Repeat Initiative, Reproducibility of real-world evidence using clinical practice data to inform regulatory and coverage decisions. Nat. Commun. 13, 5126 (2022). https://doi.org/10.1038/s41467-022-32310-3
Wilkinson, M.: The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016)
Zou, K.H., Li, J.Z., Salem, L.A., Imperator, J.S., et al.: Harnessing real-world evidence to reduce the burden of noncommunicable disease: health information technology and innovation to generate insights. Health Serv. Outcomes Res. Methodol. 21(1), 8–20 (2021)
Zozus, M., Hammond, W., Green, B., Kahn, M., et al.: Assessing data quality for healthcare systems data used in clinical research (Version 1.0). NIH Collaboratory. https://dcricollab.dcri.duke.edu/sites/NIHKR/KR/Assessing-data-quality_V1%200.pdf. Accessed May 31, 2023 (2015)
Acknowledgements
The authors appreciate the editorial support from Arghya Bhattacharya, PhD and Shanthakumar V, PhD of Viatris. The abstract was presented at the 2023 International Conference on Health Policy Statistics, Health Policy Statistics Section, American Statistical Association.
Funding
This work was funded by Mylan (now Viatris).
Author information
Authors and Affiliations
Contributions
All investigators have provided the following: (1) Made substantial contributions to the conception or design of the work. (2) Drafted the work or revising it critically for important intellectual content. (3) Provided final approvals of the version to be published. (4) Agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Corresponding author
Ethics declarations
Conflict of interest
Funding for work by MLB and WHC was provided for by Viatris. JZL and KHZ are employees and stockholders of Viatris.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Berger, M.L., Crown, W.H., Li, J.Z. et al. ATRAcTR (Authentic Transparent Relevant Accurate Track-Record): a screening tool to assess the potential for real-world data sources to support creation of credible real-world evidence for regulatory decision-making. Health Serv Outcomes Res Method (2023). https://doi.org/10.1007/s10742-023-00319-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10742-023-00319-w