Introduction

Knowledge about the benefit-to-harm balance of alternative treatment options is central to high-quality patient care. Traditionally, the experiment [randomized controlled trial (RCT) or natural experiment] has been at the top of the ‘hierarchy of evidence’ as the gold standard for evidence-based medicine, especially for therapeutic choices [1]. Bias-reducing features of the RCTs—random treatment assignment with the expectation of zero net confounding at baseline; restriction to uniform patient populations; blinding; and standardized data collection (all combined with underlying statistical theory)—are ways to maximize internal validity. In contrast to the traditional hierarchy of evidence [1], the emerging consensus among clinical epidemiologists is to move away from judging a study’s validity based only on its design type [25]. This consensus arises from an appreciation that some purported benefits of experimental designs are not always realized in practice (e.g., the baseline prognostic balance achieved by randomization is often upset during follow-up). Nor do the internally valid results of RCTs apply in all settings of routine clinical care because of the inevitable validity–generalizability tradeoffs of RCTs [25, 6••, 79]. As well, ethical, practical, and financial considerations dictate that most epidemiologic research be observational, including studies of comparative effectiveness and comparative safety of treatments [10]. Thus, observational studies comparing treatments are increasingly advocated and implemented [6••, 11]. Novel designs that combine advantages of randomized and non-randomized approaches (such as lowering the tradeoff between internal and external validity in pragmatic trials or reliance on new-user designs [12, 13••]) help mitigate the disadvantages of both approaches, aiding the acceptance of non-experimental methods in the clinical research community. Modern design and analytic approaches to reducing or quantifying systematic errors in observational research include propensity score methods, marginal structural models, instrumental variables, external adjustment, and bias analyses [2, 12, 1419]. Choosing and correctly implementing study design is a prerequisite for subsequent valid application of different analytic techniques.

Although clinicians have routinely compared harms and benefits of treatments for their patients in an informal way, the concept of systematic comparative effectiveness research (CER) is relatively new. For example, the 2008 edition of the Dictionary of Epidemiology did not yet contain an entry for CER [20]. In 2009, the Institute of Medicine defined CER as “generation and synthesis of evidence that compares the benefits or harms of alternative methods to prevent, diagnose, treat, and monitor clinical conditions, or to improve the delivery of care” [21]. CER thus encompasses studies (1) directly or indirectly comparing safety and/or effectiveness of active treatments for the same indication; (2) carried out in routine clinical practice; and (3) aiming to help clinicians, regulators, and policy makers to make evidence-based decisions. In addition to scientific aims, CER studies initiated outside academic institutions may have explicit practical goals, including formulation of guidelines, standards of care, safety regulations, or reimbursement policies [22]. Thus, clinical decision making and policy are much more prominent in planning CER studies in non-academic settings than in conventional investigator-initiated studies in academia [22, 23].

Guidelines relevant to CER have been published by several authorities [8, 9, 2428], with some of these publications eliciting critique and calls for harmonization [29, 30•]. Investigators embarking on a CER study should start by consulting the Guidelines for Good Pharmacoepidemiology Practice (GPP), maintained by the International Society for Pharmacoepidemiology (ISPE) [30•]. The Good Research for Comparative Effectiveness (GRACE) principles specify the following questions to be considered when assessing study quality [25]: (1) whether the study plans (including research questions, main comparisons, outcomes, etc.) were specified before the study was conducted; (2) whether the study was conducted and analyzed in a manner consistent with good practices and reported in sufficient detail for evaluation and replication; and (3) how valid the interpretation of the CER study is for the population of interest, assuming sound methods and appropriate follow-up.

With these questions in mind, we provide a non-technical overview of essential prerequisites for high-quality CER studies from a clinical epidemiology standpoint, keeping in mind the potentially divergent agendas of investigators and other stakeholders. We discuss the essentials of study planning, implementation, and publication of results, focusing on observational studies that generate evidence addressing different dimensions of the harm–benefit profiles of therapies.

Study Planning

The Stakeholders and the Aim

The aim of a CER study should be clearly and unambiguously defined and should meet criteria for good research, e.g., the FINER [31] or PICOTS [32] criteria. The FINER criteria state that the proposed research should be feasible (in terms of number of patients and sources of data, technical expertise, expenditure of time and money, and manageable scope); interesting (to the clinical community as well as the investigator); novel (in terms of extending and improving previous research); ethical; and relevant (to scientific knowledge, clinical health policies, or future research). The parameters for good research to be considered according to PICOTS include the population (condition(s), disease severity and stage, co-morbidities, and patient demographics), the intervention (dosage, frequency, and method of administration), comparator (placebo, usual care, or active control),the outcome (morbidity, mortality, or quality of life), the timing (duration of follow-up), and the setting (primary, specialty, inpatient, and co-interventions).

The CER study proposal should also explicitly list study initiators, sponsors, and other stakeholders, and potential conflicts of interest. Stakeholders are individuals, organizations, or communities who have a direct interest in the process and outcomes of a study [22, 23, 33]. Stakeholders who might be involved in a CER study include industry (in voluntary or regulator-imposed post-authorization safety or effectiveness studies [34•]), regulators (e.g., European Medicines Agency (EMA), US Food and Drug Administration), and governments—in different combinations [22]. Patient engagement in reviewing merits of research proposals is becoming increasingly common, and may serve to increase relevance to patient care of CER and clinical research [35].

An investigator contemplating a CER study initiated by a pharmaceutical company should always consider underlying motivations. These could include concern about safety signals emerging from spontaneous reporting, a wish to study disease risk in the general population or in specific groups of patients before a new treatment enters the market, or a regulator-imposed post-authorization monitoring. To eliminate concerns about hidden agendas that might otherwise compromise the integrity of a CER study, any potential conflict of interest among investigators or participating institutions should be fully disclosed.

It is important to note that collaboration with industry does not per se threaten study validity. If there is an agenda (hidden or obvious), university-based researchers are in a better position than for-profit contract research organizations to uphold and enforce principles and procedures protecting study validity. Academically based investigators are backed by institutional mandates for independence and the obligation to publish results of all studies in journals with independent peer review. Unless they are providing direct gainful consultancy services to the pharmaceutical industry, academic researchers are typically salaried employees who do not directly benefit financially from ‘landing’ a lucrative pharmaceutical contract. Since such a contract is executed between institutions rather than individuals, the financial gain of an individual academic investigator is limited (source: Susanne Kudsk, Legal Advisor, Aarhus University, personal communication). As well, conducting a poor study under pressure from a sponsor affects an investigator’s reputation [29]. If experts from academia refuse to collaborate with industry on CER studies, they may be replaced by potentially less skillful, less scrupulous, or less independent investigators [36].

The Contract

Collaboration between academic institutions and regulators, government, and/or industry sponsors should be governed by a professional contract, which is crucial for both the researcher and the sponsor. A contract is a formal agreement establishing the ‘rules of the game’: what is to be done, by whom, when, and at what cost. In international environments, the country whose laws will govern the contract should be clearly specified. A contract serves as a master document to be consulted in case of disputes. It should be executed by the researcher’s institution to avoid conflicts and charges of corruption that could arise, were the researcher to receive payment directly from the sponsor.

The type of contract depends on the sponsor’s role. It can take the form of (1) a grant for investigator-initiated studies with no substantial involvement by the sponsor; (2) a cooperative agreement in which the investigator and the sponsor collaborate on the project and both contribute funding and intellectual content; or (3) a contract for sponsor-initiated studies with substantial involvement by the sponsor.

The contract should regulate the interests of both the investigator (and his/her institution) and the sponsor. It should describe the parties, the purpose of the research, the definition of the project, deliverables, schedule, subcontracting, contributions and obligations of the parties, distribution and transfer of rights, confidentiality, and consequences of ending the collaboration.

The contract must ensure that the researcher and the researcher’s institution are free to use the findings in future research and teaching. The researcher also should have the unrestricted right to publish the research findings. In most cases, the sponsor may require a period of time (e.g., 30 or 60 days) to review and comment on a manuscript arising from contract research before submission for publication. Both parties must be willing to negotiate the manuscript’s content and phrasing, but the researcher should have the final say. In special circumstances, the sponsor may postpone publication for up to 6 months, for instance, to apply for a patent. However, this is a rare occasion in CER, in which timely publication of results with a public health impact has high priority. In addition, publication should not be postponed by adverse event reporting, which is usually not possible or appropriate based on aggregate results from a non-experimental CER study using databases [30•].

Assessing Study Feasibility

CER studies are increasingly conducted using secondary data sources, such as healthcare databases, which rely on routine data collected for other purposes. This raises the question whether the data relevant to the study aim are measured or measured well in the candidate data source. A feasibility study conducted ahead of the main effort may help secure data access, estimate study size, or evaluate background rates of the target condition. A feasibility study may also help establish referral and hospitalization patterns to assess the potential role of selection bias or confounding by indication. At our institution, we routinely evaluate the validity of study algorithms before using them in CER studies. For example, we evaluated the validity of an algorithm to identify osteonecrosis of the jaw and serious infections [3739] before conducting regulator-imposed industry-sponsored comparative safety studies of antiresorptive agents [40]. While the validity of the algorithm used to identify serious infection was high in hospital records, the algorithm to identify osteonecrosis of the jaw performed poorly and necessitated primary data collection [41]. Thus, a feasibility study helps estimate whether—and to what extent—existing data must be supplemented with primary data collection. In addition, a pilot study can help in estimating associated costs and in planning appropriate resources. If data from several different databases are to be combined in a CER study, a pilot study may help determine whether all databases measure equally well what they purport to measure. For example, pilot studies may compare estimates of incidence of well-characterized conditions, examine sources of any unexpected variation, and adjust the methodology (see Avillach et al. [42] and Coloma et al. [43] for illustration of this approach).

Review of the Skills of Team Members

For a CER study to be well-conducted, the investigator should be mindful of whether the research team covers the spectrum of required expertise and skills. Multidisciplinary CER study teams usually include pharmacoepidemiologists, biostatisticians, pharmacologists, and clinicians. Access to legal advice and project management are also essential to a well-conducted CER study. For multi-institutional studies, it may be efficient to outsource certain administrative or IT tasks. Furthermore, since many comparative effectiveness studies address major and pressing clinical and legal issues, it is important to select participating investigators who can meet tight deadlines without compromising research quality.

International Collaboration

If the required skills and resources are not present within the local team, international collaboration with leading experts in relevant fields can help ensure high quality of a CER study. Moreover, data from a single country/data system may be insufficient to address all study objectives, to achieve sufficient sample size, or to achieve sufficient generalizability. In some instances, collaboration between at least two different countries may be a condition for funding: for example, the EMA routinely requires use of data from two or more EU Member States in its commissioned research [44]. Finally, investigators whose institutional or national policies proscribe direct collaboration with industry may contribute to CER as subcontractors within international collaborations [40]. Decisions about the number of required databases can be formalized in the study protocol, as recently described [45].

Study Implementation

Protocol and Statistical Analysis Plan

After study feasibility is established, study sources identified, study teams assembled, and the contract signed, a study protocol is developed or finalized as the first step of study implementation. Several guidelines for the structure and components of CER protocols have been proposed [13••, 27, 46, 47]. The user guide developed for the United States Agency for Healthcare Research and Quality is comprehensive yet readable and contains contributions by highly reputed experts [13••].

Protocol writers should strive to create a detailed and transparent guide to the conduct of the study. The protocol must define the primary, secondary, and potential exploratory study objectives. Protocol writing is an iterative process that helps raise and address methodological issues. Protocol-related challenges of studies based on multinational secondary data sources require an adequate description of diverse data systems and measurement of study variables extracted from diverse sources (such as general practice-based databases, claims databases, and/or national registries). These sources may have different mechanisms for generating records, which affect data validity and completeness as well as interpretation of results.

In multinational studies, it is crucial to involve all participants in writing and revising the study protocol. In regulator-imposed post-authorization studies, the marketing authorization holder may initiate writing of the protocol according to prespecified formats, working with data custodians in participating countries to harmonize data-generating mechanisms. The protocol should be reviewed by clinicians with relevant expertise and with experience treating patients in a given health system; by statisticians with practical expertise in data-generating mechanisms, data flow, and data architecture; and by epidemiologists who can foresee the implications of data idiosyncrasies for interpretation of results.

For observational studies, including CER studies, the protocol should contain clear provisions for efforts to rule out methodological threats to validity, including selection bias, information bias, confounding, and chance. Use of automated health records—claims, patient, and disease registries, medical record databases, and insurance databases—has become a mainstay of CER [8, 9, 25, 48]. Thus, investigators have large amounts of routinely collected data on large numbers of individuals but limited control of data collection. In an era of automated databases, it is essential to consider how selection bias, confounding by indication, data quality, misclassification, and medical surveillance bias, are to be handled [49•]. Some traditional epidemiologic ‘mantras’ [50] may not apply in CER settings. One example is the dilution of estimates by non-differential misclassification of exposure, frequently invoked to defend ‘conservative estimates’ in studies of non-pharmaceutical exposures. Dilution of estimates in CER studies is, like in any other study, a potential public health hazard if exposure measurement instruments and definitions are so poor that they lower the strength of a safety signal beyond detection, resulting in continued use of a potentially unsafe agent. CER study protocols must specify ways to avoid dilution of the effect by inclusion of outcome measures that have high specificity. Another example is the challenge of confounding by indication when comparing treated with untreated; however, in CER studies comparing two different drugs with the same indication, this problem is often reduced considerably.

The planned statistical analysis should be described in sufficient detail in the study protocol. However, the comprehensive description of statistical procedures may require a separate document, the Statistical Analysis Plan (SAP). As the SAP is a guide for the study statistician, he/she should be involved in its preparation and must approve it. The SAP closely follows the study protocol and is developed after the protocol is finalized. The SAP contains a detailed description of sampling and analytic procedures, and many sections of the SAP will be lifted verbatim for use in the statistical analysis section of the study report or a published paper. Analysis of data from different international sources may be country-based or pooled. Development of common data models is quickly becoming the standard approach. Different approaches to combining international data have been described and are beyond the scope of this paper [4043, 45, 51, 52••, 53, 54••, 55, 56].

Transparency and methodological rigor are necessary features of the protocol and the SAP. The CER protocol must be in place before the study commences. In some situations, e.g., in some regulator-imposed studies, a protocol must be in place before the drug under study enters the market. By definition, such a protocol is not informed by crucial aspects of real-life drug utilization, including whether the drug will be distributed in inpatient or outpatient settings (and therefore measurable in outpatient prescription databases) and how fast drug uptake occurs. Therefore, amendments to the protocol are often necessary as real-life aspects of drug use become apparent. Protocol amendments should be justified, scientifically sound, agreed-upon by all study stakeholders, and meticulously documented [57]. CER protocols and all amendments may need approval by a regulator. The EMA publishes the protocols of imposed post-authorization studies in its ENCePP (European Network of Centres for Pharmacoepidemiology and Pharmacovigilance) register of studies [58]. Researchers should consider registration of any CER study; for example non-ENCePP studies can be registered in the ENCePP registry.

Interacting with the Sponsor

Professional interaction with the sponsor is important in both investigator- and sponsor-initiated studies, depending on contributions agreed on before study initiation. Formal channels of communication (e.g., frequency of investigator meetings, teleconferences, and updates) should be agreed upon in advance. Informal communication with sponsor employees is less regulated. Pharmaceutical companies often have dedicated research, development, and/or safety departments that are separated from the sales department in order to reduce conflicts of interest.

The sponsor may contribute important background knowledge to a CER study, which can be useful in formulating the research question (e.g., nature of potential adverse events from ongoing RCTs). However, during the conduct of the study, communication may be more informative than interactive. While the researcher and the sponsor should share a fundamental interest in improving health for patients, they may have different interests that should be kept in mind during interactions. Respectful communication is required, as research findings should not be influenced by the sponsor. Still, the sponsor may have a particular interest in getting as much information as possible, as research findings may have a major impact on approval, labeling, and sale of the company’s products.

Publication of Results

The publication potential of CER studies is attractive to academia-based researchers and may serve as an important motivator for expert clinicians and methodologists to contribute their efforts. The investigators should be free to publish all results stemming from CER research, and this right should be delineated in the contract. Sponsor employees should co-author the publications, provided they fulfill the authorship criteria [59]. Several scientific publications may stem from a single CER study, with different author constellations. Even if it seems redundant, it is worth circulating the ICMJE (International Committee of Medical Journal Editors) authorship criteria before drafting a manuscript to ensure that all aspiring authors understand and are prepared to fulfill their expected contributions. Results should be transparently reported and judiciously interpreted, including honest discussion of study limitations. Current reporting guidelines [60], especially the STROBE (STrengthening the Reporting of OBservational studies in Epidemiology) statement for observational studies, and the upcoming RECORD (REporting of studies Conducted using Observational Routinely collected Data) guidelines for reporting studies conducted using routinely collected data [61•], will help determine the type of information that needs to be included in the planned report.

Conclusion

In conclusion, methodological rigor, clear rules, transparency in communication, and independence in reporting are the guiding principles of observational CER, with the ultimate goal of improving patient care and public health.