Background

In medicine, patients with their care providers are confronted with making numerous decisions on the basis of an estimated risk or probability that a specific disease or condition is present (diagnostic setting) or a specific event will occur in the future (prognostic setting) (Figure 1). In the diagnostic setting, the probability that a particular disease is present can be used, for example, to inform the referral of patients for further testing, initiate treatment directly, or reassure patients that a serious cause for their symptoms is unlikely. In the prognostic setting, predictions can be used for planning lifestyle or therapeutic decisions based on the risk for developing a particular outcome or state of health within a specific period [1,2]. Such estimates of risk can also be used to risk-stratify participants in therapeutic clinical trials [3,4].

Figure 1
figure 1

Schematic representation of diagnostic and prognostic prediction modeling studies. The nature of the prediction in diagnosis is estimating the probability that a specific outcome or disease is present (or absent) within an individual, at this point in time—that is, the moment of prediction (T = 0). In prognosis, the prediction is about whether an individual will experience a specific event or outcome within a certain time period. In other words, in diagnostic prediction the interest is in principle a cross-sectional relationship, whereas prognostic prediction involves a longitudinal relationship. Nevertheless, in diagnostic modeling studies, for logistical reasons, a time window between predictor (index test) measurement and the reference standard is often necessary. Ideally, this interval should be as short as possible and without starting any treatment within this period.

In both the diagnostic and prognostic setting, estimates of probabilities are rarely based on a single predictor [5]. Doctors naturally integrate several patient characteristics and symptoms (predictors, test results) to make a prediction (see Figure 2 for differences in common terminology between diagnostic and prognostic studies). Prediction is therefore inherently multivariable. Prediction models (also commonly called “prognostic models,” “risk scores,” or “prediction rules” [6]) are tools that combine multiple predictors by assigning relative weights to each predictor to obtain a risk or probability [1,2]. Well-known prediction models include the Framingham Risk Score [7], Ottawa Ankle Rules [8], EuroScore [9], Nottingham Prognostic Index [10], and the Simplified Acute Physiology Score [11].

Figure 2
figure 2

Similarities and differences between diagnostic and prognostic prediction models.

Prediction model studies

Prediction model studies can be broadly categorized as model development [12], model validation (with or without updating) [13] or a combination of both (Figure 3). Model development studies aim to derive a prediction model by selecting the relevant predictors and combining them statistically into a multivariable model. Logistic and Cox regression are most frequently used for short-term (for example, disease absent vs. present, 30-day mortality) and long-term (for example, 10-year risk) outcomes, respectively [1217]. Studies may also focus on quantifying the incremental or added predictive value of a specific predictor (for example, newly discovered) to a prediction model [18].

Figure 3
figure 3

Types of prediction model studies covered by the TRIPOD statement. D = development data; V = validation data.

Quantifying the predictive ability of a model on the same data from which the model was developed (often referred to as apparent performance) will tend to give an optimistic estimate of performance, owing to overfitting (too few outcome events relative to the number of candidate predictors) and the use of predictor selection strategies [19]. Studies developing new prediction models should therefore always include some form of internal validation to quantify any optimism in the predictive performance (for example, calibration and discrimination) of the developed model. Internal validation techniques use only the original study sample and include such methods as bootstrapping or cross-validation. Internal validation is a necessary part of model development [2]. Overfitting, optimism, and miscalibration may also be addressed and accounted for during the model development by applying shrinkage (for example, heuristic or based on bootstrapping techniques) or penalization procedures (for example, ridge regression or lasso) [20].

After developing a prediction model, it is strongly recommended to evaluate the performance of the model in other participant data than was used for the model development. Such external validation requires that for each individual in the new data set, outcome predictions are made using the original model (that is, the published regression formula) and compared with the observed outcomes [13,14]. External validation may use participant data collected by the same investigators, typically using the same predictor and outcome definitions and measurements, but sampled from a later period (temporal or narrow validation); by other investigators in another hospital or country, sometimes using different definitions and measurements (geographic or broad validation); in similar participants but from an intentionally different setting (for example, model developed in secondary care and assessed in similar participants but selected from primary care); or even in other types of participants (for example, model developed in adults and assessed in children, or developed for predicting fatal events and assessed for predicting nonfatal events) [13,15,17,21,22]. In case of poor performance, the model can be updated or adjusted on the basis of the validation data set [13].

Reporting of multivariable prediction model studies

Studies developing or validating a multivariable prediction model share specific challenges for researchers [6]. Several reviews have evaluated the quality of published reports that describe the development or validation prediction models [2328]. For example, Mallett and colleagues [26] examined 47 reports published in 2005 presenting new prediction models in cancer. Reporting was found to be poor, with insufficient information described in all aspects of model development, from descriptions of patient data to statistical modeling methods. Collins and colleagues [24] evaluated the methodological conduct and reporting of 39 reports published before May 2011 describing the development of models to predict prevalent or incident type 2 diabetes. Reporting was also found to be generally poor, with key details on which predictors were examined, the handling and reporting of missing data, and model-building strategy often poorly described. Bouwmeester and colleagues [23] evaluated 71 reports, published in 2008 in 6 high-impact general medical journals, and likewise observed an overwhelmingly poor level of reporting. These and other reviews provide a clear picture that, across different disease areas and different journals, there is a generally poor level of reporting of prediction model studies [6,227,29]. Furthermore, these reviews have shown that serious deficiencies in the statistical methods, use of small data sets, inappropriate handling of missing data, and lack of validation are common [6,2327,29]. Such deficiencies ultimately lead to prediction models that are not or should not be used. It is therefore not surprising, and fortunate, that very few prediction models, relative to the large number of models published, are widely implemented or used in clinical practice [6].

Prediction models in medicine have proliferated in recent years. Health care providers and policy makers are increasingly recommending the use of prediction models within clinical practice guidelines to inform decision making at various stages in the clinical pathway [30,31]. It is a general requirement of reporting of research that other researchers can, if required, replicate all the steps taken and obtain the same results [32]. It is therefore essential that key details of how a prediction model was developed and validated be clearly reported to enable synthesis and critical appraisal of all relevant information [14,3336].

Reporting guidelines for prediction model studies: the TRIPOD statement

We describe the development of the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) Statement, a guideline specifically designed for the reporting of studies developing or validating a multivariable prediction model, whether for diagnostic or prognostic purposes. TRIPOD is not intended for multivariable modeling in etiologic studies or for studies investigating single prognostic factors [37]. Furthermore, TRIPOD is also not intended for impact studies that quantify the impact of using a prediction model on participant or doctors’ behavior and management, participant health outcomes, or cost-effectiveness of care, compared with not using the model [13,38].

Reporting guidelines for observational (the STrengthening the Reporting of OBservational studies in Epidemiology [STROBE]) [39], tumor marker (REporting recommendations for tumour MARKer prognostic studies [REMARK]) [37], diagnostic accuracy (STAndards for the Reporting of Diagnostic accuracy studies [STARD]) [40], and genetic risk prediction (Genetic RIsk Prediction Studies [GRIPS]) [41] studies all contain many items that are relevant to studies developing or validating prediction models. However, none of these guidelines are entirely appropriate for prediction model studies. The 2 guidelines most closely related to prediction models are REMARK and GRIPS. However, the focus of the REMARK checklist is primarily on prognostic factors and not prediction models, whereas the GRIPS statement is aimed at risk prediction using genetic risk factors and the specific methodological issues around handling large numbers of genetic variants.

To address a broader range of studies, we developed the TRIPOD guideline: Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis. TRIPOD explicitly covers the development and validation of prediction models for both diagnosis and prognosis, for all medical domains and all types of predictors. TRIPOD also places much more emphasis on validation studies and the reporting requirements for such studies. The reporting of studies evaluating the incremental value of specific predictors, beyond established predictors or even beyond existing prediction models [18,42], also fits entirely within the remit of TRIPOD (see the accompanying explanation and elaboration document [43]).

Developing the TRIPOD statement

We convened a 3-day meeting with an international group of prediction model researchers, including statisticians, epidemiologists, methodologists, health care professionals, and journal editors (from Annals of Internal Medicine, BMJ, Journal of Clinical Epidemiology, and PLoS Medicine) to develop recommendations for the TRIPOD Statement.

We followed published guidance for developing reporting guidelines [44] and established a steering committee (Drs. Collins, Reitsma, Altman, and Moons) to organize and coordinate the development of TRIPOD. We conducted a systematic search of MEDLINE, EMBASE, PsychINFO, and Web of Science to identify any published articles making recommendations on reporting of multivariable prediction models (or aspects of developing or validating a prediction model), reviews of published reports of multivariable prediction models that evaluated methodological conduct or reporting and reviews of methodological conduct and reporting of multivariable models in general. From these studies, a list of 129 possible checklist items was generated. The steering committee then merged related items to create a list of 76 candidate items.

Twenty-five experts with a specific interest in prediction models were invited by e-mail to participate in the Web-based survey and to rate the importance of the 76 candidate checklist items. Respondents (24 of 27) included methodologists, health care professionals, and journal editors. (In addition to the 25 meeting participants, the survey was also completed by 2 statistical editors from Annals of Internal Medicine).

The results of the survey were presented at a 3-day meeting in June 2011, in Oxford, United Kingdom; it was attended by 24 of the 25 invited participants (22 of whom had participated in the survey). During the 3-day meeting, each of the 76 candidate checklist items was discussed in turn, and a consensus was reached on whether to retain, merge with another item, or omit the item. Meeting participants were also asked to suggest additional items. After the meeting, the checklist was revised by the steering committee during numerous face-to-face meetings, and circulated to the participants to ensure it reflected the discussions. While making revisions, conscious efforts were made to harmonize our recommendations with other reporting guidelines, and where possible we chose the same or similar wording for items [37,39,41,45,46].

TRIPOD components

The TRIPOD Statement is a checklist of 22 items that we consider essential for good reporting of studies developing or validating multivariable prediction models (Table 1). The items relate to the title and abstract (items 1 and 2), background and objectives (item 3), methods (items 4 through 12), results (items 13 through 17), discussion (items 18 through 20), and other information (items 21 and 22). The TRIPOD Statement covers studies that report solely development [12,15], both development and external validation, and solely external validation (with or without updating), of a prediction model [14] (Figure 1). Therefore, some items are relevant only for studies reporting the development of a prediction model (items 10a, 10b, 14, and 15), and others apply only to studies reporting the (external) validation of a prediction model (items 10c, 10e, 12, 13c, 17, and 19a). All other items are relevant to all types of prediction model development and validation studies. Items relevant only to the development of a prediction model are denoted by D, items relating solely to a validation of a prediction model are denoted by V, whereas items relating to both types of study are denoted D;V.

Table 1 Checklist of items to include when reporting a study developing or validating a multivariable prediction model for diagnosis or prognosis*

The recommendations within TRIPOD are guidelines only for reporting research and do not prescribe how to develop or validate a prediction model. Furthermore, the checklist is not a quality assessment tool to gauge the quality of a multivariable prediction model.

An ever-increasing number of studies are evaluating the incremental value of specific predictors, beyond established predictors or even beyond existing prediction models [18,42].The reporting of these studies fits entirely within the remit of TRIPOD (see accompanying explanation and elaboration document [43].

The TRIPOD explanation and elaboration document

In addition to the TRIPOD Statement, we produced a supporting explanation and elaboration document [43] in a similar style to those for other reporting guidelines [4749]. Each checklist item is explained and accompanied by examples of good reporting from published articles. In addition, because many such studies are methodologically weak, we also summarize the qualities of good (and the limitations of less good) studies, regardless of reporting [43]. A comprehensive evidence base from existing systematic reviews of prediction models was used to support and justify the rationale for including and illustrating each checklist item. The development of the explanation and elaboration document was completed after several face-to-face meetings, teleconferences, and iterations among the authors. Additional revisions were made after sharing the document with the whole TRIPOD group before final approval.

Role of the funding source

There was no explicit funding for the development of this checklist and guidance document. The consensus meeting in June 2011 was partially funded by a National Institute for Health Research Senior Investigator Award held by Dr. Altman, Cancer Research UK, and the Netherlands Organization for Scientific Research. Drs. Collins and Altman are funded in part by the Medical Research Council. Dr. Altman is a member of the Medical Research Council Prognosis Research Strategy (PROGRESS) Partnership. The funding sources had no role in the study design, data collection, analysis, preparation of the manuscript, or decision to submit the manuscript for publication.

Discussion

Many reviews have showed that the quality of reporting in published articles describing the development or validation of multivariable prediction models in medicine is poor [2327,29]. In the absence of detailed and transparent reporting of the key study details, it is difficult for the scientific and health care community to objectively judge the strengths and weaknesses of a prediction model study [34,50,51]. The explicit aim of this checklist is to improve the quality of reporting of published prediction model studies. The TRIPOD guideline has been developed to support authors in writing reports describing the development, validation or updating of prediction models, aid editors and peer reviewers in reviewing manuscripts submitted for publication, and help readers in critically appraising published reports.

The TRIPOD Statement does not prescribe how studies developing, validating, or updating prediction models should be undertaken, nor should it be used as a tool for explicitly assessing quality or quantifying risk of bias in such studies [52]. There is, however, an implicit expectation that authors have an appropriate study design and conducted certain analyses to ensure all aspects of model development and validation are reported. The accompanying explanation and elaboration document describes aspects of good practice for such studies, as well as highlighting some inappropriate approaches that should be avoided [43].

TRIPOD encourages complete and transparent reporting reflecting study design and conduct. It is a minimum set of information that authors should report to inform the reader about how the study was carried out. We are not suggesting a standardized structure of reporting, rather that authors should ensure that they address all the checklist items somewhere in their article with sufficient detail and clarity.

We encourage researchers to develop a study protocol, especially for model development studies, and even register their study in registers that accommodate observational studies (such as ClinicalTrials.gov) [53,54]. The importance of also publishing protocols for developing or validating prediction models, certainly when conducting a prospective study, is slowly being acknowledged [55,56]. Authors can also include the study protocol when submitting their article for peer review, so that readers can know the rationale for including individuals into the study or whether all of the analyses were prespecified.

To help the editorial process; peer reviewers; and, ultimately, readers, we recommend submitting the checklist as an additional file with the report, indicating the pages where information for each item is reported. The TRIPOD reporting template for the checklist can be downloaded from [57].

Announcements and information relating to TRIPOD will be broadcast on the TRIPOD Twitter address (@TRIPODStatement). The Enhancing the QUAlity and Transparency Of health Research (EQUATOR) Network [58] will help disseminate and promote the TRIPOD Statement.

Methodological issues in developing, validating, and updating prediction models evolve. TRIPOD will be periodically reappraised, and if necessary modified to reflect comments, criticisms, and any new evidence. We therefore encourage readers to make suggestions for future updates so that ultimately, the quality of prediction model studies will improve.