Personalized diagnosis in suspected myocardial infarction

Background In suspected myocardial infarction (MI), guidelines recommend using high-sensitivity cardiac troponin (hs-cTn)-based approaches. These require fixed assay-specific thresholds and timepoints, without directly integrating clinical information. Using machine-learning techniques including hs-cTn and clinical routine variables, we aimed to build a digital tool to directly estimate the individual probability of MI, allowing for numerous hs-cTn assays. Methods In 2,575 patients presenting to the emergency department with suspected MI, two ensembles of machine-learning models using single or serial concentrations of six different hs-cTn assays were derived to estimate the individual MI probability (ARTEMIS model). Discriminative performance of the models was assessed using area under the receiver operating characteristic curve (AUC) and logLoss. Model performance was validated in an external cohort with 1688 patients and tested for global generalizability in 13 international cohorts with 23,411 patients. Results Eleven routinely available variables including age, sex, cardiovascular risk factors, electrocardiography, and hs-cTn were included in the ARTEMIS models. In the validation and generalization cohorts, excellent discriminative performance was confirmed, superior to hs-cTn only. For the serial hs-cTn measurement model, AUC ranged from 0.92 to 0.98. Good calibration was observed. Using a single hs-cTn measurement, the ARTEMIS model allowed direct rule-out of MI with very high and similar safety but up to tripled efficiency compared to the guideline-recommended strategy. Conclusion We developed and validated diagnostic models to accurately estimate the individual probability of MI, which allow for variable hs-cTn use and flexible timing of resampling. Their digital application may provide rapid, safe and efficient personalized patient care. Trial Registration numbers Data of following cohorts were used for this project: BACC (www.clinicaltrials.gov; NCT02355457), stenoCardia (www.clinicaltrials.gov; NCT03227159), ADAPT-BSN (www.australianclinicaltrials.gov.au; ACTRN12611001069943), IMPACT (www.australianclinicaltrials.gov.au, ACTRN12611000206921), ADAPT-RCT (www.anzctr.org.au; ANZCTR12610000766011), EDACS-RCT (www.anzctr.org.au; ANZCTR12613000745741); DROP-ACS (https://www.umin.ac.jp, UMIN000030668); High-STEACS (www.clinicaltrials.gov; NCT01852123), LUND (www.clinicaltrials.gov; NCT05484544), RAPID-CPU (www.clinicaltrials.gov; NCT03111862), ROMI (www.clinicaltrials.gov; NCT01994577), SAMIE (https://anzctr.org.au; ACTRN12621000053820), SEIGE and SAFETY (www.clinicaltrials.gov; NCT04772157), STOP-CP (www.clinicaltrials.gov; NCT02984436), UTROPIA (www.clinicaltrials.gov; NCT02060760). Graphical Abstract Supplementary Information The online version contains supplementary material available at 10.1007/s00392-023-02206-3.


Brief overview on model development and validation
Machine learning models were developed, validated, and generalized to estimate the individual probability of myocardial infarction (MI) in patients presenting to the emergency department (ED) having symptoms potentially indicative of MI. Separate models were estimated for each of the 6 investigated hs-cTn assays twice.
• Single hs-cTn measurement: the model uses troponin information from the first troponin measurement only plus patient-specific data.
• Serial hs-cTn measurement: the model uses troponin information from the first troponin measurement and a second troponin measurement with the same assay obtained at a later time point during ED stay plus patient-specific data. Details of the estimation approach is provided in the subsequent sections of this supplement. In brief: • All modeling steps were done with 10-fold cross-validation.
• We first imputed missing data using multiple imputation.(1) • We employed 11 different learning machines for each of the hs-cTn assays within each cross-validation for both single and serial hs-cTn measurements and a total of 18 clinical variables were initially offered (full models).
• A clinical variable was included if it was selected by at least 5 machines (reduced models). This was thought to improve model stability.
• Finally, we selected the four best-performing non-redundant machines for single hs-cTn measurement as well as the four best-performing non-redundant machines for serial hs-cTn measurements and combined them into a super learner with equal weights for single and serial hs-cTn measurements separately to estimate MI probability. Individual MI probabilities estimated by both super learners were expressed as a number ranging theoretically between 0-100%.
• The performance of the single and serial hs-cTn measurement super learner model was primarily assessed by logLoss (lower values indicate better performance) and, as secondary performance measures, by the AUC (higher values indicate better performance) as well as the Brier score (lower values indicate better performance).
• We compared o the AUC, Brier score and logLoss of the reduced machines to o the full machines and to o machines using only hs-cTn measures without other clinical variables and to o machines including estimated glomerular filtration rate (eGFR) using the CKD-EPI formula. (2) • We compared single machines with super learner machines.
• Comparisons were done with the Iman and Davenport version of the Friedman test followed by the Nemenyi post-hoc test. (3) • The single and serial hs-cTn measurement models were validated in the external validation dataset and the generalization datasets after re-calibration using logistic regression with restricted cubic splines. Calibration curves were generated to assess model calibration.
• DerSimonian and Laird random effect meta-analyses were estimated for averaging results between generalization data sets.(4) • Tables and figures were created to demonstrate the diagnostic performance across the spectrum of possible MI probability thresholds in one percent increments. Diagnostic performance measures included negative and positive predictive value (NPV and PPV), sensitivity and specificity, proportion of patients below or above a given MI probability threshold as well as corresponding 30-day incidence of MI or death. These tables and figures could be used to identify patients at low risk of MI suitable for outpatient management or those at high risk who are suitable to inpatient or invasive strategies.

Source of Data
The procedure was divided into three steps: 1. derivation, i.e., model development, 2. external validation, and 3. generalization. Five different troponin measurements were used for model development and in the cohort used for external validation. One, two or three different troponin assays were measured in the generalization cohorts. Measurements for a 6 th troponin assay together with one of the other 5 troponin assays was available in an additional set of patients from the derivation cohort. Patients with STsegment elevation MI were excluded from all parts of this study.

Model derivation data
The Biomarkers in Acute Cardiac Care (BACC) study (6) was used for model development. BACC has been described previously (6). Briefly, we included 2719 patients presenting to the ED and chest pain unit of the University Heart Center Hamburg with suspected acute MI. All patients were enrolled between July 2013 and December 2019. The inclusion criteria were a suspected MI, age ≥ 18 years and the ability to provide written informed consent. We excluded all patients with ST-elevation MI (n = 144) from further analyses. An additional n = 135 BACC patients without ST-elevation MI was included for the integration of an additional assay into the machine learning model. The diagnosis was adjudicated in a blinded fashion by two physicians independently according to the fourth Universal definition of MI. In cases of disagreement a third physician was consulted and disagreements resolved. The BACC study was registered at www.clinicaltrials.gov (NCT02355457).

External validation data
The stenoCardia study (7) was used for model validation. The methodology, follow-up, and adjudication of outcomes in stenoCardia have been reported in detail previously (7).
Between January 2007 and December 2008, 1818 patients with suspected acute coronary syndrome were consecutively enrolled in an observational multicentre cohort at three German tertiary care centers (University Medicine Mainz, University Hospital Hamburg-Eppendorf and Federal Armed Forces Hospital Koblenz). We excluded individuals with ST-segment-elevation MI (n=130) because the electrocardiographic diagnosis requires immediate treatment and biomarker diagnosis plays a minor role. The stenoCardia study was registered at www.clinicaltrials.gov (NCT03227159).

Generalization data
Generalization was done using the cohorts listed in Table 1. A brief description of each cohort is provided below.

ADPs-CH
The ADAPT-CH study was prospectively performed in accordance with the ADAPT-BSN study (see above). The ADAPT-RCT, registered at www.anzctr.org.au (ANZCTR12610000766011), aimed at comparing the effectiveness of a rapid diagnostic pathway with a standard-care diagnostic pathway for the assessment of patients with possible cardiac chest pain in a usual clinical practice setting. ED patients, where the attending physician was investigating for possible acute coronary syndrome, were included. Two senior clinicians adjudicated independently for any major adverse cardiac event within 30 days. A third senior clinician adjudicated any disagreements with the first two clinicians. The ED Assessment of Chest Pain Score -Randomized Controlled Trial (EDACS-RCT, registered at www.anzctr.org.au, ANZCTR12613000745741) aimed to test for the existence and size of any beneficial effect of using the EDACS-ADP protocol in routine clinical care compared with the ADAPT-ADP protocol. SPACE-24 was an observational study with additional sampling time points. Cohort selection and gold standard diagnosis for EDACS-ADP and SPACE-24 were identical to the ADAPT-RCT. All studies were initiated and based in Christchurch Hospital, New Zealand with majority of recruitment from this center.

DROP-ACS
The Diagnostics and Reduction of Asian Patients with Acute Coronary Syndrome Cost Analysis Base on the 0/1-h Algorithm Using High-sensitivity Cardiac Troponin (DROP-ACS) study is prospective, international, multicenter, diagnostic, cohort investigation conducted at five sites in two countries (Japan and Taiwan) that began in November 2014. Briefly, we included patients presenting to the ED and chest pain. All patients were enrolled until June 2022. The inclusion criterion was adults (aged 30-89 years) presenting with chest pain related to a suspected cardiac cause. Implementation of the 0/1-h algorithm was left to the discretion of the attending physicians. The exclusion criteria were as follows: (1) STEMI; (2) chronic kidney disease (serum creatinine level > 3 mg/dL); (3) congestive heart failure, defined as the presence of hypoxia and typical pulmonary congestion confirmed on a chest radiograph; (4) ventricular tachycardia. Informed consent was obtained from all patients. The gold standard diagnosis was adjudicated in a blinded fashion by two physicians independently according to the fourth Universal definition of MI. In cases of disagreement a third physician referred. The DROP-ACS study was registered at https://www.umin.ac.jp (UMIN000030668).

FASTEST
The "Fast ASsessment of Thoracic pain in the Emergency department using high-Sensitive Troponins" (FASTEST) study is a prospective multicenter study divided into two phases. The study was performed at six different Swedish centers enrolling unselected patients with acute chest pain presenting to the ED. The primary objective of the study was to determine whether a diagnostic strategy based on early serial measurement of high-sensitive troponins and a simple risk score will reduce the admission rate in patients with symptoms suggestive of ACS. The inclusion criteria were age ≥18 years; chest pain suggestive of ACS with duration of ≥10 min and onset of last episode within 12 hours; willingness to have extra blood samples taken; and written informed consent. The exclusion criteria were ST-segment elevation or new left bundle branch block. All patients underwent standard clinical assessment including repeated measurements of cTn, in phase 1 at ED presentation, 2 hours and 6-24 hours and in phase 2 at presentation, 1 hour and 6-24 hours. At five centers, the Elecsys Troponin T high-sensitive assay (Roche) was used and at one center, the ARCHITECT STAT High Sensitive Troponin-I assay (Abbott). Plasma samples for biobanking were obtained at presentation, after 2h (phase 1), 1h (phase 2) and 6-24 hours. After discharge from the hospital, all patients were followed for cardiac events by telephone contacts and, if necessary, by validation in patient records at 30 days. No patient was lost to follow-up. The index diagnosis and cardiovascular events up to 30 days were adjudicated by independent reviewers.

High-STEACS
The High-Sensitive Troponin in the Evaluation of Acute Coronary Syndrome (High-STEACS) sub-study prospectively enrolled patients with suspected acute coronary syndrome presenting to the Royal Infirmary of Edinburgh, Edinburgh, Scotland (clinical trials registration at www.clinicaltrials.gov, NCT01852123). All patients provided written informed consent. Patients were excluded if they had a previous presentation during the study period or were not resident in Scotland. Patients with ST-elevation MI were not included. The final diagnosis was adjudicated for all patients by two independent physicians, with consensus from a third physician where there was discrepancy. Patients were classified as having type 1 MI, type 2 MI or myocardial injury in accordance with the third universal definition of MI. Any hs-TnI concentration above the sex-specific 99th centile upper reference limit (16ng/L for women, 34ng/L for men) was considered evidence of myocardial necrosis. All patients were followed for at least one year for subsequent MI or cardiac death.

LUND
The Lund Chest Pain Study included patients presenting with nontraumatic chest pain to the ED of Skåne University Hospital in Lund (clinical trials registration at www.clinicaltrials.gov, NCT05484544). In this prospective observational study, clinical data, 1h hs-TnT, and ED physicians' assessment of the patient history and ECG were collected. The primary objective was to evaluate the diagnostic accuracy of the 1h algorithm when used in conjunction with patient history and ECG, to predict MACE within 30 days as compared to using 1h hs-TnT alone. The goldstandard was a final adjudicated diagnosis of 30-day MACE, as decided by independent reviews by two cardiologists, and in case of disagreement, by a third cardiologist. The cardiologists were blinded to the 1h hs-TnT. The adjudicated diagnoses were based on all available clinical information from all hospitals in Sweden within 60 days from the index visit, such as patient history and results of blood samples, ECG, echocardiography, stress test, and coronary angiography.

Rapid-CPU
The RAPID-CPU consists of two study populations (registration at www.clinicaltrials.gov, NCT03111862). Consecutive patients presenting to the ED at University hospital Heidelberg, Germany, with acute symptoms suggestive of acute coronary syndrome were included in a retrospective registry study. Consecutive patients with suspected acute coronary syndrome were recruited between 1 July 2016 and 30 June 2017 (cohort 1) and between 31 June and 1 July 2018 (cohort 2). In both populations, a final adjudicated diagnosis was made by two independent cardiologists. A third cardiologist refereed in cases of disagreement. The adjudication was based on all available clinical and imaging parameters (e.g. ECG, Echo, MRI, Angiography), as well as hs-cTnT. The criteria for the diagnosis of MI were based on the fourth universal MI definition.

ROMI
After approval from the research ethics board, the Optimal Troponin Cutoffs for acute coronary syndrome in the ED (ROMI-3: Rule-out MI 3h; registration at www.clinicaltrials.gov, NCT01994577) study prospectively enrolled consecutive adults (18 years or older) who presented to the ED with symptoms suggestive of ACS and for whom the ED physician ordered cardiac troponin. The adjudication process was led by an emergency physician with at least 2 of the study authors independently adjudicating the outcomes with disagreements not resolved by consensus referred to a third reviewer. All outcome adjudicators were blinded to the hs-cTn results, with MI in the study population defined as per the Third Universal Definition with the contemporary Abbott cTnI assay.

SAMIE
The Suspected acute MI in Emergency (SAMIE) study included 2022 patients presenting to the Emergency Department of five Australian hospitals. All patients were enrolled between November 2020 and September 2021. The inclusion criteria were age ≥ 18 years and the treating clinician investigated for acute MI. Exclusion criteria were all patients who presented with ST-elevation MI who were directly transferred for cardiac catheterization, patients transferred from another hospital, previous enrolment within 30 days, pregnancy, unwillingness or inability to provide informed consent, and recruitment was considered inappropriate (e.g., palliative patient). The gold standard diagnosis was adjudicated in a blinded fashion by one cardiologist to the fourth Universal definition of MI. A second cardiologist reviewed all cases of type 2 MI, with a third clinician reviewing any discrepancies. The SAMIE study was registered at https://anzctr.org.au (ACTRN12621000053820).

SEIGE and SAFETY
The SEIGE and SAFETY studies are prospective, observational studies of consecutive patients presenting to the ED in whom hs-cTnI measurements were obtained. The purpose of the study is to evaluate the clinical performance of the Siemens Atellica VTLi point of care hs-cTnI test system (SEIGE) and the Attelica IM central laboratory hs-cTnI assay (SAFETY) for the diagnosis and rule out of MI in patients presenting to the emergency department in whom serial cTnI measurements are obtained on clinical indication. All participants were enrolled between October 2020 and October 2021 at the Hennepin County Medical Center (Minneapolis, MN, USA). The gold standard diagnosis was adjudicated in a blinded fashion by two physicians independently according to the fourth Universal Definition of MI. The study was registered at www.clinicaltrials.gov (NCT04772157).

STOP-CP
The High Sensitivity Cardiac Troponin T to Optimize Chest Pain Risk Stratification (STOP CP) study has been described previously (17). Briefly, the study included 1457 patients presenting to the ED across eight sites in the United States with suspected acute coronary syndrome. All patients were enrolled between July 2013 and December 2019. The inclusion criteria were suspected ACS, age ≥ 21 years and the ability to provide written informed consent. The following patients were excluded: Patients with ST-segment elevation MI at ED presentation, systolic blood pressure <90 mm Hg, a life expectancy of <90 days, a noncardiac illness requiring admission, lack of capacity to provide consent, inability to be contacted for follow-up, non-English speaking, pregnancy, and prior enrollment in the current study. The gold standard diagnosis was adjudicated in a blinded fashion by two physicians independently according to the fourth Universal definition of MI. In cases of disagreement a third physician referred. The STOP-CP study was registered at www.clinicaltrials.gov (NCT02984436).

UTROPIA
The Use of TROPonin In Acute coronary syndromes (UTROPIA) study is an observational cohort study performed at the Hennepin County Medical Center, Minneapolis, Minnesota, United States. Patients presenting to the emergency department within the defined study period (February 2014 to June 2016) were considered for inclusion. Participants were included, when they had two or more cTnI values ordered for any clinical indication with specimen available for hs cTnI assay, 18 years of age or older, ECG done on admission and agreed to disclosure research. All cases with at least one hs-cTnI >99th percentile were adjudicated according to the Third Universal Definition of MI consensus recommendations by two clinicians following review of all available medical records including 12-lead ECG, echocardiography, angiography, hs-cTnI concentrations, and clinical presentation. Cases with an adjudication discrepancy were reviewed and adjudicated by a third senior clinician. The study was registered at www.clinicaltrials.gov (NCT02060760).

Outcome variable
The primary outcome was acute Non-ST-elevation MI at index presentation (yes/no). Individuals without a diagnosis were excluded.
Secondary outcome was the 30-day incidence of death or MI.

Independent variables (candidate features)
Both troponin-related variables (features; see Troponin measurements were logtransformed using the natural logarithm to reduce skewness. For the same reason, odd roots of the troponin rate were examined, and the 5th root was selected. Log troponin measurement features and limit of detection indicators were multiplied by 10 to reduce the magnitude of the coefficients obtained in logistic regression. One troponin measurement was randomly selected in case more than one troponin measurement was available at follow up. Table 2) and non-troponin variables, termed patient-specific features were considered during model development (Table 3). Six different high-sensitivity troponin assays were available for model development (Table 4). Non-high-sensitivity troponin assays did not qualify for model development.
Troponin measurements were log-transformed using the natural logarithm to reduce skewness. For the same reason, odd roots of the troponin rate were examined, and the 5 th root was selected. Log troponin measurement features and limit of detection indicators were multiplied by 10 to reduce the magnitude of the coefficients obtained in logistic regression. One troponin measurement was randomly selected in case more than one troponin measurement was available at follow up.

Model development (derivation)
A five-step approach was taken to for model development using the BACC study: 1. Imputation of missing data: Missing data in BACC were imputed using multiple imputation with mice (19). Five imputed datasets were generated; one randomly selected imputed dataset was kept for analyses. Only those subjects were kept for troponin assay-specific analyses if at least one assay measurement was available for this subject. 2. Full machines: Different learning machines were computed using all candidate features as input. The machine learning methods used in this project are described in Table 5. The machines lr, lrbs, lrus, glmboost, lrrcs, en, mars, gbm, rf, rffs, and svm were computed using the candidate features defined above. Support vector machines (SVM) were only used for methodological comparisons.

Performance measures
The following three measures were calculated to determine the performance of machines and super learners: • LogLoss (primary measure):
• AUC (secondary measure): the definition of the AUC has been provided, e.g., by Wang and Guo (21). We stress that the AUC measures classification performance. The variance expression has been provided by Wang and Guo (21). R code for performing the variance calculations are available in Appendix C of Wang and Guo (21). The code is displayed in Section 9 of this document.
Cross-validation procedure Estimation was done using 10-fold cross-validation (10CV). Specifically, the dataset was randomly divided into v = 10 equally sized parts, the so-called folds. Each fold was removed from the dataset in turn; all machines were fitted using the remaining 9 folds, and its performance was evaluated in the left-out fold. At the end of this procedure, 10 estimates of the machines and their performance measures were obtained. The average over the 10 estimates was taken as performance measure. Machines were only compared for the same troponin-assay, and the same 10 folds were used to allow fair comparison of the machines. When a machine was fitted on the selected 9 folds, all the computations required to obtain the machine were performed inside the 9 folds. In particular, if there were parameters that required tuning, this was done inside of the 9 folds, possibly with a new CV inside the 9 folds.
3. Feature selection: The number of features was reduced using the CV procedure described above. Six of the full machines were used for the feature selection: lrbs, lrus, glmboost, en, mars and rffs. Since the same 10 CV folds were utilized in all steps of model development, the CV procedure was also used for feature selection. The following criteria were used to select features that were subsequently used in the reduced machines: • If a feature was selected ≥ 5 times out of the 10 CV folds for ≥ 1 assay model, the feature was kept. • Identical features were selected across assays. This means that if a feature was selected for a specific assay, it was automatically kept for the other assays. • Age was kept.
• Sex was kept.
• If a troponin measurement was selected, the corresponding limit of quantification indicator for that measurement was kepty, and vice versa. 4. Reduced learning machines: Learning machines were recomputed using the selected features only. Analyses were restricted to machines without intrinsic feature selection. lrbs, lrus and rffs were thus not computed on the reduced feature set. 5. Super learner: The four learning machines with the best LogLoss performance on the reduced feature set across all assays were used to develop super learners (SL) with equal weights (Slew). For a single hs-cTn measurement model, scatterplots showed that cross-validated MI probabilities of en, glmboost and lr were almost identical. For this reason, only glmboost was selected because it performed slightly better than en and lr. An optimized SL was obtained weighting the machines using a convex combination of using a linear lasso with non-negative normalized coefficients. The LogLoss and 10CV was used to select the lasso penalization parameter. Slews combining the best two, three, and four reduced machines were developed, and their performance was evaluated using the LogLoss. Its performance was compared with the other reduced machines and the optimized SL. The SL with the best performance across troponin assays was selected as the machine for estimating the probability of AMI. 6. Comparison of learning machines: To demonstrate the adequacy of the approach taken, the performance of the Slew in the reduced model was compared to a) the single machines, b) the Slew of the full models, c) the Slew of the models including troponin only, d) the Slew of full models also including eGFR, e) the Slew of reduced models also including eGFR. For these comparisons across troponin assays, we used the method described by Demšar (22). In brief, we first used the Iman and Davenport modification of the Friedman test (23) as global test. This was followed by the Nemenyi post-hoc test (24). Elastic net logistic regression. The elastic net is a mixture of the ridge ( penalization) and the lasso ( penalization) (26). As these two the elastic net shrinks the coefficients towards zero and like the lasso it performs feature selection. The elastic net mixing parameter and the penalization parameter were estimated by 10-fold cross-validation (10CV). Gradient boosting machine with trees as the base learners. The following hyperparameters were estimated by minimization of the LogLoss using 10CV: depth of the tree, number of trees, learn rate, reduction in the loss function required to split further, proportion of candidate variables sampled at each split, proportion of observations used to fit each tree and minimum number of observations in a terminal node. The model is described in detail below.
rf ranger (ranger) Random forest (RF) in regression mode with subsampling and maximally selected rank statistics as split rule (28). The number of variables to consider for splitting at each node (mtry), minimal terminal node size (nodesize) and the sample fraction sample.fraction were tuned. The ranger function from the ranger package was used (29). svm ksvm (kernlab); tidymodels for tuning Support vector machines with radial basis function kernel. Minimization of the LogLoss using 10CV was used to select the parameters cost of constraints violations and the inverse kernel width for the radial basis function sl Super learner (SL). The SL merges different machines by assessing their performance and then generating an optimal convex combination of the predicted probabilities from the different machines (31). Ten-fold cross-validation and the LogLoss were used to select the optimal combination. slew Super learner with equal weights. This simpler version of the SL uses a convex combination with identical weights.

Validation of model
The assay-specific models were applied in stenoCardia for validation. Multiple imputation was performed as in the BACC data. Probability estimates were recalibrated using the four different approaches described in the next Section 6.

Calibration
Four calibration approaches were employed in both the validation and the generalization step: no calibration, logistic calibration, logistic calibration with restricted cubic splines and Elkan's approach to recalibration. The performance was evaluated using the measures described in Section 4, item 2.

Probability estimation -uncalibrated
Probability estimates were obtained for the validation data and for each of the m imputed datasets from generalization. Probability estimates were calculated for the three machines single machines included in the Slew and the Slew for the single hs-cTn measurement model, and all four machines included in the Slew and the Slew for the serial hs-cTn measurement model.

Logistic regression with restricted cubic splines (RCS) (logreg-RCS)
To calibrate probability estimates, we used logistic regression with restricted cubic splines (RCS) with k = 4 knots (32). In case of numerical instabilities, we reduced the number the knots. The location of the knots were chosen using the quantiles of the data (32).

Logistic regression (logreg)
For sake of comparison with logreg-RCS we also used logistic regression calibration, i.e., a logistic regression was fitted to the outcome using the linear predictor from the probability of NSTEMI as explanatory variable.

Elkan's approach
The calibrated probability can be estimated using the prevalence of MI in the old and new populations using the update formula (33) . = ( = 1| ) = .
(1 − ) where ( . ) is the prevalence in the old (new) population, and ( . ) is the probability estimate in the old (new) population.

Generalization of model
The assay-specific models for the single and serial hs-cTn measurement models were applied to the international cohort studies described in Section 2.3. Calibration analyses were performed as described for stenoCardia.

Performance measures
For each cohort of interest, the LogLoss, Brier score and AUC were calculated. The point estimate and variance of the LogLoss, Brier score and AUC were pooled across the m imputed datasets using RR.

Classification performance
The classification performance was quantified by pooling the sensitivity, specificity, as well as the positive and negative predictive values. To quantify the variance of these parameters across the imputed datasets, the binomial variance estimator Var() = %(!'%) " was used.

Meta-analysis: pooling of estimates across studies
Estimates were pooled across cohorts using random effect (RE) meta-analysis according to DerSimonian and Laird (4).

Plots
For visualization, local estimated scatterplot smoothing (LOESS) curves representing the MI and the calibrated predicted probability were created.

Incorporation of the Siemens Atellica VTLi
While the other five investigational hs-cTn assays were all available at time of model development, the Atellica VTLi was added to the super learners post-hoc by estimating a total least squares regression from the Atellica VTLi hs-cTnI to the Atellica hs-cTnI using 135 subjects from BACC with two troponin measurements available each on both assays. Total least squares (TLS) regression, also termed Deming regression, with intercept and slope was estimated to predict Siemens Atellica troponin concentrations from Siemens Atellica VTLi troponin concentrations. The single and serial hs-cTn measurement models were applied after predicting the Atellica troponin concentrations from the Atellica VTLi hs-cTn concentrations.

Sample size considerations
Sample size considerations were done for the derivation, validation, and generalization cohorts as well as the sum over all generalization cohorts using the approach of Riley et al. (34) To this end, sample sizes were fixed as well as the proportion of myocardial infarctions. The precision for the MI probability was estimated. In the derivation cohort, the precision was well below 0.015, for the validation it was 0.0170. Finally, each cohort from the generalization studies, the precision was between 0.01 and 0.023.

Calculation of the variance of the AUC
The code displayed in this section follows the code provided by Wang and Guo (21).
#mydata: a data frame with a column called "Y" for the binary outcome variable #myfit: a glm object (