Introduction

Kawasaki disease (KD) is a life-threatening acute vasculitis that diffusely affects multiple organ systems in children. Coronary artery dilatations and aneurysms can occur and represent the most serious KD complications1,2. For most patients, KD is self-limited and lacks the chronic nature of other autoimmune diseases; however, the pathological walls of afflicted vessels show a propensity for forming thrombosis and aneurysms3. If untreated or treatment fails, the vasculitis can lead to coronary aneurysm or thrombosis in 20–25% of cases, potentially resulting in ischemic heart disease, myocardial infarction, or death4,5. Clinical trials conducted in the 1980s and 1990s showed that IVIG treatment dramatically reduced occurrence of persistent CAA defined, primarily by Japanese Ministry of Health criteria6,7. These criteria stated that coronary artery diameters ≥3 mm in children <5 years and ≥4 mm in children ≥5 years were classified as abnormal. Echocardiographic detection and definition of significant CAA has dramatically improved over the past 2 to 3 decades. A 2007 study by the National Heart Lung and Blood Institute (NHLBI) showed that approximately 18 to 20% of patients had persistent CAA determined using coronary artery z scores8,9. The 2017 AHA guidelines adjusted the definition for CAA as z score ≥ 2.5 and ≥5 defines a medium to large CAA1. Although these coronary abnormalities show higher prevalence in IVIG refractory patients, they can still occur in patients seemingly responsive and showing fever resolution. Most of these children with large aneurysms require daily lifelong anti-coagulation, often with twice daily painful low molecular weight heparin injections, as warfarin is difficult to maintain within therapeutic range in children.

A study based on a large-scale Japanese cohort reported that coronary events did not occur in patients with small CAA; however, 5% of patients with medium CAA and 35% of patients with large CAA had coronary events10. North American studies support these data from Japan11. The most severe form, giant coronary artery aneurysm (GCA), has been shown to be associated with complications such as luminal narrowing, thrombosis, and major cardiac events11,12, and substantially alters quality of life for KD patients. Children with GCA require lifelong anti-coagulation, exercise restrictions, and often treatment for ischemic heart disease such as coronary stenting and/or bypass grafting1. The majority of patients with GCA develop clinically important stenosis from intimal hypertrophy during the late convalescent phase13. Japanese males with coronary artery involvement have a mortality rate 2.4 times higher than expected in general population14, but the overall impact on KD patients in the U.S. still requires definition. A recent systematic review showed that mid- to large-sized CAA provided the most significant risk factor for reducing survival of patients with KD15. Mid- to large-sized CAA showed a slower recovery with worse prognosis than smaller CAA16. Another study reported that the high persistence probability of mid- to large-sized CAA significantly increased the cardiovascular risk at 1 year after KD onset, when approximately two-thirds of the acute myocardial infarction cases occur17.

Predicting increased risk for persistent large (medium to giant) size coronary aneurysms despite IVIG treatment is clinically important for intensification of treatment and disease management. Algorithms combining clinical and lab data in Japanese populations predict risk with respect to persistent CAA18. However, the Japanese algorithms show poor predictive value for risk in North American and European cohorts8,19, and fail even in some Asian populations20,21. Thus, no universal biomarker or algorithm accurately predicts risk for persistent CAA in North America22.

A single U.S center retrospectives study showed that presence of early coronary artery dilation is moderately useful in predicting persistent dilation. However, that study did not specifically evaluate for larger higher risk aneurysms23. Currently available data indicate that KD susceptibility and treatment response depend on an individual patient’s genetic background24,25,26,27,28,29,30,31,32,33. Discrepancies among races or ethnicities also suggest that pathogenesis of KD might vary34,35. To date, very few studies have focused on genetic risk factors for CAA development in KD patients. We performed the first Whole Genome Sequencing (WGS) association analysis in a cohort of KD patients in a racially diverse North American population exhibiting differences in artery aneurysm formation. We identified multiple loci associated with CAA formation among a pediatric KD population receiving IVIG that can inform on risk stratification; potentially serve as treatment response predictors and guidance toward new therapeutic targets.

Results

Genetic association analysis between individual SNPs and the risk of large (medium/giant) aneurysm

To identify SNPs associated with KD-associated large (medium/giant) coronary aneurysm (CCA/L), clinical data were linked in KD patients with whole genome sequencing data, nested in a clinical cohort as previously described36. Basic demographics of the study population with CCA/L (N = 91) and no aneurysm (N = 278) are described in Table 1. Principal component analysis confirmed a good match between KD patients with CCA/L and those without any aneurysm. However, 3 PC were adjusted for in the analyses, as estimated in previous study36. A quantile-quantile plot indicated that population stratification had negligible effects on the statistical results (λ genomic control = 0.955). There were several SNPs that exhibited suggestive statistical significance (p < 10–5) in the additive genetic model, as shown in the Manhattan plot (Fig. 1). Of all the SNPs examined, rs62154092 in the intragenic region (nearest gene ACTR3BP2) was the most statistically significant (6.32E–08). Among the overall top 10 most significant SNPs, 5 SNPs (all intergenic rs1424006606, rs1396081550, rs1258107032, rs1379390981, rs1424309393) were in chromosome 20, although in different regions. All SNPs statistically significant at p < 10–4 are listed in Supplementary Table S1. Among the non-intergenic SNPs, rs28730284 upstream of KLRC2 gene was the most statistically significant (2.20 E-07) and among the top 15 non-intergenic, they were mostly intronic (rs9643846, rs9643847, rs57504215, rs60545202, rs59556769, rs73677451, rs12676292, rs4332118, rs6988966) located in SMAT4 and others in LOC100127 (non-coding RNA rs10276547, rs10280266), PTPRD (intronic rs600075, rs5896385) and TCAF2 (intronic rs1218424730) genes. The most significant exonic SNPs (rs11259953 and rs11259954) were in WHAMM gene (Supplementary Table S1). Regional association plots with cluster of SNPs in LD in chromosomes 7, 8 and 9 are shown in (Fig. 1b–d). Results of corresponding single SNP association with developing any CAA (N = 233), any persistent CAA (N = 145) for 2 years, and P-CCA/L for 2 years (N = 79) vs no aneurysm (N = 276) are shown in Fig. 2 and Supplementary Table S1.

Table 1 Kawasaki Disease Patients With (medium/large, any) and Without Coronary Aneurysm included in the whole genome sequencing analyses
Fig. 1: Overall and regional association results.
figure 1

a Manhattan Plot displaying Whole Genome Sequence Association results with—Large (medium/giant) Coronary Aneurysm (N = 92) vs no Coronary Artery Aneurysm (N = 276). Negative log10-transformed P values from the logistic regression model (additive model) are plotted on the y-axis and the SNP genomic locations on the x-axis (colors representing different autosomal chromosomes). Locus Zoom plots for selected gene regions in (b) Chromosome 7 with rs10276547 the most significant SNP in the region, (c) chromosome 8 with rs9643846 the most significant SNP in the region, (d) Chromosome 9 with rs600075 the most significant SNP in the region. Vertical axis (on the left) is the –log10 of the p-value, the horizontal axis is the chromosomal position. Each dot represents a SNP tested for association with large coronary aneurysm. Linkage disequilibrium between the most significant SNP, listed at the top of each plot, and the other SNPs in the plot is shown by the r2 legend in each plot. Vertical axis (on the right) is the recombination—the site and rate are represented by red curves.

Fig. 2: Circo plot summarizing whole genome sequence associations.
figure 2

The outer plot is the association of large aneurysm (N = 92), the next inner plot is the association of persistent large aneurysm (N = 79), the next inner plot is the association of any coronary aneurysm (N = 233) and the fourth inner plot is the association of persistent any coronary aneurysm vs no coronary aneurysm (for each outcome). The innermost plot indicates if SNPs were associated in 1–4 outcomes with p < 1.0E–05.

Gene mapping

Using three gene mapping strategies (position mapping, eQTL mapping and chromatin interaction mapping) in FUMA, we mapped the significant association variants (P < 10–5) to genes and identified 12 genomic risk loci (Supplementary Table S2) and 48 mapped genes associated with CCA/L (Fig. 3, Supplementary Table S3). None of the genes were mapped by all three strategies. Three genes NDUFA5, ZMAT4 and MICU2 were mapped by physical and eQTL - NDUFA5 is located at the chromosome 7, and its lead SNP rs34163760 is located in the intron of the gene (P = 4.84E-06). An eQTL analysis showed that with the increasing number of risk alleles of rs34163760, there was a higher mRNA level of NDUFA5 in the Esophagus. The CADD score of rs34163760 is 14.39 indicating a deleterious mutation. ZMAT4 is located in chromosome 8, and its lead SNP rs9643846 is located in the intron of the gene (P = 6.57E-07). An eQTL analysis showed that with the decreasing number of risk alleles of rs9643846, there was a higher mRNA level of ZMAT4 in thyroid. CADD score of rs9643846 is 15 indicating a deleterious mutation. The third gene MICU2 is located in chromosome 13, and its lead SNP rs12585631 is located in the intron of the gene (P = 4.11E-06). An eQTL analysis showed that with the decreasing number of risk alleles of rs12585631, there was a higher mRNA level of MICU2 in thyroid. The CADD score of rs12585631 is 15.22 indicating a deleterious mutation. Several genes in chromosomes 4, 7 and 13 were also identified to interact with the chromatin at those sites (Fig. 3, Supplementary Table S3).

Fig. 3: FUMA circos plots of mapped genes in genomic risk loci.
figure 3

The most outer layer is the Manhattan plot (only SNPs with P < 0.05 are displayed). Genomic risk loci are highlighted in blue and the strength of linkage disequilibrium r2 between each SNP to the lead SNP is given by the following color code: red (r2 > 0.8), orange (r2 > 0.6), green (r2 > 0.4), blue (r2 > 0.2) and gray (r2 ≤ 0.2). Genes are mapped by 3-D chromatin interaction (orange) or eQTLs (green), or both (red). a Circos plot for Chromosome 13 with lead SNP rs12585631, (b) Circos plot for Chromosome 7 with lead SNP rs10276547, and (c) Circos plot for Chromosome 4 with lead SNP rs62330192.

Expression patterns of the 48 prioritized genes were estimated in 54 different tissues (Supplementary Table S4 Supplementary Fig. S1). Several of these genes show high expression in aorta and coronary artery tissues.

Genetic risk score (GRS)

Twelve genomic risk loci identified from FUMA yielded an AUC of 0.86. As shown in Fig. 4, based on the empirical distribution of the AUC from the permutation test, eP was <0.0001, suggesting highly significant genetic risk score from the 12 genomic risk loci. For sensitivity analyses, when GRS for CCA/L was conducted separately in four specific races, AUC of 0.83, 0.78, 0.81 and 0.97 were obtained among White, Asian, Hispanic and African American KD patients.

Fig. 4: Empirical curve based on the area under the curve (AUC) from 10,000 permutation.
figure 4

X-axis is the AUC value after a random permutation of the outcome variable (medium to large coronary aneurysm, z > 5.0) based on the 12 genomic loci from FUMA in predicting the risk score of having a large aneurysm and Y-axis is the frequency. Empirical P-value (eP) is proportion of permutations resulting in a larger AUC than original data.

Discussion

We analyzed a relatively large North American KD cohort using whole genome sequencing. Treatment with IVIG during the acute phase was an inclusion criterion, so lack of treatment was not a confounding factor. Additionally, persistent coronary artery aneurysm (P-CAA or P-CCA/L) should be considered a failure of IVIG therapy. As noted in Table 1, the vast majority of medium to giant coronary aneurysms persisted. We identified for the most part novel gene loci that appear to have a relationship with coronary artery aneurysm formation and persistence in KD patients. We have used this same WGS strategy to identify genes related to IVIG refractoriness as defined by AHA guidelines1. Prior studies searching for CAA genetic risk variants have used either hypothesis driven strategies or genome-wide association strategy with their inherent limitations. In this study, we specifically used the newest AHA classifications, and sought to determine genetic associations with CCA/L (Z ≥ 5) as these show a lower chance for early regression than do smaller aneurysms (Z ≥ 2.5, but < 5). However, we also found that statistical genetic associations were consistent among all coronary phenotype groups as shown in Table 2 and Fig. 2. This suggests that pathobiology or at least genetic risk is consistent regardless of size of the aneurysm or propensity for regression.

Table 2 Top genes associated with large (medium/giant) coronary aneurysm (CCA/L) (z ≥ 5), persistent CCA/L (P-CCA/L), any coronary artery aneurysm (CAA) (z ≥ 2.5), and persistent CAA (P-CAA)

The most significant SNP related to moderate to giant aneurysm was located in the intergenic region with the closest gene being ACTR3BP2, a pseudogene with unknown function. However, among the top SNPs in the gene region, rs28730284 was located just upstream of the KLRC2 gene and has intriguing potential biological relevance to KD. The KLRC2 gene encodes the C-type lectin NKG2C (Killer cell lectin receptor-2). Natural Killer (NK) cells mediate innate immune responses against virally infected and malignant cells37,38,39. NK cell function such as production of proinflammatory cytokines, depends on a balance between activating and inhibiting signals triggered by multiple surface receptors, including NKG2C40. Polymorphisms in KLCR2 have been shown to influence both function and expression of NK cells. Furthermore, SNPs in KLRC2 are associated with microvascular inflammation during renal graft transplant rejection. Importantly, all type NK cell (CD56++CD16+−, CD56+CD16+, CD56CD16+) expression is reduced in KD patients compared to febrile or non-febrile controls, while CD56 − CD16 + NK cell expression was significantly lower in IVIG-resistant patients than in the IVIG-responsive41.

Multiple intronic SNPs were found in ZMAT4 gene, which encodes the Zinc Finger Matrin-Type 4 protein. This gene was also identified by FUMA. SNPs within ZMAT4 are associated with diseases such as Spinocerebellar Ataxia and Myopia42, and copy number variations are associated with hematological malignancies43. Function of this particular Zinc-Finger protein remains undefined, so a potential biological role in KD would be unclear. The top two exonic SNPs were in the WHAMM gene. This gene encodes a protein nucleation-promoting factor that regulates the Actin-related protein 2/3 complex, but any biological relevance to KD would be highly speculative. Additionally, we found SNP (rs1052373) within the MYBPC3 exon region as marginally significant. MYBPC3 (myosin binding protein c3) function is well established and mutations are involved in the pathology of hypertrophic cardiomyopathy44,45. This particular SNP has also been associated with athletic endurance46. However, any suggestion of biological relevance for these exonic SNPs to KD would be highly speculative.

Multiple SNPs were also found in regions near Facioscapulohumeral muscular dystrophy (FSHD) region-1(FRG1DP). FRG1 acts on upstream of FGF2, which signals activation of the AKT/ERK signaling axis in endothelial cells. Interestingly, we previously reported that this gene is associated with IVIG response in KD patients. FRG1DP has been linked to angiogenesis and retinal vasculpathy in FSHD patients including development of micro aneurysms. Additionally, altered expression for FRG-1 protein leads altered angiogenesis in human umbilical vein endothelial cells (HUVECs).

Using FUMA we found several genes potentially related to CAA/L in IVIG-treated KD patients. MICU2 is a calcium sensitive regulatory subunit of the mitochondrial calcium uniporter and is important for reducing oxidative stress particularly in endothelial cells. MICU 2 −/− mice exhibit abnormal cardiac diastolic relaxation but also develop abdominal aortic aneurysms, which spontaneously rupture with only modest increases in blood pressure. NDUFA5 encodes NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 5, a critical component of mitochondrial respiratory complex 1, which facilitates the translocation of protons across to the mitochondrial inner membrane. A SNP in NDUFA5 was also a top hit in a small Taiwanese KD study evaluating genetic risk for CAA formation47. Thus, our findings for these two genes regulating oxidative stress response suggest that mitochondria play a role in development of CAA and should be a target for future research48.

We previously published a WGS pharmacogenomics analyses of IVIG response in KD using “persistent or recurrent fever” as the benchmark for IVIG resistance36. Data suggests that IVIG resistance or refractoriness is a risk factor for persistent CAA. However, we did not find numbers of variants that were associated with both coronary aneurysms and IVIG resistance. This lack of genetic association uniformity between these two different outcomes suggests that their biological pathways and mechanisms may be different. However, the list of novel genes provides new insights into the pathogenesis of KD.

Genetic analyses of coronary aneurysms in KD are complicated by multiple potential confounding factors. Prior studies, which have identified numerous significant variants associated with coronary aneurysms, did not account for the IVIG therapeutic effect49,50,51. Clinical trials clearly show that IVIG reduces the risk of CAA52. However, those previous genetic studies for the most part do not clarify whether study participants received appropriate IVIG treatment. Thus those prior cohorts could and probably do include patients whom did not receive timely IVIG. We used strict criteria for IVIG treatment in our study subjects in accordance with pharmacogenomics design. Accordingly, our results using different design and methodology did not replicate findings from prior studies such as associations with coronary aneurysms for variants in ITPKC51, KCNN250, NEBL and TUBA3C47, SLC8A153, and the matrix metalloproteinase (MMP) gene family54. Likewise, Huang et al55 reported TET mRNA levels associated with IVIG and DMNT1 mRNA levels with CAA; however, there were no SNPs in these gene regions that were statistically significant in our study.

Two other studies indicated association of TIFAB56 and PLCB157 genes. Although the same SNPs were not replicated in our study, we found several SNPs in these gene regions that were associated with CAA in our study (Table S1). Unlike studies predominantly using Asian populations, our cohort included 4 races (Whites, Asians, African-Americans, and Hispanics). We had fewer cases of African-Americans and Hispanics and we were underpowered to conduct race-specific analyses (Supplementary Table S5). However, in the main combined cohort, we adjusted for three principal components which resulted in λ genomic control = 0.955, suggesting no major confounding factors (Supplementary Figure S2).

We also used 12 genomic loci from FUMA to test for overall prediction of risk for developing CAA/L, a genetic predictive risk score. The AUC based on these markers is promising and the empirical risk models based on 10,000 simulated cases and controls had considerably higher AUC than theoretically achievable. Although sample size is small, all race-specific analyses also showed similar trends with high AUC. These specific markers need to be validated in different IVIG-treated KD populations; however, there is potential clinical utility in developing a point-of-care assay based on panels of genetic risk markers to predict who will develop CAA/L and/or other sequalae. Accurate prediction can assist in developing a treatment plan during the acute KD phase58.

The main limitation for this study is the lack of a validation cohort. This is a common limitation of clinical trials and pharmacogenomics studies of rare diseases, although our cohort is the largest reported to date with a clear IVIG treatment phenotype. Further validations as well as functional studies of these variants will be needed in the future.

In summary, using WGS we have identified several novel genes and loci, which could have a functional impact on coronary artery response to IVIG in KD. Additionally, these loci could be used in identifying new personalized therapeutic avenues as well as developing an important predictive risk score for persistence of coronary artery aneurysms despite IVIG treatment.

Methods

Study populations and primary outcome

We performed whole genome sequencing in 504 KD patients who were diagnosed and treated with IVIG (2 g/Kg on a single infusion) and aspirin36, both using the American Heart Association (AHA) criteria1,59. All patients included in the study had echocardiography data that assessed for coronary artery aneurysm (CAA) (212 Whites, 75 Asians, 50 Hispanics and 32 Blacks). Coronary artery internal diameters in the left main coronary artery (LMCA), left anterior descending artery (LAD), and right coronary artery (RCA) obtained by echocardiography. Coronary artery dimensions were normalized for body surface area and converted to z- scores (SDs from a predicted normal mean) based on nonlinear regression equations derived from a normal nonfebrile population60. Echocardiographic data were collected at baseline, at 2 weeks, and 5–6 weeks or after following fever onset. The > 5 -6 week echocardiograms were assessed for persistence of the CAAs. We used the Boston z -score model1 and as defined by AHA, coronary abnormality or aneurysm was considered if z ≥ 2.5. According to AHA guidelines, we categorized positive or negative coronary artery aneurysm (CAA) occurring at any time point with z score ≥ 2.5 as “CAA”. Persistent CAA (P-CAA) was defined as any aneurysm z ≥ 2.5 for upto 5–6 weeks. We also categorized large CAA (CAA/L) as medium to giant coronary aneurysm for a z score ≥5.0 at any time; and then persistent CCA/L (P-CCA/L) if aneurysm z ≥ 5.0 remained for upto 5–6 weeks. Genomic comparisons were made for each of the 4 categories versus those without any aneurysm (z < 2.5).

The parent cohort/study and this pharmacogenomic study conformed to the procedures for informed consent (parental permission) approved by institutional review boards at all sponsoring organizations .The pharmacogenomic data management and analysis procedure was approved by the University of Alabama at Birmingham Institutional Review Board (IRB). The study was conducted in accordance with the local legislation and institutional requirements following the ethical guidelines of the Declaration of Helsinki. Written informed consent for minor participation (children) in this study was provided by the participants’ legal guardians/next of kin.

Whole genome sequencing and variant calling

With consent from the parents or legal guardian, whole blood or saliva was collected to extract genomic DNA, as previously described36. PCR-free libraries were generated using the BGI DNBSEQ True PCR-Free platform (Beijing Genomics Institute; Guangdong, Shenzhen, China) and whole genome sequencing was performed on the MGISEQ-2000 instrument (Beijing Genomics Institute; Guangdong, Shenzhen, China) to generate 100 bp paired-end reads, as previously described36. All reads that passed were aligned to the human reference genome (hg38) using Burrows-Wheeler aligner (BWA) v 0.7.17. The average sequencing depth was 30x per individual. Broad Institute’s Genome Analysis Tool Kit (GATK) best practices workflow was used for quality control and informatics pre-processing of the data. Variant-level QC was performed using the Variant Quality Score Recalibration tool (VQSR) from the Genome Analysis Toolkit (GATK), using the recommended threshold of 99% sensitivity for the “true” variant. As we previously reported, we included 5 duplicate samples, which showed overall high SNP genotype concordance, with a kinship coefficient estimate, Φ > 0.497 between duplicates. For the SNPs we report association, the concordant genotypes were confirmed between all duplicate samples.

Whole genome sequencing (WGS) association—single-variant analysis

Intensive quality control of the genetic data including minor allele frequency (MAF), call rate (CR), and p values of Hardy-Weinberg equilibrium (HWE), were applied to filter uncertain SNPs as described previously36, resulting in 46,718,826 variants (21,675,492 singletons) and 25,043,334 polymorphic SNPs were included in the analysis. Logistic regressions models were conducted using PLINK 1.90 to examine the association of individual autosomal SNPs using an additive model in a case/control design, for the main outcome (medium/giant aneurysm, 91 cases (52 W, 3AA, 20His, 16As) and 278 controls (160 W, 29AA, 30His, 59As) and the three secondary outcomes (Table 1). Age, gender and three principal components (PCAs) of genetic ancestry were adjusted in the models. Quantile-quantile (QQ) plots and Manhattan plots were produced with the qqman package in R. The crude and adjusted odds ratios (ORs) and 95% confidence intervals (CIs) were also calculated for the top hits using unconditional univariate logistic regression analysis to evaluate the associations between genotypes and medium/giant aneurysm.

Identification of genes and their roles using FUMA

Functional annotation was conducted in Functional Mapping and Annotation (FUMA) v1.3.061, using variants of interest from the WGS association analysis (p < 1.0 × 10−5 and all variants in r2 < 0.6 with them). Lead SNPs were defined from these independent statistically significant SNPs if pairwise SNPs had r2 < 0.1. The maximum distance between LD blocks to merge into a genomic locus was 250 kb. The genetic data of mixed population in 1000 G phase3 were used as reference to estimate LD. Three methods were used to map SNPs to genes: (a) physical distance (within a 10-kb window) from known protein-coding genes in the human reference assembly, (b) expression quantitative trait loci (eQTL) variant mapping using62, and (c) 3D chromatin interaction mapping (Hi-C)63. Combined Annotation-Dependent Depletion (CADD) analysis64 with a minimum score of >12.37 (considered to be suggestive deleterious) was used to filter the variants. Annotation of enhancers65, tissue-specific expression of genes identified via Hi-C and eQTL mapping62 were queried in FUMA tool and Genotype Tissue Expression (GTEx) database (https://gtexportal.org/home/).

Genetic risk score computation

All genomic risk loci from FUMA were used to estimate gene risk score (GRS) by a simple risk alleles count method. Discriminative power attributable to the GRS was estimated and compared by plotting receiver operating characteristic (ROC) curves and calculating the area under the curve (AUC) for the case-control samples. The AUC compares the rates of true positives (sensitivity) and false positives (1—specificity) and assesses the overall performance of genetic risk score models. Next, case and control status was randomly permuted 10,000 times and AUC was estimated with each pseudo case-control status. An empirical p-value (eP), which is the proportion of AUC based on randomization distribution of cases and controls that are more extreme than our observed AUC from the actual case (medium/giant aneurysm) and control (no aneurysm) status, was then calculated.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.