Introduction

Alzheimer’s disease is characterised by the deposition of extracellular β-amyloid plaques (Aβ) and intracellular neurofibrillary tangles (tau). Several PET tracers allow the in-vivo quantification of tau. While the cross-sectional analysis of these tracers for early detection of tau has been widely studied, their longitudinal analysis is limited. Longitudinal tau quantification may provide a useful marker of anti-tau drug efficacy in clinical trials, and different tau PET tracers may provide different sensitivity to longitudinal changes, but without a head-to-head dataset or a carefully designed case-matching procedure, comparing results in different cohorts can be biased. In this study, we aim to minimise this bias by matching subjects in two cohorts imaged using 18F-MK6240 and 18F-flortaucipir.

A recent direct comparison of 18F-flortaucipir and 18F-MK6240 (1) showed that both tracers detect tau in common regions that are typically associated with tau pathology in AD. They also showed a good SUVR correlation in these regions. This analysis also revealed that 18F-MK6240 exhibited a greater dynamic range, with the author concluding that this could lead to an earlier detection of tau accumulation in longitudinal studies, but direct longitudinal comparison would be required to support these findings.

There has been a number of longitudinal studies for 18F-Flortaucipir looking at the accumulation in cognitively unimpaired (CU) (2), CU/mild cognitive impairment (MCI) (3, 4), MCI (5) and a mix of CU, MCI and Alzheimer’s disease (AD) (68). Other studies have also looked at the pattern of accumulation in clinical variants of AD (9) and autosomal AD (10). While most studies find statistically significantly higher rates of tau accumulation at the prodromal stage of the disease, the results at the preclinical stage are mixed, with only 2 studies (3, 11) finding statistically significantly higher rates of tau accumulation in the Aβ positive CU compared with the Aβ negative CU.

Given that 18F-MK6240 is a more recent tracer, longitudinal studies using this tracer are more sparse, with only two studies looking at the pattern of longitudinal accumulation in CU, MCI and AD (12, 13). Both studies showed that 18F-MK6240 could detect statistically significant tau accumulation in both preclinical and prodromal AD subjects.

Given the interest in using tau imaging in clinical trials, it is becoming increasingly important to compare the various tau tracers in closely matched populations to properly evaluate their ability to measure longitudinal changes. While there is no direct longitudinal comparison of 18F-Flortaucipir and 18F-MK6240, we have case-matched two independent populations based on clinical diagnosis, Centiloid value, Mini Mental State Examination (MMSE) score and age.

Methods

Data were obtained from both the Australian Imaging Biomarkers and Lifestyle study (AIBL) and the ADNI database (https://adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership by the National Institute on Aging, the Food and Drug Administration, private pharmaceutical companies and non-profit organizations. Its primary goal was to test whether neuroimaging like serial magnetic resonance imaging (MRI), PET, biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. A detailed description of the inclusion criteria can be found on the ADNI website (https://www.adni-info.org). Data were downloaded from the ADNI database (https://adni.loni.usc.edu).

This study was approved by the Austin Health Human Research Ethics Committee (HREC/18/Austin/201). All AIBL participants gave written consent for publication of de-identified data. All ADNI participants signed written informed consent for participation in the ADNI, as approved by the institutional board at each participating centre.

Participants

136 participants from the AIBL study imaged using 18F-MK6240 at baseline and follow-up and 273 participants from the ADNI study imaged using 18F-Flortaucipir at baseline and follow-up were considered for this study. To ensure that the two datasets were as comparable as possible, participants were excluded from the analysis if a different scanner was used for the baseline and follow-up tau-PET scans or the delay between baseline and follow-up was less than 10 months or more than 30 months. They were also excluded if an amyloid (Aβ) PET scan was not available within 1 year of the baseline tau PET scan, if MMSE at baseline was not available, or if either baseline or follow-up clinical diagnosis was not either CU, MCI or AD. Furthermore, we also excluded participants who were cognitively impaired but were amyloid negative.

18F-MK6240 acquisitions were performed using two scanners (Philips Gemini TF64 and Siemens Biograph 128 mCT) while 18F-FTP acquisitions were performed using 23 different scanner models. Acquisition details for 18F-MK6240 have been previously described (12) and details for the 18F-FTP acquisition are available on the ADNI website (https://adni.loni.usc.edu/methods/pet-acquisition).

PET Analysis

Both 18F-NAV4694 Aβ PET and 18F-MK6240 tau PET images from the AIBL study were smoothed to a uniform 8mm FWHM resolution following the ADNI pre-processing pipeline (14), so they would match the resolution of the pre-processed 18F-Flortaucipir, 18F-Florbetapir and 18F-Florbetaben scans downloaded from ADNI.

All PET images were then analysed using CapAIBL, a PET-only quantification method (15). The Aβ PET scans (18F-NAV4694 for AIBL and either 18F-Florbetapir or 18F-Florbetaben for ADNI) were quantified using Centiloids (CL) using our recently reported Non-negative Matrix Factorisation quantification method (16, 17). Amyloid positivity (Aβ+/Aβ−) was defined based on a threshold of 25CL. Subjects were grouped as cognitively unimpaired amyloid negative (Aβ− CU), Aβ+ CU, mild cognitive impairment amyloid positive (Aβ+ MCI) and Alzheimer’s disease amyloid positive (Aβ+ AD).

The tau PET scans (18F-MK6240 for AIBL and 18F-Flortaucipir for ADNI) were spatially normalised using the CapAIBL PCA based approach (18) and the cerebellum cortex was used as the reference region to compute the Standardised Uptake Value Ratio (SUVR). Tracer retention was measured in four regions: the mesial (Me), temporoparietal (Te) and rest of the cortex (R) (19) as well as metatemporal region (MT) composed of entorhinal, inferior/middle temporal, fusiform, parahippocampus and amygdala (3). CapAIBL was also used to generate mean surface projection of Tau tracer uptake for each clinical group. To facilitate the interpretation of the results, the surface projections were mirrored and averaged to remove any asymmetry in the datasets which, given the recent head-to-head comparison results (1), are likely participant specific and not representative of the binding properties of the tracer.

No correction for partial volume effect was conducted for either amyloid or tau quantification.

Statistical Analysis

SUVR were transformed into Z-scores for the cross-sectional analysis, and the percentage annual change (SUVR%/Year) was used for the longitudinal analysis and was defined as the annualized difference in SUVR between baseline and follow-up, normalized by the baseline SUVR. T-tests were employed to assess group separation at baseline using the SUVR Z-score and longitudinally using SUVR%/Year. No correction for multiple comparisons was conducted. A power analysis with 80% power and alpha = 0.05 was also conducted to estimate the number of participants needed to detect a 25% annualized reduction of Tau accumulation in Aβ+ CU for both tau tracers. A 10000 bootstrap sampling was conducted to estimate the 95% confidence intervals.

Populations matching

The two populations were matched in terms of age, Centiloid and MMSE within each of the four groups (Aβ− CU, Aβ+ CU, Aβ+ MCI and Aβ+ AD). The age, Centiloids and MMSE of each participant were first transformed into Z-scores, using the combined Aβ− CU from both AIBL and ADNI as reference population. For each baseline participant scanned using 18F-MK6240, a weighted distance was computed with all baseline participant scanned using 18F-FTP sharing the same clinical group. The weighted distance D was defined as follows:

$$D = {{\alpha * \left| {Z\left( {Ag{e_{MK}}} \right) - Z\left( {Ag{e_{FTP}}} \right)} \right| + \beta * \left| {Z\left( {C{L_{MK}}} \right) - Z\left( {C{L_{FTP}}} \right)} \right| + \gamma * \left| {Z\left( {MMS{E_{MK}}} \right) - Z\left( {MMS{E_{FTP}}} \right)} \right|} \over {\alpha + \beta + \gamma }}$$

With the Z() function denoting the Z-score, and α, β and γ representing the weight assigned to each Z-score variables. For each 18F-MK6240 baseline scan, the 18F-FTP baseline scan within the same diagnostic group and with the shortest distance D was selected as the best matching candidate. To ensure that the matched scans were as similar as possible, we established a threshold T to define a maximum allowable distance. For any given 18F-MK6240 baseline scan, if the most similar 18F-FTP scan had a distance D greater than T, the 18F-MK6240 is deemed to be unmatchable given the available 18F-FTP scans available and is excluded. To ensure that the interval between baseline and follow-up were comparable between both studies, the matched 18F-FTP scan was also excluded if the interval between baseline and follow-up was 6 months longer or shorter than that of the matched 18F-MK6240. The procedure was run until all the baseline 18F-MK6240 scans were either matched with a suitable 18F-FTP scan or excluded. In our experiments, we used the weights (α=1, β=1,γ=2) and a threshold T=0.5 which corresponds to half a standard deviation of difference across all three metrics.

Results

Matched populations

At the end of the matching procedure, 114 pairs of participants were matched, including 65 Aβ− CU, 22 Aβ+ CU, 14 Aβ+ MCI and 13 Aβ+ AD. One of the matched 18F-FTP Aβ− CU participant had high uptake in the MT and Te (Z-Score >7). The scan was visually read as being positive and was deemed to be an outlier. It was therefore excluded from all further analysis along with its matched 18F-MK6240 Aβ− CU participant. As a results, 113 pairs of participants remained, including 64 Aβ− CU. As per design, there was no statistically significant difference in MMSE, age, Centiloid and number of years between baseline and follow-up at baseline between the 18F-MK6240 and 18F-FTP participant within each of the four diagnosis groups (Table 1). There was also no difference in CDR Sum of Boxes, ApoE status or gender between the two tracers in each group. The 18F-FTP participants were however more educated in all diagnosis groups compared to their matched 18F-MK6240.

Table 1 Mean (Standard deviation) of MMSE, centiloid, age, number of years between baseline and follow-up scans, CDR SOB, years of education as well as % of female and ApoE E4 for the baseline 18F-MK6240 and 18F-FTP and the corresponding p value

Cross sectional analysis

Using the 113 pairs of matched participants, regional SUVR at baseline for both tracers were transformed into Z-scores using each tracer’s respective Aβ− CU population as reference. Surface projections of the mean SUVR images for the Aβ− CU at baseline and follow-up are presented in supplementary Figure 1. The Z-scores were then compared between each pair of diagnosis groups using t-test (Figure 1.a). The corresponding effect sizes are presented in Table 2.

Figure 1
figure 1

(a) Baseline SUVR Z-Scores and (b) percentage of SUVR change (SUVR%/Year) for 18F-MK6240 and 18F-FTP in the four regions of interest: mesial (Me), metatemporal (MT), temporoparietal (Te) and rest of the cortex (R). (ns: not significant. *: p<0.5; **:p<0.1; ***:p<0.01; ****:p<0.001). By definition the mean Aβ− CU Z-score is zero in all area

Table 2 Effect size at baseline (SUVR) and using the percentage rate of SUVR increase (%SUVR/Year ) between the different clinical groups using 18F-MK6240 or 18F-FTP in the four regions of interest: mesial (Me), metatemporal (MT), temporoparietal (Te) and rest of the cortex (R)
Table 3 Mean (Standard deviation) increase in SUVR/Year for 18F-MK6240 and 18F-FTP in the Aβ+ CU, and the corresponding sample size estimates (95% confidence intervals based on bootstrapping) to detect a 25% reduction in the rate of tau accumulation after 1 year using the mesial (Me), metatemporal (MT) and temporoparietal (Te) regions

Baseline SUVR Z-scores were statistically significantly different between each group in all regions, except for Aβ− CU and Aβ+ in the R region (excludes Me and Te) of the cortex with both tracers.

When comparing Aβ− CU to Aβ+ CU, both tracers showed statistically significant differences in the Me and MT. However, better group separation was obtained when using 18F-MK6240, with a higher effect size in both the Me (ESMK6240=1.15, ESFTP=0.87) and MT (ESMK6240=LU, ESFTP=0.89). We also observed that the SUVR Z-Scores were typically higher using 18F-MK6240 compared to using 18F-FTP, especially in Aβ+ MCI and Aβ+ AD.

Surface projections of mean SUVR Z-Scores in each subgroup for 18F-MK6240 and 18F-FTP are presented in Figure 2.a. These projections illustrate similar patterns of retention with both tracers, starting in the Me before spreading to the Te and eventually to the rest of the cortex.

Figure 2
figure 2

(a) Mean SUVR Z-Scores at baseline and (b) mean increase in SUVR%/Year in each subgroup for 18F-MK6240 (Left) and 18F-FTP (Right)

Longitudinal analysis

For each tracer, the 113 baseline/follow-up pairs were used to compute the rate of change, expressed in SUVR%/Year. Using the classification at baseline, the rates were compared between Aβ− CU and Aβ+ CU/ MCI/AD as well as Aβ+ CU and Aβ+ MCI/AD using t-test (Figure 1.b). The corresponding effect sizes are presented in Table 2.

The rate of SUVR change in Aβ+ CU was statistically significantly higher than that of Aβ− CU in the Me, MT and Te when using 18F-MK6240. No differences between Aβ− and Aβ+ CU were observed with 18F-FTP. The rate of SUVR change in Aβ+ MCI and AD was statistically significantly higher than that of Aβ− CU in the Te and R when using 18F-MK6240, and in the Te with 18F-FTP, but also in the MT for the Aβ+ AD group.

With 18F-MK6240, the rate of SUVR change in the Aβ+ MCI was statistically significantly smaller than that of Aβ+ CU in the Me and MT. No differences between Aβ+ CU and Aβ+ MCI were observed with 18F-FTP. The rate of SUVR change in the Aβ+ AD was statistically significantly higher than the rate in the Aβ+ CU in the R with 18F-MK6240. No differences between Aβ+ CU and Aβ+ AD were observed with 18F-FTP.

We also observed a few AD subjects with large negative rates of accumulation when using 18F-MK6240, which is consistent with a previous report (12) and is likely driven by instability in the reference region in this cohort.

Surface projections of mean SUVR%/Year in each subgroup for 18F-MK6240 and 18F-FTP are presented in Figure 2.b. These show early accumulation in the Aβ+ CU in the Me which is not visible when using 18F-FTP. In the Aβ+ MCI, with 18F-MK6240, the increase was the strongest in the posterior part of the temporal, the occipital lobe as well as the superior frontal. With 18F-FTP, the inferior temporal and occipital showed the strongest increase. In the Aβ+ AD, 18F-MK6240 showed the largest increase in the temporal pole, parietal, occipital, superior frontal and precuneus. With 18F-FTP, the inferior temporal and occipital showed the strongest increase, with also noticeable increase in the parietal, superior frontal and precuneus.

We also conducted a power analysis to estimate the number of Aβ+ CU participants required to detect a 25% reduction in annual change of SUVR in a 2-arm placebo-controlled trial (Table 23). This shows that in the regions of early tau deposition (Me, MT), the number of participants required to detect a 25% reduction in annual change of SUVR using 18F-MK6240 would be almost an order of magnitude smaller than the number required when using 18F-FTP.

Discussion

In this paper, we have compared the first-generation tau tracer 18F-Flortaucipir to the more recently developed tau tracer 18F-MK6240, both cross-sectionally and longitudinally. Since no longitudinal head-to-head dataset is currently available, we matched two independent datasets from AIBL and ADNI, by minimizing the difference in MMSE, Centiloid and age at baseline. Our matching procedure was able to identify 113 pairs of participants who had no significant difference at baseline between those three metrics within any of the diagnosis groups, and no difference in the number of years between the acquisition of the baseline and follow-up scans.

Since there are no published transforms to compare SUVR between 18F-FTP and 18F-MK6240, we opted to convert the SUVRs into Z-Scores, an approach similar to CenTauRz which we have previously proposed to compare the different tau tracers using unpaired datasets (20). The cross-sectional analysis showed that both tracers were able to identify statistically significant differences at baseline between each diagnostic group. It also showed that the dynamic range of 18F-MK6240 was generally larger than that observed with 18F-FTP, in agreement with the direct head-to-head comparison of (1) which showed that the SUVR dynamic range of 18F-MK6240 was almost 2-folds greater than that of 18F-FTP. While both 18F-MK6240 and 18F-FTP could identify statistically significant differences between Aβ− CU to Aβ+ CU, the group separation was much larger when using 18F-MK6240. This would indicate that 18F-MK6240 might be able to better detect early tau accumulation.

The longitudinal analysis also showed that 18F-MK6240 could detect a statistically significant increase in the rate of tau accumulation in the Aβ+ CU compared to the Aβ− CU in both the mesial and metatemporal, which could not be detected using 18F-FTP in our matched cohort. It is possible that the lack of statistically significant difference in the rate of accumulation in the Aβ+ CU with 18F-FTP could be driven by the selected population as previous work has shown statistically significant difference between these groups in ADNI scans (11), although those differences were limited to the temporal inferior region, which could be diluted in our larger composite regions. Other groups using similar composite region, but in different cohorts, had mixed findings with either statistically significant difference found (3) or no difference (8). Recent work from Knopman and colleagues (2) showed that the rate of tau increase detected using 18F-FTP was much higher in subjects with high Aβ load, and this could explain differences between studies if their respective distribution of Aβ load at baseline is significantly different. One strength of our study is that both tracers were matched for Centiloid at baseline, which should minimize difference in tau accumulation due to differences in Aβ load.

Using 18F-MK6240, the rate of SUVR increase was statistically significantly smaller in the mesial from Aβ+ MCI compared to the rates from Aβ+ CU, while significantly larger in the rest of the cortex in Aβ+ MCI/AD compared to Aβ+ CU. This pattern follows the expected sequence of Tau deposition, with Tau deposition expected to start in the mesial, spreading to inferior and middle temporal gyri (captured by the metatemporal mask) at the preclinical stages of the disease and reaching the rest of the cortex at the prodromal/symptomatic stage of the disease. While a similar trend was observed with 18F-FTP, it failed to reach statistical significance.

The spatial pattern of 18F-MK6240 increase was quite different to the pattern obtained using 18F-FTP, especially in the Aβ+ cognitively impaired groups. Our results indicate that 18F-MK6240 is a more sensitive tracer than 18F-FTP, which is consistent with a previous head-to-head comparison showing 18F-MK6240 having lower non-specific binding than 18F-FTP (1,21). Furthermore, in vitro studies showed 18F-MK6240 has higher affinity for paired helical filament tau than 18F-FTP (22). A higher affinity for tau and a lower non-specific PET signal likely allows 18F-MK6240 to detect earlier and more extensive tau deposition in the brain.

The estimated sample sizes required to detect a 25% reduction in annual change of SUVR were much lower when using 18F-MK6240 compared to 18F-FTP. With 18F-MK6240, the number were similar for both the mesial, metatemporal and temporoparietal, whereas 18F-FTP had much higher number for the mesial, likely reflecting the lack of binding in the hippocampus with this tracer compounded by spillover from non-specific off-target binding in the choroid plexus. These results indicate that 18F-MK6240 might be better suited than 18F-FTP for anti-Aβ and/or anti-tau clinical trials in the early stages of AD.

The main limitation of this paper is that we are using case-matched rather than true head-to-head data, and while we have taken many steps to reduce the potential differences between the datasets, we cannot account for other differences that might contribute to some of the differences observed in the results. For instance, 23 different scanners were used for the acquisitions of 18F-FTP compared to two for 18F-MK6240 which may increase the variance in the 18F-FTP measurements. A true head-to-head study, similar to the head-to-head scans acquired for the Centiloid study will be required to confirm those findings.

We also did not take into account possible effects of off-target binding in the meninges spilling into the target regions. While a comparison at baseline showed statistically significant differences in meningeal SUVR between clinical groups in 18F-MK6240, with Aβ+ MCI/AD showing higher meningeal SUVR compared to the CU groups (supplementary Figure 2), there was no difference in their rate of change in any of the groups, and those were not statistically significantly different from 0 (supplementary Figure 3). This is consistent with a recent study reporting that the extracerebral uptake was stable over one year (23). Therefore, while 18F-MK6240 retention in the meninges might contribute to some of the differences observed at baseline, it was unlikely to contribute to the groups difference observed in our longitudinal analysis.

We also did not consider different reference regions for the two tracers. While some recent work has been looking at the best reference region for 18F-FTP to detect longitudinal changes (6, 11, 24), results using 18F-MK6240 are still limited (12) and require further evaluation. Therefore, we only selected a single reference region that has been frequently used in previous work for both tracers. However, future work in this area is warranted.

Lastly, we did not include partial volume correction (PVC). However, since both tracers were quantified the same way, we do not expect PVC to change the overall conclusions.

Conclusions

We have proposed a framework to match two Tau tracers acquired in two independent studies. The cross-sectional and longitudinal analysis revealed that 18F-MK6240 might be better suited to detect early Tau accumulation and be a better tracer to be used in preclinical trials. While our framework tried to minimise difference between the two populations, head-to-head longitudinal comparison will be required to confirm these results.