Introduction

18-kDa translocator protein (TSPO) receptor has been shown as a potential target for imaging neuroinflammation using PK-11195 PET [1,2,3]. Recently, a putative antagonist of TSPO, [11C]-N,N-diethyl-2[2-(4-methoxyphenyl)-5,7-dimethyl-pyrazolol[1,5-α]pyrimidin-3-yl]-acetamide ([11C]DPA-713), was developed concurrently with the TSPO agonist, fluoro-ethoxy derivative [18F]DPA-714 [4,5,6]. Both [11C]DPA-713 and [18F]DPA-714 were shown to have higher affinity than the first generation TSPO tracer [11C]-(R)-PK11195 [4, 7, 8]. Several studies have now demonstrated the usefulness of [11C]DPA-713 PET in quantifying neuroinflammation in different diseases, including multiple sclerosis (MS), Parkinson’s disease (PD), and Alzheimer disease (AD), both in animal and human studies [4, 9,10,11,12].

In PET, kinetic modeling is often essential for the accurate quantification of tracer uptake and metabolism in the tissue. This often requires the measurement of the tracer concentration in the arterial blood over time. However, this practice is often limited in terms of its invasive nature and associated risks to the subjects, as well as risky blood sample handling [13]. The need for inserting arterial lines in patients leads to significant discomfort and patient burden. In practical clinical trial settings, this often proves a key workflow bottleneck and can also adversely influence subject cooperation and accrual [14].

An alternative technique such as an image-derived input function (IDIF) [15, 16] or population-based input function (PBIF) [17, 18] can facilitate the adoption of PET protocols requiring input functions. In brain studies, IDIF is usually deduced from the dynamic images of the carotid arteries and hence is susceptible to partial volume effect [15,16,17]. Previous studies showed the feasibility of PBIF as a robust alternative to IDIF for some radiopharmaceuticals [17, 18]. PBIF is generated by averaging the normalized patient-specific arterial input functions (PSAIFs) deduced from a cohort of subjects. Several normalization techniques have been reported in the literature, for example, traditional scaling using blood samples by correlating the measured plasma activity with the AUC [17]; correlation of the PBIF with PSAIF venous samples [19]; scaling by injected dose and weight [20, 21]; and non-invasive scaling using individual parameters like weight, body surface area (BSA), and lean body mass (LBM) [17]. Many studies have assessed the feasibility of PBIF for kinetic analysis using [18F]FDG [18, 22,23,24,25], yet very few studies involved neuroreceptor PET tracers [17, 26], including TSPO brain studies [19, 21, 27]. To the best of our knowledge, no PET kinetic modeling study has been performed with [11C]DPA-713 using PBIF.

The main aim of this study is to assess the feasibility of using PBIF instead of the patient-specific AIF for [11C]DPA-713 PET kinetic modeling. This was done by first estimating the test-retest repeatability of the [11C]DPA-713 PET imaging in healthy subjects. Based on the test-retest results, we then assessed the reproducibility of kinetic analysis of [11C]DPA-713 dynamic PET images of the brain with PBIF compared to PSAIF in healthy and PD subjects. The effect of PSAIF normalization techniques on the PBIF-based kinetic results was also investigated.

Materials and methods

Subjects

In total, twelve subjects (9 males and 3 females; age 56.6 ± 11.9 years) were recruited from a Parkinson’s disease (PD) dynamic [11C]DPA-713 PET research study. Six additional healthy male subjects (age 42.6 ± 11.2 years) were also included, out of which five healthy subjects underwent test-retest studies to assess the repeatability of DPA kinetics. The inclusion criteria for the PD cohort are PD clinical diagnosis of 3 to 12 years of duration from onset of symptoms, age 30 to 70 years at time of enrollment, Hoehn and Yahr stages 2–3, and absence of a clinical diagnosis of dementia. Exclusion criteria included subjects receiving dopamine receptor blocking agents or treatment with acetylcholinesterase inhibitors, history of another significant neurological or major psychiatric disorder, or autoimmune disorders within the past 5 years. For screening purpose, all patients including healthy volunteers had a blood sample (3 mL) collected for TSPO (rs6971) genotype analysis. Three different genotypes are defined: low-affinity binders (LAB), mixed-affinity binders (MAB), and high-affinity binders (HAB). Patients that are low-affinity binders were excluded from participation. Detailed information on all subjects is shown in Supplementary Table S1.

PET measurements and reconstruction

For the PET studies, 526.4 ± 73.6 MBq (14.2 ±1.9 mCi) of [11C]DPA-713 was administered through bolus-intravenous injection, followed by flushing 10–15 ml of saline solution. PET data were acquired simultaneously after injection in list-mode format on a 4-ring Siemens Biograph mCTTM for a total of 90 min. The PET data were reconstructed into 32 dynamic frames (6×10 s, 4×30 s, 3×60 s, 2×120 s, 5×240 s, 12×300 s) using ordered subset expectation maximization (OSEM) with attenuation, scatter, and randoms corrections. Continuous arterial sampling was performed at 15-s intervals for the first 10 min using an automated fraction collector, followed by five additional samples collected at 20, 30, 45, 60, and 90 min respectively. Each of the blood samples was weighed and counted using a Wizard® automatic gamma counter (Perkin Elmer), and then, the activity concentration was calculated. Blood samples drawn at 5, 10, 20, 30, 45, 60, and 90 min post-injection were also used to estimate metabolite fractions using the HPLC method of analysis. The blood time activity curves (TACs) were finally corrected for metabolites, yielding a metabolite-corrected, arterial input function.

Data analysis and kinetic modeling

Each subject underwent a T1-weighted MRI scan. Inter-frame head motion correction was achieved by rigidly co-registering the individual dynamic PET frames to the last 10 min image set using PMOD (version 3.8; PMOD Technologies Ltd). The resulting dynamic image set was then rigidly registered to the T1-MR image set. Brain regions were delineated on the MRI images using the FreeSurfer software [28], the corresponding volumes of interest (VOIs) were overlaid on the co-registered and motion-corrected dynamic PET images, and finally, the corresponding TACs were deduced.

Kinetic modeling was done for each patient using the Logan VT model [29]:

$$ \frac{\int_0^tC(T) dT}{C(t)}={V}_T\frac{\int_0^t{C}_p(T) dT}{C(t)}+\mathrm{constant}\ \left(t\ge {t}^{\ast}\right) $$
(1)

where C(t) is the regional time activity curves (TACs), Cp is the input function, and t is the time at which the plot of \( \frac{\int_0^tC(T) dT}{C(t)} \) versus \( \frac{\int_0^t{C}_p(T) dT}{C(t)} \) reaches linearity. The linearity time was determined using the maximum admissible error criterion as described by Ichise et al. [30]. This automatically searches for the minimum time after which the relative error of every data point in the Logan plot is lower than the given error threshold. A 10% error criteria was used in this study as also suggested by similar TSPO studies [27]. An example of the Logan fit for a sample patient using both PSAIF and PBIF is shown in Supplementary Figure S7.

Kinetic analysis was performed using the PSAIFs and then repeated using the PBIFs. For each of the selected brain structures (white matter, cerebellum, thalamus, caudate, putamen, pallidum, brainstem, hippocampus, and amygdala), the total volume of distribution (VT) was estimated with the blood volume fixed to 5%. These brain regions were selected mainly because they show great affinity for [11C]DPA-713 binding.

Test-retest repeatability and reliability

Five healthy volunteers underwent a test-retest within the same day to assess the reproducibility of the kinetic parameters in the brain structures. Kinetic analysis was carried out for all the selected brain regions, for both the test and retest datasets, using the Logan VT model and the corresponding PSAIF’s. The repeatability of VT was assessed using the Bland-Altman analysis [31]:

$$ \% Relative\ Diff,D=\frac{Retest- Test}{\raisebox{1ex}{$\left( Retest+ Test\right)$}\!\left/ \!\raisebox{-1ex}{$2$}\right.}\times 100 $$
(2)
$$ Mean\ Bias=\frac{\sum_{n=1}^ND}{N} $$
(3)

where N= number of subjects

The corresponding 95% limits of agreement (LOA) and the coefficient of repeatability (CR) between test and retest were determined using:

$$ LOA= Mean\ Bias\pm 1.96 SD $$
(4)
$$ CR=1.96\times \sqrt{\frac{\sigma^2}{N-1}} $$
(5)

where σ2 is the variance of the relative difference, D, between the test and retest estimates. This represents the value below which the relative difference between test and retest is expected to lie with a 95% probability [31, 32].

Generation of population-based input functions

The individual PSAIFs were fitted using the “tri-exponential” function and then corrected for metabolites after fitting the later using “Watabe” function (as incorporated in PMOD). The PSAIFs were also fitted with “bi-exponential and gamma” function, but the tri-exponential function gave the best fit for all subjects involved in this study. Sample graphs of the fitted PSAIF, metabolite fraction, and the resulting metabolite-corrected PSAIF are shown in Supplementary Figure S1.

The PBIFs were generated from the metabolite-corrected PSAIFs of all the 18 subjects under review (samples in Supplementary Figure S5A). The metabolite-corrected PSAIFs for all subjects were interpolated to the same time grid (with a step of 1 s), and then, their peaks were aligned to the 30 s time point where the majority of the IF peaks occurred. In order to reduce the influence of subject-induced variation on the generated PBIF, each of the metabolite-corrected PSAIFs was normalized separately by three methods: (a) Weightsubject× DoseInjected, (b) the corresponding AUC, and (c) Weightsubject×AUC.

For each subject, PBIF was generated by averaging the normalized PSAIF of the other 17 subjects—leave-one-out procedure [17, 20, 33, 34]. Individual subject IFs were then generated by appropriately scaling the PBIF with the corresponding factor, i.e., (a) Weightsubject×DoseInjected, (b) the corresponding AUC, and (c) Weightsubject×AUC.

Since the normalized PBIF does not have arterial blood samples, the AUC scaling was done by tail-fitting the normalized PBIF and the PSAIF using the time points 30, 45, 60, and 90 min. Additionally, three pseudo-time points (37.5, 52.5, and 75 min) were created as the average of PSAIF at 30 and 45 min, 45 and 60 min, and 60 and 90 min respectively. This was done in order to find an optimal time point which minimizes the difference between the original PSAIF AUC and the PBIF AUC obtained by scaling with one (or two) late blood sample, following a similar approach for TSPO study [27].

The reproducibility of VT using PBIF was assessed using Bland-Altman analysis, with PSAIF values as gold reference. For each structure, the % relative difference (Relative Diff), D, between the parameters was estimated using:

$$ \% Relative\ Diff,D=\frac{P_{\mathrm{PBIF}}-{P}_{\mathrm{PSAIF}}}{P_{\mathrm{PSAIF}}}\times 100 $$
(6)

where PPBIF and PPSAIF are the kinetic parameters generated by PBIF and PSAIF respectively.

The bias and the corresponding 95% upper and lower LOA were estimated using Eqs. (3 and 4).

Statistical analysis

Data were analyzed using the SPSS (IBM SPSS statistics for windows, version 26.0) and Real statistics (http://www.real-statistics.com/) software. Normality of distribution was tested using the Shapiro-Wilk test. The statistical difference between the three normalization techniques was evaluated using the one-way analysis of variance (ANOVA). The pairwise t-test was also performed as a follow-up test to ANOVA in order to reveal which specific pair of the normalization techniques is significantly different, and Bonferroni correction was applied to correct for the potential error due to multiple testing. In all cases, a P-value < 0.05 was considered to suggest statistical significance.

The major steps involved in this study are represented with a workflow chart in Fig. 1.

Fig. 1
figure 1

The workflow chart showing the major steps involved in this study

Results

Test-retest repeatability and reliability

The repeatability of the VT estimates for all selected brain regions of interest in the test-retest studies are shown in Table 1 and Fig. 2. For all the brain regions of interest, the mean of the VT estimates from all the healthy volunteers is between 3.18 and 4.91 for test estimates and 3.68 and 5.92 for the retest. The VT estimates in the retest studies exhibited positive bias (ranging from 20 to 30%) compared to those deduced from the test studies. A systematic bias is also noticed between the test and the retest results, where all the differences lie above the zero line. The 95% LOA lies within ~3% and ~70% for all regions.

Table 1 Bland-Altman analysis of the variation in VT estimates between the test and retest
Fig. 2
figure 2

Bland-Altman plot comparing the test-retest repeatability of VT estimates for all selected brain regions of interest (the datapoints are from both left and right hemispheres). The solid line is the mean % bias between test and retest VT estimate, while the dashed lines represent the %LOA

Comparison between PSAIF and PBIF

Before generating the PBIF from the pool of subjects, we first examined the shape of the PSAIFs between groups (HV versus PD) and genotype (MAB vs HAB). The results are shown in Fig. 2 and Supplementary S2. Visual inspection of the average IF for HV and PD showed no difference between groups (Fig. 3). Also, the log-transformation plot showed no difference in the peak or tail for different groups and genotype (Supplementary Figure S2).

Fig. 3
figure 3

Comparison between the averaged PSAIFs of HV and PD subjects, and the resulting PBIF generated from the three normalization techniques. The inner plot shows the zoomed IF over the first 3 min. The standard deviation (SD) is shown for the PBIF generated by normalization with Weightsubject×AUC.

Therefore, the PSAIFs for all the eighteen subjects involved in this study were pooled together to generate the PBIF using the three normalization techniques used in this study. The individual IFs were generated by appropriately scaling the PBIF with the corresponding normalization factor. AUC scaling was done by tail-fitting the normalized PBIF and the PSAIF using the time points 30, 37.5, 45, 52.5, 60, 75, and 90 min. The AUC of the scaled PBIF using the different time points and the original PSAIF were then compared by evaluating the %error (result in Fig. 4). Although there is no significant difference in the %error between the different time points, the sample at 75 min yielded the least %error of 0.53%. Since 75 min was actually an average between the blood collected at 60 and 90 min, the AUC scaling (i.e., scaling with one (or two) late blood sample) in this study was done by tail-fitting the PBIF and the PSAIF using the last 30 min time points (i.e., between 60 and 90 min).

Fig. 4
figure 4

AUC comparison of the scaled PBIF using the different time points and the original PSAIF. Note that no blood was collected at time points 37.5, 52.5, and 75 min; they were just average time points of 30 and 45 min, 45 and 60 min, and 60 and 90 min respectively

Evaluation of the PBIF and the normalization criteria

Figure 5 shows the % difference and the LOAs between the VT estimates generated by the PSAIF and PBIF for selected brain regions. The comparison is made using PSAIF and the PBIF generated by the three normalization techniques. The mean % difference is −10% for Weightsubject×DoseInjected, +8% for AUC, and +2% for Weightsubject×AUC, while the LOAs lie within ±45% for Weightsubject×DoseInjected, ±50% for AUC, and ±38% for Weightsubject×AUC. The ANOVA analysis shows a significant difference between the results generated by the three normalization techniques for all brain structures except the brainstem (P-value = 0.095). Although for the same brainstem, the pairwise test shows a significant difference between Weightsubject×DoseInjected versus AUC (P-value = 0.034). In all cases, normalizing by Weightsubject×AUC yielded the smallest % bias and variability (% bias = ±2%; LOA = ±38% for all brain regions).

Fig. 5
figure 5

The % relative difference in VT and the LOAs between PSAIF and PBIF of some specific structures as generated by the three normalization techniques

The mean bias (±SD) between the PSAIF and PBIF for the VT generated by normalization with Weightsubject×AUC are shown in Fig. 6 and Table 2 (for all brain regions).

Fig. 6
figure 6

Bland-Altman analysis showing the variation in the VT between the PSAIF and PBIF (normalization with Weightsubject×AUC). The values are shown for the HV (blue circles) and the PD (red boxes) groups. The solid line is the mean % bias between PSAIF and PBIF VT estimate, while the dashed lines represent the %LOA

Table 2 Bland-Altman analysis of the variation in VT estimate between the PSAIF and PBIF (normalization with Weightsubject×AUC)

The mean bias for VT lies within ±2%, with amygdala showing the smallest (1.21%) deviation and putamen showing the highest (1.91%). Overall, the 95% LOA for all brain regions lies within ±38%.

We also evaluated the reproducibility of the VT generated with PBIF to that of PSAIF using the estimated t*, and other goodness of fit criteria (AIC, R2, and % standard error (SE)). The result is shown in Supplementary Figure S8. There is no difference between the VTs generated by PBIF and PSAIF based on these criteria.

Agreement of V T between PSAIF and PBIF (based on disease groups and genotype)

Finally, we estimated how well the PBIF-kinetics replicates the PSAIF-kinetics based on disease groups (HV versus PD) and genotype (MAB versus HAB). Figure 7 shows the Logan VT values for all subjects calculated with PSAIF and PBIF for the two genotype groups (HAB and MAB). For all brain regions, the average VT values obtained from PBIF is slightly higher than PSAIF, but the difference is not significant for each genotype group (%difference = 4.53%, P-value = 0.73 for HAB; and %difference = 0.73%, P-value = 0.96 for MAB). Comparing MAB to HAB, there is a significant reduction in VT both with PSAIF and PBIF. PSAIF showed an average reduction of 40% in VT across the brain regions, while the average reduction with PBIF is 42%. T-test showed a P-value < 0.01 for all the brain regions, both for PSAIF and PBIF, and on average, the P-value for PBIF is about 50% higher than PSAIF.

Fig. 7
figure 7

Logan VT values for all subjects calculated with PSAIF and PBIF for the two genotype groups (HAB and MAB). Comparing MAB to HAB, there is a significant reduction in VT both with PSAIF and PBIF. PSAIF showed an average reduction of 40% in VT across the brain regions, while the average reduction with PBIF is 42%

Figure 8 shows the difference in Logan VT values calculated with PSAIF and PBIF between HV and PD subjects, and also HAB and MAB groups. The average VT values are insignificantly higher in PD patients compared to HV. PBIF tend to overestimate the difference between PD and HV for HAB (%difference = 32.33%, P-value = 0.32 with PBIF; %difference = 13.28%, P-value = 0.64 with PSAIF). However, this difference is underestimated in MAB (%difference = 6.84%, P-value = 0.77 with PBIF; %difference = 20.92%, P-value = 0.55 with PSAIF).

Fig. 8
figure 8

Logan VT values between HV and PD subjects and also HAB and MAB groups calculated with PSAIF and PBIF

Discussion

Several studies have shown the feasibility to image neuroinflammation in multiple sclerosis (MS), Parkinson’s disease (PD), and Alzheimer’s disease (AD) using [11C]DPA-713 PET for quantifying differences between patients and controls [4, 9,10,11,12]. Accurate quantification of tracer uptake and metabolism in the tissue through kinetic modeling often requires blood sampling [13] or some alternative approach such as simplified reference modeling [35, 36], cluster analysis [35, 37], or image-derived input function (IDIF) techniques. The apparent limitations of these approaches [15,16,17, 19, 21, 38] are giving way to the exploration of the population-based input function (PBIF) approach as a more quantitatively reliable and less invasive alternative.

In this study, we have assessed the reproducibility of kinetic analysis of [11C]DPA-713 dynamic PET images using PBIF, compared to PSAIF, in a cohort of subjects with Parkinson’s disease and healthy volunteers. The major steps involved in this study are represented in a workflow chart in Fig. 1. The repeatability of the VT estimate was also assessed in a cohort of healthy volunteers that underwent a test-retest [11C]DPA-713 dynamic PET within the same day. Kinetic analysis with PSAIF was determined to be reproducible with PBIF if the corresponding LOA are within those of the test-retest study.

Test-retest repeatability

The test-retest repeatability study of the [11C]DPA-713 uptake exhibited systematic increase in uptake values between test and retest (Fig. 2 and supplementary Figure S3) although the cause of this systematic bias is yet to be fully explored in same day test-retest repeatability studies because most repeatability studies are done days or even weeks apart [39, 40]. Few recent studies that performed same day test-retest have reported the same systematic bias, and they suggested that the possible explanation to this systematic bias could be due to hormone-mediated changes in TSPO expression, tonic changes due to scan-related stress/anxiety, or alteration in blood cholesterol due to food intake between the test and retest scans [39, 41,42,43]. While performing test and retest studies under similar conditions on different days could eliminate this bias [40], other parameters such as alteration in TSPO density due to chronic disease as well as non-disease-related factors may be difficult to control [43].

One potential approach to compensate for the systematic bias between the test and retest studies is by normalization by the corresponding kinetic parameters of the GM as suggested by past studies [39, 43,44,45] and also shown by this study (Supplementary Figure S3 and S4). Without GM normalization, the % relative difference between test and retest VT values lies significantly above the zero line for all structures, indicating that retest values are always higher than test values. But with GM normalization, the % relative difference is symmetric about the zero line. The mean % Diff, the LOA and the CR are also significantly reduced, thereby improving repeatability. However, past studies involving gray matter normalization was validated in a clinical population (HIV with associated cognitive deficits) with regional inflammation. We believe there is no sufficient justification of using the GM normalization approach for PD cohorts since neuroinflammation can occur in any brain region, and therefore, we elected not to adopt it.

Another possible explanation for the high test-retest bias as reported in this study is the small regions of interest used and the relatively small number of healthy subjects (n=5) who underwent the test-retest scans. This is in agreement with a TSPO study from Jucaite et al. [39] which showed that the mean bias between test and retest was low in large brain regions (such as the whole brain, cortical gray matter and white matter) but high in smaller brain regions. They also attributed the large variability to the limited sample size, and this was also confirmed by Plaven-Sigray et al. [46] who estimated a test-retest variability in six healthy controls and obtained a variability of about 21% in VT. In fact, Collste et al. [40] carried out a test-retest study with six subjects examined on the same day and other six subjects examined 2–5 days apart. The % bias was within 14–27% for subjects examined within the same day and 0.2–8.4% for subjects examined 2–5 days apart. We therefore recommend that more research still needs to be done in order to fully understand the effect of small sample size, small brain regions of interest, as well as days between test and retest scans on the test-retest reproducibility.

Generation and evaluation of the PBIF

The PBIF was generated from the PSAIF of all 18 subjects after examining the shape of the PSAIFs between groups (HV versus PD) and genotype (MAB vs HAB). This was motivated by Owen et al. [47, 48], who demonstrated that the second generation TSPO tracers target two binding sites in humans, which leads to three affinity patterns: low-, high-, and mixed-affinity binders (LABs, HABs, and MABs respectively). Past researches have shown that this variability in binding affinity has a major influence on the kinetic parameters where the values for HABs could be approximately twice that of MABs [48, 49]. For [11C]-DPA-713 dynamic PET studies, Coughlin et al. [43] argued that those genotypes as well as other unknown physiological factors have varying degrees of influence on the global TSPO changes in the brain, thereby hindering accurate PET analysis, even among individuals with the same genotype. This was also confirmed in other TSPO studies [39, 45, 50]. Our results (Fig. 3 and Supplementary Figure S2) however showed no significant difference in PSAIF between the groups, and so, all subject PSAIFs were included in the generation of the PBIF. This was also in agreement with other studies [27, 51].

Ye et al. [52] opined that the bias in kinetic parameter estimation in direct reconstruction with PBIF depends on the normalization and scaling technique used. In this study, we have assessed and compared three normalization approaches: (a) Weightsubject×DoseInjected, (b) AUC, and (c) Weightsubject×AUC. These normalization techniques were chosen because they reduced the influence of subject-induced variation on the generated PBIF. We also normalized the PSAIFs by different methods (as shown in supplementary Figure S6 and Table S2). However, since our aim is to reduce patient variability in the generation of the PBIF, we decided to go with the above stated normalization techniques. An example of the normalized PSIFs and the resulting PBIF are shown in Supplementary Figure S5. The performance of these techniques was evaluated using the percent relative difference between the PSAIF- and PBIF-derived VT in selected brain regions (Fig. 5). There is a significant difference between the three normalization techniques for all brain structures except the brainstem.

Several normalization techniques have been reported in the literature which include traditional scaling using blood samples by correlating the measured plasma activity at a given time-point with the AUC [17]; correlation of the PBIF with AIF at any time-point using venous samples [19]; by accounting for injected dose and weight [20, 21]; non-invasive scaling using individual parameters like weight, body surface area (BSA), and lean body mass (LBM) [17]. In this study, we have assessed the three aforementioned normalization approaches. Subsequently, a subject IF was deduced by scaling the PBIF by his/her weight and injected dose. In the case of AUC normalization, this was measured after scaling the PBIF by the ratio of the average activity concentration of blood samples acquired over the last 30 min of the dynamic scan (i.e., between 60 and 90 min) and that of the tail of the PBIF over the same time frames. This setting was used as this best minimizes the error between PSAIF AUC and the scaled PBIF (Fig. 4).

Precisely, the AUC between PSAIF and PBIF was minimized by scaling the PBIF with an arterial blood value at 75 min, as also recommended by past similar TSPO studies [27, 51]. It has been shown that venous blood samples may practically be used instead for scaling purpose since arterial and venous blood tend to reach equilibrium at about 30–45 min post-injection time [17]. Although this was not tested in this work, but similar TSPO studies have also found that PBIF can be appropriately scaled using one blood sample [27, 51]. Since the utmost aim is to potentially alleviate the need for arterial blood sampling, the AUC component of the PBIF normalization can be obtained by scaling the normalized PBIF by the ratio of the average activity concentration of blood samples (possibly venous blood) acquired over the last 30 min of the dynamic scan (i.e., between 60 and 90 min) and that of the tail of the PBIF over the same time frames (as was done in this study). However, more relevant clinical studies need to be conducted to establish a correlation between activity concentration in arterial and venous blood samples at these latter time points of the scan.

In this study, normalization by Weightsubject×AUC yielded the smallest % bias (±2%) and variability (LOAs ±38%) between PBIF and PSAIF (Fig. 5). VT measured with PBIF showed good reproducibility (LOA of ±38%) but with a positive bias (±2%) (Fig. 6 and Table 2). These were also in agreement with the findings of Lavisse et al. [19]. As a final note, the reproducibility of the PBIF-based VT estimates compared with PSAIF-based VT fall well within the test-retest results (Table 1), hence showing the feasibility of [11C]-DPA-713 PET kinetic modeling using PBIF.

PBIF was able to reproduce the PSAIF kinetic results because of the similar patterns in average AIF between disease groups and genotypes (Fig. 3 and Supplementary Figure S2). Although we expect that PBIF cannot exactly reproduce the peak and shape of the PSAIF (as shown in Supplementary Figure S2), but having a similar AUC between PSAIF and PBIF will result in less bias in kinetic parameter estimation. That is why previous studies have recommended that Logan VT method is more suitable for PBIF than 2-tissue compartment model because Logan VT relies on the AUC of the IF and therefore less sensitive to the shape [20, 53].

A major limitation for this study is the relatively small sample size (n = 18), even though our findings are in agreement with previous results of smaller (n = 9) [19] and larger (n = 42) [20] sample sizes. A common factor among these studies is the normalization of the individual input functions to remove variabilities in the PBIF. This might suggest that the efficiency of the PBIF in accurately estimating the kinetic parameters depends less on the sample size used but more on the normalization. This was also consolidated by Ye et al. [52] who opined that the bias in kinetic parameter estimation in direct reconstruction with PBIF was mostly due to inaccuracy in normalization and scaling.

Conclusion

This study demonstrated the feasibility of [11C]-DPA-713 PET kinetic modeling using PBIF with Logan graphical analysis, thus potentially alleviating the need for arterial blood sampling. Moreover, it was shown that the optimal result in terms of kinetic parameter accuracy was obtained when the PSAIFs were normalized with Weightsubject×AUC.