Introduction

Positron emission tomography (PET, see Table 1 for all abbreviations in this paper) using 2-[18F]-2-deoxy-D-glucose (FDG) is an established imaging modality in oncology [1]. Although in daily practice visual inspection of FDG PET images is used for diagnosis and assessment of response to therapy, it has been shown that (semi-) quantitative analysis allows an objective complement to visual interpretation of lesions [24]. Results of this analysis might be used for individual tailoring of therapy, since increased FDG uptake usually corresponds to a dismal course of the disease. Repeated measurements can be used in early response assessment, valuable for further individualization of therapy [5]. Generally, lesions are quantified using the standardized uptake value of FDG [SUV, i.e. the FDG activity concentration at a single time point normalized to the administered activity (AA) and a measure for distribution volume such as body weight] and treatment response is assessed by the relative change between a baseline and a follow-up scan during the course of therapy (ΔSUV). Although the results of the first trials have been published in which the performance of individually tailored therapy based on early response to neoadjuvant treatment is investigated [6] and new trials are currently being undertaken, the multiple factors influencing the quantification of glucose metabolism by FDG PET are still under discussion [79].

Table 1 Commonly used abbreviations

Besides the influence that these factors have on the quantification of parameters of glucose metabolism, a variety of factors also influence the reproducibility of these parameters [10, 11]. Quantification of glucose metabolism by FDG PET is not only dependent on biological properties of the disease under investigation, but also on methodological aspects of patient preparation, image acquisition, reconstruction, region of interest (ROI) definition and methods of parameter computation. To be able to perform multicentre studies or meta-analysis, but also to apply results of studies in clinical practice, the influence of these factors should be minimized by standardization. This has led to the development of consensus recommendations by the European Organization for Research and Treatment of Cancer (EORTC) [12], the National Cancer Institute (NCI) [13] and the Netherlands Society of Nuclear Medicine (NEDPAS) [14]. The Society of Nuclear Medicine has agreed on procedure guidelines for tumour imaging but conclude that optimal methods for semiquantitative measurements need further elucidation [15].

This review aims to give a theoretical background illustrated by up-to-date publications on the influence of methodological factors influencing quantification of FDG PET. It will not merely focus on the semiquantitative parameter SUV, but also include fully quantitative parameters such as the glucose metabolic rate (MRglc) and the pharmacokinetic rate constants of two-compartment model analysis. Hardware issues influencing scanner sensitivity, such as detector crystal material, photon energy window, coincidence timing window, detector ring diameter and axial length of the field of view (FOV), are not addressed in this review. Several other factors are considered outside the scope of this study; these are: methodological errors, such as invalid cross-calibration, asynchronous clocks, omission of decay correction for the time period between calibration and start of the PET scan, low precision of plasma glucose measurement, failure to measure residual activity concentration of the infusion system or paravenous infiltration of FDG and factors inextricably linked to the non-specific targeting of FDG (e.g. infection, post-radiotherapy inflammation).

Patient preparation and image acquisition

Biological factors affecting quantification

Several biological factors affecting quantification, such as fasting plasma glucose level, uptake period, FDG distribution and clearance, patient motion (breathing) and patient discomfort (stress), all deserve attention at the time of patient preparation, FDG administration and distribution and image acquisition.

Blood glucose level

High blood glucose levels, due to a non-fasting state or diabetes mellitus, interfere with FDG uptake in malignant lesions. The transmembranous glucose transport facilitators (GLUT), albeit overexpressed in many cancers, can be saturated by an excess of unlabelled glucose. This diminishes FDG uptake as glucose and FDG both compete for the binding sites of transporters and enzymes, leading to zero-order kinetics. In patients without any known form of glucose intolerance it is shown in two consecutive scans that the SUV, using body weight as a measure of distribution volume, is significantly lower in the loaded state (serum glucose >8.0 mmol∙l−1) in both head and neck cancer (SUVBW = 6.9 vs 4.0, p < 0.02) [16] and bronchial carcinoma (SUVBW = 5.07 vs 2.84, p < 0.001) [17] compared to the fasting state (<6.0 mmol∙l−1). In contrast to the reduction of tumour uptake, skeletal muscle accumulates more FDG, resulting in blurring of tumour margins and less clear localization of lesions [16]. Also Patlak-based net influx rate constants (Ki) of the lesions decrease markedly in the glucose loaded state (mean −25%, p < 0.05), while MRglc is on average 36% higher. This paradoxical increase in MRglc might be due to the fact that the authors assumed the same value for the relative affinity of the biological system to FDG and glucose (‘lumped constant’, LC) in both states to compute MRglc from Ki, which is likely untrue [18].

The effect of hyperglycaemia has clear impact on visual interpretation of FDG PET images as it results in a reduced detection rate of malignancies [19], but it also has a major effect on quantification. Therefore, hyperglycaemia should be a reason to reschedule the procedure. Correction of hyperglycaemia by insulin directly prior to the FDG injection is dissuaded, because hyperinsulinaemia increases the translocation of GLUT4 thereby rapidly and efficiently shunting FDG to organs with a high density of insulin receptors (e.g. skeletal and cardiac muscles) [19]. When FDG injection is postponed to 1 h after insulin administration (up to a serum glucose below 8.0 mmol∙l−1) in hyperglycaemic diabetic patients, no differences between normoglycaemic non-diabetics and hyperglycaemic (insulin-corrected) diabetics are found in SUV, using lean body mass as measure for distribution volume (SUVLBM), in lungs, liver, muscles, myocardium or suspected pulmonary lesions [20]. However, recently a standardized protocol of intravenous insulin at least 1 h prior to FDG injection led to an unacceptable biodistribution in 25% of diabetic cancer patients (increased muscle uptake and decreased liver uptake). In these patients the interval between insulin and FDG injection was significantly shorter than in the patients with a normal biodistribution (65.7 vs 80.2 min, p < 0.01) [21]. Metformin strongly increases the SUV of the small and large intestines, potentially decreasing the detection sensitivity for malignant lesions, but the influence on lesion quantification is not described [22]. To prevent hyperglycaemia in non-diabetic patients, guidelines [1214] recommend that patients should have fasted at least 4 h, but preferably 6 h before administration of FDG. When blood glucose levels are well outside physiological ranges (e.g. >11 mmol∙l−1 [14]) the scan should be postponed. We use a stricter cut-off for research purposes (8.0 mmol∙l−1).

Uptake period

Typically, PET acquisition is performed 45–60 min after FDG injection, based on the fact that FDG activity concentrations become constant within the first hour after tracer injection in normal tissues. In malignancies, however, it has been shown that for some tumour types FDG concentration continues to increase up to 4–5 h after injection and that constant FDG activity concentrations are rarely reached within the first hour after injection. Between 60 and 90 min post-injection, it is not uncommon for a lesion to further increase SUV as much as one tenth of the value reached 10 h post-injection [23]. Since a further increase in FDG activity concentration 1 h after injection is rare in normal tissues [24], it has been hypothesized that dual-time-point imaging (45 and 90 min post-injection) might improve detection rates of lesions with high glucose metabolism by increasing tumour to background ratios. Therefore, for response monitoring studies using SUVs it is highly important to keep the uptake period of FDG within narrow limits (55–65 min) [13, 14].

FDG distribution and clearance

To optimize the distribution of FDG throughout the body, international guidelines recommend prehydration (0.5 l) of the patient to ensure excretion of FDG from healthy background tissue, which may be further enhanced by furosemide-induced forced diuresis. This might be of special value when the pelvis or kidney regions are of interest [1214]. Median SUV values of tumours are lower if furosemide is used. However, the total fraction of excreted activity is not different if diuretics are used, but is significantly higher early after injection for the furosemide group leading to improved image quality and reducing patient radiation exposure [25]. Therefore, it seems important when comparing parameters for glucose metabolism that the use of (loop) diuretics should be taken into account. A practical disadvantage of using forced diuresis during dynamic acquisition is that measures should be taken to avoid premature termination of the acquisition due to urinary urgency.

Patient (periodic) motion

Apart from exercise-induced increased muscle uptake during the uptake period, the effect of motion during acquisition has consequences for lesion localization (e.g. spatial mismatch around the diaphragm due to breathing) and causes smearing of the lesion activity concentration within the volume of movement. Consequently, the lesion metabolic volume is overestimated and the SUV is underestimated. Moreover, tissue inhomogeneity is similarly smeared, leading to loss of spatial heterogeneity. The magnitude of the decrease of recovered activity concentrations depends most markedly on lesion size and amplitude of motion and to a lesser extent on the motion frequency. Recovered activity concentrations can be increased by better lesion volume estimation by a motion correction algorithm. Verified in nine lung cancer patients, this algorithm reduces the estimated lesion volume by 15% leading to an increase of the mean SUV in the ROI (SUVmean) by 5% [26]. Different other techniques may be applied to improve recovery of activity concentration in periodically moving lesions such as gated PET/CT (in which data are only acquired during a certain respiratory phase), respiratory correlated dynamic PET/CT (by summing the sinograms of images of a particular breathing phase selected using a point source) or deep inspiration breath-hold PET/CT [27].

During dynamic acquisition, non-periodic patient movement can have major influence on measured parameters, since the lesion can “move into” the ROI usually defined in the last time frame(s). Consequently, activity concentrations of the lesion are underestimated for the period before patient movement due to a mispositioned ROI. In the time frames in which the lesion of interest is visible the ROI can be realigned, but during the early time frames this is often impossible due to noisy images or insufficient tissue to blood contrast. Patient movement therefore should be prevented and monitored during acquisition, the original position should be restored as quickly as possible and any movement should be noted including time at which it occurred.

If CT-based attenuation correction (AC) is used, any spatial mismatch will lead to incorrect amplification of measured activity concentrations, leading to unreliable values for lesion quantification (differences of around 10% in SUV [28, 29] due to breathing, smallest for free breathing CT [28]).

Patient discomfort

Patient stress or cold during the distribution period increases uptake of FDG in muscle and brown adipose tissue (BAT) [3032]. This can lower detection sensitivity (by lowering contrast) and specificity (by increasing the number of false-positive lesions) of the images and might affect quantification [14, 33]. Factors known to increase FDG uptake in BAT such as exposure to cold [31, 32, 3436] should be prevented by exposing patients to thermoneutral conditions. Medication such as the beta-adrenergic blocking agent propranolol [31, 3639], reserpine [36] and the opiate fentanyl [40] all prevent FDG uptake in BAT, but the effect of benzodiazepines seems doubtful [35, 36, 40, 41]. The NEDPAS guidelines [14] suggest considering the use of benzodiazepines to reduce muscle uptake when tumours in the head and neck region are expected and state that there is no place for it to reduce uptake in BAT.

Technical factors affecting quantification

At the time of data acquisition, several factors influence quantification. These include scan acquisition parameters (such as acquisition mode, scan duration, bed overlap and administered FDG activity), time frame duration in dynamic acquisitions, factors related to attenuation correction (such as motion and the use of CT contrast agents) and other forms of data correction. Most of these settings are based on the performance of the scanner and should be measured according to the NEMA (National Electrical Manufacturers Association) NU2–2007 standard [42] (Table 2).

Table 2 NEMA properties of currently commercially available modern PET/CT scanners

Acquisition parameters

The choices of acquisition parameters are closely interrelated and are based on trade-offs between signal to noise ratio (SNR) and radiation safety, financial issues (e.g. patient throughput, costs of FDG), patient comfort and logistics. Increasing the sensitivity of the PET scanner by using 3-D acquisition (instead of 2-D acquisition using lead or tungsten septa between crystal rings), increasing scan duration, enlarging bed overlap and raising the administered FDG activity (below the amount that results in count rates exceeding the maximum count rate capabilities of the scanner) all improve SNR. Removing the interplane septa from the PET scanner (i.e. changing from 2-D to 3-D mode) typically leads to a four- to eightfold increase in sensitivity but also increases the number of detected scattered and random photons. Generally, the peak noise equivalent count rate (NEC) is obtained at a lower activity concentration in 3-D than in 2-D as expected from the behaviour of the (increased) true coincidence rate. As a result the SNR (for a uniform cylinder \( SNR \propto \sqrt {NEC} \) [43]) will be higher when operating in 3-D versus 2-D mode for the low activity concentration range. Recommendations for administered FDG activity therefore should be based on acquisition mode (2-D or 3-D), scan duration and bed overlap, rather than on pre-determined diagnostic reference levels [12]. The NCI guidelines recommend a total administered FDG activity of 5.18–7.77 MBq∙kg−1 [13], and the NEDPAS recommendations advise 27.5 MBq∙min∙kg−1 (for 2-D and ≤25% bed overlap), 13.8 MBq∙min∙kg−1 (for 3-D and ≤25% bed overlap) or 6.9 MBq∙min∙kg−1 (for 3-D and 50% bed overlap). For a 3-D acquisition of a 70-kg patient with 4 min per bed position (with 25% bed overlap), it recommends approximately 242 MBq (±10%) to be administered, which corresponds to an effective absorbed dose of 4.6 mSv of the FDG (0.019 mSv∙MBq−1 [44]). Reducing scan duration therefore necessitates increased administered FDG activity to maintain SNR. Increasing acquisition time effectively maintains the quality (SNR) of FDG PET scans of heavier patients, agreeing with the hypothesis that increasing body weight increases the fractions of random and scatter coincidences [45, 46]. Increasing administered FDG activity does not improve SNR in a study, explained by the fact that it was likely to saturate count rates of the equipment [47].

Time frame duration

For dynamic acquisition, activity concentration changes are largest during the first minutes of acquisition (bolus transit) but depend on the rate of FDG infusion, distribution and clearance. Therefore, when partitioning list-mode data or using pre-defined time frames, one should consider a high temporal resolution for this period, especially important for pharmacokinetic analysis of two-compartment models. Improvement of temporal resolution inherently decreases the count accuracy per time frame, since the (approximate) Poisson distribution of count statistics (for prompt coincidences) dictates a relative standard deviation dependent on the inverse square root of the observed counts. As the choice of framing and the duration of FDG infusion are related, it is impossible to give general recommendations. Generally, for scanners with low sensitivity, time frame durations should be longer and consequently the rate of FDG infusion slower. The choices of different groups using dynamic acquisition for oncological purposes can be found elsewhere [4850] (Fig. 1).

Fig. 1
figure 1

Dynamic framing duration in three groups [4850] performing dynamic oncological FDG PET for pharmacokinetic two-compartment modelling. Abbreviations are explained in Table 2. *Duration of a bolus is a few seconds

Attenuation correction

Correction for photon attenuation is necessary since positron annihilation photons interact with surrounding tissues (mainly Compton scattering, to a lesser extent photoelectric effect). Before the introduction of combined PET/CT, attenuation correction was applied by transmission imaging of the subject with an external rotating isotope source, either positron emitters (e.g. 68Ga using its mother isotope 68Ge) or single photon emitters (e.g. 57Co or 137mBa using its mother isotope 137Cs). Since photon attenuation is dependent on the photon energy, the latter two require scaling to obtain transmission sinograms adapted to 511 keV photons. The attenuation sinogram, the ratio of a transmission scan and a scan without an object in the FOV (blank scan), is used to correct the emission sinogram, which is subsequently reconstructed to an image. Since the introduction of PET/CT, the spatial distribution of the linear attenuation coefficients can be obtained using the CT scanner. Advantages of CT AC are: reduction in total scan time, lower noise in the transmission sinograms and no replacement of isotope sources necessary due to decay. Finally, they can be acquired after tracer injection, choosing the energy window around the CT photon peak to exclude the 511 keV photons (radionuclide-based transmission scans suffer from contamination by emission photons unless the transmission data are acquired before the PET tracer is administered to the patient). CT AC however leads to extra challenges: the use of polyenergetic Bremsstrahlung photons with energy much lower than 511 keV (X-ray tube potential difference usually ∼130 kV leading to photon energy ranging from 40 to 130 keV, effective energy ∼70–80 keV), artefacts due to spatial mismatches with PET or metal objects, effects of CT-enhancing contrast and artefacts due to truncation and beam hardening [51].

Conversion of the X-ray to 511 keV linear attenuation coefficients can be performed by either segmentation, scaling or dual X-ray imaging. By segmentation, regions of different material types are identified for which linear attenuation coefficients for 511 keV photons are known. Linear scaling of X-ray linear attenuation coefficients can be performed as long as Compton scattering is the major determinant of attenuation. However, at CT photon energies the photoelectric effect plays a significant role for high atomic number elements (e.g. calcium in bone, iodine or barium in contrast agents with atomic numbers of 20, 53 and 56, respectively) in which case hybrid or bilinear scaling is appropriate [51]. Finally, linear attenuation coefficients can be measured at two tube potential differences, enabling the separation of attenuation due to the Compton scattering and the photoelectric effect. In practice, no significant difference in quantification of malignant lesions is found comparing CT-based attenuation correction with segmented AC (SAC) using transmission imaging [28] or between measured (MAC) and segmented (SAC) attenuation correction using transmission imaging [29].

CT artefacts due to metal objects and attenuation due to iodine-containing CT contrast agents might influence lesion quantification. Multiple studies have found a negligible change of the SUV in a variety of malignancies varying from +2.8% to +4% [5255]. This small effect surprisingly disappears after chemotherapy [52]. The SUVmean significantly increases in the (normal) aorta (+15%), kidneys (+13%), liver (+11%), spleen (+10%) and inferior caval vein (+12%) [52]. This might be of relevance when the plasma time-activity concentration curve for pharmacokinetic analysis is image derived (IDIF): a positive bias in the input function leads to underestimation of MRglc. Around metal prosthesis, CT AC leads to an underestimation of FDG activity concentration by <6% [28].

Finally, truncation artefacts, due to a smaller transaxial FOV of CT than that of PET, lead to incorrect attenuation correction if there are tissues outside the CT FOV. Therefore, it is recommended to scan a patient with arms up except for head and neck imaging. Usually the degree of truncation is small and therefore no major effects on quantification are expected. Lowering the arms also causes beam hardening (structures of higher density cause attenuation of low-energy photons of the polyenergetic X-ray bundle) and scatter-induced artefacts. These can influence quantification up to 11–15% [51].

Other data corrections

Apart from attenuation correction, other corrections must be applied before an image can be quantified. These are: normalization, correction for random coincidences, correction for scattered radiation, correction for dead time and cross-calibration with a dose calibrator or well counter.

Remaining differences in detection sensitivity that have not been corrected in the setup procedure are corrected by normalization. During normalization all detectors are exposed to the same amount of radiation by a rotating source or a cylinder with a uniform activity concentration. The measured counts of each detector pair (line of response, LOR), which should all be equal, are corrected by normalization factors, which are used to scale the number of counts of each detector pair to correct the emission sinogram. Using 3-D acquisition poses a new challenge for this, since the number of LORs are high (order of 108); therefore, to obtain enough counts per LOR long normalization procedures seem necessary. For this reason modified methods have been developed. Instead of using the acquired number of counts in each LOR separately, a factorization into different components is being made (component-based normalization). These components are the individual crystal detection efficiencies, geometrical factors which account for differences in the distance between detectors and their exposed face and the position of crystals in a detector block [56, 57].

Random coincidences arise when photons of two unrelated positron annihilations are detected within the coincidence timing window and are recorded as a single pair. This leads to incorrectly positioned LORs and thus adds a relatively uniform background to the reconstructed images. An estimate of the random coincidence rate can be obtained by delaying the coincidence timing window by an interval relatively large to its width. Delayed coincidences cannot be true or scattered events, since photons of the same positron annihilation will always be detected within a few nanoseconds of each other, but the rate of random coincidences will be the same in the delayed and the original window. This number can be subtracted in real time from the total number of coincidences for the detector pair to correct for randoms. This, however, will increase the statistical noise level of the corrected images, since the variance of the number of random counts is added to the variance of the uncorrected counts. Other methods use ‘smoothed delays’, in which the delayed coincidences are acquired in a separate and thereafter smoothed sinogram, or estimations based on singles rates [58].

One or both photons arising from one positron annihilation might be Compton scattered by the tissue within the gantry, leading to mispositioned LORs. This leads to a hazy background activity concentration, generally highest in the centre of the image. The percentage of scattered events detected in 3-D PET might approach 60–70% of all coincidences. The attenuation data can be used to estimate the number of scattered photons as for 511 keV photons virtually all attenuation is due to the Compton effect [51]. Another method is by extrapolation of the projection profiles immediately outside the object (determined from the attenuation data), which only represent scattered photons as almost no positron annihilations occur in air. This scatter distribution profile can be subtracted from the projections prior to image reconstruction [59]. A study of cerebral FDG PET describes that all the pharmacokinetic rate constants of glucose metabolism are overestimated due to photon scatter up to 10–30% (in decreasing magnitude: K1, Ki, k2, k3 and k4). MRglc is 12–30% higher when no scatter correction is applied [60].

At high count rates effects of system recovery after detection of a photon are piled up leading to dead time. Systems can either be non-paralysable (i.e. any event within the dead time will not be counted) or paralysable (i.e. this event will furthermore restart the dead time). Empirical dead time models can be used in which the observed count rate as a function of radioactivity concentration is measured for a range of object sizes at different energy window widths, but other methods are under investigation [61].

If all previously mentioned corrections are applied, the number of counts per voxel in the reconstructed images is directly proportional to the activity concentration of that voxel. Calibration to absolute concentrations can be performed by scanning e.g. a cylinder containing a uniform solution of known activity concentration. The obtained calibration factor can be used for absolute quantification of the images.

It can be concluded that standardization of PET acquisition is highly important. Images should be acquired in a normoglycaemic (fasting) state. When using insulin to reverse hyperglycaemia, this should be injected at least but preferably longer than 1 h before the FDG. The distribution period should be held within narrow limits (55–65 min). The effect of prehydration and diuretics led to recommendations advocating standardization. For dynamic acquisition, however, the practicability of forced diuresis should be considered. Patient motion leads to underestimation in the measured SUV or MRglc and should therefore be prevented. To prevent muscle uptake and uptake in BAT, the waiting room should be kept warm and the patient should be instructed to minimize exercise. The effect of benzodiazepines is doubtful and recent literature suggests the use of beta-blocking agents when uptake in BAT is interfering interpretation of the images.

The administered FDG activity should be standardized, dependent on acquisition method and patient body weight and kept within narrow limits (<10%). When using CT attenuation correction, quantification in areas around (metal) artefacts should not be performed. Likewise, quantification of lesions in cases of spatial mismatch between PET and CT should not be performed. Even though the effect of contrast agents on SUV may be small, the bias introduced in the blood pool used for pharmacokinetic modelling is large and therefore its use is discouraged. Whenever possible, acquisition should be performed with the arms outside the FOV to prevent effects of truncation, beam hardening and increased photon scattering. For reliable quantification, the scanner should be normalized, (cross-) calibrated and corrections for photon attenuation, randoms, scatter and dead time should be performed.

Tomographic reconstruction

Analytical algorithms (e.g. filtered backprojection, FBP) are almost completely replaced by iterative statistical algorithms (e.g. ordered subset expectation maximization, OSEM) for tomographic reconstruction of acquired coincidence events to quantifiable images. Iterative reconstruction has a number of potential advantages that make it attractive in comparison with analytical methods. Analytical algorithms have the intrinsic, limiting assumption that measured data are perfectly consistent with the object, which is never true in practice due to noise and other physical factors (e.g. attenuation). In contrast, iterative algorithms can incorporate a priori information such as the statistical distribution of the coincidences and position-dependent spatial resolution. Important adjustable parameters for tomographic reconstruction are the matrix size (the number of voxels in the transaxial plane of the reconstructed images) and the reconstruction smoothing filter. For iterative algorithms also the number of iterations (the number of times the estimate of the real object is updated) and the number of subsets (the number of projections updated simultaneously in each iteration, in OSEM iterative reconstruction only) need to be defined.

Analytical versus iterative reconstruction

Studies comparing the difference of FBP to OSEM on quantification of glucose metabolism [6264] all report higher SUVs for OSEM as compared to FBP. This might not only be due to the reconstruction algorithm but also caused by different attenuation correction techniques (SAC with OSEM vs MAC with FBP) and by different reconstruction filters used [63, 64]. Similar [65] or slightly lower (2.3%) [66] SUVsmean for different tissues are found in OSEM compared to FBP, using the same method of attenuation correction. This negligible effect might be due to the higher noise levels in FBP reconstructed images, leading to different ROIs defined on FBP compared to OSEM reconstructed images: when the same ROIs are used on both images, no significant differences are found [11]. Apart from noise, also a difference in resolution may cause the dissimilarity in quantification: higher uptake values and MRglc are found in tumours (+14%), brain (+2 to +4%) and heart (+15 to +21%) for OSEM than for FBP, which is almost completely reversible by equalizing the image resolution by smoothing with a 5-mm FWHM Gaussian kernel [67].

When specifically looking at the IDIF of the left ventricle and ascending aorta, good agreement between OSEM and FBP is observed for the first 5 min of the scan, but for the last 30 min of a 1-h scan, OSEM-derived IDIFs result in 30% higher activity concentrations for the ascending aorta compared to those derived by FBP [67]. It was concluded that OSEM causes bias in regions located within a hotter background, especially relevant for determination of an IDIF. Activity concentrations of IDIFs in the ascending or descending aorta of FBP reconstructed images are within 5% of the arterially sampled, leading to similar results for Patlak MRglc using either this IDIF or arterial sampling [48, 68].

Parameters of iterative reconstruction

The number of subsets has only a small effect on the SUV in phantom experiments when the product of iterations and subsets is kept constant. A low number of iterations (1 or 2) results in poor recovery of tumour activity concentrations. Further increase of the number of iterations does not improve the accuracy of quantification of glucose metabolism of lesions with an SUV higher than 5, but mainly results in an increase of image noise. When the SUV is lower than 5, large variability in SUV is seen as a function of the number of iterations [29]. In a study of 50 oncological patients, the SUVmean in images reconstructed with 28 subsets and a varying number of iterations was systematically increased. This effect was very small after 5 iterations (<1% change between 5 and 40 iterations) [7].

Image matrix size influences both noise (smaller number of counts per voxel in larger matrices) and spatial resolution in phantom experiments [8]. The recovery of the activity concentration in the spheres of a phantom is better when the matrix size is increased from 128∙128 (voxel size ∼5∙5 mm) to 256∙256 (voxel size voxels ∼2.5∙2.5 mm). This dependency is smaller when the smoothing kernel is larger (8 vs 5 mm full-width at half-maximum, FWHM). This can be explained since the image resolution (5 or 8 mm FWHM) is not at least twice the size of the voxel size (violation of the Nyquist principle), which can be solved by increasing the matrix to 256∙256.

It can be concluded that iterative reconstruction can be used for quantification in oncology. For pharmacokinetic analysis of two-compartment models, the IDIF of the left ventricle seems to be sensitive to spill-in and the use of the aorta is preferred. Care should be taken not to overiterate, which only adds noise, which is especially cumbersome in pharmacokinetic rate constant estimation. Too few iterations however will lead to loss of high spatial frequency features, such as resolution and heterogeneity. The matrix size should be chosen not to violate the Nyquist criterion.

ROI definition

FDG uptake or metabolism is determined in an ROI, which can be the hottest voxel within the lesion (ROImax), but can also be based on an absolute threshold (e.g. an SUV value of 4.0: ROI4.0), on a relative threshold (e.g. >50% of maximum voxel value within the ROI: ROI50%), on a manually placed fixed volume (e.g. 1 ml sphere: ROIsphere:1 ml) or adaptive (e.g. relative threshold level: ROIRTL). Further refinements can be introduced such as applying a relative threshold of the background subtracted maximum voxel value [e.g. >50%∙(maximum background) above background: ROI50%(B+max)]. The best ROI is dependent on its goal: when used to quantify tumour FDG uptake or metabolism, the ROI that yields the best correlation with clinical outcome is preferred. In other situations (e.g. radiotherapy target planning), the exact volume and position of the ROI are more important. Since all but the fixed volume ROI are dependent on the maximum voxel value, the shape and size of these ROIs are dependent on all earlier mentioned factors influencing the SNR.

The influence of the ROI definition on the accuracy of the SUV was determined in a phantom study comparing ROI50%, ROI70%, ROI50%(B+max), ROImax and ROIsquare:15 mm [8]. As expected, the recovery coefficient increases for all ROI definitions as a function of the lesion size. There is a strong dependency of SUV on SNR, with a high SNR leading towards a positive bias for the maximum voxel value and thus overestimation of SUVs calculated with an ROI dependent on this maximum voxel value (all but ROIsquare:15 mm). The effects of the ROI method on the accuracy of SUV determination are trivial: ROI70% and ROI50% are about 15 and 30% lower than ROImax, respectively, and ROI50%(B+max) is in between: close to ROI50% for high tumour to background ratios and close to ROI70% for a low tumour to background ratio. Overall, ROI50% seems to be most accurate for high-resolution (noisy) data and ROImax seems to be most accurate for smoothed (low-noise) data. The fixed volume ROI (ROIsquare:15 mm) performs worst since it includes a significant number of non-tumour voxels, especially in smaller lesions [8].

In an ROI definition study of response assessment of lung cancer comparing ROImanual, ROImax, ROIcircle:15 mm, ROI75% and ROI50% it is mentioned that ROIs50% in lesions with low uptake are often discarded on post-chemotherapy scans because of inclusion of non-tumour tissue [11]. Nevertheless, an excellent test-retest reproducibility of the volume of ROI50% on 2 consecutive days is reported (ICC = 0.99). The fixed volume-based ROI (ROIcircle:15 mm) shows best reproducibility with respect to SUVmean (ICC = 0.95), but the reproducibility of ROI50% is also very high (ICC = 0.91).

Due to the partial volume effect (PVE [69]), the isocontour level for proper whole-lesion delineation is lesion size dependent. Previously, in phantom measurements the exponential relation between lesion volume and threshold (percentage of maximum value) was determined for five different contrast levels and used on metastatic lung lesions yielding highly correlated ROI and CT volumes (correlation coefficient: 0.999) [70]. A more sophisticated approach was promoted later in which a 3-D sphere was convolved with a symmetric trivariate Gaussian point spread function [71]. This equation for background-subtracted relative threshold level (RTL) as a function of tumour radius and image resolution is independent of the tumour to background ratio and is used iteratively on PET data until the measured PET volume and threshold match. In patients the technique seems feasible and similar dimensions are achieved as pathological examination of liver metastases.

In conclusion, a number of methods for definition of the region of interest are described in the literature. The maximum voxel value is preferable in data with limited noise, does not require specialized algorithms and does not suffer from interobserver variability. A threshold-based ROI can provide reproducible quantification of glucose metabolism with better accuracy in noisy images. The NEDPAS guidelines recommend the use of an ROI41%(B+max), but to increase the (background-corrected) threshold when no meaningful tumour volume definitions are provided in lesions with a low signal to background ratio. They stress that the maximal voxel value should always be noted [14].

Quantification

Quantification of FDG uptake or metabolism of the tissue within the ROI can be performed on several levels of complexity: semiquantitatively or quantitatively (from dynamic FDG PET acquisitions) using pharmacokinetic analysis of two-compartment models.

Semiquantitative methods

The simplest method to quantify tracer uptake is by calculating tumour to non-tumour ratios (T/N) using a reference tissue, which may be difficult to define or may have limited uptake (resulting in a relatively high noise level). Furthermore, uptake in this reference tissue may be influenced by factors such as therapy, and therefore T/N ratios can change without change in tumour biology.

Since tracer uptake is directly related to body volume of the patient and the AA present in the subject at start time of the scan, measured uptake (Bq∙cm−3) should be normalized for these factors. This results in an SUV and is in older papers being referred to as differential absorption (or uptake) ratio (DAR, DUR). The least complicated normalization is by AA and patient body weight (SUVBW). For uniform tracer distribution the SUV is 1.0 and an SUV >1 implies tracer accumulation. As sceptically reviewed by Keyes [72], body composition and habitus are a source of variability since fat has a much lower uptake of FDG than other tissues. Therefore, other definitions for volume of distribution of FDG are proposed. It is observed that SUVBW is still positively correlated to body weight for normal tissues, leading to SUVsBW in heavy weighted patients up to twice that of normal patients. However, the SUVLBM is weight independent [73]. Others [7476] provide evidence that the overestimation of SUVBW of liver tissue in heavy oncological patients can be prevented using BSA (body surface area) as normalization factor. See Table 3 for definition of various types of SUV.

Table 3 Equations to normalize the SUV

Another factor related to SUV is the plasma glucose level. It is unlikely that the effects of variations in glucose level only hold true for hyperglycaemic conditions. The decreased uptake values during hyperglycaemia can be adjusted for by normalizing the SUV for plasma glucose divided by the population average [100 mg∙l−1 (≈5.6 mmol∙l−1)] [17].The NEDPAS guidelines [14] support a normalization for glucose concentration based on plasma glucose concentration (mmol∙l−1) divided by the population average (5.0 mmol∙l−1).

Quantitative measures

Analysis of two-compartment models by non-linear regression

Dynamic PET studies provide the opportunity to perform pharmacokinetic analysis of two-compartment models of glucose metabolism. In addition to the PET signal [C PET (t)], the tracer concentration in the arterial blood plasma should also be measured [C plasma (t)], which is a drawback compared to static methods in which only the SUV is of interest. Another drawback of dynamic acquisition is that only one FOV (typically ∼15–20 cm) can be taken into account; therefore, in metastasized disease not all lesions can be quantified simultaneously.

Since deoxyglucose-6-PO4, in contrast to glucose-6-PO4, cannot be catabolized further and does not diffuse across cell membranes, metabolism can be simplified to a two-compartment model (Fig. 2) with four rate constants. Tracer kinetic modelling is based on several key assumptions including the tracer principle (i.e. negligible concentration of the tracer), steady-state assumption (i.e. metabolic processes are at steady state during measurement), tissue homogeneity and instantaneous mixing assumption (i.e. homogeneous tracer distribution within each compartment), linearity assumption (i.e. rate constants are independent of tracer concentration: first-order kinetics) and the tracer dynamic assumption (i.e. the tracer behaves similarly to the substance under investigation) [77]. Moreover, it is assumed that the extraction of FDG from the plasma normally is low enough for the delivery of FDG to be independent of blood flow.

Fig. 2
figure 2

The two-compartment model for FDG catabolism. C plasma (t) the activity concentration of FDG in the blood plasma, C free (t) the intracellular activity concentration of free FDG, C bound (t) the intracellular activity concentration of FDG-6-PO4, K 1 , k 2 , k 3 and k 4 rate constants (see Table 1), C PET (t) the measured PET signal which is a combination of C free (t), C bound (t) and a fraction (shaded area, V b ) of C plasma (t). The dotted line symbolizes the cell membrane

According to the Michaelis-Menten hypothesis, an intermediate complex is formed between the substrate (S) and the transporter or enzyme, which is then converted to the chemical product (P) with release of the transporter or enzyme. The reaction rates of these processes are described by \( {k_{S \to P}} = {V_{\max }} \cdot \left[ S \right] \cdot {\left( {\left[ S \right] + {K_m}} \right)^{ - 1}} \) in which V max is the maximum rate of the reaction and K m (the Michaelis constant) is that concentration of the substrate ([S]) which leads to 0.5∙Vmax. This is clearly a non-linear relation, but it is still possible to use linear tracer compartment models if an alternative substrate (S*) is competing for the transporter/enzyme, the concentration of which is of a much lower value than of S [78, 79]. In this case, the reaction rate of the original substrate is (approximately) unaltered and the reaction rate of the tracer can be described as: \( {k_{S* \to P*}} \cong V_{\max }^* \cdot \left[ {{S^*}} \right] \cdot {K_m} \cdot {\left( {K_m^* \cdot \left( {\left[ S \right] + {K_m}} \right)} \right)^{ - 1}} \) which is a linear function of [S*] as long as \( \left[ {S*} \right] << \left[ S \right] \). Therefore, the two-compartment model with four pharmacokinetic rate constants (Phelps 4K model) can be expressed by the following differential equations [80]:

$$ \begin{gathered} \frac{{d{C_{free}}(t)}}{{dt}} = {K_1} \cdot {C_{plasma}}(t) - \left( {{k_2} + {k_3}} \right) \cdot {C_{free}}(t) + {k_4} \cdot {C_{bound}}(t) \hfill \\ \Rightarrow {C_{free}}(t) = \frac{{{K_1}}}{{{\alpha_2} - {\alpha_1}}} \cdot \left[ {\left( {{k_4} - {\alpha_1}} \right) \cdot {e^{ - {\alpha_1} \cdot t}} + \left( {{\alpha_2} - {k_4}} \right) \cdot {e^{ - {\alpha_2} \cdot t}}} \right] \otimes {C_{plasma}}(t) \hfill \\ \end{gathered} $$
(1)
$$ \frac{{d{C_{bound}}(t)}}{{dt}} = {k_3} \cdot {C_{free}}(t) - {k_4} \cdot {C_{bound}}(t) \Rightarrow {C_{bound}}(t) = \frac{{{K_1} \cdot {k_3}}}{{{\alpha_2} - {\alpha_1}}} \cdot \left( {{e^{ - {\alpha_1} \cdot t}} - {e^{ - {\alpha_2} \cdot t}}} \right) \otimes {C_{plasma}}(t) $$
(2)

Where ‘\( \otimes \)’ stands for the operation of convolution and \( {\alpha_{1,2}} = \frac{1}{2} \cdot \left[ {{k_2} + {k_3} + {k_4} \mp \sqrt {{{\left( {{k_2} + {k_3} + {k_4}} \right)}^2} - 4 \cdot {k_2} \cdot {k_4}} } \right] \).

The sum of the activity concentrations in both compartments plus a fraction of the plasma activity concentration [V b  · C plasma (t), with Vb the blood volume fraction] are measured within the ROI by the dynamic PET acquisition:

$$ {C_{PET}}(t) = \left( {1 - {V_b}} \right) \cdot \left( {{C_{free}}(t) + {C_{bound}}(t)} \right) + {V_b} \cdot {C_{plasma}}(t) $$
(3)

Therefore, when the plasma input function is known, all five free parameters of FDG metabolism (K1, k2, k3, k4 and Vb) can be estimated using non-linear least squares fitting of C PET (t).

The original irreversible model (Sokoloff 3K model) did not incorporate k4 [81] since the rate of hydrolysis of FDG-6-PO4 by glucose-6-phosphatase activity is negligible in mammalian tissues, except for liver tissue [82]. This is verified in further studies [83], which compare the residual sum of squares of fits with and without k4 by the Akaike Information Criterion [84] and Schwarz Criterion [85]. These statistics reward goodness of fit but depreciate the number of free parameters of the model (the lowest value denotes the model that best explains the data with a minimum of free parameters). Moreover, simulation studies caution that a k4 might result from tissue heterogeneity rather than real dephosphorylation [86]. Therefore, currently most studies use the simplified three-rate constant (Sokoloff 3K) model. With k4 = 0, Eqs. 1 and 2 simplify to:

$$ \frac{{d{C_{free}}(t)}}{{dt}} = {K_1} \cdot {C_{plasma}}(t) - \left( {{k_2} + {k_3}} \right) \cdot {C_{free}}(t) \Rightarrow {C_{free}}(t) = {K_1} \cdot {e^{ - \left( {{k_2} + {k_3}} \right) \cdot t}} \otimes {C_{plasma}}(t) $$
(4)
$$ \frac{{d{C_{bound}}(t)}}{{dt}} = {k_3} \cdot {C_{free}}(t) \Rightarrow {C_{bound}}(t) = \frac{{{K_1} \cdot {k_3}}}{{{k_2} + {k_3}}} \cdot \left( {1 - {e^{ - \left( {{k_2} + {k_3}} \right) \cdot t}}} \right) \otimes {C_{plasma}}(t) $$
(5)

The ratio of the phosphorylation rates of FDG (MRFDG) and glucose (MRglc) equals both the ratio of fluxes between the compartments of free and bound substrates (Eq. 5) and the ratio of the Vmax, Km and concentration of both the intracellular free tracer and the natural substrate:

$$ \frac{{M{R_{FDG}}}}{{M{R_{glc}}}} = \frac{{{k_{3,FDG}} \cdot {C_{free,FDG}}(t)}}{{{k_{3,glc}} \cdot {C_{free,glc}}(t)}} = \frac{{{V_{\max, FDG}} \cdot {K_{m,glc}} \cdot {C_{free,FDG}}(t)}}{{{V_{\max, glc}} \cdot {K_{m,FDG}} \cdot {C_{free,glc}}(t)}} $$
(6)

In contrast to analogue tracers (such as FDG) when direct isotopic substitution labelling of glucose is used (e.g. [11C]-glucose), the Vmax and Km for both substrates are essentially the same and Eq. 6 would reduce to a simple ratio of concentrations of FDG and glucose. In the condition where C plasma (t) of glucose and FDG is constant, Eq. 6 can be written as:

$$ \frac{{M{R_{FDG}} \cdot {{\left( {{C_{plasma,FDG}} \cdot F} \right)}^{ - 1}}}}{{M{R_{glc}} \cdot {{\left( {{C_{plasma,glc}} \cdot F} \right)}^{ - 1}}}} = \frac{{{V_{\max, FDG}} \cdot {K_{m,glc}} \cdot {\lambda_{FDG}}}}{{{V_{\max, glc}} \cdot {K_{m,FDG}} \cdot {\lambda_{glc}}}} = L{C_{FDG}} $$
(7)

In which F is the blood flow and λ are the partition coefficients \( \left( {{C_{free}} \cdot {C_{plasma}}^{ - 1}} \right) \) of both the substrates. This ratio is called the lumped constant of FDG (LCFDG), or the steady-state ratio of the net extraction of FDG to that of glucose at constant plasma levels of both substrates [81]. The full operational equation is often written as: \( L{C_{FDG}} = {V_{\max, FDG}} \cdot {K_{m,glc}} \cdot \lambda \cdot {\left( {{V_{\max, glc}} \cdot {K_{m,FDG}} \cdot \varphi } \right)^{ - 1}} \). In this λ denotes the ratio of partition coefficients and ϕ is the fraction of glucose-6-PO4 that continues down the Embden-Meyerhof pathway (i.e. regular glycolysis), which is normally quite close to 1.0. The value of the LCFDG can be determined by simultaneous measurement of MRglc (e.g. by [11C]-1-glucose) and MRFDG [87], independent determination of all six parameters of the LC or by the ratio of fractional arteriovenous differences for the two substrates [81, 88].

In the steady-state condition of the compartment of free intracellular FDG, the flux into this compartment is balanced by the flux out \( \left( {{{\left( {d{C_{free}}(t) \cdot d{t^{ - 1}}} \right)}_{FDG}} = 0} \right) \), the MRglc can be estimated from Eqs. 5 and 7:

$$ M{R_{glc}} = \frac{{d{C_{bound,glc}}(t)}}{{dt}} = \frac{{M{R_{FDG}}}}{{{C_{plasma,FDG}}}} \cdot \frac{{{C_{plasma,glc}}}}{{L{C_{FDG}}}} = {\left( {\frac{{{K_1} \cdot {k_3}}}{{{k_2} + {k_3}}}} \right)_{FDG}} \cdot \frac{{{C_{plasma,glc}}}}{{L{C_{FDG}}}} $$
(8)

In a study with 20 patients with non-small cell lung carcinoma using the Sokoloff (3K) model, it was shown that parameter estimation by non-linear least squares fitting of 30-min dynamic data yielded essentially the same results for K1, k2, k3 and Ki, (R 2 = 0.918, 0.937, 0.785 and 0.924, respectively, with mean relative differences varying 9–25%) compared to a 60-min protocol [89].

Primary, glucose metabolism (and thus the LCFDG) are determined in the normal (rat) brain [18, 78, 87]. In this field it has been shown that the use of a single LCFDG suits only considering the conditions during which it was determined. The LCFDG is dependent on the time between tracer injection and measurement, surely changes in any disease with an enzymatic component and probably varies with regional glucose concentrations and in conditions of ischaemia, hypoglycaemia, the method of anaesthesia, age (adult or developing) and species of the subject under investigation. Due to the heterogeneity of neoplasia, these limitations are far greater and for many tumours the LCFDG is not known. As perfectly summarized elsewhere, calculations of MRFDG can be used if one is interested in comparative measurements (e.g. metabolic changes over time, during treatment, between a diseased or normal tissue or between different physiological states), assuming the LCFDG of the tissue under investigation remains unchanged. In cases in which the true MRglc needs to be determined from FDG results, then one must measure the LCFDG in the particular experimental setup [18]. Studies of the LCFDG in oncology are very limited and reports of determination of the LCFDG of non-glioma malignant tissues are lacking to this date.

Patlak graphical method

Patlak et al. [90] derived a graphical method (often called Patlak analysis, Patlak-Rutland or Gjedde-Patlak plot) that uses linear regression to analyse pharmacokinetics described by any compartment model with at least one irreversible transport step or reaction (‘trapping’ of FDG due to k4 = 0 min−1). It further assumes that all the reversible compartments must be in equilibrium with the plasma, which in practice only occurs when dC plasma (t)/dt is small enough for these tissue compartments to follow. Combining Eqs. 35, with \( {K_i} = \frac{{{K_1} \cdot {k_3}}}{{{k_2} + {k_3}}} \) leads to:

$$ \frac{{{C_{PET}}(t)}}{{{C_{plasma}}(t)}} = \left( {{K_i} \cdot \left( {1 - {V_b}} \right)} \right) \cdot \left( {\frac{{\int_0^t {{C_{plasma}}\left( \tau \right)} d\tau }}{{{C_{plasma}}(t)}}} \right) + \left( {\frac{{\left( {1 - {V_b}} \right) \cdot {K_1} \cdot {k_2}}}{{{{\left( {{k_2} + {k_3}} \right)}^2}}} + {V_b}} \right) $$
(9)

Linear regression of the plot \( {C_{PET}}(t) \cdot {C_{plasma}}{(t)^{ - 1}} \) vs \( \int_0^t {{C_{plasma}}\left( \tau \right)} d\tau \cdot {C_{plasma}}{(t)^{ - 1}} \) (“Patlak space” or “funny time”) results in slope: \( {K_i} \cdot \left( {1 - {V_b}} \right) \), from which, with an estimated Vb, MRglc can be computed. Simplification of the problem of solving differential equations by non-linear optimization to an approach amenable to linear regression avoids many problems inherent in the former approach: sensitivity to noise in the time-activity concentration curves, parameter covariance, local minima in the approximate solution to the differential equations and dependence of parameter estimates on starting guesses. As a trade-off only the Ki, but not the individual kinetic parameters, is estimated. An example is provided in Fig. 3.

Fig. 3
figure 3

Quantitative analysis of a T2 adenocarcinoma of the right superior lung lobe (left top). Right top, time-activity concentration curves. Left bottom, analysis of two-compartment models by both Sokoloff 3K and Phelps 4K model. Right bottom, Patlak graphical analysis

Linearity in Patlak space is reached in conditions where the plasma FDG concentration is constant, but since C plasma  (t) continues to decrease due to irreversible cellular uptake and renal clearance, this situation is never fully met. In good approximation, onset of linearity is usually attained 10–15 min after the bolus injection. It is shown that MRglc determined by Patlak analysis over different intervals is highly similar compared to the Sokoloff (3K) model (all R 2 ≥ 0.951). ROIs were defined on the last three time frames (45–60 min), which is a limitation of this study; in shortened protocols, tumour ROIs have to be drawn at earlier time points, in which the contrast is lower, leading to different ROIs which likely will produce less accurate MRglc [83]. Another way of shortening acquisition duration is by combination of an initial 10-min dynamic scan with a single static time frame 56–60 min after tracer injection [91]. Highly similar values for Ki between full dynamic and a shortened dynamic acquisition are described (R 2 of non-linear regression = 0.815). In practice the ROIs need to be repositioned since the patient leaves the gantry between both acquisitions.

Adaptations to pharmacokinetic models

Variations to the previously mentioned pharmacokinetic models have been reported frequently [9298]. A variation to the Sokoloff 3K model with six parameters uses a reference region outside the tumour to account for the normal tissue within the tumour ROI (2 ROIs 6P model). In simulation studies it was found to adequately describe tumour pharmacokinetic rate constants of two-compartment models regardless of the amount of normal tissue within the ROI. A drawback of the technique is that reference tissue must be available [92].

Many variations to the Patlak graphical method are reported [9398]. Wong et al. [93] describe a technique in which they create three sequential images (each 15 min duration) starting 10 min after FDG injection combined with three arterialized venous samples taken at the mid-time of each scan. This method pre-empts the need for continued sampling and showed Ki to be within 3–4% of the values obtained by regular Patlak analysis. Another variation computes lesion Ki from the mean of the per voxel Ki for which the correlation coefficient of the Patlak plot is above a threshold [94]. This technique was further adapted by defining a 2-D ROI on a summed image of all correlation coefficient-constrained planes containing the lesion. This ROI was propagated over the planes containing the lesion in which the parameters of interest were determined. The authors conclude that this technique (total lesion evaluation method, TLE) is less ROI dependent and also incorporates therapy-related volume changes and is especially suitable for therapy response monitoring [95]. A simplified kinetic method (SKM) is advocated [96], assuming a uniform input function, based on fitting the input function of control patients with a triexponential decay function. The SKM allows MRglc to be calculated from a static image and one late venous blood sample. It was shown in lung cancer patients that the SKM was an improvement over the SUV and approaches the MRglc, calculated by non-linear least squares. A hybrid method of Patlak and SKM shows less bias and variability than either technique alone. For this method six time frames (each 5 min, 25–55 min post-injection) are acquired. By using every other time frame (3 data points) or every third time frame (2 data points) this method can be used to up to three fields of view [97]. The last method mentioned is based on the relation between Ki and SUV, average plasma clearance rate and the initial distribution volume of FDG (Sadato method). Estimated Ki and SUV are compared with Patlak Ki, leading to the conclusion that the estimated, non-invasively determined Ki is a better indicator of tumour uptake than the SUV [98].

Hoekstra et al. reviewed [99] and compared [83] 34 variations of previously mentioned methods to obtain glucose consumption (2 T/N, 12 SUV, 2 Sadato, 2 SKM, 5 Patlak, 10 TLE and one “2 ROI, 6P” variations) on 30 randomly selected dynamic FDG PET scans in 19 lung cancer patients. Since incorporation of k4 did not improve fits, the Sokoloff 3K model was used as gold standard. The reliability of the gold standard was considered high since the test-retest variability (ICC = 0.95) and intra- and interobserver variability (ICC = 0.98 for both) were small. Of the 34 models tested, 10 met the required minimal correlation with the gold standard (R 2 > 0.95). They concluded that the best simplified options are SUVBSA+glucose (40–60 min or 50–60 min post-injection), Sadato method based on BSA and Patlak graphical analysis (10–60 min post-injection). Similarly, Lammertsma et al. [100] pooled the results of three studies comparing methodological aspects of response monitoring in lung [83], breast [101] and gastro-oesophageal [102] cancer (in total 170 FDG PET studies) and show excellent correlation with the Sokoloff 3K model of Patlak MRglc (ICC = 0.98), SKM (ICC = 0.94), SUVBSA+glucose (ICC = 0.91) and SUVBSA (ICC = 0.91). They conclude that although Patlak MRglc has the best correlation with the gold standard, it remains to be proved that these findings are of clinical relevance. Changes found by SUV estimation may still represent a relevant response [103, 104].

Input functions

For all previously described fully quantitative measures of glucose metabolism, the arterial plasma input function [IF or C plasma (t)] of the tumour should be known. Different approaches to obtain this IF have been described in the literature. Since it is impossible to sample the artery directly vascularizing the tumour, the gold standard is serial arterial sampling of a superficial artery (e.g. the radial artery) after which the activity concentration of the plasma derived by centrifugation can be determined [105]. Alternatives to serial arterial sampling with less complications [106] or less radiation burden to the personnel are: (arterialized) venous blood sampling [107], use of an IDIF [48, 108], modelling of a population-based IF [109111] or extraction of IF using mathematical segmentation methods [112].

To overcome the time-dependent ratio of arterial and venous blood activity concentration, the heated hand procedure, which shunts the arterial blood to the venous system, can be performed [107]. In Patlak analysis, the use of arterialized venous blood shows a net effect of ∼10% overestimation of Ki and ∼5% overestimation of MRglc compared to the gold standard of arterial sampling. IDIFs of the ascending (ICC = 0.98) or abdominal (ICC = 0.96) aorta show better Patlak MRglc correlation than IDIFs of the left ventricle (ICC = 0.94) [48, 108], but due to the PVE the recovery of activity concentrations in the aorta is less than one, causing an underestimation of the FDG activity concentration. A left ventricle IDIF shows positive bias, since the PVE caused a hot myocardium to spill-in activity in later time frames [48] leading to an underestimation of MRglc of 16.2–17.5% [113]. Another drawback of the IDIF is that it is a measure of whole blood activity concentration, which is known to be lower than in plasma and not constant over time [107]. Population-based curves reduce the need for blood sampling and moreover can be used in body areas where IDIFs are not feasible. They can be based on averaging normalized blood curves or on fitting to a proposed equation [109, 110]. Patlak-based MRglc obtained by population-based curves overall show high correlation with the gold standard (R 2 > 0.984) [109]. Usage of mathematical tissue segmentation results in whole blood activity concentrations as well [112, 114], but it leads to an image-derived whole blood IF with similar drawbacks as an IDIF.

For estimation of pharmacokinetic rate constants of two-compartment models, the exact timing (time delay) and shape (dispersion) of the IF are needed. When it is measured by sampling [82], this does not necessarily reflect the supply of the tracer to the lesion. The time delay can be corrected for, but the dispersion of the IF results from the impulse response characteristics of the distributing system of the patient and is therefore difficult to predict [79]. These factors complicate the use of pharmacokinetic analysis of two-compartment models and therefore frequently IDIFs of a large blood pool close to the tumour are used [49, 50]. The effect of time delay and dispersion is of negligible relevance for Patlak MRglc estimation, since the C plasma (t) in the Patlak equation is relatively constant in the period of linear regression and the integral of C plasma (t) is almost not affected.

Concluding, for clinical practice and most research into the value of quantification of FDG PET, the SUVmax and SUVs of isocontour-based ROIs are most relevant. They can be performed whole body, pre-empt the need of an IF and can be obtained from static images. In special situations FDG metabolism can be quantified using dynamic acquisition, preferably using IDIFs of the ascending aorta. Even though these result in absolute quantification of FDG influx, they are limited in calculation of ‘real’ MRglc due to the indefinite value of the LCFDG.

Repeated measures studies

Quantitative FDG PET is not only useful as a single measurement (e.g. for treatment stratification), but repeated measurements before, during and after treatment may be used for early response assessment of therapy. For this purpose a new quantification parameter is introduced, defined as the product of the SUV and tumour volume (total lesion glycolysis, TLG) [115]. Even though the ΔTLG is used in a small number of recent studies [116, 117], this response parameter needs further evaluation before it can be recommended for routine use, since it does not perform better than ΔSUV alone in all studies [116].

Measured tumour response is nearly independent of the ROI definition, since most factors contributing to bias and noise cancel out in calculation of the relative change [8, 11]. However, when the tumour volume changes significantly, the PVE can play a role which cannot be cancelled by measuring relative effects [8]. For matter of reproducibility, threshold-based ROIs are recommended for repeated measures studies, with a test-retest reproducibility of the SUV of 1–13 ± 6–12% [11, 118120], of Ki of 10 ± 8% and of K1, k2 and k3 of 24–42 ± 13–31% [119]. Due to the magnitude of the day-to-day variation, changes in quantification of SUV ± 15–20% (i.e. 2 standard deviations) are within the reproducibility limits of this method of quantification and therefore should be considered as stable values.

The robustness of variations of the ΔSUV (%) was shown in gastric carcinoma patients neoadjuvantly treated with cisplatin-based chemotherapy [10]. The authors conclude that in gastric carcinomas the prediction of response to chemotherapy on the basis of relative tumour SUV changes is not essentially influenced by any of the methodological variations investigated (incubation period 40 or 90 min, reconstruction method FBP or OSEM and various SUV normalizations). However, phantom experiments reveal a difference between institutions for SUV quantification up to 30%, which can be improved by calibration [29]. This is especially important when baseline and follow-up scans are performed in different settings or institutions, underlining the need for standardization [14].

Conclusion

Apart from hardware issues and sources of error, many methodological and biological factors influence quantification in FDG PET. For multicentre investigation therefore these parameters must be standardized and intercentre calibration has to be performed. In general, the relative simplicity of semiquantitative quantification by SUV seems to outweigh its drawbacks, providing that the process from acquisition till quantification is standardized. Repeated measures studies seem less dependent on most factors influencing quantification as they cancel out in calculation of relative treatment effects providing that the methodology at the baseline scan is repeated for the follow-up scan. Pharmacokinetic quantification is a sophisticated and rather complicated method and is therefore mainly applied in a research setting.