Background

Since its introduction as a positron emission tomography (PET) tracer back in the early 1970’s, [18F]-fluorodeoxyglucose (18F-FDG) has been widely utilized and now comprises more than 96% of PET studies worldwide [1]. Even though 18F-FDG is mainly a radiotracer for oncology, it is not a tumor-specific PET tracer, since it is essentially based on the presence of elevated glucose uptake [2]. Many malignant lesions, in fact, are poorly imaged with 18F-FDG; some due to their slow growth or low metabolic nature, and others due to their location within highly metabolic organs such as the brain and liver [3]. Various alternative PET tracers have been synthesized and evaluated over the last decade to overcome the limitations of 18F-FDG, including tracers based on amino acid metabolism such as l-3-18F-α-methyl tyrosine (18F-FAMT) [1, 4].

18F-FAMT has been validated in several clinical studies to be useful for the prediction of cancer prognosis and to rule out benign lesions from malignant neoplasms [5,6,7,8,9,10,11,12,13]. The tumor accumulation of 18F-FAMT is exclusively facilitated by the L-type amino acid transporter 1 (LAT1), which is highly upregulated in malignant cells [14]. Unlike other amino acid PET tracers that are not specific to a single amino acid transporter, 18F-FAMT has a α-methyl moiety that allows it to be transported only by LAT1, making it highly specific for malignancies [15]. Although a handful of clinical studies have investigated its potential in malignant tumor detection, the overall diagnostic performance of 18F-FAMT remains unknown. The present meta-analysis aimed to determine the diagnostic performance of 18F-FAMT PET for detection and evaluation of malignant lesions in a direct side-by-side comparison to 18F-FDG PET.

Methods

Search strategy and study selection

The design of this study followed the current recommendations for systematic review of diagnostic test accuracy studies from the Cochrane Collaboration [16, 17]. Studies evaluating 18F-FAMT PET or PET/CT as a diagnostic tool for evaluation of malignancy were electronically searched in Pubmed/MEDLINE, Web of Science, ScienceDirect, and Google Scholar databases from the inception of 18F-FAMT to December 2016 without language restriction. The search algorithm was based on a combination of the following terms: 18F-FAMT or 18F-FMT or “alpha-methyltyrosine.” To find more potential studies, we also screened references of the retrieved studies. Articles without raw clinical data such as reviews, conference abstracts, editorial, comments, preclinical, animal and non-radiopharmaceutical studies, or clinical studies with fewer than ten patients were excluded. The following information was extracted: first author’s name, year of publication, study design, study population, types/subtypes of malignancies, injected dose, imaging parameters, cut-off values of quantitative parameters, study and follow-up period, final diagnosis, and the reference standard.

The clinical studies obtained were subject to inclusion criteria for further analysis: (a) both 18F-FAMT and 18F-FDG were used to differentiate malignant tumors from benign lesions, (b) histopathological analysis and/or close clinical and imaging follow-up were used as reference standards, (c) when data or subsets of data were presented in more than one article, the article with the most detailed/recent data was chosen, and (d) only articles in which at least 10 of the 14 questions in the QUADAS (Quality Assessment Tool for Diagnostic Accuracy Studies) questionnaire were answered ‘yes’ were included [18]. Studies were screened for eligibility, the risk of bias, and source of variations by three authors (AA, AB, RY) independently. Disagreements regarding the eligibility of a study were resolved by consensus.

Meta-analysis

Meta-analysis of the diagnostic performance of 18F-FAMT and 18F-FDG in recognizing malignancies was performed following the current recommendations [17] and was conducted separately for two diagnostic methods: 1) by visual assessment, and 2) by diagnostic cut-off values applied in each study. From each study included, the number of true positives, false positives, true negatives, and false negatives were extracted to construct a 2 × 2 contingency table. If studies lacked clear data to produce such tables, the first authors were contacted when possible. This main data were described on forest plots of specificity and sensitivity.

Heterogeneity and between-study variability were evaluated, and subgroup study (meta-regression analysis) was used to investigate the source, if any. A Higgins’ inconsistency I 2 up to 30% was considered little evidence of heterogeneity. To determine whether different thresholds were used to define positive and negative test results (either explicitly or implicitly), the Spearman ρ between the logit of sensitivity and logit of 1 − specificity was calculated to assess the presence of a threshold effect. A strong positive correlation (Spearman ρ > 0.6) would suggest the presence of a threshold effect. Whenever possible, a bivariate random-effect model meta-analysis method was used to obtain summary estimates of sensitivity and specificity across studies instead of univariate approaches.

The hierarchical summary of the receiver operating characteristic (HSROC) curve was plotted following the method of Rücker and Schumacher [19]. The area under the curve (AUC), which is the average true-positive rate over the entire range of false-positive rate, serves as a global measure of test performance, while the diagnostic odd ratio (DOR) is calculated to describe the diagnostic value [20]. Note that the DOR is a single overall indicator of diagnostic performance and is, unlike sensitivity and specificity, independent of any threshold value. Meta-analysis was performed using the ‘mada’ (Meta-Analysis of Diagnostic Accuracy) package in R statistical software version 3.2.2 [21, 22].

Results

Literature search

The systematic search was performed to collect diagnostic test studies using 18F-FAMT and 18F-FDG PET for malignancy detection. The search yielded 65 studies involving 18F-FAMT as PET radiotracer in basic science investigations and clinical studies. There were three radiochemistry studies, nine in vitro and animal studies, four review articles, and 49 clinical studies. Thirty studies among these 49 clinical studies were original articles in which both PET radiotracers were employed. Figure 1 summarizes the systematic study selection.

Fig. 1
figure 1

The study selection

Study eligibility, quality, and risk of bias

Nine eligible studies according to the inclusion criteria (Table 1) were further evaluated with QUADAS tool. All were prospective studies of good quality (QUADAS Scores >10) involving at least 19 patients (patient number range: 19–74) and 21 lesions (lesion number range: 21–75). Overall, the nine eligible studies had a low risk of bias, except in blinding from the index test results (Additional file 1: Table S1). Blinding from the index test results was sometimes unavoidable in the clinical workflow, since histopathological diagnosis is established after the primary surgery or biopsy, while PET imaging is an early step in workups to establish the clinical diagnosis. In one study, the histopathology (biopsy) diagnosis was known before the PET study was performed [7]. However, this study was later excluded from the meta-analysis (Table 1). The other important potential source of bias was the use of other imaging studies (CT, MRI or bone scans) and close clinical monitoring as verification methods in one study [5]. However, in this study, only two patients (from 19 patients, total 57 lesions) had their lesions diagnosed without any histological examination: one had malignant melanoma in the foot (single lesion), and the other had diffuse malignant melanoma (lesions in the brain and spinal cord).

Table 1 Characteristics of Diagnostic Comparison Studies of 18F-FAMT and 18F-FDG

Six studies were included in the final meta-analysis due to the availability of individual patient data to construct 2 × 2 contingency tables (Table 1 and Additional file 1: Table S2). All studies employed maximum standardized uptake value (SUVmax) for quantitative interpretation of the PET images. Four explicitly described SUVmax cut-off value for discrimination between malignant and benign lesions. The SUVmax cut-offs of 18F-FAMT studies ranged from 1 to 1.45 while in 18F-FDG studies, they ranged from 0.81 to 4.72. Six studies with a total sample size of 272 patients (278 lesions) with malignancy from musculoskeletal [12, 23], fatty tumors [11], maxillofacial tumors [9], lung cancer [24], and several different tumors [5] were included.

Descriptive statistics

Figure 2 described the paired sensitivity and specificity of 18F-FAMT and 18F-FDG of each study in forest plots. The sensitivity of both radiotracers was homogeneous either based on the visual assessment or diagnostic cut-off values. Their specificity was heterogeneous based on visual assessment. The Spearman correlation (ρ) between sensitivity and the logit of 1-specificity suggest that accuracy of both radiotracers based on visual assessment may be influenced by threshold effects (≥ 0.6). However, their accuracy was less affected by threshold effect when the diagnostic cut-off value was implemented.

Fig. 2
figure 2

Sensitivity and specificity of 18F-FAMT and 18F-FDG for malignancy detection

Meta-analysis

Due to the small number of studies included, both univariate and bivariate approach meta-analysis was performed. The bivariate approach is the method currently recommended; however, it cannot handle small sample sizes [17]. Meta-regression or subgroup analysis (to explore the source of heterogeneity) was also irrelevant due to the limited number of studies.

Table 2 described the summary estimates from the random effects univariate analysis. DOR of 18F-FAMT and 18F-FDG based on visual assessment were 8.90 and 4.63, while those based on diagnostic cut-off were 13.83 and 7.85, respectively. The heterogeneity between studies as well as inter-study was observed only mildly on 18F-FAMT studies based on visual assessment (Higgins’ I 2: 11.76%, τ 2: 1.46) while it was not observed in other studies.

Table 2 Summary estimates from univariate meta-analysis

The summary estimate measures of the random effects bivariate model are described in Table 3 . There was no significant difference in average sensitivity and specificity between 18F-FAMT and 18F-FDG based on visual assessment (p = 0.181 and 0.207, respectively). However, 18F-FAMT was significantly more specific than 18F-FDG (p < 0.01) based on diagnostic cut-off values. DOR of 18F-FAMT and 18F-FDG based on visual assessment were 8.33 and 3.88 while based on diagnostic cut-off were 16.70 and 8.17, respectively.

Table 3 Summary estimates from bivariate meta-analysis

The HSROC curves of diagnostic performance comparison are shown in Fig. 3. The AUC of diagnostic performance of 18F-FAMT and 18F-FDG based on visual assessment was 77.4% and 72.8%, while those based on diagnostic cut-off were 85.6% and 80.2%, respectively. The estimated SROC curves from the bivariate model (Rutter-Gatsonis method) were also plotted as a reference (Fig. 3, dashed lines). The summary operating points of 18F-FAMT were on the left side of those of 18F-FDG in both HSROC curves comparison, which indicated that 18F-FAMT provided more specificity. Meanwhile, their similar heights of the summary operating points on the Y-axis showed that their sensitivities were comparable.

Fig. 3
figure 3

Summary ROC plots obtained from the bivariate model of the diagnostic performance of 18F-FAMT and 18F-FDG based on (a) visual assessment and (b) diagnostic cut-off value. Oval regions are the 95% confidence regions around the summary operating points. The SROC curves from parametrization according to Rutter and Gatsonis are also presented

Discussion

This meta-analysis summarized the diagnostic performance of 18F-FAMT PET for detection of various malignancies in six studies with total 278 patients. Overall, the included studies have a low risk of bias with good methodological quality based on QUADAS tool. Our results demonstrated that 18F-FAMT is comparable with 18F-FDG for its diagnostic performance in detecting malignancies by either visual assessments or diagnostic cut-off values. Moreover, 18F-FAMT capability is coherent in several types of tumors, where all individual diagnostic test studies directly compared the two radiotracers on the same patients in a prospective study design. Additionally, the potential for selection bias can be safely ignored due to the sufficient number of lesions evaluated in each study included (n > 20). Another strength of this meta-analysis is that even though the study number is limited, heterogeneity was not substantial. The source of observed mild heterogeneity was likely due to threshold effects, which was found in studies based on visual assessment. However, other potential sources of heterogeneity should not be neglected since subgroup analysis was not applicable [25]. Publication bias is an important consideration in any meta-analysis. However, DOR heterogeneity observed in our results precludes the necessity for a funnel plot asymmetry test [26].

In the current recommendation for meta-analysis of diagnostic test accuracy from The Cochrane Collaboration, bivariate approach meta-analysis is preferred over the traditional univariate meta-analysis [17]. However, guidance for determining methodological approaches for meta-analysis with small numbers of studies is currently lacking. In this case, Doebler et al. and Takwoingi et al. encouraged the use of univariate approaches excluding pooling sensitivities and specificities [21, 27]. Eventually, both univariate and bivariate methods were conducted in the current study, and the diagnostic performance of 18F-FAMT against 18F-FDG was consistent under both approaches. The more conservative approach for HSROC estimation (Rücker-Schumacher’s method) also showed a similar tendency to the traditional HSROC parametrization (Rutter-Gatsonis’s method) [19].

Despite the limited number of studies included, results of our meta-analysis reflect the natural characteristics of both radiotracers that assess malignant lesions via different metabolic processes. The key feature of 18F-FDG is its superior capability to depict increased metabolic activity reflected by cell glucose consumption. The price of this high sensitivity is the detection accuracy that is prone to being obscured by normal physiological uptake, inflammation, and active benign tumors [2]. In a recent large-size meta-analysis, 18F-FDG PET failed to maintain its diagnostic accuracy for lung cancer in populations with endemic infectious lung disease [28]. 18F-FDG PET was also only moderately accurate for differentiating benign from malignant pleural effusions [29].

In another meta-analysis, whole-body 18F-FDG PET/CT remained superior to conventional imaging in the detection of distant malignancies, regardless of the primary tumor site and type [30]. However, the diagnostic accuracy of a PET radiotracer for lesions in the thorax and abdomen, where most primary lesions are located, is essential. It is well known that the role of 18F-FDG PET in oncology is often mitigated by many pitfalls, including background physiological uptake of major organs [31].

On the other hand, 18F-FAMT specific uptake depicted the actual malignant process. 18F-FAMT uptake reflects excessive transport of amino acids via LAT1, which is absent in normal cells and pathology other than malignancy [15]. However, the trade-off of 18F-FAMT’s high specificity is the relatively small absolute uptake in tumor cells, as a consequence of the nature of the LAT1 transporter. The influx of one amino acid substrate into tumor cells via LAT1 is mandatoryly coupled to the efflux of another amino acid substrate, resulting in 18F-FAMT’s relatively fast clearance from the tumor [14]. Nonetheless, the advantage of 18F-FAMT is the minimal background uptake in all organs except kidney and urinary tracts, allowing one to obtain high contrast images clearly depicting various types of malignancy including brain tumors [6, 13].

Meta-analyses evaluating the diagnostic performance of 18F-FDG PET in malignancy detection were mostly limited to a particular cancer type, or in comparison with conventional imaging (CT or MRI) or hybrid imaging (PET/CT or PET/MRI). Currently, only a few tumor-specific PET radiotracers are continuously investigated in a clinical setting for various type of cancers [32]. 18F-FET is probably the closest to 18F-FAMT in terms of chemical compound, radiochemistry, and clinical applicability. While 18F-FET has higher diagnostic accuracy than 18F-FDG, its effectiveness is limited for brain tumors [33]. l-[methyl-11C]-methionine (11C-MET), the most popular amino acid-based PET radiotracer to date, also has excellent diagnostic accuracy for glioma compared to 18F-FDG [34]. However, both 18F-FET and 11C-MET are also substrates for LAT2 transporters, which is also expressed in normal cells [14, 35]. The low kidney uptake PET tracer anti-1-amino-3-18F-fluorocyclobutane-1-carboxylic acid (18F-FACBC) has recently been meta-analyzed for its accuracy in prostate cancer recurrence detection. However, the specificity of 18F-FACBC is lower than 11C-choline PET and even T2-weighted MRI [36]. Therefore, 18F-FAMT probably the most versatile oncologic PET radiotracer currently available.

However, there a few limitations in this study and also in 18F-FAMT itself. First, all studies were from a single institution, which was potentially affected by publication bias despite the authors of each study belonging to various departments and evaluating different types of tumors. Even though studies by Watanabe et al. and Tian et al. focused on musculoskeletal tumors, they were separated by more than a decade, eliminating the possibility of overlapping patients [12, 23]. A study of various tumors by Inoue et al., however, included two patients with chondrosarcoma and schwannoma that might also be involved in the Watanabe et al. study, since these studies were from the same period [5, 12]. Unfortunately, this is difficult to confirm. Second, not all types of malignancies were evaluated; in particular, lymphoma, melanoma, pancreas and thyroid cancer, which are tumor types for which 18F-FDG PET is recommended to improve diagnostic accuracy [3]. Tumors in the pelvic area and abdomen were also poorly represented in this study.

Another drawback of the current 18F-FAMT studies is the absence of dynamic PET data. Currently 18F-FAMT PET scan is performed at 40–60 min post injection. However, phases as early as 5–15 min post injection might show higher tumor detection accuracy for any amino acid PET tracer considering the two-way-directional characteristic of amino acid uptake by their transporters [37]. A dynamic 18F-FAMT PET study in an animal tumor model showed that tumor-to-muscle uptake ratio is highest at 20 min and remains high at 60 min [38]. However, clinical dynamic PET studies are necessary to obtain optimal scan times.

Our current findings emphasize the need for prospective multicenter studies to overcome limitations of the single center report. This can only be achieved when the 18F-FAMT synthesis method is optimized and becomes widely used. The current 18F-FAMT radiofluorination method yields a low radioactivity that is only enough for PET scans for a mere three to four patients in each radiosynthesis [39]. Recently, a modified method of 18F-FAMT synthesis allows production to achieve high radioactivity for routine use [40]. However, a more practical approach is warranted. The twenty years of anticipation might soon be realized with the recent rapid development of fluorination methods. Of particular interest are the so-called late-stage fluorination methods which allow optimized synthesis of previously inaccessible PET radiotracers [41]. These novel radiofluorination approaches which make possible large-scale synthesis allow reconsideration of promising but underutilized radiotracers, like 18F-FAMT. Hence, revisiting the diagnostic performance of 18F-FAMT is a major step in the quest for an ideal general oncology PET tracer. Once these impediments are resolved, which we foresee shortly, the future may bring increased clinical impact of 18F-FAMT in oncology.

Conclusion

18F-FAMT has diagnostic performance equal to or perhaps even better than 18F-FDG for malignancy detection in several cancer types. Future development in 18F-FAMT radiosynthesis might allow this tracer to be evaluated in other tumor types.