ESR statement on the stepwise development of imaging biomarkers
Development of imaging biomarkers is a structured process in which new biomarkers are discovered, verified, validated and qualified against biological processes and clinical end-points. The validation process not only concerns the determination of the sensitivity and specificity but also the measurement of reproducibility. Reproducibility assessments and standardisation of the acquisition and data analysis methods are crucial when imaging biomarkers are used in multicentre trials for assessing response to treatment. Quality control in multicentre trials can be performed with the use of imaging phantoms. The cost-effectiveness of imaging biomarkers also needs to be determined. A lot of imaging biomarkers are being developed, but there are still unmet needs—for example, in the detection of tumour invasiveness.
• Using imaging biomarkers to streamline drug discovery and disease progression is a huge advancement in healthcare.
• The qualification and technical validation of imaging biomarkers pose unique challenges in that the accuracy, methods, standardisations and reproducibility are strictly monitored.
• The clinical value of new biomarkers is of the highest priority in terms of patient management, assessing risk factors and disease prognosis.
KeywordsImaging biomarkers Qualification Validation
There is increasing interest in developing the quantitative imaging of biomarkers in personalised medicine. Biomarkers are defined as “characteristics that are objectively measured and evaluated as indicators of normal biological processes, pathological processes, or pharmaceutical responses to a therapeutic intervention” . Broadly, biomarkers fall into two categories: bio-specimen biomarkers, including molecular biomarkers and genetic biomarkers, and bio-signal biomarkers or imaging biomarkers. Bio-specimen biomarkers are obtained by removing a sample from a patient. Examples of these molecular biomarkers are genes and proteins detected from fluids or tissue samples. Bio-signal biomarkers remove no material from the patient, but rather detect and analyse an electromagnetic, photonic or acoustic signal emitted by the patient . These imaging biomarkers have the advantage of being non-invasive, spatially resolved and repeatable . They are of particular interest if they can overcome the limitations of the established histological “gold standards”. Indeed, invasive reference examinations, such as biopsy, can be inconclusive, are non-representative of the whole tissue (which is a tremendous limitation when assessing malignant tumours, which are known to be heterogeneous) and possess non-negligible levels of mortality and morbidity.
Genetic biomarkers indicate whether a disease may occur, but they are usually inefficient to assess the presence and stage of a disease. Similar to molecular biomarkers, imaging biomarkers can be used for early detection of diseases, staging and grading, and predicting or assessing the response to treatment . Accordingly, because of their relative lower cost compared with imaging, molecular biomarkers may be more appropriate for disease screening and early detection than imaging biomarkers. With their high sensitivity, molecular biomarkers could also detect subclinical stages of disease before any morphological or functional change is detectable on imaging. In contrast, imaging biomarkers are often more useful than molecular biomarkers for disease staging, and also grading and for assessing tumour response, because localised information is crucial.
Before being routinely used in the clinic, imaging biomarkers must be validated. Determining the accuracy implies calculating the sensitivity and specificity of the biomarker when compared with a biological process, such as tumour necrosis, which can be assessed at histopathological examination.
This validation process is challenging because changes in tissue properties due to diseases that are measured by imaging, such as the diffusion coefficients at DW-MRI or the mechanical properties at MR elastography, are only indirectly linked to structural changes such as necrosis, cellularity, fibrosis and vascular architecture. Moreover, the functional properties that are measured may be influenced by other co-existing factors, such as inflammation, perfusion, permeability and interstitial pressure. For example, the apparent diffusion coefficient (ADC) is decreased in chronic liver disease. This ADC decrease has been shown to be influenced by increased fibrosis, inflammation and steatosis, as well as by decreased perfusion [6, 7, 8, 9]. Equating what is measured by imaging and what is occurring at the cellular level in tissue is a difficult task because our understanding of the biophysical underpinnings of many imaging biomarkers, such as diffusion measurements of in vivo systems, remains partial [10, 11].
To help in this understanding, pre-validations studies are conducted in animal models of the disease of interest, where histopathological analysis and other invasive reference examinations can be easily conducted . More basic ex-vivo research in tissues, phantoms or theoretical models may also help in the understanding of the relationship between signal formation and underlying physiopathology . The transition to the patient has then to be realised, and the biomarker once again validated using small-cohort then large-cohort clinical studies.
The ultimate goal for an imaging biomarker is to understand its predictivity so well that it can become a surrogate for clinical outcome. One primary end-point in therapy assessment studies is patient survival. No imaging biomarker, even the familiar “response evaluation criteria in solid tumours” (RECIST) , universally employed in oncology drug development, is widely accepted as surrogate for survival. The RECIST criteria can be used to define time to progression, but increases in time to progression as a result of therapy is not necessarily a surrogate of improved overall survival, as shown by the avastin (bevacizumab) story . In 2011, the FDA withdrew approval for the combined use of avastin and chemotherapy for the treatment of metastatic breast cancer because preliminary licensing was predicated on future demonstration of improvement in survival or quality of life, both of which were not forthcoming when clinical trials were completed.
Surrogacy can only be reliably established with a large number of adequately powered clinical studies using a variety of interventions, and with the aid of meta-analyses. This is a daunting goal, which constitutes the very last step in biomarker qualification .
Repeatability (measurements at short intervals on the same subjects using the same equipment in the same centres) and reproducibility (measurements at short intervals on the same subjects using different facilities in the same and different centres) studies must be conducted for image acquisition and image analysis. These studies have to be performed with the same observer (intra-observer variability) and with different observers (inter-observer variability). Repeatability and reproducibility are particularly important to assess if the imaging biomarkers are to be used in longitudinal studies; for example, for treatment follow-up, to ensure that the changes in parameter are caused by a response to treatment and not by inherent technical or physiological variation. The reproducibility will affect the diagnostic usefulness of the biomarker. As an example, it is known that perfusion parameters are markedly variable between subjects. Therefore, it has been reported that post-therapy decrease of Ktrans should at least be in the 30–50 % range to represent a significant therapy-induced change, whereas for ADC at DW-MRI a change of 10–20 % would be sufficient . Reproducibility studies are now very often included in scientific papers, as advised by the “standards for reporting of diagnostic accuracy” (STARD) criteria and should ideally include Bland-Altman plots and results of coefficients of repeatability [16, 17].
Standardisation of image acquisition: similar acquisition parameters should be used across imaging platforms, when these parameters affect the results of the biomarker. For example, the calculation of ADC depends on the number and choice of the gradient “b” values. A collaborative paper by Padhani et al.  lays the foundation for acquisition standardisation, notably by recommending that monoexponential assessments of ADC should use two b values above 100 mm2/s.
Moreover, DW-MRI is very sensitive to motion. Motion correction schemes are thus advised for DW-MRI acquisition. However, it is still unclear which scheme is optimal. As an example for upper abdominal studies, some consider that free breathing acquisition produces reliable enough data, even with a better reproducibility than breath-hold, and that a respiratory-triggered scheme produces less reproducible data, while others recommend using tracking-only navigator techniques [19, 20, 21].
Standardisation of image analysis: volume and region of interest (ROI) determinations and parameter calculation (mathematical models) should be standardised. In tumour perfusion imaging, it has been shown that the ROI placements in the vascular input and in the tumour influence the results and reproducibility of the parameter measurements . To take motion into account, rigid and non-rigid registration of images at different time points can be used. In heterogeneous lesions such as tumours, imaging biomarkers are frequently calculated as parametric maps with spatial resolution. We need to define how to handle the histogram that displays the obtained values. Descriptive statistics such as mean value, standard deviation, and range can be directly obtained from the histogram. The main drawback with this approach is the clear tendency to underestimate the changes in body tissues and organs, since the values indicative of disease, or its most relevant manifestations, are minimised. For this reason, percentiles are used in some settings to obtain a better relationship with the most relevant predictive clinical variables. The optimal type of approach must be defined for each problem (complete histogram, partial histogram in quartiles, partial histogram in deciles). A further approach involves the analysis of the heterogeneity in the spatial distribution of a biomarker provided by its parametric image. To this end, some distribution asymmetry statistics such as kurtosis can be used [23, 24, 25, 26]. Finally, the choice of the mathematical model that is used to calculate the quantitative parameters has also a major influence on the results that are obtained [27, 28]. Standardisation procedures are currently being developed [18, 29, 30]. It is important that standardisation be a collaborative effort of academia and industry. Standardisation of data reporting should also be performed. For example, to describe the liver elasticity in cirrhosis, different units (Young modulus in kPa, shear modulus in kPa, wave speed in m/s) and different cut-off values are currently used [31, 32, 33]. Standardisation of these data would improve the communication between research groups.
Adequate phantoms could be used to validate, on a day-to-day basis, that the biomarker stays robust and to avoid any drift in the machine, acquisition or processing protocol. The advantage of using phantoms is that the sequence can be optimised in detail before being performed in patients (which is particularly adapted to CT studies to limit the radiation imposed on the patient), and distribution of the same phantom across imaging platforms allows control of the quality and standardisation of the procedures. Multicentre quality control studies have already been conducted using a simple, ice-water filled, DW-MRI phantom containing tubes of solutions of known diffusion coefficients, which allowed for comparing machines and centres . For ultrasound and CT, phantoms ranging from simple gels with inclusions of different shapes and sizes (for control of tumour size measurement) to complex thoracic models including vasculature inserts (to test perfusion acquisitions) are available [30, 35]. Mechanically-induced motion of these phantoms can also be realised . Another possibility is to simulate images based on computerised phantoms . This computerised phantom dataset can even incorporate deformation information mimicking respiration of patients .
When imaging biomarkers are validated for use in drug development studies or clinical trials, several additional points should be considered. First, the imaging biomarker should bring new information on top of existing diagnostic tools or existing risk factors and have the potential to modify the patient management . The coronary artery calcium score, one of the most evaluated cardiovascular imaging biomarkers, is not only associated with the risk of future cardiovascular events but it improves the traditional classification of risk by shifting patients from intermediate to high risk categories . It is likely that a panel of biomarkers will be required to achieve the high accuracy required at the clinical level.
Second, the imaging biomarker should be completely non-invasive, for not losing the advantage of safe imaging methods over invasive reference examinations. Third, the imaging biomarker should be cost-effective. If the biomarker is to be added as part of the clinical routine examination, and not to further burden the public health system with increased costs of care, its diagnostic advantages have to offset its cost. The imaging biomarker also should be easy to implement in the clinic, meaning that the machinery must already exist or be easily available, that there should not be the need for specific expertise from hospital employees, and that the parameter must be easy to measure and interpret. Few guidelines currently exist for imaging biomarker use [40, 41]. Together with other agencies, guidelines, evaluation and implementation may be an important task for the biomarkers subcommittee of the ESR.
Biomarkers have also a potential for the industry as pharmacodynamic markers and even surrogate endpoints for targeted clinical phase I to III studies . Development of new biomarkers was identified as the highest priority for scientific effort by the FDA to ease the marketing of newly developed drugs .
Development of new biomarkers
When seeing the difficulties in the qualification and standardisation of existing imaging biomarkers, is there a need to develop additional ones? The answer is yes; for example, in the field of oncology, where the palette of reasonably well-understood biomarkers, has major gaps. The hallmarks of cancer include sustaining proliferative signalling, evading growth suppressors, resisting cell death, enabling replicative immortality, inducing angiogenesis, activating invasion and metastasis, reprogramming of energy metabolism and evading immune destruction . Regarding angiogenesis, there are useful biomarkers utilising MRI, CT, ultrasound or PET. For drugs affecting the deregulated cellular energetics of the Warburg effect, FDG-PET offers an obvious assessment. For cellularity, proliferation and apoptosis, a joint public-private partnership between the EU and pharmaceutical companies called “Quantitative Imaging in Cancer: Connecting Cellular Processes (QuIC-ConCePT)” is currently devoted to the validation of imaging biomarkers, namely ADC at DW-MRI, [18F]30-deoxy-30-fluorothymidine PET (FLT–PET) and isatin-5-sulphonamide PET ([18F]ICMT-11), an apoptosis radiotracer with subnanomolar affinity for caspase-3 [12, 45, 46]. However, we currently do not have good markers for activation of invasion and appearance of metastasis before these events become macroscopically evident. Thus, development of new imaging biomarkers is still needed.
The European Society of Radiology and its related European Institute for Biomedical Imaging Research (EIBIR) should have a relevant role in coordinating future developments of biomarkers and in the assessment and validation of imaging biomarkers as surrogate end points.
This paper was kindly prepared by the ESR Subcommittee on Imaging Biomarkers (Chairperson: Bernard Van Beers. Research Committee Chairperson: Luis Martí-Bonmatí. Members: Marco Essig, Thomas Helbich, Celso Matos, Wiro Niessen, Anwar Padhani, Harriet C. Thoeny, Siegfried Trattnig, Jean-Paul Vallée. Co-opted members: Peter Brader, Nicolas Grenier) on behalf of the European Society of Radiology (ESR) and with the help of Sabrina Doblas, INSERM U773, Paris, France.
It was approved by the ESR Executive Council in December 2012.
- 9.Leitao HS, Doblas S, d’Assignies G, Garteiser P, Daire JL, Paradis V, Geraldes CF, Vilgrain V, Van Beers BE (2012) Fat deposition decreases diffusion parameters at MRI: a study in phantoms and patients with liver steatosis. Eur Radiol 23(2):461-467Google Scholar
- 26.Yang X, Knopp MV (2011) Quantifying tumor vascular heterogeneity with dynamic contrast-enhanced magnetic resonance imaging: a review. J Biomed Biotechnol 732848:1–12Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.