Introduction

One of the most significant advantages of positron emission tomography (PET) over other forms of functional imaging techniques is its capability to quantify absolute regional radiotracer concentration. Therefore, PET can generate quantitative dynamic images of regional radiotracer uptake, resulting in regional measurements of kinetic parameters. Quantification provides the direct relationship between the activity concentration measured in vivo in organs/tissues and the underlying physiological or pharmacokinetic processes occurring in the structure of interest [1]. It correlates the variation over time of the activity concentration to physiologically relevant quantitative parameters. In preclinical studies, quantitative assessment of tissue uptake permits better management of the therapy for an individual animal model and eventually enables assessment of overall response to a therapy in a population of transgenic animals [2]. However, quantitative PET is challenged by the need for appropriate compartmental or kinetic models to derive estimates of such parameters from dynamic PET measurements of regional activity concentrations, which is difficult to achieve in a clinical setting. Moreover, to take full advantage of the quantitative capabilities of PET imaging, patient-specific correction of background and physical degrading factors must be performed.

Automated quantitative assessment of metabolic PET data is appealing given its usefulness in terms of facilitating experimental molecular imaging investigations, since it can reduce variability across institutions and may improve the reliability of image interpretation independent of reader experience. For example, the development of tracer-specific small animal PET probabilistic atlases [3] correlated with anatomical (e.g. MRI) templates enabled automated volume of interest (VOI) or voxel-based analysis of small animal PET data with minimal end-user interaction [4]. One such software tool was developed by Kesner et al. [5] to enable assessment of the biodistribution of PET tracers using small animal PET data. This is achieved though non-rigid registration of a digital mouse model with the animal PET image set followed by automated calculation of tracer concentrations in 22 predefined VOI representing the major organs and remaining whole body. The development of advanced anatomical models including both stylized and more realistic voxel-based mouse [68] and rat [9, 10] models obtained from serial cryosections or dedicated high-resolution small animal CT and MRI scanners will certainly help to support ongoing research in this area [11].

Our objective in this work is to develop and assess the performance of atlas-guided automated analysis of small animal PET data based on computerized anatomical models and retrospective registration-guided methods enabling correct localization and accurate quantification of molecular targets. Our approach is different from the one adopted by Kesner et al. [5] in the sense that we are aiming at developing a fully automated analysis procedure which does not require user interaction and does not rely on the use of external fiducial markers or internal landmarks.

A substantial number of techniques have been proposed to achieve the goal of multimodal medical image registration [12, 13]. However, image registration algorithms widely used in clinical studies have not been well characterized in the small animal setting. A number of investigators focused on the utility of popular image registration techniques in various scenarios and using different imaging technologies and reported various degrees of success [1422]. Some techniques rely on the use of external fiducial markers or specially designed hardware devices [23] to aid the registration process, whereas other approaches rely on fully automated algorithms that do not involve user interaction. Current state-of-the-art image registration techniques allow for automatic image registration through a rigid body transformation, thus ignoring organ deformation. There has also been noticeable progress in non-rigid registration algorithms that can compensate for perceived organ deformation for different imaging modalities or align images from different subjects [24]. However, despite progress made during the last few years, many image registration problems, particularly for small animal imaging, remain unsolved, and this is likely to continue to be an active field of research in the future [25].

The rationale of the automated 3-D image registration procedure used in this work to register PET images of the actual animal to an atlas for automated analysis is that it should be easier to find the correct alignment of anatomical CT images of the animal and those of the atlas than it is for noisy low-resolution PET images. This is achieved through non-rigid registration based on a multi-resolution approach. The transformation parameters obtained from CT to CT registration are then applied to corresponding PET images followed by automated quantitative analysis of tracer uptake in predefined regions.

Materials and methods

Mouse atlas

The Digimouse atlas [7], composed of preregistered slices of PET, CT and cryosection images and the corresponding atlas, was used in this work. The latter results from a segmentation of 21 VOI including the skin, skeleton, eye, medulla, cerebellum, olfactory bulbs, external cerebrum, striatum, heart, the rest of the brain, masseter muscles, lachrymal glands, bladder, testis, stomach, spleen, pancreas, liver, kidneys, adrenal glands and lungs. Two versions of this data set are available: processed images coded in 8-bit (256 grey levels) and raw images coded on 32-bit floating-point format. The use of processed images would lead to loss of information during the registration process owing to the lower dynamic range. For this reason, we have reprocessed the raw images to obtain a data set similar to the processed one but with an isotropic voxel having a size of 0.2 mm and with coding to match typical CT and PET images in terms of dynamic range (16 and 32 bits, respectively) and scale (HU for CT). A representative coronal slice of the resulting images is shown in Fig. 1.

Fig. 1
figure 1

Spatially registered multimodality images for a coronal slice through the Digimouse model. From left to right: X-ray CT, PET, cryosection, segmented atlas and overlay of the atlas onto cryosection

Mouse PET/CT studies

Experimental studies

The sample of laboratory animals included in this work is composed of eight mouse PET/CT studies acquired using three different radiotracers with segmentations performed in our laboratory. Most of these mice had tumour xenografts producing large morphological deformations that increased the difficulty of the image registration process and challenges of automated quantification tasks (Fig. 2):

  • One 18F-fluorodeoxyglucose (FDG) mouse acquired on the FLEX Triumph™ preclinical PET/CT scanner (Gamma Medica-Ideas, Northridge, CA, USA) [26], consisting of 16-bit CT images of 256 × 256 × 512 voxels of 0.17 × 0.17 × 0.17 mm3, and 32-bit PET images of 256 × 256 × 256 voxels of 0.4 × 0.4 × 0.4 mm3.

  • Four mouse studies (three 18F-FDG and one 18F-NaF) acquired on the Siemens MicroFocus scanner kindly provided by the Crump Institute at UCLA composed of CT images of 256 × 256 × 496 voxels of 0.2 × 0.2 × 0.2 mm3 coded in 16 bits and PET images of 128 × 128 × 95 voxels of 0.4 × 0.4 × 0.8 mm3 coded in 32 bits.

  • Three mice from the Applied Sciences Laboratory at Uppsala acquired for 60 min in list-mode format on the FLEX Triumph™ preclinical PET/CT scanner (Gamma Medica-Ideas, Northridge, CA, USA), composed of CT images of 240 × 240 × 63 voxels of 0.25 × 0.25 × 1.175 mm3 coded in 16 bits and PET images of 240 × 240 × 63 voxels of 0.25 × 0.25 × 1.175 mm3 coded in 16 bits. These three mice were injected with a bispecific antibody labelled with 68Ga.

Fig. 2
figure 2

Example of coronal and sagittal slices of PET/CT studies (a-c) and overlay of the atlas onto CT images (d-f) of the experimental mouse studies acquired using 18F-FDG (left), 18F-NaF (middle) and bispecific antibody labelled with 68Ga (right) radiotracers

Since our approach relies on 3-D image registration, segmented images of experimental animal images are needed for objective assessment of the accuracy of the image registration algorithm. Therefore, corresponding images were segmented into seven regions/organs: brain, lungs, heart, kidneys, bladder, skeleton and “remaining organs & skin”. Segmentation of CT images was performed using the ITK-SNAP software [27], which provides flexible tools for semi-automated medical image segmentation. The choice of segmented organs/regions was dictated by the need to have organs located at various portions of the mouse provided that soft tissue contrast of CT images is high enough to enable reliable segmentation.

The size and organ volume variation of this sample of laboratory mice is shown in Table 1. The sizes (e.g. cephalad-caudad length) were calculated from mouse segmentations by calculating the number of transaxial slices containing the mouse and multiplying this number by the slice thickness. The calculation of organ volumes was performed by counting the number of voxels belonging to each organ (example voxels labelled as brain) and multiplying it by the voxel volume. These measurements relied on complete whole-body mouse images for all mice except three where the assessment did not include the head since this part was outside the field of view.

Table 1 Physical characteristics of mice used in the experimental group

Simulated studies

More elaborate analysis under controlled conditions (ground truth known) was carried out using 17 simulated PET/CT data sets. The simulated mouse sample was generated using the Moby digital mouse model [6], which creates two 3-D voxelized phantoms, one containing the activity map and the second the attenuation map (μ-map) at 511 keV. These two voxelized phantoms take into account respiratory and cardiac motions during an acquisition of 120 s.

Mice simulations were performed to produce a normal mouse sample and to complement the relatively small experimental mouse sample size by increasing the overall sample size. Mouse sizes (Table 2) and activity concentrations in the various organs/tissues were randomly chosen for each simulation with a normal probability based on typical biodistribution studies reported in the literature [5]. The sizes of simulated mice were read from the output log file of the Moby software, whereas the volumes were calculated as described earlier for the experimental studies.

Table 2 Physical characteristics of mice used in the simulated group

The Moby software was also used to produce segmented images by modifying the input parameters to simulate tracer uptake in only one organ at a time, leading to an image for each segmented region/organ. All images were then combined using an exclusive operator to avoid having voxels labelled as two different organs, thus producing one unique image corresponding to the mouse segmentation serving as reference. This image is finally corrected by filling holes in the heart and lung regions using the ITK-SNAP software to simulate the Digimouse segmentation. An example of the three voxelized images is shown in Fig. 3.

Fig. 3
figure 3

Coronal views of a voxel-based mouse model generated using the Moby software for simulation of an 18F-FDG study showing from left to right: activity map, attenuation map, segmented image, simulated X-ray projection image, simulated CT image and corresponding simulated PET image

We used the CT projector provided with the Moby mouse software to generate realistic mouse CT images corresponding to the simulated PET data [28]. The projector was used to obtain 360 projections, one per degree, with the characteristics of the cone-beam microCT device used in our laboratory [29]:

  • Object to source distance = 22.2 cm

  • Object to detector distance = 6.8 cm

  • Half-fan angle = 19°

  • Pixel size of 0.2 × 0.2 mm2

CT images were then reconstructed from the simulated projection data with a voxel size of 0.2 × 0.2 × 0.4 mm3 using Feldkamp’s algorithm [30] as implemented in the MATLAB image reconstruction toolbox developed by Dr. Fessler.Footnote 1 PET images of the simulated Moby model were reconstructed with attenuation correction using the Software for Tomographic Image Reconstruction (STIR) package [31] by means of the ordered subsets version of Green’s maximum a posteriori (MAP) using the one-step late algorithm [32] with 24 subsets, 25 sub-iterations and inter-update Metz filtering [33]. Example images of X-ray projections and CT and PET reconstructed slices are shown in Fig. 3.

Automated analysis and qualification

As explained earlier, the proposed approach is based on 3-D non-rigid image registration between actual mice (source or moving image) and the Digimouse atlas (target or fixed image). The main restriction of our method is that PET/CT images (as opposed to PET only images) of the moving mouse are required since the high-resolution anatomical CT image serves as the basis for actual mouse to atlas registration. The second constraint is that similar fields of view for source and target images have to be made available.

Following CT to CT image coregistration, the generated spatial transformation resulting from this procedure is used to transform the moving PET image. This will produce a new PET/CT data set of the actual mouse in the atlas space. Thereafter, atlas segmentation is used to automatically define VOI to measure the radiotracer’s activity concentration in various organs of the actual mouse. Therefore, the image registration process is the key point of this approach. For this reason, the methodology followed and the main parameters used as well as the algorithm’s performance assessment are described in detail below.

Image registration

Image registration was performed using the Elastix software suite [34]. The package consists of a collection of algorithms based on the Insight Toolkit (ITK) libraries that can be run from command line. One of the advantages of this software is that it enables saving the calculated transformation following the coregistration into an ASCII file that can be further used to transform other images using the same transformation parameters by means of the Transformix tool [34]. This software was conceived as a set of modules that can be parameterized individually depending on the needs of the user to handle particular situations. In our case, the following choices were made:

  • All calculations were carried out with floating-point precision.

  • Multi-resolution registration was performed at 5 levels with down-sampling factors of 32, 16, 8, 4, and 2 times, each with first-order B-spline image smoothing using 4,096, 4,096, 2,048, 2,048 and 1,024 iterations, respectively.

  • Normalized correlation coefficient was used as the cost function metric since it was reported to perform slightly better (compared to mutual information) in intra-modality image coregistration [34].

  • The image sampler used to compute the cost function is composed of 2,500 random points that were updated for each iteration.

  • The adaptive stochastic gradient descent was used as the optimization method to adapt the step size sequence.

  • The final interpolation is a third-order B-spline function.

The registration was performed in three steps:

  1. 1.

    Affine registration: mainly used to pre-align both CT images to facilitate the task of the non-rigid registration procedure. It is composed of 12 degrees of freedom, 3 for each of the possible transformations: translation, rotation, scaling and shearing.

  2. 2.

    B-spline non-rigid registration: applied to obtain a good alignment of CT images in size and shape. This step produces very good results for mouse shapes, but in most cases it does not suffice for internal organs. This transformation step allows an infinite number of degrees of freedom.

  3. 3.

    Masked B-spline non-rigid registration: applied to obtain a finer registration of internal organs. This step was added to improve the quality of image registration without spending considerable computational resources outside the mouse surface. We used one unique mask based on the Digimouse atlas shape since, based on previous registration steps, the actual mouse CT image now has the same shape.

Once the three steps are completed, we apply the same transformation parameters to PET images of the same mouse as well as its segmented image to obtain a PET/CT and segmented data set corresponding to the Digimouse space. The performance of the image registration algorithm was assessed using two well-established metrics [35]:

  1. 1.

    Dice coefficient (D coeff ) or mean overlap. This is a volume overlap metric which quantifies the intersection between source and target regions divided by their mean volume:

$$ {D_{{coeff}}} = 2\frac{{\left| {S \cap T} \right|}}{{\left| S \right| + \left| T \right|}} $$
(1)

The Dice coefficient [36] is a special case of kappa statistical coefficient [37] which can be interpreted as follows:

  • Less than 0.20 ➔ poor agreement

  • 0.20 to 0.40 ➔ fair agreement

  • 0.40 to 0.60 ➔ moderate agreement

  • 0.60 to 0.80 ➔ good agreement

  • 0.80 to 1.00 ➔ excellent agreement

  1. 2.

    Hausdorff distance (HD). This is the most frequently used discrepancy measure, which represents the maximum distance one would need to move the boundaries of the source region to completely cover the target region:

$$ H{D_{{S \to T}}} = \mathop{{\max }}\limits_{{s \in S}} \left\{ {\mathop{{\min \left\{ {d\left( {s,t} \right)} \right\}}}\limits_{{t \in T}} } \right\} $$
(2)

where s and t are points inside the source (S) and the target (T) regions, respectively, and d(s,t) is the distance between s and t. The measure of the directional Hausdorff distance HD S→T is oriented and in most of the cases is not equal to HD T→S . For this reason, the generalized Hausdorff distance (HD) is defined as:

$$ \begin{gathered} HD = \max \left\{ {\mathop{{\max }}\limits_{{s \in S}} \left\{ {\mathop{{\min \left\{ {d\left( {s,t} \right)} \right\}}}\limits_{{t \in T}} } \right\},\mathop{{\max }}\limits_{{t \in T}} \left\{ {\mathop{{\min \left\{ {d\left( {t,s} \right)} \right\}}}\limits_{{s \in S}} } \right\}} \right\} \\ = \max \left\{ {H{D_{{S \to T}}},H{D_{{T \to S}}}} \right\} \\ \end{gathered} $$
(3)

PET normalized mean activity measurement

The mean activity concentrations in various regions of the PET images were calculated and normalized to the maximum activity concentration to obtain the normalized mean activities (NMAs). The NMAs from the pre-registered PET images prior to transformation for the seven segmented regions were used as reference. The automated PET analysis results are then compared to reference values by calculating the NMAs in the same seven regions based on the original Digimouse segmentation. The relative error between these two NMAs values is also calculated (in %).

Results

The assessment of the accuracy of the image registration algorithm using well-established metrics is important because it will condition the automated activity quantification procedure. To avoid image segmentation bias, we have measured both metrics (D coeff and HD) using the original Digimouse segmentation and our own segmentation of the Digimouse atlas, similar to the procedure followed for the segmentation of the experimental mouse sample. Statistical analysis performed to compare the means of all the regions using the paired t test for correlated samples is reported in terms of p values (0.05 was used as threshold for statistically significant differences). The analysis of the Dice coefficient revealed that the mean D coeff was significantly improved (p < 0.0001) from 0.4557 to 0.5032 using our segmentation of the Digimouse CT image. However, the improvement of mean Hausdorff distance, from 6.33 to 6.16 mm, was not statistically significant (p = 0.143).

Similar to the procedure followed for assessment of image registration, we have tested if our segmentation of the Digimouse improved the results of the NMA analysis compared to the original Digimouse segmentation. This was performed by comparing the relative difference between quantitative estimates in the original and spatially transformed images using both segmentations of the Digimouse atlas, namely the original segmentation and our own segmentation. The t test for correlated samples revealed that the slight improvement of the mean relative difference (from 32.1 to 30.9 %) was not statistically significant (p = 0.214) when using our segmentation approach. Considering that only the Dice coefficient is influenced by the segmentation, we hypothesize that both Digimouse segmentations are equivalent and will not bias the results. For this reason, the results shown below are all calculated using the original Digimouse segmentation.

Typical image registration results between experimental mouse studies and the Digimouse atlas are shown in Figs. 4 and 5 as representative of the best (mouse 4) and worst (mouse 6) cases, respectively, considering the Dice coefficient as metric for assessment.

Fig. 4
figure 4

Illustration of the best deformable registration example between the Digimouse and one of the experimental mouse studies (mouse 4) showing: a overlay of the Digimouse atlas onto corresponding CT images, b actual 18F-FDG PET/CT mouse study, c mouse study shown in b with overlay of the segmentation onto CT image (seven organs), d CT to CT registration of the Digimouse and actual mouse study shown in c, e overlay of the transformed segmentation (seven organs) using registration parameters obtained in d onto CT image and f transformed PET/CT study using registration parameters obtained in d

Fig. 5
figure 5

Same as Fig. 4 for the worst deformable registration example between the Digimouse and one of the experimental mouse PET/CT studies (mouse 6) acquired using 68Ga-labelled ethylenediaminetetraacetic acid (EDTA)

As discussed earlier, a large variability was expected for the experimental studies given the heterogeneity of the considered sample. To illustrate this variability, Table 3 summarizes the results for the Dice coefficient and the Hausdorff distance for all segmented regions of the experimental and simulated mice. One should note the large difference among Dice coefficients corresponding to best (0.594 ± 0.208) and worst (0.308 ± 0.276) registration results (Figs. 4 and 5).

Table 3 Summary of mouse registration results for the 8 experimental mice and 17 simulated mice studies when using the original Digimouse segmentation

Dice coefficient, Hausdorff distance and normalized mean activity estimates before deformation (prior to image registration) as well as the mean relative difference for the NMAs introduced by the automated quantitative analysis procedure for the seven segmented regions for both experimental and simulated mice studies are shown in Table 4. Some representative results of these metrics are presented in box-and-whisker plots for different regions corresponding to experimental and simulated mouse studies in Fig. 6.

Table 4 Summary of image registration performance metrics and normalized mean activity measurements of original mice (before transformation) and the relative difference between the original and the automated quantification procedure when using the Digimouse segmentation
Fig. 6
figure 6

Box-and-whisker plots of image registration metrics (DC and HD) and automated quantitative analysis (NMA) results for the 8 experimental and 17 simulated mice studies for the 3 and 8 segmented regions, respectively. a Dice coefficients, b Hausdorff distance and c normalized mean activity (NMA) relative difference

Paired t test analysis for independent samples was performed to test the null hypothesis that mean values (D coeff , HD, NMA before transformation and automated NMA relative difference) obtained with the experimental mouse sample is different from the mean value obtained with simulated mice. One can see that statistically significant differences between experimental and simulated samples were found for lungs, bladder and remaining organs & skin regions for Dice coefficient, whereas this was only the case for the brain for Hausdorff distance. With regard to NMA before registration, one can observe that there is evidence of statistically significant differences between experimental and simulated mice activity concentration for kidneys, lungs and remaining organs & skin regions, probably because three different radiotracers were utilized for the experimental mice sample, whereas only FDG was used for simulated studies using the Moby model. Finally, a statistically significant difference is observed only for the kidneys between experimental and simulated mouse sample results.

Based on the foregoing statistical analysis, we can in some cases unify experimental and simulated studies into a unique sample which will be more representative of the performance of the developed automated quantification method. The results of this unified analysis are shown in Table 5, where two estimates are shown for regions where there is a statistically significant difference between experimental and simulated studies. The 95 % confidence interval was also calculated taking into account the sample size.

Table 5 Summary of the pooled performance evaluation of the automated coregistration procedure and quantitative analysis of PET data using the 95 % confidence interval

A similar approach to the one adopted by Kesner et al. [5] was used to assess the accuracy of the automated analysis algorithm, which consists in measuring the so-called harvested standardized uptake value (hSUV) that reflects the actual measurement of the activity concentration in dissected organs of the mice. This metric is then correlated through linear regression to VOI-based analysis. The accuracy of the automated quantitative method is evaluated using the correlation coefficient (R 2) of this regression.

By analogy, we define the harvested NMA (hNMA) metric as the activity concentration used as input in the voxelized Moby phantom, which is correlated to the activity concentration estimated using the automated analysis technique through linear regression analysis. Table 6 shows the mean correlation coefficient calculated for all organs for each individual mouse, whereas Table 7 shows the mean correlation coefficient calculated with all mice for each individual organ. The results obtained by Kesner et al. [5] using their semi-automated algorithm are also shown in both tables. It can be seen that the accuracy of our fully automated approach is comparable to that of Kesner et al.’s semi-automated method.

Table 6 Correlation coefficients (R 2) resulting from linear regression analysis between actual (harvested) and PET-derived activity concentration for simulated studies. The results obtained by Kesner et al. [5] for experimental studies using manual and semi-automated software approaches are also shown
Table 7 Correlation coefficients (R 2) resulting from linear regression analysis between actual (harvested) and PET-derived activity concentration for different regions of simulated mouse studies. The results obtained by Kesner et al. [5] for experimental studies using manual and semi-automated software approaches are also shown

Discussion

Automated quantitative analysis of PET data potentially may play a pivotal role in large-scale clinical trials and longitudinal serial preclinical studies involving the acquisition of a large sample of small animal imaging data. This work proposes and evaluates the performance of an atlas-guided automated quantification algorithm using simulated and experimental studies. The performance of the image registration algorithm is the critical issue in our automated quantification approach. In the experimental sample, we noticed a large variability in the results reported for the various metrics evaluated. For example, Dice coefficient results vary between 0.308 and 0.594 (mean = 0.451) (Table 3). Similar variability was observed for the Hausdorff distance, with values varying from 5.10 to 7.55 mm (mean = 6.33 mm), with larger regions having larger Hausdorff distances (HD Heart  = 0.85 mm; HD Skeleton  = 10.76 mm) (Table 4). In contrast to the experimental sample, the performance of image registration was less widely dispersed for the simulated sample, with a dispersion of Dice coefficients between 0.408 and 0.461 but with a mean (0.447) close to the one achieved for the experimental sample. The same trend was observed for the Hausdorff distance metric (min. = 5.91 mm; max. = 6.49 mm; mean = 6.03 mm). The most probable reason for the difference in dispersion of results is the presence of very large morphological deformations on the experimental mice owing to the presence of tumour xenografts. Nevertheless, one can observe that, in the overall sample composed of simulated and experimental mice, a final moderate agreement was reached regarding the performance of the image registration procedure (D coeff  = 0.448), with a mean mismatch of approximately 6 mm.

The analysis of the image registration procedure on a region-by-region basis (Table 4) reveals again that the algorithm performs better when using simulated compared to experimental mice owing to the presence of tumour xenografts, except for the bladder and to a smaller extent remaining organs & skin, where Dice coefficients of experimental mice are higher than those of simulated mice. One can also see that the brain and remaining organs & skin regions registration show good to excellent agreement (D coeff  > 0.76) for both experimental and simulated samples. Poor agreement was obtained for time varying organs such as the bladder owing to periodic filling and emptying (for experimental and simulated mice) and small structures such as the pancreas (for the simulated sample) (Table 5). A fair to good agreement was found for the other regions (0.289 ≤ D coeff ≤0.657).

The statistical analysis revealed that, in most of the cases, experimental and simulated samples can be combined to obtain a unified larger sample with a higher statistical power. Some exceptions with respect to Dice coefficient are the lungs, likely because of the presence of some poorly aligned cases, since the median shown in Fig. 6 (D coeff  = 0.420) is much closer to the simulated mice results. The second exception is the bladder region, though this was expected owing to the poor performance of the image registration procedure for this organ. The last exception applies to remaining organs & skin region, which presents a good to excellent agreement and where the difference in the image registration results is influenced by the way the regions of interest are defined. It should be noted that in the experimental mouse segmentation remaining organs & skin include liver, pancreas, spleen, stomach and testis regions, owing to the low soft tissue contrast, while in the simulated mouse these consist of independent regions. The only exception with respect to the Hausdorff distance is surprisingly the brain region.

Several studies have reported on various algorithms implementing deformable image registration of actual mice to an atlas. For instance, the method proposed by Baiker et al. [38] is based on manual extraction of the skeleton and lung from source and target mice to be registered with a hierarchical realistic model of the skeletal joint movement. This method achieved Dice coefficients comparable to those reported in this work for kidneys (0.48), liver (0.62) and heart (0.50) regions. Worse results were obtained for the brain (0.75), whereas better results were achieved for the lungs (0.65) and skeleton (0.5). The most remarkable result achieved is the skin correspondence (maximum mismatch below 3.5 mm) from mice that are in different positions. Wang et al. [39, 40] performed image registration of mouse CT images with the Moby phantom, Digimouse atlas and their own statistical atlas constructed using microCT images of 45 mice. The registration is performed only for the torso by manually segmenting high-contrast organs to be registered; low-contrast organ positions are estimated from this transformation. Dice coefficients corresponding to this method are particularly high following registration with the authors’ statistical atlas (D coeff  > 0.70 except for the spleen, D coeff  ≈ 0.45). However, when the registration is performed with the Digimouse atlas, the performance is similar to that achieved in this work for low-contrast organs such as the liver (0.68), spleen (0.38) and kidneys (left kidney ≈ 0.39, right kidney ≈ 0.61), while better results were achieved for high-contrast organs such as the lungs (0.75) and heart (0.68).

When analysing the performance of automated activity quantification, one can see that, similar to the image registration procedure, lower relative errors were obtained for the simulated sample except for the bladder, where the results reflect poor image registration. The same observations were made for the kidneys, probably because of the relatively high activity concentration for the simulated sample, where a small mismatch might result in a non-negligible reduction in the estimated activity concentration due to the small size of the region. Considering the statistical analysis results of the two samples, one can see that a statistically significant difference in the relative error was obtained only for the kidneys. Therefore, all remaining regions can be analysed as a pooled sample. Table 5 shows that in 7 of the 12 analysed regions the relative error of the automated quantification procedure is below 9 %. In particular, the relative error for the brain, lungs and skeleton is below 5 %. The kidneys’ activity concentration estimation error was estimated to be around 20 %, while larger errors were obtained for other regions owing to poor image registration, except remaining organs & skin” region, where an excellent image registration was achieved but with a correspondingly large relative error in terms of tracer uptake quantification (76.8 %). This result is highly influenced by large errors (∼250 %) associated with three experimental studies owing to the presence of tumour xenografts, where the tracer is taken up by one leg belonging to remaining organs & skin region. When deformable registration handles this large deformation, it significantly changes the shape and reduces the volume of the lesions, thus introducing a large difference in tracer uptake estimation. This also explains the high standard deviation in the NMA metric for this region (367 %), since it reflects the analysis of three mouse studies presenting with large distorting tumour xenografts while the remaining five mouse studies present a mean relative error of 20.1 % (SD of 25.7 %).

The comparison of the proposed automated quantification method results with those of Kesner et al. [5] shows that similar performance could be achieved using both methods (Tables 6 and 7). The main strength of our fully automated approach is that only minor preprocessing of the images is required to select the same field of view, while most other approaches rely on partially segmented images and/or the use of external fiducial markers.

Conclusion

We propose a novel atlas-guided approach for automated quantification of small animal PET studies. To this end, preprocessing for adaptation of PET/CT images of the actual mouse study and atlas fields of view is required. Automated quantification is achieved through 3-D image registration between CT images of the actual mouse and the Digimouse atlas. The transformations achieving this are afterwards applied to corresponding PET images to put them in the Digimouse image space. The Digimouse segmentation is used to define the regions of interest for quantitative analysis.

The method proved to provide similar performance compared to other proposed techniques requiring user intervention [5, 39, 40]. Normalized mean activity estimation was preserved between reference and automated measures in most of the regions considered, with relative errors below 10 %. The only regions resulting in higher relative errors are those corresponding to small organs presenting a large variability among mice such as the testicles, bladder, kidneys and pancreas. This can also be explained by physiological respiratory motion, cardiac motion and periodic filling and emptying of the bladder. This was, however, expected from image registration assessment results. Further refinement of the latter procedure is underway.