Voxel-based morphometry in single subjects without a scanner-specific normal database using a convolutional neural network

Objectives Reliable detection of disease-specific atrophy in individual T1w-MRI by voxel-based morphometry (VBM) requires scanner-specific normal databases (NDB), which often are not available. The aim of this retrospective study was to design, train, and test a deep convolutional neural network (CNN) for single-subject VBM without the need for a NDB (CNN-VBM). Materials and methods The training dataset comprised 8945 T1w scans from 65 different scanners. The gold standard VBM maps were obtained by conventional VBM with a scanner-specific NDB for each of the 65 scanners. CNN-VBM was tested in an independent dataset comprising healthy controls (n = 37) and subjects with Alzheimer’s disease (AD, n = 51) or frontotemporal lobar degeneration (FTLD, n = 30). A scanner-specific NDB for the generation of the gold standard VBM maps was available also for the test set. The technical performance of CNN-VBM was characterized by the Dice coefficient of CNN-VBM maps relative to VBM maps from scanner-specific VBM. For clinical testing, VBM maps were categorized visually according to the clinical diagnoses in the test set by two independent readers, separately for both VBM methods. Results The VBM maps from CNN-VBM were similar to the scanner-specific VBM maps (median Dice coefficient 0.85, interquartile range [0.81, 0.90]). Overall accuracy of the visual categorization of the VBM maps for the detection of AD or FTLD was 89.8% for CNN-VBM and 89.0% for scanner-specific VBM. Conclusion CNN-VBM without NDB provides a similar performance in the detection of AD- and FTLD-specific atrophy as conventional VBM. Clinical relevance statement A deep convolutional neural network for voxel-based morphometry eliminates the need of scanner-specific normal databases without relevant performance loss and, therefore, could pave the way for the widespread clinical use of voxel-based morphometry to support the diagnosis of neurodegenerative diseases. Key Points • The need of normal databases is a barrier for widespread use of voxel-based brain morphometry. • A convolutional neural network achieved a similar performance for detection of atrophy than conventional voxel-based morphometry. • Convolutional neural networks can pave the way for widespread clinical use of voxel-based morphometry. Graphical abstract Supplementary Information The online version contains supplementary material available at 10.1007/s00330-023-10356-1.

smoothing, a voxel-based two-sample t-test of the individual smoothed GM component image against the GM component images of the NDB was carried out, resulting in a statistical t-map.Age and total intracranial volume were taken into account as nuisance covariates.The total intracranial volume was estimated in each T1w-MRI scan by using a 3D-CNN specifically trained for accurate and stable delineation of the total intracranial volume [2; 3].

3D-CNN architecture
The architecture of the custom 3D-CNN is shown in Figure 1 in the manuscript.The 3D-CNN follows a fully convolutional encoder-decoder (U-net-like) architecture with 3D convolutions with 3x3x3 kernel size.Residual blocks are used in the encoder [4].In addition, deep supervision [5] is employed by including additional segmentation layers at several stages in the decoder.The encoder reduces the spatial feature map size four times (using convolution with stride 2) and doubles the feature map number with each reduction.Starting with 16 feature maps of size 160x160x160 in the first layer, this leads to Eur Radiol (2023) Krüger J, Opfer R, Spies L, Hedderich D, Buchert R.
256 maps of size 10x10x10 in the last encoder layer.The decoder uses convolution layers, followed by nearest-neighbour up-sampling and deep supervision in three layers [5].For the long-range connections between encoder and decoder a feature concatenation is employed.Leaky ReLU [6] is used as activation function in each layer.Due to the rather large patches of 160x160x160 voxels the batch size is 1.Therefore, instance normalization, a special case of group normalization, is used instead of batch normalization.

Data augmentation
Random combinations of the following data augmentation techniques were used during training of the 3D-CNN to further increase the heterogeneity of the training dataset.
▪ Random left-right flipping of the image patches.
▪ Rotation around the x-, y-or z-axis by an angle randomly chosen between -10 and 10 degrees.
▪ Translation in x-, y-or z-direction by a distance randomly chosen between -5 and 5 mm.▪ Voxel-wise adding of Gaussian random noise with zero mean and variance randomly chosen between 0 and 0.0001.The voxel values were normalized between 0 and 1 (this is required for every

CNN)
▪ Simulation of a random bias field.A plane was selected randomly within the 3D volume.For all other planes, the distance to the selected plane was computed.Distance values were scaled to the interval [1, d] with d a random number between 1 and 2. Image voxel values were multiplied with the scaled distance of the plane the voxel was located in.
The first three augmentation techniques were applied simultaneously to the input and to the output of the 3D-CNN whereas the last two techniques were applied only to the input image (output unchanged).

Application of the 3D-CNN
For the application of the 3D-CNN to a T1w-MRI scan, eight (2x2x2) evenly distributed overlapping crops of 160x160x160 mm 3 were taken.For each crop, the predicted class values were computed and Eur Radiol (2023) Krüger J, Opfer R, Spies L, Hedderich D, Buchert R. merged to the entire volume by taking the mean values in the overlapping regions.Each of the 4 output maps (corresponding to the 4 parts of the statistical maps from scanner-specific-VBM, Figure 1 in the manuscript) contains (probability) values between 0 and 1.The output maps corresponding to the 'low significance' part and to the 'high GM density' part of conventional t-maps were ignored.The output maps corresponding to the 'low extrahippocampal GM density' part and to the 'low hippocampal GM density' part were summed voxel-by-voxel to obtain the (final) CNN-VBM map.
The voxel intensities in the CNN-VBM map range between 0 and 1.The threshold corresponding to P=.005 for t-maps from scanner-specific-VBM was identified as follows.On a subset of 130 randomly selected cases from the training dataset (2 cases per scanner), the CNN-VBM-map was binarized using varying thresholds between 0.1 and 0.85 (step size 0.05).For each threshold on the CNN-VBM-map, the Dice similarity coefficient of the resulting binarized map with the corresponding conventional gold standard map from scanner-specific-VBM binarized at P=.005 was computed for each of the 130 cases.
Prior to computing the Dice similarity coefficient, the binary maps were smoothed by convolution with an isotropic Gaussian kernel with 10 mm full-width-at-half-maximum and then binarized again (all values > 0) in order to reduce the sensitivity with respect to minor, clinically irrelevant differences.The mean Dice similarity coefficient over the 130 cases was maximal for a threshold of 0.40 (Supplementary Figure 4).This value was used for thresholding the CNN-VBM-maps for visual interpretation.Adduru VR, Michael AM, Helguera M, Baum SA, Moore GJ ( 2017