A hybrid optimization strategy for registering images with large local deformations and intensity variations

To develop a method for intra-patient registration of pre- and post-contrast abdominal MR images with large local deformations and large intensity variations. A hybrid method is proposed to deal with this problem. It consists of two coupled techniques: (1) descriptor matching (DM) at the original resolution using a discrete optimization strategy to avoid getting trapped in a local minimum; (2) continuous optimization to refine the registration outcome based on autocorrelation of local image structure (ALOST). Our method—called DM-ALOST—has become insensitive to the local uptake of contrast agent by exploiting the mean phase and the phase congruency extracted from the multi-scale monogenic signal. The method was extensively tested on abdominal MR data of 30 patients with Crohn’s disease. DM-ALOST produced significantly larger mean Dice coefficients than two state-of-the-art methods $$({p}<0.05)$$ (p<0.05) . Both qualitative and quantitative tests demonstrated improved registration using the proposed method compared to the state-of-the-art. The DM-ALOST method facilitates measurement of corresponding features from different abdominal MR images, which can aid to assess certain diseases, particularly Crohn’s disease.


Introduction
Image registration has been a key topic in the medical imaging community over the past decades. However, there are still many registration problems that remain challenging. We encountered such a challenge upon registering three-dimensional pre-to post-contrast abdominal images acquired by magnetic resonance imaging (MRI) for quantifying Crohn's disease activity [1]. This registration is far from trivial as peristalsis of the bowel (or organ motion) may cause large local deformations, which makes that the registration algorithm gets trapped in a local minimum as discussed in [2]. Furthermore, the diseased regions of interest are composed of relatively thin bowel structures, which are easily mismatched. Moreover, there can also be large local intensity variations, a.o. due to space-varying contrast uptake and the MRI bias field.
A coarse to fine (or multi-resolution) registration strategy is often used [3] to avoid getting trapped in local minima as much as possible. However, this can be insufficient for large deformations, as we have experienced in the aforementioned case. A coarse to fine registration method can be combined with discrete optimization approaches, such as graph cuts [4] and linear programming [5]. Discrete optimization is typically less sensitive to the initial conditions [5] and suitable to deal with large deformations [2]. However, discrete opti-mization goes at the expense of the registration precision because of quantization effects.
Alternatively, continuous optimization typically produces more precise solutions than discrete approaches. However, continuous optimization is sensitive to the initial conditions and gets more easily trapped in a local minimum. A hybrid optimization strategy was recently proposed to combine the best of both worlds [6]. As such the discrete optimization strategy initially yields a coarse solution to cope with the largest deformations after which continuous optimization locally improves the result in a refined search space. A good review on hybrid image registration can be found in [7].
It can be foreseen that the aforementioned multi-resolution strategy removes fine structures (e.g. the bowel wall) from the image at coarse resolutions. Furthermore, local deformation due to peristalsis is in essence different from the global motion of the larger structures due to breathing. As large local deformations are not correctly estimated at a coarse scale, an ambiguous registration result will occur locally. Thus, registration based on the highest resolution is preferred without any restriction of the search space for discrete optimization. The large displacement optical flow (LDOF) method [8] integrates rich descriptors such as the scale-invariant feature transform (SIFT [9]) and the histograms of oriented gradients (HOG [10]) into variational optic flow to solve the large deformation problem. To use descriptor matching with continuous optimization for our task, we need an image representation and an energy function that are both insensitive to local intensity variations.
Previously, well-known rich descriptors such as SIFT, SURF [11] and HOG were studied for representing image structure. However, none of these can fully handle intensity and contrast variations. Alternatively, images may be transformed into a representation that is insensitive to these signal variations. An example of such a representation for multidimensional signals is the monogenic signal [12]. Thus, a rich descriptor could be calculated based on the monogenic signal representation of these images.
Likewise, there are many energy functions for image registration that tackle the intensity inhomogeneity problem. For example, mutual information (MI) [13,14] is one of the most widely used energy functions. However, originally being a global objective function, it lacks local, i.e. spatial, information on the local structure and cannot cope with large space-variant signal and contrast differences. Although several methods introduce local information into a global objective function [15,16], it has been noticed [17] that finding an accurate correspondence remains difficult, especially due to the many local minima that generally accompany most non-rigid deformation models.
Recently, a method employing a local modality independent neighborhood descriptor (MIND) was proposed [17]. MIND was reported to be a "distinctive" descriptor, which was claimed to be important for registering images with many degrees of freedom [17]. However, MIND is not invariant to large intensity inhomogeneities LCC-Demons was introduced to address the local intensity variation issue [18] by using symmetric local correlation coefficient as energy function. A new descriptor called ALOST is proposed to reduce the influence of intensity inhomogeneities [19]. Methods that transform MRI images into a new image representation can be also used to deal with intensity inhomogeneities [20]. Other methods specifically designed for abdominal DCE-MRI registration use the entire DCE-MRI image sequence to reduce the influence [21,22]. However, all of abovementioned methods are sensitive to large local deformations of an image.
In this paper, we present a hybrid registration method to facilitate the spatial alignment of pre-to post-contrast abdominal MR images. Our work is inspired by the method described in [8] that initially performs descriptor matching using discrete optimization succeeded by a continuous optimization to refine the registration. Different from [8], we perform the descriptor matching based on a representation derived from the monogenic signal rather than on the original image. The ALOST method [19] follows as a continuous optimization step to refine the registration. Therefore, our work contains two main contributions compared to [8,19]. 1. It is insensitive to local contrast changes due to the varying uptake of contrast agent as well as to global signal fluctuations from the MR scanner's bias field. 2. Several elaborate evaluations are performed to assess the performance of our method on bowel images.

Method
Image registration can be treated as an optimization problem [23]. The energy function to be optimized in our experiments consists of three parts, namely a data term, a regularization term and a coupling term: where w = [u, v, w] is the deformation field for each pixel in a 3D moving image, α and β are weighting parameters.
In our proposed hybrid registration framework, we first find a solution to the descriptor matching term through discrete optimization: w 1 = arg min w 2 {E descriptor (w 2 )} (see (5)). Herewith, we aim to estimate the largest deformations in the presence of intensity distortions without sacrificing image resolution. Subsequently, w 1 is used to initialize the minimization (1).
The remainder of the method section is structured as follows. First, we introduce some basic theory of the monogenic Fig. 1 a An abdominal MR image and b the same image deformed only by a simulated local intensity distortion; e, f mean phase images computed from a, respectively, b; c intensity profiles along the red dashed lines in a (red) and b (blue); g intensity profiles along the red dashed lines in d (red) and e (blue); d descriptor matching correspondences using HOG-I; h descriptor matching correspondences using HOG-MP; the red circles correspond to positions in image (a), the green crosses correspond to positions in image (b); each green cross should directly fit in a red circle, i.e. without a displacement, as no spatial deformation was imposed signal on which we will build both parts of the hybrid method. Subsequently, we will define the involved energy terms: the data term, and the coupling term. Finally, we will describe our optimization procedure.

Preliminaries
The mean phase and phase congruency [24] are derived from the analytic representation of a signal based on the Riesz transform [25]. They provide measures that are independent of the signal's amplitude, making them invariant to trends in the signal and variations in contrast. A representative example is shown in Fig. 1. The intensity distortion we added merely concerned the addition of a Gaussian intensity profile centered at a randomly selected point [26]. We plot profiles along a line segment suffering from a large local intensity variation in Fig. 1c, g. In Fig. 1c, the profiles from the original and distorted images clearly deviate because of the intensity distortion. In contrast, this is not the case in Fig. 1g. This example illustrates that the mean phase image is insensitive to local intensity changes.

Definition of the data term
Our data term E ALOST (w) exploits the autocorrelation of LOcal STructural information. It was inspired by the MIND concept [17], which is based on the local sum of squared differences (SSD) between two patches P of image I : where x 1 and x 2 are the centers of the two image patches p of size D × D × D. Notice that expanding the square of (2) reveals the inner product between the two patches, which can also be interpreted as a windowed (hence local) autocorrelation as a function of the difference x 1 −x 2 . The local SSDs are incorporated in a feature vector defined by: with r an offset in a predefined neighborhood R of size R × R×R around position x (spatial coordinator of the image), n a normalization constant (chosen such that the maximum value of the MIND feature at position x is one) and V (I , x) represents the average SSD in a small neighborhood N around In words, the feature vector consists of a 1-D vector of length R× R× R, which represents all correlations between a neighborhood P centered at a voxel (x) and a neighborhood P centered on a voxel (x + r).
We employ the mean phase image I MP and the phase congruency image I PC for image I in (3), since these are both insensitive to local contrast changes. Furthermore, we concatenate the feature vectors for I MP and I PC to form the ALOST representation as The ALOST representation has two important properties: (1) it is insensitive to intensity distortions; (2) it emphasizes 'salient' features (e.g. edges) in the image. For the sake of simplicity, we use in the remaining text the notation ALOST f (x) and ALOST m (x) to represent ALOST feature vectors calculated from the fixed image and the moving image, respectively. As such, the data term E data (w) in (1) can be defined as: The local diffusion regularization is used to avoid the illposedness of the image registration. More details of the ALOST method can be found in [19].

Definition of the descriptor matching and coupling terms
The registration procedure is made robust against getting stuck in a local minimum through descriptor matching in a discrete optimization scheme. To accomplish this, we densely calculate descriptors over both the fixed and moving images at the original resolution (without down-sampling) to preserve the thin bowel structures. Inspired by [8,27], we use the 3D HOG in our descriptor matching step. Therefore, we first calculate 3D gradients [G x , G y , G z ] using a coarse 1-D central difference operator in all three dimensions. Then, these 3D gradients are quantized based on a regular polyhedron with N 0 = 10 orientations (i.e. a icosahedron) to limit the computation time. Furthermore, a Gaussian filter with σ = 0.8 mm is applied to smooth along the polyhedron to reduce quantization effects. The final 3D HOG descriptor for each voxel consists of 10 × 7 entries (10 orientations times 7 cells, the cell of the current voxel plus cells from its 6 cardinal neighbors in 3D space). For building a cell, we empirically set neighborhoods N n = 9.
Notice that the discrete optimization aims to simplify the registration problem after which the continuous optimization follows to refine the registration. Therefore, a highly accurate calculation is not required. At the same time, we found that the HOG was sensitive to large local intensity variations (see see Results section). Accordingly, we propose to calculate HOG on the mean phase image (φ(x)), which describes the type of local structure and is insensitive to the intensity variations.
The descriptor matching term is defined as: where HOG-MP m and HOG-MP f denote the HOG descriptors of the mean phase images derived from the moving image and the fixed image, respectively; the summation is over all positions x i in the fixed image where a HOG descriptor was calculated; x j = x i + w 2 (x i ) is a position in the moving image with such a descriptor. We couple the discrete optimization functional to the continuous one by adding a coupling term to the latter

Optimization procedure
Approximate nearest neighbor (ANN) search method is used for descriptor matching [28]. Compared to exhaustive search method, ANN method allows reduction of the complexity without losing too much accuracy. Similar to [8], we use the ANN method twice (for searching from the moving image to the fixed image and vice versa) to exclude outliers. A continuous optimization strategy is subsequently used for refining the deformation field, after descriptor matching, by minimizing (1). We apply the Gauss-Newton optimization method in order to do so. We use a multi-resolution registration procedure in order to speed up the convergence. Notice that the multi-resolution strategy is only used in the continuous registration step. At that stage the largest deformations have already been resolved.
The "initialization" through w 1 is softly favored through E coupling (w, w 1 ) (see (1)), but we reduce the "influence" of w 1 when estimating w at finer resolutions. Different from [8], the weighting parameter of β is not fixed but decreasing with increasing iteration number of the registration procedure: β(k) = β 0 /k, where β 0 is the initial value and k the iteration number.

Results
The proposed hybrid registration method will be compared against two state-of-the-art registration techniques: the deformable registration via attribute matching and mutualsaliency weighting (DRAMMS), [29] and the MIND [17]. Both methods are based on neighborhood structure descriptors. In total, seven different registration methods will be used in the comparisons. These will be referred to as DRAMMS, DM-O (descriptor matching on original image), DM (descriptor matching on mean phase image), MIND, DM-MIND (descriptor matching plus MIND), ALOST [19] and DM-ALOST (descriptor matching plus ALOST, the hybrid method currently proposed).
The computation time for the ALOST and MIND methods on the abdominal dataset (containing the largest images of size 400*400*100) was less than 10 min on a personal computer equipped with an Intel Core TM 2 Quad Processor Q8400 clocked at 2.66 GHz and 4 GB RAM memory. The DM-ALOST and DM-MIND took around 20 min, which is in the same range as the DRAMMS approach. Figure 1d visualizes the descriptor matching results using HOG on the intensity images, HOG-I, and Fig. 1h pictures the descriptor matching results using HOG on the mean phase images, HOG-MP, between the same regions (middle left side) in Fig. 1a, b. Red circles indicate the positions around which the descriptors are calculated in Fig. 1a and green dots indicate positions in Fig. 1b. Since there were no spatial deformations, an identity correspondence should be found, i.e. each green dot should directly fit in a red circle, without a displacement. However, there is a clearly observable difference in the matching pairs by using the HOG-I descriptor (Fig. 1d) and the HOG-MP descriptor (Fig. 1h). Where the HOG-MP descriptor mostly finds the identity correspondence, the HOG-I descriptor finds correspondences over very large displacements. This outcome emphasizes that HOG-I is very sensitive to intensity variation, whereas HOG-MP is not.

Registration performance on synthetic abdominal images
We evaluated the performance of our registration framework in the presence of a local geometric deformation and a local intensity distortion. We empirically set α = 0.1 for ALOST (same as [19]) and in addition set β 0 = 0.5 for DM-ALOST. In the ALOST paper [19], we have extensively tested different settings for MIND, but found that the optimal settings were the default settings. We adopted this setting for this paper. For DRAMMS, we also tested several settings in this paper, but without too much difference in performance. Therefore, we used the default settings of DRAMMS and MIND in all our experiments. Furthermore, we selected a typical 2D slice (containing diseased bowel) from a 3D image for this synthetic experiment.
We generated 100 differently deformed images with large local large deformation and local intensity distortion from the selected reference image. First, a global geometric deformation with a thin-plate spline (TPS) model (similar to [26], 7 × 7 grid size) was applied to a reference image (Fig. 2a). Second, a local deformation was imposed around one randomly picked point from the image. This deformation was generated by randomly selecting a displacement between 15 to 25 pixels in the x and y direction, respectively. Thereafter, care was taken that the displacement field was continuous by Gaussian filtering, creating a Gaussian distributed displacement field around the picked point. The width of the Gaussian was set to a randomly selected value in the range from 20 to 50 pixels, which effectively controlled the locality of the deformation. Finally, a local intensity distortion was induced by adding a Gaussian-shaped intensity distribution centered at a randomly selected point (again as in [26]). An example is shown in Fig. 2b.
DM-ALOST gave the best overall result on data with only large geometric deformation, although the difference with DM-MIND was not significant (see Fig. 3a). Furthermore, the DM-ALOST outcome performs significantly better than all other approaches as assessed by a two-sided signed rank test ( p < 0.01) on data with large geometric deformations and local intensity distortions (Fig. 3b). One can also see that the results of DM on original image (DM-O) and DM on mean phase (DM) are similar for images without intensity distortion. On the other hands, DM gave significant better ( p < 0.01) results comparing to DM-O for images with intensity distortion. It shows that descriptor matching on mean phase images could give better initialization for the following continuous registration step. Therefore, we will only use DM for the rest of experiments.

Abdominal MR image pre-to post-contrast registration
The last series of experiments assess the suitability of the proposed method to register pre-and post-contrast abdominal MR data of patients with Crohn's disease. These MRI data were taken from a prior study, which has been approved by the medical ethics committee [30]. Thirty out of 33 patients from the prior study gave written consent to use their data for future investigations. The image size of the pre and post-contrast scans was 400 × 400 × 100 voxels with a resolution of 1 × 1 × 2 mm 3 and was acquired in a breath-hold. All patients underwent ileocolonoscopy within Box-and-whisker plots of the root-mean-squared error (RMSE) of the estimated deformation fields prior to (original) and after registration using seven different approaches. a Outcomes on 100 images that only contain a local geometric deformation. b Outcomes on 100 images containing both a local geometric deformation and a local intensity distortion 1 month after the MRI scan was acquired. We used the six methods as described in previous section to register the data (DM-O is excluded). The parameters were the same as in previous section. These settings also proved to yield the best registration result for the abdominal images (data not shown).

Quantitative assessment of the registration performance
Unfortunately, it is not a trivial task to identify landmark points on the bowel wall since there is deformation due to inspiration depth and a continuous deformation over time due to peristalsis between the scans. Therefore, we decided to label relevant 'landmark' regions, namely the bowel segments. A research fellow specialized in imaging Crohn's disease supervised by a radiologist delineated each bowel segment affected by Crohn's disease in the post-contrast MR images. Subsequently, two research fellows, both specialized in imaging Crohn's disease, independently annotated the same region in the pre-contrast images. Polygons were drawn in coronal slices. Each lesion was delineated in all slices in which it was observed. In total 30 lesions were identified in 30 bowel segments. A typical example is shown in Fig. 4. The mean Dice coefficient calculated over the annotations in the pre-contrast MR images was 0.71, reflecting the interobserver agreement. The Dice coefficient was also computed over corresponding pairs of annotations in the pre-and postcontrast images prior to registration and after registration by each of the six methods. The results are summarized in Table 1. One can see that the DM-ALOST outperformed the other methods for the data of both annotators. Particularly, the Wilcoxon signed rank test demonstrated that DM-ALOST was significantly ( p < 0.05) better than the other methods. Notice that the mean Dice coefficients of DM-ALOST, Fig. 4 Annotations on pre-and post-contrast MR images. a-c Reference annotations on the post-contrast MR images made by two research fellow; d-f corresponding annotations on the pre-contrast MR images made by the first research fellow; g-i same images as d-f but now containing the annotations by the second research fellow. Radiologists supervised the research fellows. Noticed that the annotations that were made by the two research fellows are partially different 0.68 and 0.69, closely approximate the inter-observer agreement..

Correlation of manual to semi-automatic measurements of relative contrast enhancement
The manual relative contrast enhancement (RCE) was calculated as defined above using the mean intensities measured over manually delineated regions in the pre-and post-contrast images. Four research fellows each supervised by a different radiologist outlined corresponding regions with the most disease activity in the pre-and post-contrast images of the 30 patients affected by the disease [30]. These regions comprised small focal areas on the order of 100 voxels, as is conventional. These local annotations were made on the most suspicious part of the bowel wall. This is how radiologists conventionally measure RCE. The RCE measurements for these 30 patients were used in previous study [31]. Unfortunately, the annotations for these RCE measurements were not saved. Moreover, one research fellow supervised by a radi-ologist delineated the entire region affected by the disease only on post-contrast MR images. These annotations could be saved and were used to calculate the semi-automatic RCE measurement. The RCE was semi-automatically measured by first calculating the mean intensity over this full region in the post-contrast image and by averaging over the region in the pre-contrast image that was copied from the post-contrast image. These semi-automatic RCE measures were correlated to the manual RCE measures by means of Pearson's correlation. The correlation coefficients thus obtained are shown in Table 2, stratified by averaged over four fellows. Notice that DM-ALOST gave a higher correlation than the other registration methods, although none of the differences were significant. It showed that DM-ALOST gave best registration results among all six methods [32].

Discussion and conclusion
We developed a hybrid registration framework called DM-ALOST to register images with large local deformations Table 1 Mean dice metric (MDM) between annotations on pre-and post-contrast abdominal MR images prior to registration (i.e. no registration), and after registration using, respectively, DRAMMS, DM, MIND, DM-MIND, ALOST and DM-ALOST The numbers printed in bold face are the maxima, i.e. the best results, per row. The standard deviation is shown between brackets combined with large local intensity variations. The main target application was the spatial alignment of 3D abdominal pre-and post-contrast MR images of patients with Crohn's disease. Differences in inspiration depth and peristalsis cause large local geometric deformations, while the administrated contrast agent causes large spatially localized contrast variations. The developed method-DM-ALOST-combines a discrete descriptor matching at the full image resolution with continuous optimization of a powerful representation exploiting the mean phase and the phase congruency. The discrete optimization enabled us to cope with local geometric deformations of small image structures and guides the continuous registration in such a way that it does not get trapped in a local minimum. The use of the mean phase offers the required invariance to large intensity distortions. It is important for our application that especially the diseased parts of the bowel are correctly registered. Previously, we developed a method to automatically segment the diseased bowel wall from post-contrast MR images [33]. An obvious way to enhance the registration might be to give high priority to such regions [34]. However, we have found that this does not lead to a significant improvement, which we attribute to the small size of the regions.
There are several limitations of our work. The optimal weighting parameters were manually selected for each registration task. Although this might bias the reported figures, we found little difference between the manually tuned settings for the different data sets and the different tasks. Indeed, empirical fine-tuning may be needed for particular task. For example, the scale of Gaussian filter should be increased if the SNR of image is too low. Therefore, we do not expect that it influences the ordering of the six methods in terms of performance. Another limitation is that the HOG-MP descriptor is not invariant to non-rigid deformations. For data hampered by very large local non-rigid deformations, the descriptor matching using HOG-MP might fail. In that case, descriptors could be considered such as described in [35,36]. However, that would go at the expense of a higher computation time. Furthermore, incorporating segmentation of the entire bowel (i.e. not only the diseased parts) might be useful, although the segmentation on the pre-contrast MR images is far from trivial.