Background

To achieve the best therapeutic outcome, modern radiotherapy has attempted a variety of ways to maximize the damage to the tumor while sparing surrounding normal tissues [1, 35]. The accurate targeting of tumor has been playing an important role in the implementation of successful radiotherapy, which introduced the concept of image guided radiotherapy (IGRT). As one of the key steps in targeting tumor, the registration between planning CT images and CBCT (Cone beam-CT) images has been widely explored and its techniques have been improved significantly since the advent of CBCT.

Image registration techniques generally rely on the information of the images themselves. Current image registration methods fall roughly into two categories: geometric features-based and image grayscale-based [2]. To date, these image registration methods have been widely used to perform the registration between planning CT and CBCT images. Based on the deformation with intensity simultaneously corrected, a CT to CBCT deformable registration approach was proved to be robust against the CBCT artifacts and intensity inconsistency [3]. Free-form deformable registration algorithm, which resulted in a high correlation between CBCT and the new planning CT, was also successfully conducted [4]. Multiscale registration, which decomposed the registering images into a series of scales and registered the coarser scales of the images iteratively, was regarded as an effective method for the registration between CT and daily CBCT images [5]. Registration techniques based on mutual information (MI) belong to the image grayscale-based registration method and have been widely applied to multi-modal and mono-modal image registration tasks. A multi-modal retinal image registration, which was based on improved mutual information using adaptive probability density estimation, resulted in high accuracy and efficiency [6]. Three–dimensional registration techniques based on mutual information could be also applied to the alignment of brain tissues in magnetic resonance imaging time-series or PET [7, 8]. A comparison between standard mutual information and normalized mutual information indicated that normalized mutual information is more stable and robust in that it is immune to the variation of entropy [9]. The application of mutual information is a very effective strategy for image registration, but the traditional mutual information method only focuses on the image intensity information, with spatial information neglected, which leads to the instability to intensity interpolation [10]. With regard to registration of medical images, spatial information is very important and should be incorporated into grayscale-based based registration algorithms. A 3D-2D registration of CT and X-ray images incorporated the spatial information in a variational approximation and obtained a high registration accuracy [11]. Positions with large gradient usually correspond to tissue transition, which provides spatial information [12]. Therefore, wavelet transform was recently applied to image registration [13, 14]. Daubechies complex wavelet transform, which is shift invariant and provides phase information, was successfully used to achieve the fusion of multimodal medical images [15]. A flexible multiscale and shift-invariant representation of registered images was firstly obtained by using stationary wavelet transform, and then the registration through pulse-coupled neural network was performed on the new representation [16, 17].

Some studies incorporated the gradient information of medical images in the mutual information to compensate for the lack of spatial information [18, 19]. These methods produced gradient images using the sum of the squares of the corresponding sub-band coefficients. However, gradient images generated this way lack diagonal components, leading to the loss of edge information and low registration accuracy. In registrations between planning CT images and CBCT images, the noise in CBCT images usually results in poor resolution in low contrast areas of the images and blurs edges of images.

In this paper, we proposed a registration method based on stationary wavelet transform (SWT) with translational invariance. The translational invariance of the stationary wavelet is conducive to highlighting edge features of an image and improves the registration accuracy. Experiments showed that our algorithm is robust.

Materials and methods

Materials

For planning CT images and setup CBCT images in radiotherapy, Siemens large aperture CT and Varian Rapid Arc CBCT are used for image registration. The image parameters for CT are as follows: image matrix 512 × 512; pixels size 1.17 × 1.17 mm; the image parameters for CBCT are as follows: image matrix 384 × 384; pixel size 1.27 × 1.27 mm. All the participants gave their informed consent and the Ethics Committee of Beijing Xuanwu Hospital Affiliated to Capital Medical University approved the protocol of this study.

Methods

In our proposed method, the reference image and floating images were decomposed with three levels using stationary wavelet transform. Low-frequency components of wavelet produced in all levels during the decomposition were set to zero, and a gradient image of the original image was obtained by performing inverse wavelet transform on remaining high-frequency components. Then, the mutual information of the original image and the gradient image was calculated by using the normalized mutual information as the similarity measure. Finally, a new similarity measure was synthesized with a weighting function. The Powell algorithm was used for multi-parameter optimization to produce the final spatial transformation parameters for the image registration.

Stationary wavelet transform of image

Nason and Silverman introduced the stationary wavelet transform in 1995 [20]. In contrast to orthogonal wavelets, stationary wavelet, also known as non-sampling wavelet transform, has the properties of redundancy, translational invariance, capability of providing more approximate estimation of continuous wavelet transform. As an effective mathematical tool for edge detection [2124], its advantages include the local time-frequency characteristics and multi-resolution analysis capability of wavelet transform. The jth-level decomposition of SWT is shown in Fig. 1.

Fig. 1
figure 1

Schematic diagram of the jth level SWT decomposition. The signal C j is decomposed into low frequency components c j + 1 and high frequency components d j + 1 corresponding to the high pass and low pass filters, respectively

The decomposition formulas of SWT are as follows:

$$ \begin{array}{l}{A}_{j,{k}_1,{k}_2}=\sum_{n_1}\sum_{n_2}{h}_0^{\uparrow {2}^j}\left({n}_1-2{k}_1\right){h}_0^{\uparrow {2}^j}\left({n}_2-2{k}_2\right){A}_{j-1,{n}_1,{n}_2}\hfill \\ {}{D}_{j,{k}_1,{k}_2}^1=\sum_{n_1}\sum_{n_2}{h}_0^{\uparrow {2}^j}\left({n}_1-2{k}_1\right){g}_0^{\uparrow {2}^j}\left({n}_2-2{k}_2\right){A}_{j-1,{n}_1,{n}_2}\hfill \\ {}{D}_{j,{k}_1,{k}_2}^2=\sum_{n_1}\sum_{n_2}{g}_0^{\uparrow {2}^j}\left({n}_1-2{k}_1\right){h}_0^{\uparrow {2}^j}\left({n}_2-2{k}_2\right){A}_{j-1,{n}_1,{n}_2}\hfill \\ {}{D}_{j,{k}_1,{k}_2}^3=\sum_{n_1}\sum_{n_2}{g}_0^{\uparrow {2}^j}\left({n}_1-2{k}_1\right){g}_0^{\uparrow {2}^j}\left({n}_2-2{k}_2\right){A}_{j-1,{n}_1,{n}_2}\hfill \end{array} $$
(1)

where \( {A}_{j,{k}_1,{k}_2} \), \( {D}_{j,{k}_1,{k}_2}^1 \), \( {D}_{j,{k}_1,{k}_2}^2 \), \( {D}_{j,{k}_1,{k}_2}^3 \) are the low frequency components (LL), the horizontal high-frequency component (LH), vertical high-frequency component (HL) and diagonal components (HH) of the stationary wavelet transform, respectively. \( {h}_0^{\uparrow {2}^j} \) and \( {g}_0^{\uparrow {2}^j} \) are used to denote that 2j-1 zeros are inserted between the two points h 0 and g 0. The corresponding reconstruction algorithm (IDSWT) is

$$ \begin{array}{l}{A}_{j-1,{n}_1,{n}_2}=\frac{1}{4}{\displaystyle \sum_{i=0}^3\left\{{\displaystyle \sum_{k1}{\displaystyle \sum_{k2}{h}_1\left({n}_1-2{k}_1-i\right){h}_1\left({n}_2-2{k}_2-i\right){A}_{j,{k}_1,{k}_2}}}\right.}\hfill \\ {}\kern6em +{\displaystyle \sum_{k_1}{\displaystyle \sum_{k_2}{h}_1\left({n}_1-2{k}_1-i\right){g}_1\left({n}_2-2{k}_2-i\right){D}_{j,{k}_1,{k}_2}^1}}\hfill \\ {}\kern6em +{\displaystyle \sum_{k_1}{\displaystyle \sum_{k_2}{g}_1\left({n}_1-2{k}_1-i\right){h}_1\left({n}_2-2{k}_2-i\right){D}_{j,{k}_1,{k}_2}^2}}\hfill \\ {}\kern6em \left.+{\displaystyle \sum_{k_1}{\displaystyle \sum_{k_2}{g}_1\left({n}_1-2{k}_1-i\right){g}_1\left({n}_2-2{k}_2-i\right){D}_{j,{k}_1,{k}_2}^3}}\right\}\hfill \end{array} $$
(2)

The components of the image after the SWT are shown in Fig. 2.

Fig. 2
figure 2

The images produced after the decomposition with SWT. Parts a, b, c, d are the low frequency components, the horizontal high frequency components, the vertical high-frequency components and diagonal components of the stationary wavelet transform, respectively

Synthesized gradient image based on stationary wavelet transform

The high-frequency sub-band of wavelet transform has the ability to highlight the differences between neighboring pixels in an image [25]. Large wavelet coefficient indicates the boundary of two distinct intensity regions in the original image. Stationary wavelet transform is translationally invariant, which helps to identify the image edge features. In order to improve the resolution of edge details, image with prominent edge features can be reconstructed by using the inverse SWT with the three groups of wavelet vectors (LH, HL, HH). The CBCT image and the gradient image generated with SWT are shown in Fig. 3.

Fig. 3
figure 3

The CBCT image and the gradient image produced after the SWT. Part a is the original CBCT image and Part b is the gradient image after the SWT

Similarity measure

In this paper, mutual information is used as the similarity measure. As a basic concept of information theory, mutual information is generally used to describe the statistical correlation between two systems, or the amount of information of a system contained in another system. In multimodality image registration, when the spatial positions of two images are completely consistent, the mutual information, i.e., the information of one image expressed by another image, is maximum. The mutual information can be expressed by entropy which describes the complexity or uncertainty of a system. The entropy of the system A is defined as

$$ H(A)=-{\displaystyle \sum_a{p}_A(a) \log {p}_A(a)} $$
(3)

The joint entropy of two systems is defined as

$$ H\left(A,B\right)=-{\displaystyle \sum_{a,b}{p}_{A,B}\left(a,b\right) \log {p}_{A,B}\left(a,b\right)}. $$
(4)

where a ∈ A, b ∈ B, and p A (a) is the marginal probability density function, p A,B (a, b) is the joint probability density function. The mutual information between the two systems is thus expressed as

$$ I\left(A,B\right)=H(A)+H(B)-H\left(A,B\right). $$
(5)

Mutual information is sensitive to the amount of overlap between images and normalized mutual information (NMI) has been proposed to overcome this problem. It is defined as

$$ NMI\left(A,B\right)=\frac{H(A)+H(B)}{H\left(A,B\right)}. $$
(6)

Registration method based on stationary wavelet transform

We denoted the normalized mutual information of the original image as NMIi and that of the synthesized gradient image based on the SWT as NMIg. They were calculated using Eq. (1). Jiangang Liu merged original images with gradient information using the method described below, and the new similarity measure was given by [24]:

$$ NMI=f\left(v\left( NM{I}_i, NM{I}_g\right)\right) NM{I}_i+\left(1-f\left(v\left( NM{I}_i, NM{I}_g\right)\right)\right) NM{I}_g $$
(7)

where

$$ f(v)=1/\left(1+ \exp \left(-\left(v-0.5\right)/T\right)\right) $$
(8)
$$ v\left(x,y\right)=\left(x+y\right)/2 $$
(9)

The weighting function f(v) is shown in Fig. 4, where T is a time constant used to control the shape of f(v). The weighting function is essentially a logistic function, which has the properties of saturation, differentiability and nonlinearity. It also has a maximum and a minimum. f(v) is an ideal weighting function for merging registration function. According to our experience, T = 0.04 is an appropriate value.

Fig. 4
figure 4

Weighting function f(v). The shapes of the weighting function depends on the parameter T, the blue line, the red line and the green line are corresponding to the T values of 0.04,0.1 and 0.001, respectively

Our method follows this procedure:

  • Step 1: decompose the reference image R and floating image F respectively using stationary wavelet transform;

  • Step 2: assign the coefficients of low-frequency in all levels of stationary wavelet decomposition to zero and perform inverse transform to reconstruct corresponding gradient images;

  • Step 3: combine the original image and the synthesized gradient image using the weighting function to form a new similarity measure;

  • Step 4: use the Powell multi-parameter optimization algorithm to optimize the space transformation parameters (Δx, Δy, θ) and the registration is completed.

Optimization algorithm

Two-dimensional image registration is essentially a multi-parameter optimization problem, namely, searching three optimal registration parameters (two translational parameters and a rotational parameter) to maximize the mutual information. In this paper, the Powell multi-parameter optimization algorithm and the Brent one-dimensional search algorithm are used to optimize the parameters.

Results

Our algorithm was implemented in Matlab R2008a. We selected ten medical images as the reference images, and the floating images were generated with spatial transformation of the corresponding reference images. As the preset transformation parameters in X and Y directions and the rotation angle θ (As shown in Table 1) were known, judgment of the correctness and registration accuracy of the algorithm was straightforward. The smaller the gray level difference between the image F after registration and the reference image R is, the higher the registration accuracy is.

Table 1 Experiment results and error

Root mean square error (MSE), which is defined as follows, is employed as the registration error [2629].

$$ MSE=\sqrt{\frac{1}{ MN}{\displaystyle \sum_{i-1}^M{\displaystyle \sum_{j-1}^N{\left[R\left(i,j\right)-F\left(i,j\right)\right]}^2}}}. $$
(10)

The smaller the values of MSE are, the higher the registration accuracy is. If two images are identical, the MSE = 0. We took ten images with preset transformation parameters as the experiment data, the experiment results of which are shown in Table 1.

Judged from the above experiment results, the proposed registration method is accurate (on the order of subpixel) and robust. However, compared with mutual information registration, the proposed method is time-consuming, because it needs to calculate not only the mutual information between the original images, but also the mutual information with the gradient images, which increases the computation load.

The CBCT image is shown in Fig. 3a. Because of smaller image matrix and larger pixel size, CBCT images must be up-sampled to the size of the CT image before the registration. Due to a considerable difference between our CT and CBCT images, the initially linear shifting (80 and 80 mm along X and Y directions, respectively) of the CBCT images was performed to manually narrow the difference for the reduction of latter registration time. In order to test the CT and CBCT images registration algorithm, we generated ten images by performing manual spatial transformation to CBCT images in advance (as shown in Table 2). Among the ten images, five images were transformed in Y direction linearly, and other five images were transformed linearly in X direction. They were then registered with the corresponding layer of CT images. If the corresponding translation term of the registration is linear, and the rotation angle is close to 0°, high registration accuracy is indicated. The registration results are shown in Table 2.

Table 2 Registration results of the CT and CBCT images

In order to compare our proposed method with standard mutual information, the linearity of the transforming variation along X and Y directions was represented in Fig. 5

Fig. 5
figure 5

The linearity of transformation parameters in X and Y directions. Part a shows the transformation in Y direction and part b shows the transformation in X direction

As can be seen from Table 2 and Fig. 5, with our proposed method, the translations in the Y direction in the first five images, and in the X direction in the last five images appear linear. Therefore, the proposed method is more robust than the traditional mutual information registration. The registration results of CBCT to CT images are shown in Fig. 6.

Fig. 6
figure 6

Results of registering CBCT to CT images. Part a is the original CT image; Part b is the CBCT image; Part c is the registration result by using conventional MI method; Part d is the .registration result by using our proposed method; Part e is the fusion of CBCT and CT images with conventional MI method; Part f is the fusion of CBCT and CT images with our proposed method; Part g is the difference (fe) of registered CBCT images by using the two methods

Discussion

Based on mutual information and stationary wavelets transform, the registration of CT and manually transformed CT images resulted in a lower MMSE compared with standard mutual information, which indicated that the algorithm we proposed was more accurate. The registration results of CT and CBCT images showed that the transformation parameters of our registration method was more linearly related to the preprocessing parameters along the corresponding directions, which indicated that our method was also more robust. The stationary wavelet transform can be applied to obtain the spatial information of the registered images and the normalized mutual information can be used as the similarity measure in the registration. The combination of the above techniques yielded an effective registration of CT and CBCT, which is indispensable for the accurate location of tumor in radiotherapy. In our future investigation, we will focus on improving the speed of the algorithm. In addition, compared with other methods, the proposed registration algorithm calculated the mutual information of the original image and the gradient image respectively, which increases the computational cost. Shortening the algorithm running time will be the focus of our further research.

There are still some limits in our work. Firstly, we just tried to use the gradient information to investigate the effects of registration between CT and CBCT images, so compared with other state-of-art method performing on multimodal image registration (such as registration of CT, MRI, PET images, etc. ) [3034], our proposed method may not perform so well. In our future work, we will continue to improve our method and apply it on multimodal images for a further evolution.

Conclusions

In this paper, we proposed a medical image registration algorithm based on SWT and mutual information. The algorithm synthesizes a gradient image based on the translational invariance of SWT, and incorporates it into the mutual information calculation of the original image by the weighting function to obtain a new similarity measure. The proposed method effectively overcomes the weakness of mutual information registration for the lack of spatial information. Experiment results showed that the proposed method is robust and accurate. As for the registration between planning CT images and setup CBCT images in radiotherapy, SWT is data redundanct and translationally invariant, which is conducive to identify sharp variations in the image. Furthermore, image reconstruction based on SWT tends to highlight edge features, and enhances the resolution of edge details. In particular, for noisy CBCT images, we can extract more accurate gradient information from the images, thereby the accuracy of the registration can be improved.