# A feature-based solution for 3D registration of CT and MRI images of human knee

- 268 Downloads
- 1 Citations

## Abstract

This paper presents a feature-based solution for 3D registration of CT and MRI images of a human knee. It facilitates constructing high-quality models with clear outlining of bone tissues and detailed illustration of soft tissues. The model will be used for analysing the effect of posterior cruciate ligament and anterior cruciate ligament deficiency. The solution consists of preprocessing, feature extraction, transformation parameter estimation and resampling, and blending. In preprocessing, we propose partial preserving and iterative neighbour comparing filtering to help segment bone tissues from MRI images without having to construct a statistical model. Through analysing the characteristics of knee images, tibia and femur are selected as the features and the algorithm for effectively extracting them is described. To estimate transformation parameters, we propose a method based on the statistical information of projected feature images, including translating according to the project feature image centroids and calculating the rotation angle by searching and mapping boundary points. We transform the MRI image and then blend it with the CT image by taking the maximum intensity of every two corresponding voxels from the two images. At the end of the paper, the registration result is evaluated by computing the Pearson product-moment correlation coefficient of the binarised features and the accuracy is confirmed.

### Keywords

Feature-based image registration Magnetic resonance imaging Computed tomography Human knee## 1 Introduction

Human knees are commonly injured and suffer from various conditions. Patients who have suffered significant trauma or chronic osteoarthritis pain are usually asked to undergo an MRI of the knees, as this shows clear and detailed structures of the cartilage, ligaments, and muscles [1]. Meanwhile, CT scans are adopted where the actual bone structures require well outlining. Registering and blending CT and MRI images of the human knee facilitates constructing high-quality models providing far more abundant information, featuring both modalities. This model is constructed for the purpose of analysing the effect of posterior cruciate ligament and anterior cruciate ligament deficiency. It is also considered to be a potentially useful tool for disease diagnosis, stress analysis, and computer-aided surgery.

This paper first introduces image registration and the typical classification of registration methods, i.e., intensity-based methods and feature-based methods. Then, we review our previous work and analyse the feasibility of applying mutual information method, one of the state-of-the-art intensity-based methods for multimodal image registration. Concluding with several drawbacks of applying this method, we consider using feature-based method and point out our main focuses.

In methodology, this paper proposes a feature-based solution for 3D registration of CT and MRI images of the human knee. The solution consists of four steps: preprocessing, feature extraction, transformation parameter estimation and resampling, and blending.

*Preprocessing.* Preprocessing enhances the images so that the features can be extracted more easily. An obvious contrast between bone tissues and other parts of the image is resulted by preprocessing. Different approaches are adopted to preprocess CT and MRI images. CT images are preprocessed by thresholding and adopting hole filling operation. MRI images are preprocessed with a series of operations to increase contrast of bone tissues and soft tissues as well as eliminate noise inside bone tissues. Since different parts of an MRI image vary a lot, parameters in these operations are piecewise applied. An algorithm is proposed for separating MRI slices into four parts.

*Feature extraction.* Tibia and femur, as the two largest bones viewed in the image are proper choices for anatomical features. They are extracted by thresholding and adopting morphological opening. Morphological opening is performed to a portion of the slices in the image. An algorithm is proposed for identifying whether a slice should be processed with morphological opening or not.

*Transformation parameter estimation and resampling.* Based on the features extracted, transformation parameters are estimated utilising the statistical information of these features. The sensed image is resampled according to the transformation matrix computed.

*Blending.* CT and MRI images are in the end blended by taking the maximum intensity of each corresponding voxel.

## 2 Analysis of methods

Image registration is the process of finding correspondence between all points in two images of a scene. The process spatially aligns the images, making it possible to fuse information in the images. The correspondence is often established by finding a transformation that minimises some distance between the transformed sensed image and the reference image [2].

Over the years, many approaches have been proposed for image registration. These approaches are generally categorised into intensity-based methods (or area-based methods) and feature-based methods.

*Intensity-based methods* [3, 4, 5] operate directly on voxel intensities. The basic principle of these methods is to search for a transformation that optimises a criterion measuring the intensity similarity of corresponding voxels. Among different measures, mutual information has been proved to be an excellent one for cross-modality registrations. This measure assumes the statistical dependence of the voxel intensities is maximal when the images are geometrically aligned.

*Feature-based methods* [6, 7, 8] typically extract distinct anatomical features from images and find the correspondence and transformation between them. Features such as points, curves, and surfaces are often employed in transformation model estimation. It can handle complex between-image distortions and is faster than mutual information method, since no evaluation of a matching criterion on the whole image is needed [9].

Medical images like brains often employ mutual information method since they contain inadequate features that are distinctive and easily detectable [10]. This measure only makes a fairly loose assumption that image intensities should have a probabilistic relationship. This assumption holds for brain images whose statistical dependence is relatively strong. However, directly applying it to knee images is very likely to fail [11]. CT and MRI images are different in many ways. First, an MRI image provides more detailed information (redundant while building probabilistic relationship) about soft tissues than a CT image. Second, the contrast between bone tissues and soft tissues is significant enough in a CT image so that there exists a threshold to separate them, while this boundary is ambiguous in an MRI image. Third, the intensities of bone tissues are greater than those of soft tissues in a CT image, while this is opposite in an MRI image.

In our previous work [12], two-dimensional images are preprocessed into a pair of bone skeletons. This provides images with stronger statistical dependence. The registration is carried out by optimisation of mutual information using Powell’s method [13]. This approach provides a relatively accurate result of registration. However, it is time-consuming as each evaluation for mutual information criterion involves all the voxels in images and the number of iterations during optimisation is very large [14]. Adding a third dimension, as in the 3D MRI and CT images in this work, will take even longer time in searching.

Now, we consider the feasibility of using feature-based methods. Selecting proper features and extracting them are the keys to the success of feature-based image registration.

The correct alignment of bones is of most concern in the registration of the human knee [12]. In our research, we acquire the image pair with the assistance of a customised leg stand so that the angles of knee bending are guaranteed to be the same and the knees have already been aligned along the \(z\) axis (head to foot direction). Bone tissues, unlike soft tissues, are rigid and their physical appearance does not change easily with environment. They are also salient features that can be easily located and identified with human eyes. Tibia and femur are the two largest bones in knee images; hence, they are proper choices for anatomical features in our registration.

Tibia and femur can be extracted out of a CT image easily, but they are very difficult to be extracted out of an MRI image. Standard automatic segmentation methods often fail to provide reasonable segmentation as different tissues have overlapping image intensity values and boundaries between tissues are not clearly separated [15]. Most existing segmentation methods on MRI images of the human knee rely on constructing a statistical model, the accuracy of which is strongly influenced by the amount of input data [1, 15].

Furthermore, the data describing the desired structures are hand-segmented which can be prohibitive. Based on the reasons above, it is important to find an approach that overcomes the shortcomings of these methods.

## 3 Methodology

### 3.1 Preprocessing

Preprocessing is essential since the images may contain noise and the contrast between bone tissues and other parts may not be significant enough. Preprocessing facilitates the feature extraction step.

#### 3.1.1 Masking

We first present the steps for computing the threshold \(\theta _\mathrm{mask}\). Store the intensity of each voxel in image \(f(x,y,z)\) into vector \(V\). Then, compute the number of bins in the image histogram, \(N_\mathrm{bins} = \lceil \frac{|V|}{S_1}\rceil \), where the value \(S_1\) is chosen based on the number of voxels in the image to generate a moderate resolution of histogram to discover the gap. We test different \(S_1\) with values 10, 100, 1,000, 10,000, and 100,000, and conclude that \(S_1 = 10,000\) is a safe and cost-effective choice.

Define the threshold for identifying the gap \(\eta _\mathrm{gap} = \frac{|V|}{S_2}\), where the value of \(S_2\) is chosen to make \(\eta _\mathrm{gap}\) small enough but not too small to ignore the gap so that the gap can be discovered. We determine \(S_2\) with similar method for determining \(S_1\) and have \(S_2 = 20\).

#### 3.1.2 Gamma correction

#### 3.1.3 Partial preserving

#### 3.1.4 Iterative neighbour comparing filtering

Iterative neighbour comparing filtering increases the intensities of the voxels other than bone tissues as well as eliminates noise in bone tissues.

This step consists of a number of iterations. We denote the number of iteration as iter. In each iteration, \(I_\mathrm{v}\) and \(I_\mathrm{s}\) are computed. \(I_\mathrm{v}\) denotes the intensity of a voxel multiplied by \(|\varOmega _\mathrm{v}|\), where \(\varOmega _\mathrm{v}\) denotes itself and its neighbour voxels. \(I_\mathrm{s}\) denotes the sum of the intensities of the voxels in \(\varOmega _\mathrm{v}\). \(I_\mathrm{v}\) and \(I_\mathrm{s}\) are compared to determine the operation to this voxel. If \(I_\mathrm{v}\) and \(I_\mathrm{s}\) are close to each other, it often means this voxel is located at a continuous volume and the intensity of this voxel should be increased. If \(I_\mathrm{v}\) is much greater than \(I_\mathrm{s}\), this voxel is inferred to be isolated and probably noise to be eliminated. If \(I_\mathrm{v}\) is much smaller than \(I_\mathrm{s}\), the intensity of this voxel better remains unchanged, since this happens when either there is a noise voxel with high intensity around it or it is located in a continuous volume but with a relatively small intensity compared with its neighbour voxels. A parameter \(\lambda \) is introduced into the comparison to control the filter behaviour.

Then, compute \(I_\mathrm{v} = |\varOmega _\mathrm{v}| \cdot f(v)\) and

#### 3.1.5 Grayscale morphological closing

The foregoing steps performed to the MRI image may break the continuity of the voxel intensities of soft tissues. Morphological close operation is employed here to repair these broken voxels. Morphological close operation is a dilation followed by an erosion with the same structuring element for both operations.

Let \(f(x)\) denote a grayscale image and \(b(x)\) denote the structuring function, the dilation of \(f\) by \(b\) is given by \((f\oplus b)(x)=\sup _{y\in E}[f(y)+b(x-y)]\), and the erosion of \(f\) by \(b\) is given by \((f\ominus b)(x)=\inf _{y\in E}[f(y)-b(y-x)]\), where “sup” denotes the supremum, “inf” denotes the infimum and \(E\) is the grid. Then, the close operation \(f\cdot b\) is given by \(f\cdot b=(f\oplus b)\ominus b\). The structuring element \(b\) used here is flat and disc-shaped with a radius of \(R\).

#### 3.1.6 Piecewise processing

Recommended parameters for MRI image slices.

Part | \(\lambda \) | \(\tau \) | \(sz_{\varOmega }\) | Iter | Prop | \(\phi \) | \(R\) |
---|---|---|---|---|---|---|---|

1 | 0.8 | 4 | 2 | 3 | 0.9 | 0.20 | 5 |

2 | 0.4 | 4 | 2 | 3 | 0.9 | 0.45 | 4 |

3 | 0.3 | 4 | 2 | 3 | 0.9 | 0.30 | 4 |

4 | 0.4 | 4 | 2 | 3 | 0.9 | 0.20 | 7 |

### 3.2 Feature extraction

#### 3.2.1 Thresholding

Thresholding sets the intensities of bone tissues to one and others to zero. The MRI image is thresholded by setting the voxels of the masked volume whose intensity is not equal to zero and others to one. The CT image is thresholded with Otsu’s method, which chooses the threshold to minimise the intraclass variance of the thresholded black and white pixels [17].

#### 3.2.2 Connected component selection

After thresholding, the two greatest connected components in each image are selected. For MRI images, these connected components are already tibia and femur. But for CT images, there are still extra bone tissues connected to tibia or femur, i.e., they are also included in these two connected components.

#### 3.2.3 Morphological opening

This step eliminates extra bone tissues connected to tibia or femur with morphological open operations. Mathematical opening is the dilation of the erosion of a set \(A\) by a structuring element \(B\), functioned as eliminating areas relatively small. The structuring element employed here is flat and disc-shaped.

### 3.3 Transformation parameter estimation and resampling

Now that tibia and femur are extracted out of the original images, we can use the features for transformation parameter estimation. This step estimates the parameters required to produce the transformation matrix in resampling the sensed image to the spatial coordinates of the reference image. The transformation here is a typical affine transformation, where an object can translate, rotate, and scale. In this part, parameters for translation, rotation, and scaling are estimated sequentially.

#### 3.3.1 Preparation

Let CT and MRI denote the binarised images of the features. They represent the same range along \(z\) axis and should be scaled so that they contain the same number of slices \(h\).

#### 3.3.2 Translation

This step moves the centroids of the features to the same position and determines parameter \(tx\) and \(ty\). \(tx\) denotes the displacement along \(x\) axis and \(ty\) along \(y\) axis.

#### 3.3.3 Rotation

This step calculates rotation angle \(\beta \) based on \({\text {MRI}}_\mathrm{proj}^\prime \) and \({\text {CT}}_\mathrm{proj}^\prime \). It utilises the inherent asymmetry of the point sets.

We first find the edges of \({\text {CT}}_\mathrm{proj}^\prime \) and \({\text {MRI}}_\mathrm{proj}^\prime \) with the Roberts’ cross operator [18]. The coordinates on the edges are sequentially added into \(V_\mathrm{CT}\) and \(V_\mathrm{MRI}\). The distances between each pixel on the edges and the centroid are calculated and stored in distance vector \({\text {dist}}_\mathrm{CT}\) and \({\text {dist}}_\mathrm{MRI}\): \({\text {dist}}_\mathrm{CT} = |V_\mathrm{CT} - P_\mathrm{C}|\) and \({\text {dist}}_\mathrm{MRI} = |V_\mathrm{MRI} - P_\mathrm{C}|\). The distance vector with the greater size is interpolated so that the resampled vector has the same size with the one that is originally smaller. We denote the new distance vectors as \({\text {dist}}_\mathrm{CT}^{\prime }\) and \({\text {dist}}_\mathrm{MRI}^{\prime }\), respectively. Then, \({\text {dist}}_\mathrm{CT}^{\prime }\) is circularly shifted right to find the minimum sum of squared difference between the corresponding entries of \({\text {dist}}_\mathrm{CT}^{\prime }\) and \({\text {dist}}_\mathrm{MRI}^{\prime }\).

Formally, let \([{\text {dist}}_\mathrm{CT}^{\prime }]^{j \mathrm{th}}(i)\) denote the \(i\)th entry in the vector generated by circularly shifting right \(j\) times based on \({\text {dist}}_\mathrm{CT}^{\prime }\). Find the minimum of \(s_\mathrm{j}, s_\mathrm{m}\) which satisfies \(s_\mathrm{j} = \sum _{i = 1}^{l_\mathrm{dist}}\left( [{\text {dist}}_\mathrm{CT}^{\prime }]^{j\mathrm th}(i) - {\text {dist}}_\mathrm{MRI}^{\prime }(i)\right) ^{2}\), where \(l_\mathrm{dist} = |{\text {dist}}_\mathrm{CT}^{\prime }| = |{\text {dist}}_\mathrm{MRI}^{\prime }|\). The corresponding entries on \([{\text {dist}}_\mathrm{CT}^{\prime }]^{s\mathrm th}\) and \({\text {dist}}_\mathrm{MRI}^{\prime }\) are mapped back to find the nearest coordinates in \({\text {CT}}_\mathrm{proj}^\prime \) and \({\text {MRI}}_\mathrm{proj}^\prime \). \(l_\mathrm{dist}\) pairs of such matching coordinates are found. Let \(P_\mathrm{CT}\) and \(P_\mathrm{MRI}\) denote the vectors containing the matching coordinates in \({\text {CT}}_\mathrm{proj}^\prime \) and \({\text {MRI}}_\mathrm{proj}^\prime \), respectively.

Then, we calculate the median angle formed by \(V_{a}(i)\) and \(V_{b}(i),\, \beta = {\text {median}}\left( {\text {angle}}(i)\right) \), where \({\text {angle}}(i) = \arccos \left( \frac{V_{a}(i) \cdot V_{b}(i^{\prime })}{|V_{a}(i)| |V_{b}(i^{\prime })|}\right) \), where \(V_{a} = {\text {CT}}_\mathrm{proj}^\prime (i) - P_\mathrm{C},\, V_{b} = {\text {MRI}}_\mathrm{proj}^\prime (i^{\prime }) - P_\mathrm{C}\).

#### 3.3.4 Scaling

This step calculates scale factors \(sx\) and \(sy\). They are estimated by computing the ratio of the mean distance between the centroid and the points on the edges of \({\text {CT}}_\mathrm{proj}^{\prime \prime }\) and \({\text {MRI}}_\mathrm{proj}^{\prime \prime }\), independently along each dimension \(sx=\bar{\mathrm{CT}}_{xs} / \bar{\mathrm{MRI}}_{xs},\, sy=\bar{\mathrm{CT}}_{ys} / \bar{\mathrm{MRI}}_{ys}\), where \({\text {CT}}_{xs}\) and \({\text {MRI}}_{xs}\) denote the distances between the points on edges and the centroid along \(x\) axis, and \({\text {CT}}_{ys}\) and \({\text {MRI}}_{ys}\) along \(y\) axis.

#### 3.3.5 Resampling

Thus, we have the resampled MRI image \({\text {MRI}}^\prime \). For convenience, let \({\text {CT}}^\prime = {\text {CT}}\).

### 3.4 Blending

Blending fuses the information from both modalities. The outcome of this step should have clear outlining of bone tissues and detailed illustration of soft tissues. The intensities of bone tissues are very large in CT images and very small in MRI images, so it is not meaningful to simply average the intensities of two images. Instead, we take the maximum of corresponding voxels in \({\text {CT}}^\prime \) and \({\text {MRI}}^\prime \) to give the blended image \(I_\mathrm{blended}(x,y,z) = {\text {max}}({\text {CT}}^\prime (x,y,z), {\text {MRI}}^\prime (x,y,z))\). This helps to maximise information retention in CT-MRI fusion [19].

## 4 Results and evaluation

We implemented our solution in Matlab 2012b and ran it on a computer with 2.13 GHz CPU and 2 GB RAM. It took 62.3 s in total to finish computing. The original matrix sizes of CT and MRI images are \(176\times 148\times 59\) and \(128\times 128\times 59\), respectively.

To numerically evaluate this result, the Pearson product-moment correlation coefficient of the binarised features (i.e., tibia and femur) of \({\text {CT}}^{\prime }\) and \({\text {MRI}}^{\prime }\) is computed. We evaluate with the binarised features (instead of the whole images) for two reasons. On the one hand, correct alignment of bone tissues is of most concern in this research. Bone tissues are rigid, and they are consistent between the images, while the appearance of soft tissues might change. On the other, CT and MRI images differ from each other in many ways. It is more comparable if we only take the position information of the features into evaluation, thus they are binarised.

## 5 Conclusion

In this paper, we present a feature-based solution for 3D registration of CT and MRI images of the human knee. The experiment shows that our solution is very accurate. In our solution, statistical information is used to help extract features, estimate parameters and improve the accuracy. The model provided by this solution will be further used to analyse the effect of posterior cruciate ligament and anterior cruciate ligament deficiency. In the future work, we will further verify the robustness of our solution as we acquire new datasets to feature it with more generality.

### References

- 1.Fripp, J., Warfield, S.K., Crozier, S., Ourselin, S.: In: Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, Vol. 1 (IEEE, 2006), pp. 167–170Google Scholar
- 2.Goshtasby, A.A.: Image Registration: Principles Tools and Methods. Springer, New York (2012)CrossRefGoogle Scholar
- 3.Alam, M.M., Howlader, T., Rahman, S.M.: Entropy-based image registration method using the curvelet transform. Signal Image Video Process.
**8**(3), 491–505 (2014)Google Scholar - 4.Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., Suetens, P.: Multimodality image registration by maximization of mutual information. IEEE Trans. Med. Imaging
**16**(2), 187 (1997)CrossRefGoogle Scholar - 5.Kim, J., Fessler, J.A.: Intensity-based image registration using robust correlation coefficients. IEEE Trans. Med. Imaging
**23**(11), 1430 (2004)CrossRefGoogle Scholar - 6.Bouchiha, R., Besbes, K.: Comparison of local descriptors for automatic remote sensing image registration. Signal Image Video Process. 1–7 (2013)Google Scholar
- 7.Can, A., Stewart, C.V., Roysam, B., Tanenbaum, H.L.: A feature-based technique for joint, linear estimation of high-order image-to-mosaic transformations: mosaicing the curved human retina. IEEE Trans. Pattern Anal. Mach. Intell.
**24**(3), 347 (2002)CrossRefGoogle Scholar - 8.Dai, X., Khorram, S.: A feature-based image registration algorithm using improved chain-code representation combined with invariant moments. IEEE Trans. Geosci. Remote Sens.
**37**(5), 2351 (1999)CrossRefGoogle Scholar - 9.Boda, S.: In: Feature-Based Image Registration. Ph.D. thesis (2009)Google Scholar
- 10.Zitova, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput.
**21**(11), 977 (2003)CrossRefGoogle Scholar - 11.Tomaževič, D., Likar, B., Pernuš, F.: Multi-feature mutual information image registration. Image Anal. Stereol.
**31**(1), 43 (2012)MathSciNetCrossRefGoogle Scholar - 12.Ji, Z., Wei, H.: The registration of knee joint images with preprocessing. Int. J. Image Graphics Signal Proc. (IJIGSP)
**3**(4), 10 (2011)Google Scholar - 13.Powell, M. J.: An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput. J.
**7**(2), 155–162 (1964)Google Scholar - 14.Pan, X., Zhao, K., Liu, J., Kang, Y.: In: Biomedical Engineering and Informatics (BMEI), 2010 3rd International Conference on, Vol. 1 (IEEE, 2010), pp. 18–22Google Scholar
- 15.Kapur, T., Beardsley, P., Gibson, S., Grimson, W., Wells, W.: In: Proceedings of IEEE Intl Workshop on Model-Based 3D Image Analysis (Citeseer, 1998), pp. 97–106Google Scholar
- 16.Poynton, C.: Digital Video and HD: Algorithms and Interfaces. Morgan Kaufmann, Burlington, Massachusetts (2012)Google Scholar
- 17.Otsu, N.: A threshold selection method from gray-level histograms. Automatica
**11**(285–296), 23 (1975)Google Scholar - 18.Roberts, L.G.: In: Machine Perception of Three-Dimensional Solids. Technical Report, DTIC Document (1963)Google Scholar
- 19.Shah, P., Srikanth, T., Merchant, S.N.: In: U.B. Desai, Signal, Image and Video Processing pp. 1–16 (2013)Google Scholar