Skip to main content
Log in

3D segmentation of abdominal CT imagery with graphical models, conditional random fields and learning

Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Probabilistic graphical models have had a tremendous impact in machine learning and approaches based on energy function minimization via techniques such as graph cuts are now widely used in image segmentation. However, the free parameters in energy function-based segmentation techniques are often set by hand or using heuristic techniques. In this paper, we explore parameter learning in detail. We show how probabilistic graphical models can be used for segmentation problems to illustrate Markov random fields (MRFs), their discriminative counterparts conditional random fields (CRFs) as well as kernel CRFs. We discuss the relationships between energy function formulations, MRFs, CRFs, hybrids based on graphical models and their relationships to key techniques for inference and learning. We then explore a series of novel 3D graphical models and present a series of detailed experiments comparing and contrasting different approaches for the complete volumetric segmentation of multiple organs within computed tomography imagery of the abdominal region. Further, we show how these modeling techniques can be combined with state of the art image features based on histograms of oriented gradients to increase segmentation performance. We explore a wide variety of modeling choices, discuss the importance and relationships between inference and learning techniques and present experiments using different levels of user interaction. We go on to explore a novel approach to the challenging and important problem of adrenal gland segmentation. We present a 3D CRF formulation and compare with a novel 3D sparse kernel CRF approach we call a relevance vector random field. The method yields state of the art performance and avoids the need to discretize or cluster input features. We believe our work is the first to provide quantitative comparisons between traditional MRFs with edge-modulated interaction potentials and CRFs for multi-organ abdominal segmentation and the first to explore the 3D adrenal gland segmentation problem. Finally, along with this paper we provide the labeled data used for our experiments to the community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Notes

  1. http://www.sliver07.org/.

    Fig. 1
    figure 1

    (Left) a coarsely registered axial data slice from the liver segmentation data set of [16], and (right) the liver segmentation and our manual segmentation for other anatomical structures. Each shade denotes a different class label. The classes in decreasing order of brightness are spleen, gall bladder, right kidney, left kidney, liver and background or other tissues class

  2. http://www.cs.rochester.edu/~bhole/medicalseg.

  3. http://www.sliver07.org/.

  4. http://www.cs.rochester.edu/~bhole/medicalseg.

References

  1. Bauer, S., Nolte, L.P., Reyes, M.: Fully automatic segmentation of brain tumor images using support vector machine classification in combination with hierarchical conditional random field regularization. In: Fichtinger, G., Martel, A., Peters, T. (eds.) Medical Image Computing and Computer-Assisted Intervention 2011. Lecture Notes in Computer Science, vol. 6893, pp. 354–361 (2011)

  2. Besag, J.: On the statistical analysis of dirty pictures. J. R. Stat. Soc. B-48, 259–302 (1986)

    Google Scholar 

  3. Bhole, C., Morsillo, N., Pal, C.: 3d segmentation in ct imagery with conditional random fields and histograms of oriented gradients. In: The Second International Workshop on Machine Learning in Medical Imaging (MLMI), Medical Image Computing and Computer Assisted Intervention Society 2011 (2011)

  4. Bishop, C.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)

    MATH  Google Scholar 

  5. Blake A., Rother C., Brown M., Perez P., Torr P.: Interactive image segmentation using an adaptive GMMRF model. In: European Conference of Computer Vision, pp. 428–441 (2004)

  6. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)

    Article  Google Scholar 

  7. Bresson, X., Vandergheynst, P., Thiran, J.P.: A variational model for object segmentation using boundary information and shape prior driven by the Mumford–Shah functional. Int. J. Comput. Vis. 68(2), 145–162 (2006)

    Article  Google Scholar 

  8. Byrd, R.H., Nocedal, J., Schnabel, R.H.: Representations of quasi-newton matrices and their use in limited memory methods. Math. Prog. 63(1–3), 129–156 (1994)

    Google Scholar 

  9. Chhikara, R.S., Folks, L.: The inverse Gaussian distribution: theory, methodology, and applications. CRC Press, London (1989)

    MATH  Google Scholar 

  10. Cortes, C., Vapnik, V.: Support vector network. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  11. Criminisi, A., Shotton, J., Robertson, D.P., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies, pp. 106–117. In: Medical Computer Vision. International Workshop MICCAI (2010)

  12. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE Computer Society, New York (2005)

  13. Druck, G., Pal, C., McCallum, A., Zhu, X.: Semi-supervised classification with hybrid generative/discriminative methods. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 280–289. ACM, New York (2007)

  14. Heimann, T., Styner, M., van Ginneken, B. (eds.): 3D Segmentation in the Clinic: A Grand, Challenge, pp. 7–15 (2007)

  15. Graf, F., Kriegel, H.P., Schubert, M., Strukelj, M., Cavallaro A.: Fully automatic detection of the vertebrae in 2D CT images. In: SPIE Medical Imaging, vol 7962 (2011)

  16. Heimann, T., van Ginneken, B., Styner, M., Arzhaeva, Y., Aurich, V., Bauer, C., Beck, A., Becker, C., Beichel, R., Bekes, G., Bello, F., Binnig, G., Bischof, H., Bornik, A., Cashman, P., Chi, Y., Cordova, A., Dawant, B., Fidrich, M., Furst, J., Furukawa, D., Grenacher, L., Hornegger, J., Kainmueller, D., Kitney, R., Kobatake, H., Lamecker, H., Lange, T., Lee, J., Lennon, B., Li, R., Li, S., Meinzer, H.P., Nemeth, G., Raicu, D., Rau, A., van Rikxoort, E., Rousson, M., Rusko, L., Saddi, K., Schmidt, G., Seghers, D., Shimizu, A., Slagmolen, P., Sorantin, E., Soza, G., Susomboon, R., Waite, J., Wimmer, A., Wolf, I.: Comparison and evaluation of methods for liver segmentation from CT datasets. IEEE Trans. Med. Imaging 28(8), 1251–1265 (2009)

    Article  Google Scholar 

  17. Heskes, T.: Stable fixed points of loopy belief propagation are local minima of the bethe free energy. In: Neural Information Processing Systems, pp. 343–350. MIT Press, Cambridge (2003)

  18. Hocking, R.: The analysis and selection of variables in linear regression. Biometrics 32(1), (1976)

  19. Huang, C., Darwiche, A.: Inference in belief networks: a procedural guide. Int. J. Approx. Reason. 15(3), 225–263 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  20. Johnson, P., Horton, K., Fishman, E.: Adrenal mass imaging with multidetector CT: pathologic conditions, pearls, and pitfalls. RadioGraphics 29, 1333–1351 (2009)

    Article  Google Scholar 

  21. Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 26, 65–81 (2004)

    Article  Google Scholar 

  22. Kschischang, F., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum–product algorithm. IEEE Trans. Inf. Theory 47, 498–519 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  23. Kumar, S., Hebert, M.: Discriminative random fields: a discriminative framework for contextual interaction in classification. IEEE Int. Conf. Comput. Vis. 2, 1150 (2003)

    Google Scholar 

  24. Lafferty, J., Zhu, X., Liu, Y.: Kernel conditional random fields. In: Twenty-First International Conference on Machine Learning, p. 64. ACM Press, New York, NY, USA (2004)

  25. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data, pp. 282–289. In: Proceedings of the Eighteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., Burlington (2001)

  26. Lee, C.H., Wang, S., Murtha, A., Brown, M.R.G., Greiner, R.: Segmenting brain tumors using pseudo-conditional random fields, pp. 359–366. In: Medical Image Computing and Computer Assisted Intervention Society (2008)

  27. Ling, H., Zhou, S., Zheng, Y., Georgescu, B., Suehling, M., Comaniciu, D.: Hierarchical, learning-based automatic liver segmentation. In: IEEE Conference on Computer Vission and Pattern Recognition, pp. 1–8 (2008)

  28. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Google Scholar 

  29. Mayo-Smith, W., Boland, G., Noto, R., Lee, M.: State-of-the-art adrenal imaging. RadioGraphics 21, 995–1012 (2001)

    Article  Google Scholar 

  30. Monaco, J., Madabhushi, A.: Weighted maximum posterior marginals for random fields using an ensemble of conditional densities from multiple markov chain monte carlo simulations. IEEE Trans. Med. Imaging 30(7), 1353–1364 (2011)

    Article  Google Scholar 

  31. Morsillo, N., Pal, C., Nelson, R.: Mining the web for visual concepts. In: Proceedings of the 9th International Workshop on Multimedia Data Mining, pp. 18–25. ACM, New York (2008)

  32. Motwani, K., Adluru, N., Hinrichs, C., Alexander, A.L., Singh, V.: Epitome driven 3-D Diffusion Tensor image segmentation: on extracting specific structures. Adv. Neural Inf. Process. Syst. 23, 1696–1704 (2010)

    Google Scholar 

  33. Murphy, K.P., Weiss, Y., Jordan, M.I.: Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of Uncertainty in AI, pp. 467–475 (1999)

  34. Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and Naive Bayes. Adv. Neural Inf. Process. Syst. 2, 841–848 (2001)

    Google Scholar 

  35. Park, H., Bland, P., Meyer, C.: Construction of an abdominal probabilistic atlas and its application in segmentation. IEEE Trans. Med. Imaging 22(4), 483–492 (2003)

    Article  Google Scholar 

  36. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, Burlington (1988)

    Google Scholar 

  37. Raina, R., Shen, Y., Ng, A.Y., Mccallum, A.: Classification with hybrid generative/discriminative models. In: Advances in Neural Information Processing Systems, vol. 16. MIT Press, New York (2003)

  38. Rim, D., Hassan, K., Pal, C.: Semi Supervised Learning in Wild Faces and Videos. In: British Machine Vision Conference (2011)

  39. Scharstein, D., Pal, C.: Learning conditional random fields for stereo. Comput. Vis. Pattern Recognit. 1–8 (2007)

  40. Seifert, S., Barbu, A., Zhou, S.K., Liu, D., Feulner, J., Huber, M., Sühling, M., Cavallaro, A., Comaniciu, D.: Hierarchical parsing and semantic navigation of full body CT data. In: Pluim, J.P.W., Dawant, B.M. (eds.) Proceedings of the SPIE (2009)

  41. Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 134–141. Association for Computational Linguistics, Morristown, NJ, USA (2003)

  42. Sutton, C., McCallum, A.: An introduction to conditional random fields for relational learning. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning. MIT Press, Burlington (2007)

  43. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., Rother, C.: A comparative study of energy minimization methods for markov random fields with smoothness-based priors. Pattern Anal. Mach. Intell. 30(6), 1068–1080 (2008)

    Article  Google Scholar 

  44. Tappen, M.F., Freeman, W.T.: Comparison of graph cuts with belief propagation for stereo, using identical mrf parameters. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, IEEE Computer Society, p. 900. Washington, DC, USA (2003)

  45. Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: Proceedings of Neural Information Processing Systems (2003)

  46. Tipping, M.E.: The Relevance Vector Machine. In: Advances in Neural Information Processing Systems, pp. 652–658 (2000)

  47. Tipping, M.E., Faul, A., Avenue, J.J.T.: Fast marginal likelihood maximisation for sparse Bayesian models. In: Proceedings Of The Ninth International Workshop On Artificial Intelligence And Statistics, pp. 3–6 (2003)

  48. Tsechpenakis, G., Wang, J., Mayer, B., Metaxas, D.: Coupling CRFs and deformable models for 3D medical image segmentation. In: IEEE International Conference on Computer Vision, pp. 1–8 (2007)

  49. Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)

    Google Scholar 

  50. Varshney, L.: Abdominal organ segmentation in CT scan images: a survey, pp. 1–3 (2002)

  51. Vishwanathan, S.V.N., Schraudolph, N.N., Schmidt, M.W., Murphy, K.P.: Accelerated training of conditional random fields with stochastic gradient methods. In: Proceedings of the 23rd International Conference on Machine learning, ACM, New York, NY, USA, pp. 969–976 (2006)

  52. Weinman, J.J., Tran, L., Pal, C.J.: Efficiently learning random fields for stereo vision with sparse message passing. In: European Conference on Computer Vision, pp. 617–630. Springer, Berlin (2008)

  53. Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. In: Proceedings of European Symposium on Artificial Neural Networks (1999)

  54. Winn, J., Bishop, C.M.: Variational message passing. J. Mach. Learn. 6, 661–694 (2005)

    MATH  MathSciNet  Google Scholar 

  55. Xue, J.H., Titterington, D.M.: Comment on ”On discriminative vs. generative classifiers: a comparison of logistic regression and Naive Bayes”. Neural Process. Lett. 28(3), 169–187 (2008)

    Article  Google Scholar 

  56. Zhang, Y., Brady, M., Smith, S.: Segmentation of brain MR images through a hidden markov random field model and the EM algorithm. IEEE Trans. Med. Imaging 20(1), 45–57 (2001)

    Article  Google Scholar 

  57. Zhang, Y., Matuszewski, B.J., Shark, L.K., Moore, C.J.: Medical image segmentation using new hybrid level-set method. In: Proceedings of the 2008 Fifth International Conference BioMedical Visualization: Information Visualization in Medical and Biomedical Informatics, pp. 71–76. IEEE Computer Society, New York (2008)

  58. Zhu, J., Hastie, T.: Kernel logistic regression and the import vector machine. J. Comput. Graph. Stat. 14, 1081–1088 (2001)

    MathSciNet  Google Scholar 

Download references

Acknowledgments

This research was funded in part by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Clinical and Translational Science Award within the Up-state New York Translational Research Network (UNYTRN) of the Clinical and Translational Science Institute (CTSI), University of Rochester, Carestream, the Center for Emerging and Innovative Sciences (CEIS), a NYSTAR-designated Center for Advanced Technology, and by the National Institutes of Health (NIH) Award R01-DA-034977. We thank Prof. M.F. Reiser, FACR, FRCR, from the Department of Radiology, University of Munich, Germany, for his support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Pal.

Appendices

Appendix A

We discuss the modification to the logistic regression to handle class imbalance because there are more background points than there are other organ points. We group the pixels depending on their class they belong to and so can write the equation as follows.

$$\begin{aligned} p(\mathbf{y | x })&= \prod _{i=1}^C \prod _{j=1}^{N_i} \left( \frac{1}{Z(\mathbf{x_{ij} })} \right. \exp \left( -\sum _{c=1}^{C} \sum _{b=1}^B \lambda _{c,b} \rho _{\lambda }(x_{ij}^\mathrm{int}, y_{ij}) \right. \nonumber \\&\qquad \quad \qquad - \sum _{c=1}^{C} \sum _{l=1}^{L_c} \alpha _{c,l} \rho _{\alpha }(x_{ij}^\mathrm{loc}, y_{ij}) \nonumber \\&\quad \qquad \qquad - \left. \sum _{c=1}^C \sum _{h=1}^H \beta _{c,h} \left. \rho _{\beta }(x_{ij}^\mathrm{hog}, y_{ij})\right) \right) ^{n_i} \end{aligned}$$
(37)

where \(C\) indicates total number of class labels, \(B\) indicates number of bins per class (and we assume that each class has the same number of bins), \(H\) indicates number of HOG codewords per class, \(L_c\) indicates number of Gaussian distributions used to model the class \(c\) and \(G\) indicates the number of gradient bins and \(Z(\mathbf x )\) is the normalization constant. \(N_i\) is the number of pixels belonging to class \(i\) and \(n_i\) is the inverse of the fraction of pixels belonging to class \(i\). The details of the feature functions are provided in the main paper.

If \(L\) is the log-likelihood, then the update equations for the parameters is given as follows:

$$\begin{aligned} \frac{\partial L}{\partial \lambda _{c,b}}&= \sum _{i=1}^C n_i \sum _{j=1}^{N_i} \left( \rho _{\lambda }(x_{ij}^\mathrm{int}, y_{ij}) - \right. \nonumber \\&\qquad \qquad \qquad \sum _{y_{ij}^{^{\prime }}} p(\mathbf{y_{ij}^{^{\prime }} | x_{ij} }) \left. \rho _{\lambda }(x_{ij}^\mathrm{int}, y_{ij}) \right) \end{aligned}$$
(38)

Appendix B

The log-likelihood for the weighted framework is given as follows. There are \(N\) unique location points which correspond to the rescaled 3D space of the patient volumes. Each of the \(n\) points can have \(M_n\) copies. This corresponds to saying that multiple patients can have the same voxel belong to the same class. This allows us to use a very large number of points without compromising on speed or memory requirements. Let \(X\) denotes the points belonging to a particular class. Then, we can write the equation as follows.

$$\begin{aligned} \log p(X | \pi , \mu , \varSigma ) = \sum _{n=1}^{N} \sum _{m=1}^{M_n} \log \sum _{k=1}^{K} \pi _k \mathcal{N }(x_{nm}| \mu _k, \varSigma _k) \end{aligned}$$

\(M_n\) can be thought of as a weight of the point \(n\). All \(x_{nm}\) points are the same for \(m\) and so we can rewrite the equation as follows.

$$\begin{aligned} \log p(X | \pi , \mu , \varSigma ) = \sum _{n=1}^{N} M_n \log \sum _{k=1}^{K} \pi _k \mathcal{N }(x_{nm}| \mu _k, \varSigma _k) \end{aligned}$$

The update equations for the parameters are modified (from [4]) as follows.

$$\begin{aligned} \mu _k = \frac{1}{N_k} \sum _{n=1}^{N} M_n \gamma (z_{nk}) x_n \end{aligned}$$

where \(z_{nk}\) is an indicator variable and \(\gamma (z_{nk})\) is the responsibility.

$$\begin{aligned} \gamma (z_{nk})&= \frac{ \pi _k \mathcal{N }(x_{n}| \mu _k, \varSigma _k)}{\sum _{j=1}^{K} \pi _j \mathcal{N }(x_{n}| \mu _j, \varSigma _j)}\\ N_k&= \sum _{n=1}^N M_n \gamma (z_{nk}), \;\;\;\pi _k = \frac{N_k}{ \sum _{i=1}^N M_n}, \;\; \text{ and }\\ \varSigma _k&= \frac{1}{N_k} \sum _{n=1}^{N} M_n \gamma (z_{nk}) (x_n - \mu _k) (x_n - \mu _k)^T \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhole, C., Pal, C., Rim, D. et al. 3D segmentation of abdominal CT imagery with graphical models, conditional random fields and learning. Machine Vision and Applications 25, 301–325 (2014). https://doi.org/10.1007/s00138-013-0497-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-013-0497-x

Keywords

Navigation