Abstract
Probabilistic graphical models have had a tremendous impact in machine learning and approaches based on energy function minimization via techniques such as graph cuts are now widely used in image segmentation. However, the free parameters in energy function-based segmentation techniques are often set by hand or using heuristic techniques. In this paper, we explore parameter learning in detail. We show how probabilistic graphical models can be used for segmentation problems to illustrate Markov random fields (MRFs), their discriminative counterparts conditional random fields (CRFs) as well as kernel CRFs. We discuss the relationships between energy function formulations, MRFs, CRFs, hybrids based on graphical models and their relationships to key techniques for inference and learning. We then explore a series of novel 3D graphical models and present a series of detailed experiments comparing and contrasting different approaches for the complete volumetric segmentation of multiple organs within computed tomography imagery of the abdominal region. Further, we show how these modeling techniques can be combined with state of the art image features based on histograms of oriented gradients to increase segmentation performance. We explore a wide variety of modeling choices, discuss the importance and relationships between inference and learning techniques and present experiments using different levels of user interaction. We go on to explore a novel approach to the challenging and important problem of adrenal gland segmentation. We present a 3D CRF formulation and compare with a novel 3D sparse kernel CRF approach we call a relevance vector random field. The method yields state of the art performance and avoids the need to discretize or cluster input features. We believe our work is the first to provide quantitative comparisons between traditional MRFs with edge-modulated interaction potentials and CRFs for multi-organ abdominal segmentation and the first to explore the 3D adrenal gland segmentation problem. Finally, along with this paper we provide the labeled data used for our experiments to the community.
References
Bauer, S., Nolte, L.P., Reyes, M.: Fully automatic segmentation of brain tumor images using support vector machine classification in combination with hierarchical conditional random field regularization. In: Fichtinger, G., Martel, A., Peters, T. (eds.) Medical Image Computing and Computer-Assisted Intervention 2011. Lecture Notes in Computer Science, vol. 6893, pp. 354–361 (2011)
Besag, J.: On the statistical analysis of dirty pictures. J. R. Stat. Soc. B-48, 259–302 (1986)
Bhole, C., Morsillo, N., Pal, C.: 3d segmentation in ct imagery with conditional random fields and histograms of oriented gradients. In: The Second International Workshop on Machine Learning in Medical Imaging (MLMI), Medical Image Computing and Computer Assisted Intervention Society 2011 (2011)
Bishop, C.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)
Blake A., Rother C., Brown M., Perez P., Torr P.: Interactive image segmentation using an adaptive GMMRF model. In: European Conference of Computer Vision, pp. 428–441 (2004)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)
Bresson, X., Vandergheynst, P., Thiran, J.P.: A variational model for object segmentation using boundary information and shape prior driven by the Mumford–Shah functional. Int. J. Comput. Vis. 68(2), 145–162 (2006)
Byrd, R.H., Nocedal, J., Schnabel, R.H.: Representations of quasi-newton matrices and their use in limited memory methods. Math. Prog. 63(1–3), 129–156 (1994)
Chhikara, R.S., Folks, L.: The inverse Gaussian distribution: theory, methodology, and applications. CRC Press, London (1989)
Cortes, C., Vapnik, V.: Support vector network. Mach. Learn. 20(3), 273–297 (1995)
Criminisi, A., Shotton, J., Robertson, D.P., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies, pp. 106–117. In: Medical Computer Vision. International Workshop MICCAI (2010)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE Computer Society, New York (2005)
Druck, G., Pal, C., McCallum, A., Zhu, X.: Semi-supervised classification with hybrid generative/discriminative methods. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 280–289. ACM, New York (2007)
Heimann, T., Styner, M., van Ginneken, B. (eds.): 3D Segmentation in the Clinic: A Grand, Challenge, pp. 7–15 (2007)
Graf, F., Kriegel, H.P., Schubert, M., Strukelj, M., Cavallaro A.: Fully automatic detection of the vertebrae in 2D CT images. In: SPIE Medical Imaging, vol 7962 (2011)
Heimann, T., van Ginneken, B., Styner, M., Arzhaeva, Y., Aurich, V., Bauer, C., Beck, A., Becker, C., Beichel, R., Bekes, G., Bello, F., Binnig, G., Bischof, H., Bornik, A., Cashman, P., Chi, Y., Cordova, A., Dawant, B., Fidrich, M., Furst, J., Furukawa, D., Grenacher, L., Hornegger, J., Kainmueller, D., Kitney, R., Kobatake, H., Lamecker, H., Lange, T., Lee, J., Lennon, B., Li, R., Li, S., Meinzer, H.P., Nemeth, G., Raicu, D., Rau, A., van Rikxoort, E., Rousson, M., Rusko, L., Saddi, K., Schmidt, G., Seghers, D., Shimizu, A., Slagmolen, P., Sorantin, E., Soza, G., Susomboon, R., Waite, J., Wimmer, A., Wolf, I.: Comparison and evaluation of methods for liver segmentation from CT datasets. IEEE Trans. Med. Imaging 28(8), 1251–1265 (2009)
Heskes, T.: Stable fixed points of loopy belief propagation are local minima of the bethe free energy. In: Neural Information Processing Systems, pp. 343–350. MIT Press, Cambridge (2003)
Hocking, R.: The analysis and selection of variables in linear regression. Biometrics 32(1), (1976)
Huang, C., Darwiche, A.: Inference in belief networks: a procedural guide. Int. J. Approx. Reason. 15(3), 225–263 (1996)
Johnson, P., Horton, K., Fishman, E.: Adrenal mass imaging with multidetector CT: pathologic conditions, pearls, and pitfalls. RadioGraphics 29, 1333–1351 (2009)
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 26, 65–81 (2004)
Kschischang, F., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum–product algorithm. IEEE Trans. Inf. Theory 47, 498–519 (2001)
Kumar, S., Hebert, M.: Discriminative random fields: a discriminative framework for contextual interaction in classification. IEEE Int. Conf. Comput. Vis. 2, 1150 (2003)
Lafferty, J., Zhu, X., Liu, Y.: Kernel conditional random fields. In: Twenty-First International Conference on Machine Learning, p. 64. ACM Press, New York, NY, USA (2004)
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data, pp. 282–289. In: Proceedings of the Eighteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., Burlington (2001)
Lee, C.H., Wang, S., Murtha, A., Brown, M.R.G., Greiner, R.: Segmenting brain tumors using pseudo-conditional random fields, pp. 359–366. In: Medical Image Computing and Computer Assisted Intervention Society (2008)
Ling, H., Zhou, S., Zheng, Y., Georgescu, B., Suehling, M., Comaniciu, D.: Hierarchical, learning-based automatic liver segmentation. In: IEEE Conference on Computer Vission and Pattern Recognition, pp. 1–8 (2008)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Mayo-Smith, W., Boland, G., Noto, R., Lee, M.: State-of-the-art adrenal imaging. RadioGraphics 21, 995–1012 (2001)
Monaco, J., Madabhushi, A.: Weighted maximum posterior marginals for random fields using an ensemble of conditional densities from multiple markov chain monte carlo simulations. IEEE Trans. Med. Imaging 30(7), 1353–1364 (2011)
Morsillo, N., Pal, C., Nelson, R.: Mining the web for visual concepts. In: Proceedings of the 9th International Workshop on Multimedia Data Mining, pp. 18–25. ACM, New York (2008)
Motwani, K., Adluru, N., Hinrichs, C., Alexander, A.L., Singh, V.: Epitome driven 3-D Diffusion Tensor image segmentation: on extracting specific structures. Adv. Neural Inf. Process. Syst. 23, 1696–1704 (2010)
Murphy, K.P., Weiss, Y., Jordan, M.I.: Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of Uncertainty in AI, pp. 467–475 (1999)
Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and Naive Bayes. Adv. Neural Inf. Process. Syst. 2, 841–848 (2001)
Park, H., Bland, P., Meyer, C.: Construction of an abdominal probabilistic atlas and its application in segmentation. IEEE Trans. Med. Imaging 22(4), 483–492 (2003)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, Burlington (1988)
Raina, R., Shen, Y., Ng, A.Y., Mccallum, A.: Classification with hybrid generative/discriminative models. In: Advances in Neural Information Processing Systems, vol. 16. MIT Press, New York (2003)
Rim, D., Hassan, K., Pal, C.: Semi Supervised Learning in Wild Faces and Videos. In: British Machine Vision Conference (2011)
Scharstein, D., Pal, C.: Learning conditional random fields for stereo. Comput. Vis. Pattern Recognit. 1–8 (2007)
Seifert, S., Barbu, A., Zhou, S.K., Liu, D., Feulner, J., Huber, M., Sühling, M., Cavallaro, A., Comaniciu, D.: Hierarchical parsing and semantic navigation of full body CT data. In: Pluim, J.P.W., Dawant, B.M. (eds.) Proceedings of the SPIE (2009)
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 134–141. Association for Computational Linguistics, Morristown, NJ, USA (2003)
Sutton, C., McCallum, A.: An introduction to conditional random fields for relational learning. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning. MIT Press, Burlington (2007)
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., Rother, C.: A comparative study of energy minimization methods for markov random fields with smoothness-based priors. Pattern Anal. Mach. Intell. 30(6), 1068–1080 (2008)
Tappen, M.F., Freeman, W.T.: Comparison of graph cuts with belief propagation for stereo, using identical mrf parameters. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, IEEE Computer Society, p. 900. Washington, DC, USA (2003)
Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: Proceedings of Neural Information Processing Systems (2003)
Tipping, M.E.: The Relevance Vector Machine. In: Advances in Neural Information Processing Systems, pp. 652–658 (2000)
Tipping, M.E., Faul, A., Avenue, J.J.T.: Fast marginal likelihood maximisation for sparse Bayesian models. In: Proceedings Of The Ninth International Workshop On Artificial Intelligence And Statistics, pp. 3–6 (2003)
Tsechpenakis, G., Wang, J., Mayer, B., Metaxas, D.: Coupling CRFs and deformable models for 3D medical image segmentation. In: IEEE International Conference on Computer Vision, pp. 1–8 (2007)
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)
Varshney, L.: Abdominal organ segmentation in CT scan images: a survey, pp. 1–3 (2002)
Vishwanathan, S.V.N., Schraudolph, N.N., Schmidt, M.W., Murphy, K.P.: Accelerated training of conditional random fields with stochastic gradient methods. In: Proceedings of the 23rd International Conference on Machine learning, ACM, New York, NY, USA, pp. 969–976 (2006)
Weinman, J.J., Tran, L., Pal, C.J.: Efficiently learning random fields for stereo vision with sparse message passing. In: European Conference on Computer Vision, pp. 617–630. Springer, Berlin (2008)
Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. In: Proceedings of European Symposium on Artificial Neural Networks (1999)
Winn, J., Bishop, C.M.: Variational message passing. J. Mach. Learn. 6, 661–694 (2005)
Xue, J.H., Titterington, D.M.: Comment on ”On discriminative vs. generative classifiers: a comparison of logistic regression and Naive Bayes”. Neural Process. Lett. 28(3), 169–187 (2008)
Zhang, Y., Brady, M., Smith, S.: Segmentation of brain MR images through a hidden markov random field model and the EM algorithm. IEEE Trans. Med. Imaging 20(1), 45–57 (2001)
Zhang, Y., Matuszewski, B.J., Shark, L.K., Moore, C.J.: Medical image segmentation using new hybrid level-set method. In: Proceedings of the 2008 Fifth International Conference BioMedical Visualization: Information Visualization in Medical and Biomedical Informatics, pp. 71–76. IEEE Computer Society, New York (2008)
Zhu, J., Hastie, T.: Kernel logistic regression and the import vector machine. J. Comput. Graph. Stat. 14, 1081–1088 (2001)
Acknowledgments
This research was funded in part by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Clinical and Translational Science Award within the Up-state New York Translational Research Network (UNYTRN) of the Clinical and Translational Science Institute (CTSI), University of Rochester, Carestream, the Center for Emerging and Innovative Sciences (CEIS), a NYSTAR-designated Center for Advanced Technology, and by the National Institutes of Health (NIH) Award R01-DA-034977. We thank Prof. M.F. Reiser, FACR, FRCR, from the Department of Radiology, University of Munich, Germany, for his support.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
We discuss the modification to the logistic regression to handle class imbalance because there are more background points than there are other organ points. We group the pixels depending on their class they belong to and so can write the equation as follows.
where \(C\) indicates total number of class labels, \(B\) indicates number of bins per class (and we assume that each class has the same number of bins), \(H\) indicates number of HOG codewords per class, \(L_c\) indicates number of Gaussian distributions used to model the class \(c\) and \(G\) indicates the number of gradient bins and \(Z(\mathbf x )\) is the normalization constant. \(N_i\) is the number of pixels belonging to class \(i\) and \(n_i\) is the inverse of the fraction of pixels belonging to class \(i\). The details of the feature functions are provided in the main paper.
If \(L\) is the log-likelihood, then the update equations for the parameters is given as follows:
Appendix B
The log-likelihood for the weighted framework is given as follows. There are \(N\) unique location points which correspond to the rescaled 3D space of the patient volumes. Each of the \(n\) points can have \(M_n\) copies. This corresponds to saying that multiple patients can have the same voxel belong to the same class. This allows us to use a very large number of points without compromising on speed or memory requirements. Let \(X\) denotes the points belonging to a particular class. Then, we can write the equation as follows.
\(M_n\) can be thought of as a weight of the point \(n\). All \(x_{nm}\) points are the same for \(m\) and so we can rewrite the equation as follows.
The update equations for the parameters are modified (from [4]) as follows.
where \(z_{nk}\) is an indicator variable and \(\gamma (z_{nk})\) is the responsibility.
Rights and permissions
About this article
Cite this article
Bhole, C., Pal, C., Rim, D. et al. 3D segmentation of abdominal CT imagery with graphical models, conditional random fields and learning. Machine Vision and Applications 25, 301–325 (2014). https://doi.org/10.1007/s00138-013-0497-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-013-0497-x