Abstract
In this research we address the problem of classification and labeling of regions given a single static natural image. Natural images exhibit strong spatial dependencies, and modeling these dependencies in a principled manner is crucial to achieve good classification accuracy. In this work, we present Discriminative Random Fields (DRFs) to model spatial interactions in images in a discriminative framework based on the concept of Conditional Random Fields proposed by lafferty et al.(2001). The DRFs classify image regions by incorporating neighborhood spatial interactions in the labels as well as the observed data. The DRF framework offers several advantages over the conventional Markov Random Field (MRF) framework. First, the DRFs allow to relax the strong assumption of conditional independence of the observed data generally used in the MRF framework for tractability. This assumption is too restrictive for a large number of applications in computer vision. Second, the DRFs derive their classification power by exploiting the probabilistic discriminative models instead of the generative models used for modeling observations in the MRF framework. Third, the interaction in labels in DRFs is based on the idea of pairwise discrimination of the observed data making it data-adaptive instead of being fixed a priori as in MRFs. Finally, all the parameters in the DRF model are estimated simultaneously from the training data unlike the MRF framework where the likelihood parameters are usually learned separately from the field parameters. We present preliminary experiments with man-made structure detection and binary image restoration tasks, and compare the DRF results with the MRF results.
Similar content being viewed by others
References
Barrett, W.A. and Petersen, K.D. 2001. Houghing the hough: Peak collection for detection of corners, junctions and line intersections. In Proc. IEEE Int. Conference on Computer Vision and Pattern Recognition, 2:302–309.
Besag, J. 1986. On the statistical analysis of dirty pictures. Journal of Royal Statistical Soc., B-48:259–302.
Blake, A., Rother, C., Brown, M., Perez, P., and Torr, P. 2004. Interactive image segmentation using an adaptive GMMRF model. In Proc. European Conf. on Computer Vision (ECCV).
Bottou, L. 1991. Une Approache theorique de l'Apprentissage Connexionniste Applications a la Reconnaissance de la Parole. Ph.D. thesis, University de Paris, France.
Bouman, C.A. and Shapiro, M. 1994. A multiscale random field model for bayesian image segmentation. IEEE Trans. on Image Processing, 3(2):162–177.
Boykov, Y. and Jolly, M-P. 2001. Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In Proc. International Conference on Computer Vision (ICCV), I:105–112.
Cheng, H. and Bouman, C.A. 2001. Multiscale bayesian segmentation using a trainable context model. IEEE Trans. on Image Processing, 10(4):511–525.
Christmas, W.J., Kittler, J. and Petrou, M. 1995. Structural matching in computer vision using probabilistic relaxation. IEEE Trans. Pattern Anal. Machine Intell., 17(8):749–764.
Collins, M. 2002. Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP).
Felzenszwalb, P.F. and Huttenlocher, D.P. 2000. Pictorial structures for object recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR'00).
Feng, X., Williams, C.K.I., and Felderhof, S.N. 2002. Combining belief networks and neural networks for scene segmentation. IEEE Trans. Pattern Anal. Machine Intelligence, 24(4):467– 483.
Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In Proc. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'03), 2:264–271.
Figueiredo, M.A.T. 2001. Adaptive sparseness using jeffreys prior. Advances in Neural Information Processing Systems (NIPS).
Figueiredo, M.A.T. and Jain, A.K. 2001. Bayesian learning of sparse classifiers. In Proc. IEEE Int. Conference on Computer Vision and Pattern Recognition, 1:35–41.
Fox, C. and Nicholls, G. 2000. Exact map states and expectations from perfect sampling: Greig, porteous and seheult revisited. In Proc. Twentieth Int. Workshop on Bayesian Inference and Maximum Entropy Methods in Sci. and Eng.
Geman, S. and Geman, D. 1984. Stochastic relaxation, gibbs distribution and the bayesian restoration of images. IEEE Trans. on Patt. Anal. Mach. Intelli., 6:721–741.
Gill, P.E., Murray, W., and Wright, M.H. 1981. Practical Optimization. Academic Press, San Diego.
Greig, D.M., Porteous, B.T., and Seheult, A.H. 1989. Exact maximum a posteriori estimation for binary images. Journal of Royal Statis. Soc., 51(2):271–279.
Guo, C.E., Zhu, S.C., and Wu, Y.N. 2003. Modeling visual patterns by integrating descriptive and generative models. International Journal of Computer Vision, 53(1):5–29.
Hammersley, J.M. and Clifford, P. Markov field on finite graph and lattices. Unpublished.
He, X., Zemel, R., and Carreira-Perpinan, M. 2004. Multiscale conditional random fields for image labelling. IEEE Int. Conf. CVPR.
Hinton, G.E. 2002. Training product of experts by minimizing contrastive divergence. Neural Computation, 14:1771–1800.
Ising, E. 1925. Beitrag zur theorie der ferromagnetismus. Zeitschrift Fur Physik, 31:253–258.
Kittler, J. 1997. Probabilistic relaxation: Potential, relationships and open problems. In Proc. Energy Minimization Methods in Computer Vision and Pattern Recognition, pp. 393–408.
Kittler, J. and Hancock, E.R. 1989. Combining evidence in probabilistic relaxation. Int. Jour. Pattern Recog. Artificial Intelli., 3(1):29–51.
Kittler, J. and Illingworth, J. 1985. Relaxation labeling algorithms — a review. Image and Vision Computing, 3(4):206–216.
Kittler, J. and Pairman, D. 1985. Contextual pattern recognition applied to cloud detection and identification. IEEE Trans. on Geo. and Remote Sensing, 23(6):855–863.
Kolmogorov, V. and Zabih, R. 2002 What energy functions can be minimized via graph cuts. In Proc. European Conf. on Computer Vision, 3:65–81.
Krishnamachari, S. and Chellappa, R. 1996. Delineating buildings by grouping lines with MRFs'. IEEE Trans. on Pat. Anal. Mach. Intell., 5(1):164–168.
Kumar, S., August, J., and Hebert, M. 2005. Exploiting inference for approximate parameter learning in discriminative fields: An empirical study. Fourth Int. Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR).
Kumar, S. and Hebert, M. 2003. Discriminative fields for modeling spatial dependencies in natural images. In Advances in Neural Information Processing Systems (NIPS).
Kumar, S. and Hebert, M. 2003. Discriminative random fields: A discriminative framework for contextual interaction in classification. In Proc. IEEE International Conference on Computer Vision (ICCV), 2:1150–1157.
Kumar, S. and Hebert, M. 2003. Man-made structure detection in natural images using a causal multiscale random field. In Proc. IEEE Int. Conf. on Comp. Vision and Pattern Recog. (CVPR), 1:119–126.
Kumar, S., loui, A.C., and Hebert, M. 2003. An observation-constrained generative approach for probabilistic classification of image regions. Image and Vision Computing, Special Issue on Generative Models Based Vision, 21:87–97.
Lafferty, J., McCallum, A., and Pereira, F. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. Int. Conf. on Machine Learning.
Lafferty, J., Zhu, X. and Liu, Y. 2004. Kernel conditional random fields: Representation and clique selection. In Proc. Twenty-First International Conference on Machine Learning (ICML).
Li, S.Z. 2001. Markov Random Field Modeling in Image Analysis. Springer-Verlag, Tokyo.
Mackay, D. 1996. Bayesian non-linear modelling for the 1993 energy prediction competition. In Maximum Entropy and Bayesian Methods, pp. 221–234.
McCullagh, P. and Nelder, J.A. 1987. Generalised Linear Models. Chapman and Hall, London.
Minka, T.P. 2001. Algorithms for Maximum-Likelihood Logistic Regression. Statistics Tech Report 758, Carnegie Mellon University.
Murphy, K., Torralba, A., and Freeman, W.T. 2003. Using the forest to see the trees: A graphical model relating features, objects and scenes. In Advances in Neural Information Processing Systems (NIPS 03).
Ng, A.Y. and Jordan, M.I. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Advances in Neural Information Processing Systems (NIPS).
Pieczynski, W. and Tebbache, A.N. 2000. Pairwise markov random fields and its application in textured images segmentation. In Proc. 4th IEEE Southwest Symposium on Image Analysis and Interpretation, pp. 106–110.
Qi, Y., Szummer, M., and Minka, T.P. 2005. Diagram structure recognition by bayesian conditional random fields. In Proc. International Conference on Computer Vision and Pattern Recognition (CVPR).
Quattoni, A., Collins, M., and Darrell, T. 2004 Conditional random fields for object recognition. Neural Information Processing Systems (NIPS).
Rosenfeld, A., Hummel, R., and Zucker, S. 1976. Scene labeling by relaxation operations. IEEE Trans System, Man, Cybernatics, SMC-6:420–433.
Rubinstein, Y.D. and Hastie, T. 1997. Discriminative vs informative learning. In Proc. Third Int. Conf. on Knowledge Discovery and Data Mining, pp. 49–53.
Szummer, M. and Qi, Y. 2004. Contextual recognition of hand-drawn diagrams with conditional random fields. Workshop on Frontiers in Handwriting Recognition.
Taskar, B., Guestrin, C., and Koller, D. 2003. Max-margin markov network. Neural Information Processing Systems Conference (NIPS'03).
Tipping, M. 2000. The relevance vector machine. Advances in Neural Information Processing Systems-NIPS'12, pp. 652–658.
Torralba, A., Murphy, K.P., and Freeman, W.T. 2005. Contextual models for object detection using boosted random fields. Adv. in Neural Information Processing Systems (NIPS).
Waltz, D.L. 1975. Understanding Line Drawing of Scenes with Shadows. The Psychology of Computer Vision, P H Winston, ed. McGraw-Hill, New York.
Wang Y. and Ji, Q. 2005. A dynamic conditional random field model for object segmentation in image sequences. In Proc. IEEE Int. Conf. on Comp. Vision and Pattern Recog. (CVPR), 1:264– 270.
Weber, M., Welling, M., and Perona, P. 2000. Towards automatic discovery of object categories. In Proc. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'00).
Weinman, J., Hanson, A., and McCallum, A. 2004. Sign detection in natural images with conditional random fields. In Proc. of IEEE International Workshop on Machine Learning for Signal Processing.
Williams, C.K.I. and Adams, N.J. 1999. Dts: Dynamic trees. Advances in Neural Information Processing Systems, 11.
Williams, P. 1995. Bayesian regularization and pruning using a laplacian prior. Neural Computation, 7:117–143.
Wilson, R. and Li, C.T. 2003. A class of discrete multiresolution random fields and its application to image segmentation. IEEE Trans. on Pattern Anal. and Machine Intelli., 25(1):42–56.
Won, C.S. and Derin, H. 1992. Unsupervised segmentation of noisy and textured images using markov random fields. CVGIP, 54:308–328.
Xiao, G., Brady, M., Noble, J.A., and Zhang, Y. 2002. Segmentation of ultrasound b-mode images with intensity inhomogeneity correction. IEEE Trans. on Medical Imaging, 21(1):48–57.
Author information
Authors and Affiliations
Corresponding author
Additional information
Sanjiv Kumar is currently with Google Research, Pittsburgh, PA, USA. His contact email is: sanjivk@google.com.
Rights and permissions
About this article
Cite this article
Kumar, S., Hebert, M. Discriminative Random Fields. Int J Comput Vision 68, 179–201 (2006). https://doi.org/10.1007/s11263-006-7007-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-006-7007-9