Fields of Experts

Article

Abstract

We develop a framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks. The approach provides a practical method for learning high-order Markov random field (MRF) models with potential functions that extend over large pixel neighborhoods. These clique potentials are modeled using the Product-of-Experts framework that uses non-linear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. We demonstrate the capabilities of this Field-of-Experts model with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the model is trained on a generic image database and is not tuned toward a specific application, we obtain results that compete with specialized techniques.

Keywords

Markov random fields Low-level vision Image modeling Learning Image restoration 

References

  1. Bell, A. J., & Sejnowski, T. J. (1995). An information-maximization approach to blind separation and blind deconvolution. Neural Computation, 7(6), 1129–1159. CrossRefGoogle Scholar
  2. Bertalmío, M., Sapiro, G., Caselles, V., & Ballester, C. (2007). Image inpainting. In ACM SIGGRAPH (pp. 417–424), July 2000. Google Scholar
  3. Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society: Series B, 48(3), 259–302. MATHMathSciNetGoogle Scholar
  4. Black, M. J., Sapiro, G., Marimont, D. H., & Heeger, D. (1998). Robust anisotropic diffusion. IEEE Transactions on Image Processing, 7(3), 421–432. CrossRefGoogle Scholar
  5. Blake, A., & Zisserman, A. (1987). Visual reconstruction. Cambridge: MIT Press. Google Scholar
  6. Bottou, L. (2004). Stochastic learning. In O. Bousquet & U. von Luxburg (Eds.), Lecture notes in artificial intelligence: Vol. 3176. Advanced lectures on machine learning (pp. 146–168). Berlin: Springer. Google Scholar
  7. Buades, A., Coll, B., & Morel, J.-M. (2004). A review of image denoising algorithms with a new one. SIAM Multiscale Modeling and Simulation, 4(2), 490–530. CrossRefMathSciNetGoogle Scholar
  8. Charbonnier, P., Blanc-Feéraud, L., Aubert, G., & Barlaud, M. (1997). Deterministic edge-preserving regularization in computed imaging. IEEE Transactions on Image Processing, 6(2), 298–311. CrossRefGoogle Scholar
  9. Criminisi, A., Pérez, P., & Toyama, K. (2004). Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on Image Processing, 13(9), 1200–1212. CrossRefGoogle Scholar
  10. Darroch, J. N., & Ratcliff, D. (1972). Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 43(5), 1470–1480. MATHCrossRefMathSciNetGoogle Scholar
  11. della Pietra, S. D., della Pietra, V. D., & Lafferty, J. (1997). Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380–393. CrossRefGoogle Scholar
  12. Descombes, X., Morris, R. D., Zerubia, J., & Berthod, M. (1999). Estimation of Markov random field prior parameters using Markov chain Monte Carlo maximum likelihood. IEEE Transactions on Image Processing, 8(7), 954–963. MATHCrossRefMathSciNetGoogle Scholar
  13. Donoho, D. L., Elad, M., & Temlyakov, V. N. (2006). Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transactions Information Theory, 52(1), 6–18. CrossRefMathSciNetGoogle Scholar
  14. Efros, A. A., & Leung, T. K. (1999). Texture synthesis by non-parametric sampling. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 2, pp. 1033–1038), Sept. 1999. Google Scholar
  15. Elad, M., & Aharon, M. (2006). Image denoising via learned dictionaries and sparse representations. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 1, pp. 895–900), June 2006. Google Scholar
  16. Elad, M., Milanfar, P., & Rubinstein, R. (2006). Analysis versus synthesis in signal priors. In Proc. of EUSIPCO, Florence, Italy, Sept. 2006. Google Scholar
  17. Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient belief propagation for early vision. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 1, pp. 261–268), June 2004. Google Scholar
  18. Fitzgibbon, A., Wexler, Y., & Zisserman, A. (2003). Image-based rendering using image-based priors. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 2, pp. 1176–1183), Oct. 2003. Google Scholar
  19. Freeman, W. T., Pasztor, E. C., & Carmichael, O. T. (2000). Learning low-level vision. International Journal of Computer Vision, 40(1), 24–47. CrossRefGoogle Scholar
  20. Gävert, H., Hurri, J., Särelä, J., & Hyvärinen, A. FastICA software for MATLAB. http://www.cis.hut.fi/projects/ica/fastica/, Oct. 2005. Software version 2.5.
  21. Gehler, P., & Welling, M. (2006). Products of “edge-perts”. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 18, pp. 419–426). Google Scholar
  22. Geman, D., & Reynolds, G. (1992). Constrained restoration and the recovery of discontinuities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(3), 367–383. CrossRefGoogle Scholar
  23. Geman, S., & Geman, D. (1984). Stochastic relaxation Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741. MATHCrossRefGoogle Scholar
  24. Geman, S., McClure, D. E., & Geman, D. (1992). A nonlinear filter for film restoration and other problems in image processing. CVGIP: Graphical Models and Image Processing, 54(2), 281–289. CrossRefGoogle Scholar
  25. Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In Proceedings of the 23rd symposium on the interface, Computing Science and Statistics (pp. 156–163), Seattle, Washington, Apr. 1991. Google Scholar
  26. Gilboa, G., Sochen, N., & Zeevi, Y. Y. (2004). Image enhancement and denoising by complex diffusion processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(8), 1020–1036. CrossRefGoogle Scholar
  27. Gimel’farb, G. L. (1996). Texture modeling by multiple pairwise pixel interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(11), 1110–1114. CrossRefGoogle Scholar
  28. Gisy, T. (2005). Image inpainting based on natural image statistics. Diplom thesis, Eidgenössische Technische Hochschule, Zürich, Switzerland, Sept. 2005. Google Scholar
  29. Hashimoto, W., & Kurata, K. (2000). Properties of basis functions generated by shift invariant sparse representations of natural images. Biological Cybernetics, 83(2), 111–118. CrossRefGoogle Scholar
  30. Heitz, F., & Bouthemy, P. (1993). Multimodal estimation of discontinuous optical flow using Markov random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(12), 1217–1232. CrossRefGoogle Scholar
  31. Hinton, G. E. (1999). Products of experts. In Int. conf. on art. neur. netw. (ICANN) (Vol. 1, pp. 1–6), Sept. 1999. Google Scholar
  32. Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800. MATHCrossRefGoogle Scholar
  33. Hinton, G. E., & Teh, Y.-W. (2001). Discovering multiple constraints that are frequently approximately satisfied. In Conf. on uncert. in art. intel. (UAI) (pp. 227–234), Aug. 2001. Google Scholar
  34. Hofmann, T., Puzicha, J., & Buhmann, J. M. (1998). Unsupervised texture segmentation in a deterministic annealing framework. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 803–818. CrossRefGoogle Scholar
  35. Huang, J., & Mumford, D. (1999). Statistics of natural images and models. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 1, pp. 1541–1547), June 1999. Google Scholar
  36. Hyvaärinen, A. (2005). Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6, 695–708. MathSciNetGoogle Scholar
  37. Jordan, M. I., Ghahramani, Z., Jaakola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37(2), 183–233. MATHCrossRefGoogle Scholar
  38. Kashyap, R. L., & Chellappa, R. (1981). Filtering of noisy images using Markov random field models. In Proceedings of the nineteenth Allerton conference on communication control and computing (pp. 850–859). Urbana, Illinois, Oct. 1981. Google Scholar
  39. Kervrann, C., & Boulanger, J. (2006). Unsupervised patch-based image regularization and representation. In A. Leonardis, H. Bischof, & A. Prinz (Eds.), Lect. notes in comp. sci.: Vol. 3954. Eur. conf. on comp. vis. (ECCV) (pp. 555–567). Berlin: Springer. Google Scholar
  40. Kohli, P., Kumar, M. P., & Torr, P. H. S. (2007). ℘3 & beyond: Solving energies with higher order cliques. In IEEE conf. on comp. vis. and pat. recog. (CVPR), June 2007. Google Scholar
  41. Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2), 147–159. CrossRefGoogle Scholar
  42. Kumar, S., & Hebert, M. (2006). Discriminative random fields. International Journal of Computer Vision, 68(2), 179–201. CrossRefGoogle Scholar
  43. Lan, X., Roth, S., Huttenlocher, D. P., & Black, M. J. (2006). Efficient belief propagation with learned higher-order Markov random fields. In A. Leonardis, H. Bischof, & A. Prinz (Eds.), Lect. notes in comp. sci.: Vol. 3952. Eur. conf. on comp. vis. (ECCV) (pp. 269–282). Berlin: Springer. Google Scholar
  44. LeCun, Y., & Huang, F. J. (2005). Loss functions for discriminative training of energy-based models. In R. G. Cowell and Z. Ghahramani (Eds.) Int. works. on art. int. and stat. (AISTATS) (pp. 206–213), Jan. 2005. Google Scholar
  45. Levin, A., Zomet, A., & Weiss, Y. (2003). Learning how to inpaint from global image statistics. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 1, pp. 305–312), Oct. 2003. Google Scholar
  46. Li, S. Z. (2001). Markov random field modeling in image analysis (2nd ed.) Berlin: Springer. MATHGoogle Scholar
  47. Lyu, S., & Simoncelli, E. P. (2007). Statistical modeling of images with fields of Gaussian scale mixtures. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 19, pp. 945–952). Google Scholar
  48. Marroquin, J., Mitter, S., & Poggio, T. (1987). Probabilistic solutions of ill-posed problems in computational vision. Journal of American Statistical Association, 82(397), 76–89. MATHCrossRefGoogle Scholar
  49. Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In IEEE int. conf. on comp. vis. (ICCV) (Vol. 2, pp. 416–423), July 2001. Google Scholar
  50. McAuley, J. J., Caetano, T., Smola, A. J., & Franz, M. O. (2006). Learning high-order MRF priors of color images. In Int. conf. on mach. learn. (ICML) (pp. 617–624), June 2006. Google Scholar
  51. Minka, T. (2005). Divergence measures and message passing (Technical Report MSR-TR-2005-173), Microsoft Research, Cambridge, UK. Google Scholar
  52. Moldovan, T. M., Roth, S., & Black, M. J. (2006). Denoising archival films using a learned Bayesian model. In IEEE int. conf. on image proc. (ICIP) (pp. 2641–2644), Oct. 2006. Google Scholar
  53. Moussouris, J. (1974). Gibbs and Markov random systems with constraints. Journal of Statistical Physics, 10(1), 11–33. CrossRefMathSciNetGoogle Scholar
  54. Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods (Technical Report CRG-TR-93-1), Department of Computer Science, University of Toronto, Ontario, Canada, Sept. 1993. Google Scholar
  55. Neher, R., & Srivastava, A. (2005). A Bayesian MRF framework for labeling using hyperspectral imaging. IEEE Transactions on Geoscience and Remote Sensing, 43(6), 1363–1374. CrossRefGoogle Scholar
  56. Ning, F., Delhomme, D., LeCun, Y., Piano, F., Bottou, L., & Barbano, P. E. (2005). Toward automatic phenotyping of developing embryos from videos. IEEE Transactions on Image Processing, 14(9), 1360–1371. CrossRefGoogle Scholar
  57. Olshausen, B. A., & Field, D. J. (1996). Natural image statistics and efficient coding. Network: Computation in Neural, 7(2), 333–339. CrossRefGoogle Scholar
  58. Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37(23), 3311–3325. CrossRefGoogle Scholar
  59. Paget, R., & Longstaff, I. D. (1998). Texture synthesis via a noncausal nonparametric multiscale Markov random field. IEEE Transactions on Image Processing, 7(6), 925–931. CrossRefGoogle Scholar
  60. Pickup, L. C., Roberts, S. J., & Zisserman, A. (2004). A sampled texture prior for image super-resolution. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 16). Google Scholar
  61. Poggio, T., Torre, V., & Koch, C. (1985). Computational vision and regularization theory. Nature, 317, 314–319. CrossRefGoogle Scholar
  62. Portilla, J. (2006b). Image denoising software. http://www.io.csic.es/PagsPers/JPortilla/denoise/software/index.htm. Software version 1.0.3.
  63. Portilla, J., Strela, V., Wainwright, M. J., & Simoncelli, E. P. (2003). Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Transactions on Image Processing, 12(11), 1338–1351. CrossRefMathSciNetGoogle Scholar
  64. Potetz, B. (2007). Efficient belief propagation for vision using linear constraint nodes. In IEEE conf. on comp. vis. and pat. recog. (CVPR), June 2007. Google Scholar
  65. Rasmussen, C. E. (2006). minimize.m—Conjugate gradient minimization. http://www.kyb.tuebingen.mpg.de/bs/people/carl/code/minimize/, Sept. 2006.
  66. Roth, S. (2007). High-order Markov random fields for low-level vision. Ph.D. Dissertation, Brown University, Department of Computer Science, Providence, Rhode Island, May 2007. Google Scholar
  67. Roth, S., & Black, M. J. (2005). Fields of experts: A framework for learning image priors. In IEEE conf. on comp. vis. and pat. recog. (CVPR) (Vol. 2, pp. 860–867), June 2005. Google Scholar
  68. Roth, S., & Black, M. J. (2007a). Steerable random fields. In IEEE int. conf. on comp. vis. (ICCV), Oct. 2007. Google Scholar
  69. Roth, S., & Black, M. J. (2007b). On the spatial statistics of optical flow. International Journal of Computer Vision, 74(1), 33–50. CrossRefGoogle Scholar
  70. Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear Gaussian models. Neural Computation, 11(2), 305–345. CrossRefGoogle Scholar
  71. Ruderman, D. L. (1994). The statistics of natural images. Network: Computation in Neural, 5(4), 517–548. MATHCrossRefGoogle Scholar
  72. Sallee, P., & Olshausen, B. A. (2003). Learning sparse multiscale image representations. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 15, pp. 1327–1334). Google Scholar
  73. Schnörr, C., Sprengel, R., & Neumann, B. (1996). A variational approach to the design of early vision algorithms. Computing Supplement, 11, 149–165. Google Scholar
  74. Sebastiani, G., & Godtliebsen, F. (1997). On the use of Gibbs priors for Bayesian image restoration. Signal Processing, 56(1), 111–118. MATHCrossRefGoogle Scholar
  75. Srivastava, A., Liu, X., & Grenander, U. (2002). Universal analytical forms for modeling image probabilities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1200–1214. CrossRefGoogle Scholar
  76. Srivastava, A., Lee, A. B., Simoncelli, E. P., & Zhu, S.-C. (2003). On advances in statistical modeling of natural images. Journal of Mathematical Imaging and Vision, 18(1), 17–33. MATHCrossRefMathSciNetGoogle Scholar
  77. Stewart, L., He, X., & Zemel, R. S. (2008). Learning flexible features for conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(8), 1415–1426. CrossRefGoogle Scholar
  78. Sun, J., Zhen, N.-N., & Shum, H.-Y. (2003). Stereo matching using belief propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 787–800. CrossRefGoogle Scholar
  79. Szeliski, R. (1990). Bayesian modeling of uncertainty in low-level vision. International Journal of Computer Vision, 5(3), 271–301. CrossRefGoogle Scholar
  80. Tappen, M. F., Russell, B. C., & Freeman, W. T. (2003). Exploiting the sparse derivative prior for super-resolution and image demosaicing. In Proceedings of the 3rd international workshop on statistical and computational theories of vision, Nice, France, Oct. 2003. Google Scholar
  81. Teh, Y. W., Welling, M., Osindero, S., & Hinton, G. E. (2003). Energy-based models for sparse overcomplete representations. Journal of Machine Learning Research, 4, 1235–1260. CrossRefMathSciNetGoogle Scholar
  82. Tjelmeland, H., & Besag, J. (1998). Markov random fields with higher-order interactions. Scandinavian Journal of Statistics, 25(3), 415–433. MATHCrossRefMathSciNetGoogle Scholar
  83. Trobin, W., Pock, T., Cremers, D., & Bischof, H. (2008). An unbiased second-order prior for high-accuracy motion estimation. In Lect. notes in comp. sci.: Vol. 5096. Pat. recog., proc. DAGM-symp. (pp. 396–405). Berlin: Springer. Google Scholar
  84. Varma, M., & Zisserman, A. (2005). A statistical approach to texture classification from single images. International Journal of Computer Vision, 62(1–2), 61–81. Google Scholar
  85. Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. CrossRefGoogle Scholar
  86. Weickert, J. (1997). A review of nonlinear diffusion filtering. In Lect. notes in comp. sci.: Vol. 1252. Proceedings of scale-space theory in computer vision (pp. 3–28). Berlin: Springer. Google Scholar
  87. Weiss, Y., & Freeman, W. T. (2007). What makes a good model of natural images? In IEEE conf. on comp. vis. and pat. recog. (CVPR), June 2007. Google Scholar
  88. Welling, M., & Sutton, C. (2005). Learning in Markov random fields with contrastive free energies. In R. G. Cowell and Z. Ghahramani (Eds.), Int. works. on art. int. and stat. (AISTATS) (pp. 389–396), Jan. 2005. Google Scholar
  89. Welling, M., Hinton, G. E., & Osindero, S. (2003). Learning sparse topographic representations with products of Student-t distributions. In Adv. in neur. inf. proc. sys. (NIPS) (Vol. 15, pp. 1359–1366). Google Scholar
  90. Wersing, H., Eggert, J., & Körner, E. (2003). Sparse coding with invariance constraints. In Int. conf. on art. neur. netw. (ICANN) (pp. 385–392), June 2003. Google Scholar
  91. Wong, E. (1968). Two-dimensional random fields and representation of images. SIAM Journal on Applied Mathematics, 16(4), 756–770. MATHCrossRefMathSciNetGoogle Scholar
  92. Yanover, C., Meltzer, T., & Weiss, Y. (2006). Linear programming relaxations and belief propagation—An empirical study. Journal of Machine Learning Research, 7, 1887–1907. MathSciNetGoogle Scholar
  93. Yedidia, J. S., Freeman, W. T., & Weiss, Y. (2003). Understanding belief propagation and its generalizations. In G. Lakemeyer & B. Nebel (Eds.), Exploring artificial intelligence in the new millennium (pp. 239–236). San Mateo: Morgan Kaufmann. Chap. 8. Google Scholar
  94. Zalesny, A., & van Gool, L. (2001). A compact model for viewpoint dependent texture synthesis. In Lect. notes in comp. sci.: Vol. 2018. Proceedings of SMILE 2000 workshop (pp. 124–143). Berlin: Springer. Google Scholar
  95. Zhu, S. C., & Mumford, D. (1997). Prior learning and Gibbs reaction–diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236–1250. CrossRefGoogle Scholar
  96. Zhu, S. C., Wu, Y., & Mumford, D. (1998). Filters random fields and maximum entropy (FRAME): Towards a unified theory for texture modeling. International Journal of Computer Vision, 27(2), 107–126. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of Computer ScienceTU DarmstadtDarmstadtGermany
  2. 2.Department of Computer ScienceBrown UniversityProvidenceUSA

Personalised recommendations