Skip to main content

Basic Principles of Annealing for Large Scale Non-Linear Optimization

  • Chapter
Online Optimization of Large Scale Systems
  • 1538 Accesses

Abstract

Computational Annealing, a class of optimization heuristics that are inspired by statistical physics of phase transitions has been demonstrated to be highly effective for large, non-linear combinatorial optimization problems. In many applications in computer vision and pattern recognition one encounters non-linear objective functions with a very large number of discrete and possibly additional continuous variables. Typical cases of such problems are clustering, grouping and image segmentation or assignment problems in motion or stereo analysis or in object recognition. For this type of problems, standard integer programming techniques are not applicable and one has to resort to optimization heuristics that are fast, yet avoid a possibly exponential number of unfavorable local minima. A particularly powerful, generic class of algorithms is provided by simulated or deterministic annealing techniques. Simulated annealing and the Gibbs sampler are discussed first to present the basic concepts; then, the theory of deterministic annealing is presented in great detail and the relation to continuation methods are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. E. Allgower and K. Georg, Numerical Continuation Methods, vol. 13 of Springer Series in Computational Mathematics, Springer Verlag, 1990.

    Book  MATH  Google Scholar 

  2. J. Besag, On the statistical analysis of dirty pictures, Journal of the Royal Statistical Society, Series B, 48 (1986), pp. 25–37.

    MathSciNet  Google Scholar 

  3. G. Bilbro and W. Snyder, Mean field approximation minimizes relative entropy, Journal of the Optical Society of America, 8 (1989).

    Google Scholar 

  4. G. Bilbro, W. Snyder, S. Gamier, and J. Gault, Mean field annealing: A formalism for constructing GNC-like algorithms, IEEE Transactions on Neural Networks, 3 (1992).

    Google Scholar 

  5. A. Blake and A. Zisserman, Visual Reconstruction, MIT Press, 1987.

    Google Scholar 

  6. J. Buhmann and H. Kühnel, Vector quantization with complexity costs, IEEE Transactions on Information Theory, 39 (1993), pp. 1133–1145.

    Article  MATH  Google Scholar 

  7. O. Catoni, Rough large deviation estimates for simulated annealing: Applications to exponential schedules, Annals of Probability, 20 (1992), pp. 1109–1146.

    Article  MathSciNet  MATH  Google Scholar 

  8. O. Catoni, Rough large deviation estimates for simulated annealing: Applications to exponential schedules, Journal of Complexity, 12 (1996), pp. 595–623.

    Article  MathSciNet  MATH  Google Scholar 

  9. O. Catoni, Erratum, Journal of Complexity, 13 (1997), p. 384.

    Article  MathSciNet  Google Scholar 

  10. V. Cerny, Thermodynamical approach to the travelling salesman problem, Journal of Optimization Theory and Applications, 45 (1985), pp. 41–51.

    Article  MathSciNet  MATH  Google Scholar 

  11. N. Collins, R. Eglese, and B. Golden, Simulated annealingan annotated bibliography, American Journal of Mathematical and Management Science, 8 (1988), pp. 209–308.

    MathSciNet  MATH  Google Scholar 

  12. T. Cover and J. Thomas, Elements of Information Theory, John Wiley & Sons, 1991.

    Book  MATH  Google Scholar 

  13. I. Csiszar, Why least squares and maximum entropy — an axiomatic approach to inference for linear inverse problems, Annals of Statistics, 19 (1991), pp. 2032–2066.

    Article  MathSciNet  MATH  Google Scholar 

  14. S. Duane, A. Kennedy, B. Pendleton, and D. Roweth, Hybrid Monte Carlo, Physics Letters B, 195 (1987), pp. 216–222.

    Article  Google Scholar 

  15. D. Geiger and F. Girosi, Coupled markov random fields and mean field theory, in Advances in Neural Information Processing Systems 2, 1990, pp. 660–667.

    Google Scholar 

  16. D. Geiger and F. Girosi, Parallel and deterministic algorithms from MRF’s: Surface reconstruction, IEEE Transactions on Pattern Analysis and Machine Intelligence, (1991), pp. 401–412.

    Google Scholar 

  17. S. Gelfand and S. Mitter, Simulated annealing type algorithms for multivariate optimization, Algorithmica, 6 (1991), pp. 419–436.

    Article  MathSciNet  MATH  Google Scholar 

  18. S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 6 (1984), pp. 721–741.

    Article  MATH  Google Scholar 

  19. S. Geman and C. Hwang, Diffusion for global optimization, SIAM Journal of Control and Optimization, 24 (1986), pp. 1031–1043.

    Article  MathSciNet  MATH  Google Scholar 

  20. S. Gold and A. Rangarajan, A graduated assignment algorithm for graph matching, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18 (1996), pp. 377–388.

    Article  Google Scholar 

  21. J. Goodman and A. Sokal, Multigrid Monte-Carlo method. Conceptual foundations, Physical Review D, 40 (1989), pp. 2025–2071.

    Article  Google Scholar 

  22. M. Grötschel and Y. Wakabayashi, A cutting plane algorithm for a clustering problem, Mathematical Programming, 45 (1989), pp. 59–96.

    Article  MathSciNet  MATH  Google Scholar 

  23. M. Grötschel and Y. Wakabayashi, Facets of the clique partitioning polytope, Mathematical Programming, 47 (1990), pp. 367–387.

    Article  MathSciNet  MATH  Google Scholar 

  24. B. Hajek, Cooling schedules for optimal annealing, Mathematics of Operation Research, 13 (1988), pp. 311–324.

    Article  MATH  Google Scholar 

  25. W. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57 (1970), pp. 97–109.

    Article  MATH  Google Scholar 

  26. T. Hofmann and J. Buhmann, Pairwise data clustering by deterministic annealing, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19 (1997), pp. 1–14.

    Article  Google Scholar 

  27. T. Hofmann, J. Puzicha, and J. Buhmann, A deterministic annealing framework for textured image segmentation, IAI-TR 96–2, Institut für Informatik III, 1996.

    Google Scholar 

  28. T. Hofmann, J. Puzicha, and J. Buhmann, Unsupervised segmentation of textured images by pairwise data clustering, in Proceedings of the IEEE International Conference on Image Processing (ICIP′96), 1996, pp. III: 137–140.

    Google Scholar 

  29. T. Hofmann, J. Puzicha, and J. Buhmann, Deterministic annealing for unsupervised texture segmentation, in Proceedings of the International Workshop on Energy Minimization Methods in Computer Vision (EMMCVPR′97), Lectures Notes in Computer Science, Springer Verlag, 1997, pp. 213–228.

    Chapter  Google Scholar 

  30. T. Hofmann, J. Puzicha, and J. Buhmann, Unsupervised texture segmentation in a deterministic annealing framework, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20 (1998), pp. 803–818.

    Article  Google Scholar 

  31. A. Jain and R. Dubes, Algorithms for Clustering Data, Prentice Hall, 1988.

    MATH  Google Scholar 

  32. E. Jaynes, Information theory and statistical mechanics, Physical Review, 106 (1957), pp. 620–630.

    Article  MathSciNet  MATH  Google Scholar 

  33. E. Jaynes, Information theory and statistical mechanics II, Physical Review, 108 (1957), pp. 171–190.

    Article  MathSciNet  Google Scholar 

  34. E. Jaynes, On the rationale of maximum-entropy methods, Proceedings of the IEEE, 70 (1982), pp. 939–952.

    Article  Google Scholar 

  35. S. Kirkpatrick, C. Gelatt, and M. Vecchi, Optimization by simulated annealing, Science, 220 (1983), pp. 671–680.

    Article  MathSciNet  MATH  Google Scholar 

  36. H. Klock and J. M. Buhmann, Data visualization by multidimensional scaling: A deterministic annealing approach, Pattern Recognition, 33 (2000), pp. 651–669.

    Article  Google Scholar 

  37. P. v. Laarhoven, Theoretical and Computational Aspects of Simulated Annealing, CWI Tracts, 1988.

    MATH  Google Scholar 

  38. P. v. Laarhoven and E. Aarts, Simulated Annealing: Theory and applications, Reidel Publishing Company, 1987.

    Book  MATH  Google Scholar 

  39. P. Laplace, Theorie Analytique des probabilites, Courcier, Paris, 1812.

    Google Scholar 

  40. S. Li, Markov Random Field Modeling in Computer Vision, Springer, 1995.

    Book  Google Scholar 

  41. N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and M. Teller, Equation for state calculations by fast computing machines, Journal of Chemical Physics, 21 (1953), pp. 1087–1092.

    Article  Google Scholar 

  42. R. Neal, Probabilistic inference unsing Markov chain Monte Carlo methods, Tech. Rep. CRG-TR-93–1, Department of Computer Science, University of Toronto, Canada, 1993.

    Google Scholar 

  43. G. Parisi, Statistical Field Theory, Addison Wesley, Redwood City, Ca., 1988.

    MATH  Google Scholar 

  44. F. Pereira, N. Tishby, and L. Lee, Distributional clustering of English words, in 30th Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 1993, pp. 183–190.

    Google Scholar 

  45. C. Peterson and J. Anderson, A mean field theory learning algorithm for neural networks, Complex Systems, 1 (1987), pp. 995–1019.

    MATH  Google Scholar 

  46. C. Peterson and B. Söderberg, A new method for mapping optimization problems onto neural networks, International Journal of Neural Systems, 1 (1989), pp. 3–22.

    Article  Google Scholar 

  47. J. Puzicha, M. Held, J. Ketterer, J. Buhmann, and D. Fellner, On spatial quantization of color images, IEEE Transactions on Image Processing, 9 (2000), pp. 666–682.

    Article  Google Scholar 

  48. J. Puzicha, T. Hofmann, and J. Buhmann, Deterministic annealing: Fast physical heuristics for real time optimization of large systems., in Proceedings of the 15th IMACS World Congress on Scientific Computation, Modelling and Applied Mathematics, 1997.

    Google Scholar 

  49. J. Puzicha, T. Hofmann, and J. Buhmann, A theory of proximity based clustering: Structure detection by optimization, Pattern Recognition, 33 (2000), pp. 617–634.

    Article  Google Scholar 

  50. J. Puzicha, T. Hofmann, and J. M. Buhmann, A theory of proximity based clustering: Structure detection by optimization, Pattern Recognition, 33 (2000), pp. 617–634.

    Article  Google Scholar 

  51. K. Rose, E. Gurewitz, and G. Fox, A deterministic annealing approach to clustering, Pattern Recognition Letters, 11 (1990), pp. 589–594.

    Article  MATH  Google Scholar 

  52. K. Rose, E. Gurewitz, and G. Fox, Statistical mechanics and phase transition in clustering, Physical Review Letters, 65 (1990), pp. 945–948.

    Article  Google Scholar 

  53. K. Rose, E. Gurewitz, and G. Fox, Vector quantization by deterministic annealing, IEEE Transactions on Information Theory, 38 (1992), pp. 1249–1257.

    Article  MATH  Google Scholar 

  54. M.-A. Sato and S. Ishii, Bifurcations in mean-field-theory annealing, Physical Review E, 53 (1996), pp. 5153–5168.

    Article  Google Scholar 

  55. C. Shannon, A mathematical theory of communication, Bell System Tech. Journal, 27 (1948), pp. 379–423, 623–659.

    MathSciNet  MATH  Google Scholar 

  56. R. Swendsen and J. Wang, Nonuniversal critical dynamics in Monte Carlo simulations, Physical Review Letters, 58 (1987), pp. 86–88.

    Article  Google Scholar 

  57. D. van den Bout and T. Miller, Graph partitioning using annealed neural networks, IEEE Transactions on Neural Networks, 1 (1990), pp. 192–203.

    Article  Google Scholar 

  58. G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction, Springer, 1995.

    Book  MATH  Google Scholar 

  59. A. Yuille, Generalized deformable models, statistical physics and matching problems, Neural Computation, 2 (1990), pp. 1–24.

    Article  Google Scholar 

  60. J. Zerubia and R. Chellappa, Mean field annealing using compound Gauss-Markov random fields for edge detection and image estimation, IEEE Transactions on Neural Networks, 4 (1993), pp. 703–709.

    Article  Google Scholar 

  61. J. Zhang, The mean field theory in EM procedures for blind Markov random fields, IEEE Transactions on Image Processing, 2 (1993), pp. 27–40.

    Article  Google Scholar 

  62. J. Zhang, The convergence of mean field procedures for MRF’s, IEEE Transactions on Image Processing, 5 (1996), pp. 1662–1665.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Buhmann, J.M., Puzicha, J. (2001). Basic Principles of Annealing for Large Scale Non-Linear Optimization. In: Grötschel, M., Krumke, S.O., Rambau, J. (eds) Online Optimization of Large Scale Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04331-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-04331-8_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-07633-6

  • Online ISBN: 978-3-662-04331-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics