Basic Principles of Annealing for Large Scale Non-Linear Optimization

Buhmann, Joachim M.; Puzicha, Jan

doi:10.1007/978-3-662-04331-8_36

Joachim M. Buhmann² &
Jan Puzicha³

1538 Accesses

Abstract

Computational Annealing, a class of optimization heuristics that are inspired by statistical physics of phase transitions has been demonstrated to be highly effective for large, non-linear combinatorial optimization problems. In many applications in computer vision and pattern recognition one encounters non-linear objective functions with a very large number of discrete and possibly additional continuous variables. Typical cases of such problems are clustering, grouping and image segmentation or assignment problems in motion or stereo analysis or in object recognition. For this type of problems, standard integer programming techniques are not applicable and one has to resort to optimization heuristics that are fast, yet avoid a possibly exponential number of unfavorable local minima. A particularly powerful, generic class of algorithms is provided by simulated or deterministic annealing techniques. Simulated annealing and the Gibbs sampler are discussed first to present the basic concepts; then, the theory of deterministic annealing is presented in great detail and the relation to continuation methods are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Simulated Annealing

Theoretically Grounded Acceleration Techniques for Simulated Annealing

Towards the Analysis of the simulated annealing method in the multiextremal case

Article 01 April 2017

References

E. Allgower and K. Georg, Numerical Continuation Methods, vol. 13 of Springer Series in Computational Mathematics, Springer Verlag, 1990.
Book MATH Google Scholar
J. Besag, On the statistical analysis of dirty pictures, Journal of the Royal Statistical Society, Series B, 48 (1986), pp. 25–37.
MathSciNet Google Scholar
G. Bilbro and W. Snyder, Mean field approximation minimizes relative entropy, Journal of the Optical Society of America, 8 (1989).
Google Scholar
G. Bilbro, W. Snyder, S. Gamier, and J. Gault, Mean field annealing: A formalism for constructing GNC-like algorithms, IEEE Transactions on Neural Networks, 3 (1992).
Google Scholar
A. Blake and A. Zisserman, Visual Reconstruction, MIT Press, 1987.
Google Scholar
J. Buhmann and H. Kühnel, Vector quantization with complexity costs, IEEE Transactions on Information Theory, 39 (1993), pp. 1133–1145.
Article MATH Google Scholar
O. Catoni, Rough large deviation estimates for simulated annealing: Applications to exponential schedules, Annals of Probability, 20 (1992), pp. 1109–1146.
Article MathSciNet MATH Google Scholar
O. Catoni, Rough large deviation estimates for simulated annealing: Applications to exponential schedules, Journal of Complexity, 12 (1996), pp. 595–623.
Article MathSciNet MATH Google Scholar
O. Catoni, Erratum, Journal of Complexity, 13 (1997), p. 384.
Article MathSciNet Google Scholar
V. Cerny, Thermodynamical approach to the travelling salesman problem, Journal of Optimization Theory and Applications, 45 (1985), pp. 41–51.
Article MathSciNet MATH Google Scholar
N. Collins, R. Eglese, and B. Golden, Simulated annealing — an annotated bibliography, American Journal of Mathematical and Management Science, 8 (1988), pp. 209–308.
MathSciNet MATH Google Scholar
T. Cover and J. Thomas, Elements of Information Theory, John Wiley & Sons, 1991.
Book MATH Google Scholar
I. Csiszar, Why least squares and maximum entropy — an axiomatic approach to inference for linear inverse problems, Annals of Statistics, 19 (1991), pp. 2032–2066.
Article MathSciNet MATH Google Scholar
S. Duane, A. Kennedy, B. Pendleton, and D. Roweth, Hybrid Monte Carlo, Physics Letters B, 195 (1987), pp. 216–222.
Article Google Scholar
D. Geiger and F. Girosi, Coupled markov random fields and mean field theory, in Advances in Neural Information Processing Systems 2, 1990, pp. 660–667.
Google Scholar
D. Geiger and F. Girosi, Parallel and deterministic algorithms from MRF’s: Surface reconstruction, IEEE Transactions on Pattern Analysis and Machine Intelligence, (1991), pp. 401–412.
Google Scholar
S. Gelfand and S. Mitter, Simulated annealing type algorithms for multivariate optimization, Algorithmica, 6 (1991), pp. 419–436.
Article MathSciNet MATH Google Scholar
S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 6 (1984), pp. 721–741.
Article MATH Google Scholar
S. Geman and C. Hwang, Diffusion for global optimization, SIAM Journal of Control and Optimization, 24 (1986), pp. 1031–1043.
Article MathSciNet MATH Google Scholar
S. Gold and A. Rangarajan, A graduated assignment algorithm for graph matching, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18 (1996), pp. 377–388.
Article Google Scholar
J. Goodman and A. Sokal, Multigrid Monte-Carlo method. Conceptual foundations, Physical Review D, 40 (1989), pp. 2025–2071.
Article Google Scholar
M. Grötschel and Y. Wakabayashi, A cutting plane algorithm for a clustering problem, Mathematical Programming, 45 (1989), pp. 59–96.
Article MathSciNet MATH Google Scholar
M. Grötschel and Y. Wakabayashi, Facets of the clique partitioning polytope, Mathematical Programming, 47 (1990), pp. 367–387.
Article MathSciNet MATH Google Scholar
B. Hajek, Cooling schedules for optimal annealing, Mathematics of Operation Research, 13 (1988), pp. 311–324.
Article MATH Google Scholar
W. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57 (1970), pp. 97–109.
Article MATH Google Scholar
T. Hofmann and J. Buhmann, Pairwise data clustering by deterministic annealing, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19 (1997), pp. 1–14.
Article Google Scholar
T. Hofmann, J. Puzicha, and J. Buhmann, A deterministic annealing framework for textured image segmentation, IAI-TR 96–2, Institut für Informatik III, 1996.
Google Scholar
T. Hofmann, J. Puzicha, and J. Buhmann, Unsupervised segmentation of textured images by pairwise data clustering, in Proceedings of the IEEE International Conference on Image Processing (ICIP′96), 1996, pp. III: 137–140.
Google Scholar
T. Hofmann, J. Puzicha, and J. Buhmann, Deterministic annealing for unsupervised texture segmentation, in Proceedings of the International Workshop on Energy Minimization Methods in Computer Vision (EMMCVPR′97), Lectures Notes in Computer Science, Springer Verlag, 1997, pp. 213–228.
Chapter Google Scholar
T. Hofmann, J. Puzicha, and J. Buhmann, Unsupervised texture segmentation in a deterministic annealing framework, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20 (1998), pp. 803–818.
Article Google Scholar
A. Jain and R. Dubes, Algorithms for Clustering Data, Prentice Hall, 1988.
MATH Google Scholar
E. Jaynes, Information theory and statistical mechanics, Physical Review, 106 (1957), pp. 620–630.
Article MathSciNet MATH Google Scholar
E. Jaynes, Information theory and statistical mechanics II, Physical Review, 108 (1957), pp. 171–190.
Article MathSciNet Google Scholar
E. Jaynes, On the rationale of maximum-entropy methods, Proceedings of the IEEE, 70 (1982), pp. 939–952.
Article Google Scholar
S. Kirkpatrick, C. Gelatt, and M. Vecchi, Optimization by simulated annealing, Science, 220 (1983), pp. 671–680.
Article MathSciNet MATH Google Scholar
H. Klock and J. M. Buhmann, Data visualization by multidimensional scaling: A deterministic annealing approach, Pattern Recognition, 33 (2000), pp. 651–669.
Article Google Scholar
P. v. Laarhoven, Theoretical and Computational Aspects of Simulated Annealing, CWI Tracts, 1988.
MATH Google Scholar
P. v. Laarhoven and E. Aarts, Simulated Annealing: Theory and applications, Reidel Publishing Company, 1987.
Book MATH Google Scholar
P. Laplace, Theorie Analytique des probabilites, Courcier, Paris, 1812.
Google Scholar
S. Li, Markov Random Field Modeling in Computer Vision, Springer, 1995.
Book Google Scholar
N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and M. Teller, Equation for state calculations by fast computing machines, Journal of Chemical Physics, 21 (1953), pp. 1087–1092.
Article Google Scholar
R. Neal, Probabilistic inference unsing Markov chain Monte Carlo methods, Tech. Rep. CRG-TR-93–1, Department of Computer Science, University of Toronto, Canada, 1993.
Google Scholar
G. Parisi, Statistical Field Theory, Addison Wesley, Redwood City, Ca., 1988.
MATH Google Scholar
F. Pereira, N. Tishby, and L. Lee, Distributional clustering of English words, in 30th Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 1993, pp. 183–190.
Google Scholar
C. Peterson and J. Anderson, A mean field theory learning algorithm for neural networks, Complex Systems, 1 (1987), pp. 995–1019.
MATH Google Scholar
C. Peterson and B. Söderberg, A new method for mapping optimization problems onto neural networks, International Journal of Neural Systems, 1 (1989), pp. 3–22.
Article Google Scholar
J. Puzicha, M. Held, J. Ketterer, J. Buhmann, and D. Fellner, On spatial quantization of color images, IEEE Transactions on Image Processing, 9 (2000), pp. 666–682.
Article Google Scholar
J. Puzicha, T. Hofmann, and J. Buhmann, Deterministic annealing: Fast physical heuristics for real time optimization of large systems., in Proceedings of the 15th IMACS World Congress on Scientific Computation, Modelling and Applied Mathematics, 1997.
Google Scholar
J. Puzicha, T. Hofmann, and J. Buhmann, A theory of proximity based clustering: Structure detection by optimization, Pattern Recognition, 33 (2000), pp. 617–634.
Article Google Scholar
J. Puzicha, T. Hofmann, and J. M. Buhmann, A theory of proximity based clustering: Structure detection by optimization, Pattern Recognition, 33 (2000), pp. 617–634.
Article Google Scholar
K. Rose, E. Gurewitz, and G. Fox, A deterministic annealing approach to clustering, Pattern Recognition Letters, 11 (1990), pp. 589–594.
Article MATH Google Scholar
K. Rose, E. Gurewitz, and G. Fox, Statistical mechanics and phase transition in clustering, Physical Review Letters, 65 (1990), pp. 945–948.
Article Google Scholar
K. Rose, E. Gurewitz, and G. Fox, Vector quantization by deterministic annealing, IEEE Transactions on Information Theory, 38 (1992), pp. 1249–1257.
Article MATH Google Scholar
M.-A. Sato and S. Ishii, Bifurcations in mean-field-theory annealing, Physical Review E, 53 (1996), pp. 5153–5168.
Article Google Scholar
C. Shannon, A mathematical theory of communication, Bell System Tech. Journal, 27 (1948), pp. 379–423, 623–659.
MathSciNet MATH Google Scholar
R. Swendsen and J. Wang, Nonuniversal critical dynamics in Monte Carlo simulations, Physical Review Letters, 58 (1987), pp. 86–88.
Article Google Scholar
D. van den Bout and T. Miller, Graph partitioning using annealed neural networks, IEEE Transactions on Neural Networks, 1 (1990), pp. 192–203.
Article Google Scholar
G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction, Springer, 1995.
Book MATH Google Scholar
A. Yuille, Generalized deformable models, statistical physics and matching problems, Neural Computation, 2 (1990), pp. 1–24.
Article Google Scholar
J. Zerubia and R. Chellappa, Mean field annealing using compound Gauss-Markov random fields for edge detection and image estimation, IEEE Transactions on Neural Networks, 4 (1993), pp. 703–709.
Article Google Scholar
J. Zhang, The mean field theory in EM procedures for blind Markov random fields, IEEE Transactions on Image Processing, 2 (1993), pp. 27–40.
Article Google Scholar
J. Zhang, The convergence of mean field procedures for MRF’s, IEEE Transactions on Image Processing, 5 (1996), pp. 1662–1665.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik, Rheinische Friedrich-Wilhelm Universität Bonn, Germany
Joachim M. Buhmann
Department of Computer Science, University of California, Berkeley, USA
Jan Puzicha

Authors

Joachim M. Buhmann
View author publications
You can also search for this author in PubMed Google Scholar
Jan Puzicha
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Konrad-Zuse-Zentrum für Informationstechnik Berlin (ZIB), Takustraße 7, 14195, Berlin-Dahlem, Germany
Martin Grötschel , Sven O. Krumke & Jörg Rambau , &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Buhmann, J.M., Puzicha, J. (2001). Basic Principles of Annealing for Large Scale Non-Linear Optimization. In: Grötschel, M., Krumke, S.O., Rambau, J. (eds) Online Optimization of Large Scale Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04331-8_36

Download citation

DOI: https://doi.org/10.1007/978-3-662-04331-8_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07633-6
Online ISBN: 978-3-662-04331-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Basic Principles of Annealing for Large Scale Non-Linear Optimization

Abstract

Access this chapter

Preview

Similar content being viewed by others

Simulated Annealing

Theoretically Grounded Acceleration Techniques for Simulated Annealing

Towards the Analysis of the simulated annealing method in the multiextremal case

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Basic Principles of Annealing for Large Scale Non-Linear Optimization

Abstract

Access this chapter

Preview

Similar content being viewed by others

Simulated Annealing

Theoretically Grounded Acceleration Techniques for Simulated Annealing

Towards the Analysis of the simulated annealing method in the multiextremal case

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation