Abstract
Computational Annealing, a class of optimization heuristics that are inspired by statistical physics of phase transitions has been demonstrated to be highly effective for large, non-linear combinatorial optimization problems. In many applications in computer vision and pattern recognition one encounters non-linear objective functions with a very large number of discrete and possibly additional continuous variables. Typical cases of such problems are clustering, grouping and image segmentation or assignment problems in motion or stereo analysis or in object recognition. For this type of problems, standard integer programming techniques are not applicable and one has to resort to optimization heuristics that are fast, yet avoid a possibly exponential number of unfavorable local minima. A particularly powerful, generic class of algorithms is provided by simulated or deterministic annealing techniques. Simulated annealing and the Gibbs sampler are discussed first to present the basic concepts; then, the theory of deterministic annealing is presented in great detail and the relation to continuation methods are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
E. Allgower and K. Georg, Numerical Continuation Methods, vol. 13 of Springer Series in Computational Mathematics, Springer Verlag, 1990.
J. Besag, On the statistical analysis of dirty pictures, Journal of the Royal Statistical Society, Series B, 48 (1986), pp. 25–37.
G. Bilbro and W. Snyder, Mean field approximation minimizes relative entropy, Journal of the Optical Society of America, 8 (1989).
G. Bilbro, W. Snyder, S. Gamier, and J. Gault, Mean field annealing: A formalism for constructing GNC-like algorithms, IEEE Transactions on Neural Networks, 3 (1992).
A. Blake and A. Zisserman, Visual Reconstruction, MIT Press, 1987.
J. Buhmann and H. Kühnel, Vector quantization with complexity costs, IEEE Transactions on Information Theory, 39 (1993), pp. 1133–1145.
O. Catoni, Rough large deviation estimates for simulated annealing: Applications to exponential schedules, Annals of Probability, 20 (1992), pp. 1109–1146.
O. Catoni, Rough large deviation estimates for simulated annealing: Applications to exponential schedules, Journal of Complexity, 12 (1996), pp. 595–623.
O. Catoni, Erratum, Journal of Complexity, 13 (1997), p. 384.
V. Cerny, Thermodynamical approach to the travelling salesman problem, Journal of Optimization Theory and Applications, 45 (1985), pp. 41–51.
N. Collins, R. Eglese, and B. Golden, Simulated annealing — an annotated bibliography, American Journal of Mathematical and Management Science, 8 (1988), pp. 209–308.
T. Cover and J. Thomas, Elements of Information Theory, John Wiley & Sons, 1991.
I. Csiszar, Why least squares and maximum entropy — an axiomatic approach to inference for linear inverse problems, Annals of Statistics, 19 (1991), pp. 2032–2066.
S. Duane, A. Kennedy, B. Pendleton, and D. Roweth, Hybrid Monte Carlo, Physics Letters B, 195 (1987), pp. 216–222.
D. Geiger and F. Girosi, Coupled markov random fields and mean field theory, in Advances in Neural Information Processing Systems 2, 1990, pp. 660–667.
D. Geiger and F. Girosi, Parallel and deterministic algorithms from MRF’s: Surface reconstruction, IEEE Transactions on Pattern Analysis and Machine Intelligence, (1991), pp. 401–412.
S. Gelfand and S. Mitter, Simulated annealing type algorithms for multivariate optimization, Algorithmica, 6 (1991), pp. 419–436.
S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 6 (1984), pp. 721–741.
S. Geman and C. Hwang, Diffusion for global optimization, SIAM Journal of Control and Optimization, 24 (1986), pp. 1031–1043.
S. Gold and A. Rangarajan, A graduated assignment algorithm for graph matching, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18 (1996), pp. 377–388.
J. Goodman and A. Sokal, Multigrid Monte-Carlo method. Conceptual foundations, Physical Review D, 40 (1989), pp. 2025–2071.
M. Grötschel and Y. Wakabayashi, A cutting plane algorithm for a clustering problem, Mathematical Programming, 45 (1989), pp. 59–96.
M. Grötschel and Y. Wakabayashi, Facets of the clique partitioning polytope, Mathematical Programming, 47 (1990), pp. 367–387.
B. Hajek, Cooling schedules for optimal annealing, Mathematics of Operation Research, 13 (1988), pp. 311–324.
W. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57 (1970), pp. 97–109.
T. Hofmann and J. Buhmann, Pairwise data clustering by deterministic annealing, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19 (1997), pp. 1–14.
T. Hofmann, J. Puzicha, and J. Buhmann, A deterministic annealing framework for textured image segmentation, IAI-TR 96–2, Institut für Informatik III, 1996.
T. Hofmann, J. Puzicha, and J. Buhmann, Unsupervised segmentation of textured images by pairwise data clustering, in Proceedings of the IEEE International Conference on Image Processing (ICIP′96), 1996, pp. III: 137–140.
T. Hofmann, J. Puzicha, and J. Buhmann, Deterministic annealing for unsupervised texture segmentation, in Proceedings of the International Workshop on Energy Minimization Methods in Computer Vision (EMMCVPR′97), Lectures Notes in Computer Science, Springer Verlag, 1997, pp. 213–228.
T. Hofmann, J. Puzicha, and J. Buhmann, Unsupervised texture segmentation in a deterministic annealing framework, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20 (1998), pp. 803–818.
A. Jain and R. Dubes, Algorithms for Clustering Data, Prentice Hall, 1988.
E. Jaynes, Information theory and statistical mechanics, Physical Review, 106 (1957), pp. 620–630.
E. Jaynes, Information theory and statistical mechanics II, Physical Review, 108 (1957), pp. 171–190.
E. Jaynes, On the rationale of maximum-entropy methods, Proceedings of the IEEE, 70 (1982), pp. 939–952.
S. Kirkpatrick, C. Gelatt, and M. Vecchi, Optimization by simulated annealing, Science, 220 (1983), pp. 671–680.
H. Klock and J. M. Buhmann, Data visualization by multidimensional scaling: A deterministic annealing approach, Pattern Recognition, 33 (2000), pp. 651–669.
P. v. Laarhoven, Theoretical and Computational Aspects of Simulated Annealing, CWI Tracts, 1988.
P. v. Laarhoven and E. Aarts, Simulated Annealing: Theory and applications, Reidel Publishing Company, 1987.
P. Laplace, Theorie Analytique des probabilites, Courcier, Paris, 1812.
S. Li, Markov Random Field Modeling in Computer Vision, Springer, 1995.
N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and M. Teller, Equation for state calculations by fast computing machines, Journal of Chemical Physics, 21 (1953), pp. 1087–1092.
R. Neal, Probabilistic inference unsing Markov chain Monte Carlo methods, Tech. Rep. CRG-TR-93–1, Department of Computer Science, University of Toronto, Canada, 1993.
G. Parisi, Statistical Field Theory, Addison Wesley, Redwood City, Ca., 1988.
F. Pereira, N. Tishby, and L. Lee, Distributional clustering of English words, in 30th Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 1993, pp. 183–190.
C. Peterson and J. Anderson, A mean field theory learning algorithm for neural networks, Complex Systems, 1 (1987), pp. 995–1019.
C. Peterson and B. Söderberg, A new method for mapping optimization problems onto neural networks, International Journal of Neural Systems, 1 (1989), pp. 3–22.
J. Puzicha, M. Held, J. Ketterer, J. Buhmann, and D. Fellner, On spatial quantization of color images, IEEE Transactions on Image Processing, 9 (2000), pp. 666–682.
J. Puzicha, T. Hofmann, and J. Buhmann, Deterministic annealing: Fast physical heuristics for real time optimization of large systems., in Proceedings of the 15th IMACS World Congress on Scientific Computation, Modelling and Applied Mathematics, 1997.
J. Puzicha, T. Hofmann, and J. Buhmann, A theory of proximity based clustering: Structure detection by optimization, Pattern Recognition, 33 (2000), pp. 617–634.
J. Puzicha, T. Hofmann, and J. M. Buhmann, A theory of proximity based clustering: Structure detection by optimization, Pattern Recognition, 33 (2000), pp. 617–634.
K. Rose, E. Gurewitz, and G. Fox, A deterministic annealing approach to clustering, Pattern Recognition Letters, 11 (1990), pp. 589–594.
K. Rose, E. Gurewitz, and G. Fox, Statistical mechanics and phase transition in clustering, Physical Review Letters, 65 (1990), pp. 945–948.
K. Rose, E. Gurewitz, and G. Fox, Vector quantization by deterministic annealing, IEEE Transactions on Information Theory, 38 (1992), pp. 1249–1257.
M.-A. Sato and S. Ishii, Bifurcations in mean-field-theory annealing, Physical Review E, 53 (1996), pp. 5153–5168.
C. Shannon, A mathematical theory of communication, Bell System Tech. Journal, 27 (1948), pp. 379–423, 623–659.
R. Swendsen and J. Wang, Nonuniversal critical dynamics in Monte Carlo simulations, Physical Review Letters, 58 (1987), pp. 86–88.
D. van den Bout and T. Miller, Graph partitioning using annealed neural networks, IEEE Transactions on Neural Networks, 1 (1990), pp. 192–203.
G. Winkler, Image Analysis, Random Fields and Dynamic Monte Carlo Methods: A Mathematical Introduction, Springer, 1995.
A. Yuille, Generalized deformable models, statistical physics and matching problems, Neural Computation, 2 (1990), pp. 1–24.
J. Zerubia and R. Chellappa, Mean field annealing using compound Gauss-Markov random fields for edge detection and image estimation, IEEE Transactions on Neural Networks, 4 (1993), pp. 703–709.
J. Zhang, The mean field theory in EM procedures for blind Markov random fields, IEEE Transactions on Image Processing, 2 (1993), pp. 27–40.
J. Zhang, The convergence of mean field procedures for MRF’s, IEEE Transactions on Image Processing, 5 (1996), pp. 1662–1665.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Buhmann, J.M., Puzicha, J. (2001). Basic Principles of Annealing for Large Scale Non-Linear Optimization. In: Grötschel, M., Krumke, S.O., Rambau, J. (eds) Online Optimization of Large Scale Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04331-8_36
Download citation
DOI: https://doi.org/10.1007/978-3-662-04331-8_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07633-6
Online ISBN: 978-3-662-04331-8
eBook Packages: Springer Book Archive