Abstract
We examine various methods for data clustering and data classification that are based on the minimization of the so-called cluster function and its modications. These functions are nonsmooth and nonconvex. We use Discrete Gradient methods for their local minimization. We consider also a combination of this method with the cutting angle method for global minimization. We present and discuss results of numerical experiments.
Similar content being viewed by others
References
Aarts E. and Korst J. (1989).Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing. Wiley Interscience Series in Discrete Mathematics and Optimization. John Wiley.
Al-Sultan K.S. (1995). A tabu search approach to the clustering problem.Pattern Recognition 28, 1443–1451.
Al-Sultan K.S. and Khan M.M. (1996). Computational experience on four algorthms for the hard clustering problem.Pattern Recognition Letters 17, 295–308.
Anderberg M.R. (1973).Cluster Analysis for Applications. Academic Press.
Andramonov M.Yu, Rubinov A.M. and Glover B.M. (1999). Cutting angle method in global optimization.Applied Mathematics Letters 12, 95–100.
Babu G.P. and Murty M.N. (1993). A near optimal initial seed value selection in thek-meanws algorithm using a genetic algorithm.Pattern Recognition Letters 14, 763–769.
Baeza-Yates R.A. (1992). Introduction to data structures and algorithms related to information retrieval. In: Frakes W.B. and Baeza-Yates R.A. (eds.),Information Retrieval: Data Structures and Algorithms. Prentice Hall, 13–27.
Bagirov A.M. (1992). A method of approximating a subdifferential.Russian Journal of Computational Mathematics and Mathematical Physics 32, 561–566.
Bagirov A.M. (1998). Continuous subdifferential approximation and its construction.Indian Journal of Pure and Applied Mathematics 1, 17–29.
Bagirov A.M. (1999a). Derivative-free methods for unconstrained nonsmooth optimization and its numerical analysis.Investigacao Operacional 19, 75–93.
Bagirov A.M. (1999b). Minimization methods for one class of nonsmooth functions and calculation of semi-equilibrium prices, In: Eberhard A. et al. (eds.)Progress in Optimization: Contribution from Australia. Kluwer Academic Publishers, 147–175.
Bagirov A.M. (2000). Numerical methods for minimizing quasidifferentiable functions: a survey and comparison. In: Demyanov V.F. and Rubinov A.M. (eds.),Quasidifferentiability and Related Topics. Kluwer Academic Publishers, 33–71.
Bagirov A.M. and Gasanov A.A. (1995). A method of approximating a quasidifferential.Russian Journal of Computational Mathematics and Mathematical Physics 35, 403–409.
Bagirov A.M., Rubinov A.M. and Yearwood J. (2000). A heuristic algorithm for feature selection based on optimization techniques. In: Sarker R., Abbas H. and Newton C.S. (eds.),Heuristic and Optimization for Knowledge Discovery. Idea Publishing Group.
Bagirov A.M., Rubinov A.M. and Yearwood J. (2001). Using global optimization to improve classification for medical diagnosis and prognosis.Topics in Health Information Management 22, 65–74.
Bagirov A.M., Rubinov A.M. and Yearwood J. (2002). A global optimization approach to classification.Optimization and Engineering 3, 129–155.
Bagirov A.M. and Rubinov A.M. (2000). Global minimization of increasing positively homogeneous function over unit simplex.Annals of Operations Research 98, 171–187.
Bagirov A.M. and Rubinov A.M. (2001). Modified versions of the cutting angle method, In: Hadjisavvas N. and Pardalos P.M. (eds.),Advances in Convex Analysis and Global Optimization. Kluwer Academic Publishers.
Bagirov A.M. and Rubinov A.M. (2000). The cutting angle method and a local search.Journal of Global Optimization (to appear).
Bagirov A.M. and Yearwood J. (2003). A new nonsmooth optimization algorithm for clustering problems. Research Report 03/02, University of Ballarat, Australia. Submitted toEuropean Journal of Operational Research.
Batten L. and Beliakov G. (2002). Fast algorithm for the Cutting Angle Method of Global Optimization.Journal of Global Optimization 24, 149–161.
Bennett K.P. and Mangasarian O.L. (1992). Robust linear programming discrimination of two linearly inseparable sets.Optimization Methods and Software 1, 23–34.
Bock H.H. (1974).Automatische Klassifikation. Vandenhoeck and Ruprecht.
Bock H.H. (1998). Clustering and neural networks, In: Rizzi A., Vichi M. and Bock H.H. (eds.),Advances in Data Science and Classification. Springer-Verlag, 265–277.
Bradley P.S. and Mangasarian O.L. (1998). Feature selection via concave minimization and support vector machines. Machine Learning Proceedings of the Fifteenth International Conference (ICML’98), San Francisco, California. Morgan Kaufmann, 82–90.
Bradley P.S. and Mangasarian O.L. (2000). Massive data discrimination via linear support vector machines.Optimization Methods and Software 13, 1–10.
Bradley P.S., Fayyad U.M. and Mangasarian O.L. (1999). Data mining: overview and optimization opportunities.INFORMS Journal on Computing 11, 217–238.
Brown D.E. and Entail C.L. (1992). A practical application of simulated annealing to the clustering problem.Pattern Recognition 25, 401–412.
Brown M., Grundy W., Lin D., Christianini N., Sugnet C., Furey T., Ares M. and Haussler D. (2000). Knowledg-based analysis of microarray gene expression data using support vector machines. Proceedings of the National Academy of Sciences 97, 262–267.
Bhuyan N.J., Raghavan V.V. and Venkatesh K.E. (19919. Genetic algorithms for clustering with an ordered representation. Proceedings of the Fourth International Conference on Genetic Algorithms, 408–415.
Carpenter G. and Grossberg S. (1990). Art3: Hierarchical search using chemical transmitters in self organising pattern recognition architectures.Neural Networks 3, 129–152.
Chen C. and Mangasarian O.L. (1995). Hybrid misclassification minimization. Mathematical Programming Technical Report 95-05, University of Wisconsin.
DeCoste D. and Schölkopf B. (2002). Training invariant support vector machines.Machine Learning 46, 161–190.
Demyanov V.F. and Rubinov A.M. (1995).Constructive Nonsmooth Analysis. Peter Lang.
Dhillon I.S., Fan J. and Guan Y. (2001). Efficient clustering of very large document collections, In: Grossman R.L., Kamath C., Kegelmeyer P., Kumar V. and Namburu R.R. (eds.),Data Mining for Scientific and Engineering Applications. Kluwer Academic Publishers.
Diehr G. (1985). Evaluation of a branch and bound algorithm for clustering.SIAM Journal of Scientific and Statistical Computing 6, 268–284.
Dubes R. and Jain A.K. (1976). Clustering techniques: the user’s dilemma.Pattern Recognition 8, 247–260.
Finnie G. and Sun Z. (2003).r 5 model for case-based reasoning.Knowledge-Based Systems 16, 59–65.
Fogel D.B. (1994). An introduction to simulated evolutionary optimization.IEEE Transactions on Neural Networks 5, 3–14.
Goldberg D.E. (1989).Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Publishing Co..
Grefenstette J. (1986). Optimization of control parameters for genetic algorithms.IEEE Transactions on Systems Man and Cybernetics 1, 122–128.
Hanjoul P. and Peeters D. (1985). A comparison of two dual-based procedures for solving thep-median problem.European Journal of Operational Research 20, 387–396.
Hansen P. and Jaumard B. (1997). Cluster analysis and mathematical programming.Mathematical Programming 79, 191–215.
Holland J.H. (1975).Adaptation in Natural and Artificial Systems. University of Michigan Press.
Hawkins D.M., Muller M.W. and Krooden J.A.ten (1982). Cluster analysis. In: Hawkins D.M. (ed.),Topics in Applied Multivariate Analysis. Cambridge University Press.
Jain A.K., Murty M.N. and Flynn P.J. (1999). Data clustering: a review.ACM Computing Surveys 31, 264–323.
Jardine N. and Sibson R. (1971).Mathematical Taxonomy. John Wiley.
Jensen R.E. (1969). A dynamic programming algorithm for cluster analysis.Operations Research 17, 1034–1057.
Joachims T. (1998). Text categorization with support vector machines: Learning with many relevant teatures.Proceedings of the European Conference on Machine Learning. Springer-Verlag, 137–142.
Jones D. and Beltramo M.A. (1991). Solving partitioning problems with genetic algorithms.Proceedings of the Fourth International Conference on Genetic Algorithms 442–449.
King B. (1967). Step-wise clustering procedures.Journal of the American Statistical Association 69, 86–101.
Kohonen T. (1989).Self Organization and Associative Memory. Springer Information Sciences Series. Springer-Verlag.
Koontz W.L.G., Narendra P.M. and Fukunaga K. (1975). A branch and bound clustering algorithm.IEEE Transactions on Computers, 24, 908–915.
Lu S.Y. and Fu K.S. (1978). A sentence to sentence clustering procedure for pattern analysis.IEEE Transactions on Systems Mans and Cybernetics 8, 381–389.
MacQueen J.B. (1967). Some Methods for Classification and Analysis of Multivariate observations. In: LeCam L.M. and Neyman J. (eds.),Proceedings of the Firth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press.
Mangasarian O.L. (1994). Misclassification minimization.Journal of Global Optimization 5, 309–323.
Mangasarian O.L. (1997). Mathematical programming in data mining.Data Mining and Knowledge Discovery 1, 183–201.
McLachlan G.J. (1992).Discriminat Analysis and Statistical Pattern Recognition. John Wiley.
McLachlan G.J., Peel D. and Prado, P. (1997). Clustering via normal mixture models. Proceedings of the American Statistical Association (Bayesian Statistical Science Section), 98–103.
McQueen J. (1971). Some methods for classification and analysis of multivariate observations. In: LeCam L.M. and Neyman J. (eds.),Proceedings of the Firth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press.
Michie D., Spiegelhalter D.J. and Taylor C.C. (1994).Machine Learning, Neural and Statistical Classification. Ellis Horwood Series in Artificial Intelligence.
Mifflin R. (1977). Semismooth and semiconvex functions in constrained optimization.SIAM Journal on Control and Optimization 15, 959–972.
Mirkin B. (1996).Mathematical Classification and Clustering. Kluwer Academic Publishers.
Murphy P.M. and Aha D.W. (1992). UCI repository of machine learning databases. Technical report, Department of Information and Computer science, University of California, Irvine. www.ics.uci.edu/mlearn/MLRepository.html.
Murtagh F. (1984). A survey of recent advances in hierarchical clustering algorithms which use cluster centres.Computer Journal 26, 354–359.
Nagy G. (1968). State of the art in pattern recognition.Proceedings of the IEEE 56, 836–862.
Quinlan J.R. (1993).C4.5: Programs for Machine Learning. Morgan Kaufmann.
Raghavan V.V. and Birchand K. (1979). A comparison of the stability characteristics of some graph theoretic clustering methods. Proceedings of the Second international Conference on Information Storage and Retreival, 10–22.
Reeves C.R. (1993).Modern Heuristic Techniques for Combinatorial Problems. Blackwell.
Rubinov A.M. (2000).Abstract Convexity and Global Optimization. Kluwer Academic Publishers.
Rubinov A.M. and Soukhoroukova N.V. (2003). A nonsmooth optimization approach to clustering large-scale datasets, manuscript of the author.
Rubinov A.M., Soukhoroukova N.V. and Yearwood J. (2001). Clustering for studing structure and quality of datasets, Research Report 01/24, University of Ballarat.
Rubinov A.M. and Ugon J. (2002). Skeletons of finite sets of points, manuscript of the author.
Schölkopf B. and Smola A. (2002).Learning with Kernels. The MIT Press.
Selim S.Z. and Al-Sultan K.S. (1991). A simulated annealing algorithm for the clustering.Pattern Recognition 24, 1003–1008.
Selim S.Z. and Ismail M.A. (1984).k-means-type algorithm: generalized convergence theorem and characterization of local optimality.IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 81–87.
Sethi I. and Jain A.K. (1991).Artificial Neural Networks and Pattern Recognition: Old and new Connections. Elsevier Science.
Shang Y. and Wah. B.W. (1996). Global optimization for neural network training.IEEE Computer 29, 31–44.
Sneath P.H.A. and Sokal R.R. (1973).Numerical Taxonomy. Freeman.
Spath H. (1980).Cluster Analysis Algorithms. Ellis Horwood Limited.
Sun L.X., Xie Y.L., Song X.H., Wang J.H. and Yu R.Q. (1994). Cluster analysis by simulated annealing.Computers and Chemistry 18, 103–108.
Ward J.H. Jr. (1983). Hierarchical grouping to optimize and objective function.Journal of the American Statistical Association 58, 236–244.
Author information
Authors and Affiliations
Additional information
This research was supported by the Australian Research Council.
Rights and permissions
About this article
Cite this article
Bagirov, A.M., Rubinov, A.M., Soukhoroukova, N.V. et al. Unsupervised and supervised data classification via nonsmooth and global optimization. Top 11, 1–75 (2003). https://doi.org/10.1007/BF02578945
Issue Date:
DOI: https://doi.org/10.1007/BF02578945