Journal of Classification

, Volume 8, Issue 1, pp 31–44 | Cite as

Large-sample results for optimization-based clustering methods

  • Peter G. Bryant
Article

Abstract

Many common (nonhierarchical) clustering and classification methods are optimization-based methods, in the sense described by Windham (1987) in this Journal. This paper gives some large sample properties for estimates derived by such methods. Under appropriate conditions, such estimates converge with probability one to a limit, and are asymptotically normally distributed around that limiting value. The conditions are satisfied by most of the common examples of optimization-based methods.

Keywords

Classification Clustering Maximum likelihood Asymptotic properties 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BARDWELL, R. A. (1989), “Asymptotic Behavior of Certain Estimators Under Mild Regularity Conditions“, Ph.D. dissertation, Department of Mathematics, University of Colorado at Boulder.Google Scholar
  2. BOCK, H.-H. (1985), “On Some Significance Tests in Cluster Analysis,”Journal of Classification, 2, 77–108.MATHCrossRefMathSciNetGoogle Scholar
  3. BOENTE, G., and FRAIMAN, R. (1988), “On the Asymptotic Behavior of General Maximum Likelihood Estimates for the Nonregular Case Under Nonstandard Conditions”,Biometrika, 75, 45–56.MATHCrossRefMathSciNetGoogle Scholar
  4. BRYANT, P. G. (1988), “On Characterizing Optimization-Based Clustering Criteria”,Journal of Classification, 5, 81–84.CrossRefMathSciNetGoogle Scholar
  5. BRYANT, P. G., and WILLIAMSON, J. A. (1978), “Asymptotic Behaviour of Classification Maximum Likelihood Estimates,”Biometrika, 65, 273–281.MATHCrossRefGoogle Scholar
  6. BRYANT, P. G., and WILLIAMSON, J. A. (1984). “The Asymptotic Distribution of Statistics Derived by Maximizing Sums,” Faculty Working Paper Series number UCD-CBA 1984-3, College of Business and Administration, University of Colorado at Denver.Google Scholar
  7. BRYANT, P. G., and WILLIAMSON, J. A. (1986), “Maximum Likelihood and Classification: A Comparison of Three Approaches,” inClassification as a Tool of Research, Eds. W. Gaul and M. Schader, Amsterdam: North-Holland, 35–45.Google Scholar
  8. DANIELS, H. E. (1961) “The Asymptotic Efficiency of a Maximum Likelihood Estimator,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1, Ed. J. Neyman, Berkeley and Los Angeles: University of California Press, 151–163.Google Scholar
  9. DAY, W. H. E., and EDELSBRUNNER, H. (1985) “Investigation of Proportional Link Linkage Clustering Methods,”Journal of Classification, 2, 239–254.CrossRefGoogle Scholar
  10. DUPACOVA, J., and Wets, R. (1988), “Asymptotic Behavior of Statistical Estimators and of Optimal Solutions of Stochastic Optimization Problems,”Annals of Statistics, 16, 4, 1517–1549.MATHMathSciNetGoogle Scholar
  11. FOUTZ, R. V., and SRIVASTAVA, R. C. (1977) “The Performance of the Likelihood Ratio Test When the Model is Incorrect,”Annals of Statistics, 5, 1183–1194.MATHMathSciNetGoogle Scholar
  12. HARTIGAN, J. A. (1978) “Asymptotic Distributions for Clustering Criteria,”Annals of Statistics, 6, 117–131.MATHMathSciNetGoogle Scholar
  13. HUBER, P. J. (1967), “The Behavior of Maximum Likelihood Estimates under Non-standard Conditions”, inProceedings, Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, Eds. L. M. Le Cam and J. Neyman, Berkeley and Los Angeles: University of California Press, 221–233.Google Scholar
  14. MARRIOTT, F. H. C. (1975) “Separating Mixtures of Normal Dsitributions,”Biometrics, 31, 767–769.MATHCrossRefGoogle Scholar
  15. MARRIOTT, F. H. C. (1982), “Optimization Methods of Cluster Analysis,”Biometrika, 69, 417–421.CrossRefMathSciNetGoogle Scholar
  16. POLLARD, D. (1981) “Strong Consistency of k-means Clustering,”Annals of Statistics, 9, 135–140.MATHMathSciNetGoogle Scholar
  17. POLLARD, D. (1982) “A Central Limit Theorem for k-means Clustering,”Annals of Probability, 10, 919–926.MATHMathSciNetGoogle Scholar
  18. SCOTT, A. J., and SYMONS, M. J. (1971) “Clustering Methods Based on Likelihood Ratio Criteria,”Biometrics, 27, 387–397.CrossRefGoogle Scholar
  19. SYMONS, M. J. (1981) “Clustering Criteria and Multivariate Normal Mixtures,”Biometrics, 37, 35–43.MATHCrossRefMathSciNetGoogle Scholar
  20. WHITE, H. (1982) “Maximum Likelihood Estimation of Misspecified Models,”Econometrica,50, 1–25.MATHCrossRefMathSciNetGoogle Scholar
  21. WILLIAMSON, J. A. (1984) “A Note on the Proof by H. E. Daniels of the Asymptotic Efficiency of a Maximum Likelihood Estimator,”Biometrika, 71, 651–653.MATHCrossRefMathSciNetGoogle Scholar
  22. WINDHAM, M. P. (1986), “A Unification of Optimization-Based Numerical Classification Algorithms,” inClassification as a Tool of Research, Eds. W. Gaul and M. Schader, Amsterdam: North-Holland, 447–452.Google Scholar
  23. WINDHAM, M. P. (1987) “Parameter Modification for Clustering Criteria”,Journal of Classification, 4, 191–214.MATHCrossRefMathSciNetGoogle Scholar
  24. WINDHAM, M. P. (1989), “Statistical Models in Cluster Analysis,” Utah State University, Department of Mathematics and Statistics Research Report May/1989/45.Google Scholar

Copyright information

© Springer-Verlag 1991

Authors and Affiliations

  • Peter G. Bryant
    • 1
  1. 1.Graduate School of Business AdministrationUniversity of Colorado at DenverDenverUSA

Personalised recommendations