Abstract
Categorical data of high (but finite) dimensionality generate sparsely populated J-way contingency tables because of finite sample sizes. A model representing such data by a "smooth" low dimensional parametric structure using a "natural" metric would be useful. We discuss a model using a metric determined by convex sets to represent moments of a discrete distribution to order J. The model is shown, from theorems on convex polytopes, to depend only on the linear space spanned by the convex set—it is otherwise measure invariant. We provide an empirical example to illustrate the maximum likelihood estimation of parameters of a particular statistical application (Grade of Membership analysis) of such a model.
Similar content being viewed by others
References
Albert, A. (1944). The matrices of factor analysis, Proc. Nat. Acad. Sci., 30, 90–95.
Bishop, Y., Fienberg, S. and Holland, P. (1975). Discrete Multivariate Analysis: Theory and Practice, MIT Press, Cambridge, Mass. and London, England.
Brondsted, A. (1983). An Introduction to Convex Polytopes, Springer-Verlag, New York.
Hahn, H. and Rosenthal, A. (1948). Set Functions, Albuquerque, New Mexico, University of New Mexico Press.
Karlin, S. and Shapley, L. S. (1953). Geometry of Moment Spaces, Memoirs of the American Mathematical Society, Providence, RI, American Mathematical Society.
Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many parameters, Ann. Math. Statist., 27, 887–890.
Kleinman, J. C. (1973). Proportions with extraneous variance: Single and Independent samples, J. Amer. Statist. Assoc., 68, 46–54.
Laird, N. (1978). Nonparametric maximum likelihood estimation of a mixing distribution, J. Amer. Statist. Assoc., 73, 805–811.
Lazarsfeld, P. F. and Henry, N. W. (1968). Latent Structure Analysis, Houghton Mifflin, Boston.
Little, R. J. A. and Rubin, D. B. (1986). Statistical Analysis with Missing Data, Wiley, New York.
Manton, K. G., Woodbury, M. A., Stallard, E., Riggan, W. B., Creason, J. P. and Pellom, A. (1989). Empirical Bayes procedures for stabilizing maps of US cancer mortality rates, J. Amer. Statist. Assoc., 84, 637–650.
Manton, K. G., Cornelius, E. S. and Woodbury, M. A. (1995). Nursing home residents: A multivariate analysis of their medical, behavioral, psycho-social, and service use characteristics, Journal of Gerontology: Biological Sciences & Medical Sciences, 50, M242–M251.
Nevman, J. and Scott, E. L. (1948). Consistent estimators based on partially consistent observations, Econometrica, 16, 1–32.
Orchard, G. and Woodbury, M. A. (1971). A missing information principle: Theory and application, Sixth Berkeley Symposium on Mathematical Statistics and Probability (eds. L. M. LeCam, J. Neyman and E. L. Scott), pp. 697–715, Berkeley, University of California Press, California.
Suppes, P. and Zanotti, M. (1981). When are probabilistic explanations possible?, Synthese, 48, 191–199.
Tolley, H. D. and Manton, K. G. (1992). Large sample properties of estimates of discrete grade of membership model, Ann. Statist. Math., 44, 85–95.
Tsutakawa, R. K., Shoop, G. L. and Marienfeld, C. J. (1985). Empirical bayes estimation of cancer mortality rates, Statistics in Medicine, 4, 201–212.
Weyl, H. (1949). The elementary theory of convex polyhedra, Ann. Math. Stud., 24, 3–18.
Woodbury, M. A., Manton, K. G. and Tolley, H. D. (1994). A general model for statistical analysis using fuzzy sets: Sufficient conditions for identifability and statistical properties, Inform. Sci., 1, 149–180.
Author information
Authors and Affiliations
About this article
Cite this article
Woodbury, M.A., Manton, K.G. & Tolley, H.D. Convex Models of High Dimensional Discrete Data. Annals of the Institute of Statistical Mathematics 49, 371–393 (1997). https://doi.org/10.1023/A:1003175232300
Issue Date:
DOI: https://doi.org/10.1023/A:1003175232300