The location model is a useful tool in parametric analysis of mixed continuous and categorical variables. In this model, the continuous variables are assumed to follow different multivariate normal distributions for each possible combination of categorical variable values. Using this model, a distance between two populations involving mixed variables can be defined. To date, however, no distributional results have been available, against which to assess the outcomes of practical applications of this distance. The null distribution of estimated distance is therefore considered in this paper, for a range of possible situations. No explicit analytical expressions are derived for this distribution, but easily implementable Monte Carlo schemes are described. These are then applied to previously cited examples.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
AITCHISON, J., HABBEMA, J.D.F., and KAY, J.W. (1977), “A Critical Comparison of Two Methods of Statistical Discrimination,”Applied Statistics, 26, 15–25.
BOX, G.E.P., and TIAO, G.C. (1973),Bayesian Inference in Statistical Analysis, Reading, Massachusetts: Addison-Wesley.
GEISSER, S. (1971), “The Inferential Use of Predictive Distributions,” inFoundations of Statistical Inference, eds. V.P. Godambe and D.A. Sprott, Toronto: Holt, Rinehart and Winston, 458–469.
JOHNSON, N.L., and KOTZ, S. (1972),Distributions in Statistics: Continuous Multivariate Distributions, New York: Wiley.
KENDALL, M.G., and STUART, A. (1979),The Advanced Theory of Statistics, Vol. 2 (4th Edition), London: Griffin.
KRZANOWSKI, W.J. (1983), “Distance Between Populations Using Mixed Continuous and Categorical Variables,”Biometrika, 70, 235–243.
MARDIA, K.V., KENT, J.T., and BIBBY, J.M. (1979),Multivariate Analysis, London: Academic Press.
MATUSITA, K. (1956), “Decision Rule, Based on Distance, for the Classification Problem,”Annals of the Institute of Statistical Mathematics, 16, 305–315.
MORAN, M.A., and MURPHY, B.J. (1979), “A Closer Look at Two Alternative Methods of Statistical Discrimination,”Applied Statistics, 28, 223–232.
MUIRHEAD, R.J. (1982),Aspects of Multivariate Statistical Theory, New York: Wiley.
OLKIN, I., and TATE, R.F. (1961), “Multivariate Correlation Models with Mixed Discrete and Continuous Variables,”Annals of Mathematical Statistics, 22, 92–96.
About this article
Cite this article
Krzanowski, W.J. On the null distribution of distance between two groups, using mixed continuous and categorical variables. Journal of Classification 1, 243–253 (1984). https://doi.org/10.1007/BF01890125
- Distance between groups
- Location model
- Mixed variables
- Monte Carlo methods