Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

On the null distribution of distance between two groups, using mixed continuous and categorical variables

Abstract

The location model is a useful tool in parametric analysis of mixed continuous and categorical variables. In this model, the continuous variables are assumed to follow different multivariate normal distributions for each possible combination of categorical variable values. Using this model, a distance between two populations involving mixed variables can be defined. To date, however, no distributional results have been available, against which to assess the outcomes of practical applications of this distance. The null distribution of estimated distance is therefore considered in this paper, for a range of possible situations. No explicit analytical expressions are derived for this distribution, but easily implementable Monte Carlo schemes are described. These are then applied to previously cited examples.

This is a preview of subscription content, log in to check access.

References

  1. AITCHISON, J., HABBEMA, J.D.F., and KAY, J.W. (1977), “A Critical Comparison of Two Methods of Statistical Discrimination,”Applied Statistics, 26, 15–25.

  2. BOX, G.E.P., and TIAO, G.C. (1973),Bayesian Inference in Statistical Analysis, Reading, Massachusetts: Addison-Wesley.

  3. GEISSER, S. (1971), “The Inferential Use of Predictive Distributions,” inFoundations of Statistical Inference, eds. V.P. Godambe and D.A. Sprott, Toronto: Holt, Rinehart and Winston, 458–469.

  4. JOHNSON, N.L., and KOTZ, S. (1972),Distributions in Statistics: Continuous Multivariate Distributions, New York: Wiley.

  5. KENDALL, M.G., and STUART, A. (1979),The Advanced Theory of Statistics, Vol. 2 (4th Edition), London: Griffin.

  6. KRZANOWSKI, W.J. (1983), “Distance Between Populations Using Mixed Continuous and Categorical Variables,”Biometrika, 70, 235–243.

  7. MARDIA, K.V., KENT, J.T., and BIBBY, J.M. (1979),Multivariate Analysis, London: Academic Press.

  8. MATUSITA, K. (1956), “Decision Rule, Based on Distance, for the Classification Problem,”Annals of the Institute of Statistical Mathematics, 16, 305–315.

  9. MORAN, M.A., and MURPHY, B.J. (1979), “A Closer Look at Two Alternative Methods of Statistical Discrimination,”Applied Statistics, 28, 223–232.

  10. MUIRHEAD, R.J. (1982),Aspects of Multivariate Statistical Theory, New York: Wiley.

  11. OLKIN, I., and TATE, R.F. (1961), “Multivariate Correlation Models with Mixed Discrete and Continuous Variables,”Annals of Mathematical Statistics, 22, 92–96.

Download references

Author information

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Krzanowski, W.J. On the null distribution of distance between two groups, using mixed continuous and categorical variables. Journal of Classification 1, 243–253 (1984). https://doi.org/10.1007/BF01890125

Download citation

Keywords

  • Distance between groups
  • Location model
  • Mixed variables
  • Monte Carlo methods
  • Simulation