Distance Based Association and Multi-Sample Tests for General Multivariate Data

  • Carles M. Cuadras
Chapter
Part of the Statistics for Industry and Technology book series (SIT)

Abstract

Most multivariate tests are based on the hypothesis of multinormality. But often this hypothesis fails, or we have variables that are non quantitative. On the other hand we can deal with a large number of variables. Defining probabilistic models with mixed data is not easy. However, it is always possible to define a measure of distance between two observations. We prove that the use of distances can provide alternative tests for comparing several populations when the data are of general type. This approach is illustrated with three real data examples. We also define and study a measure of association between two data sets and make a Bayesian extension of the so-called distance-based discriminant rule.

Keywords and phrases

Statistical distances multivariate association discriminant analysis MANOVA ANOQE permutation test large data sets 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, M. J. (2006). Distance-based tests for homogeneity of multivariate dispersions. Biometrics, 62:245–253.MATHCrossRefMathSciNetGoogle Scholar
  2. Arenas, C. and Cuadras, C. M. (2002). Recent statistical methods based on distances. Contributions to Science, 2:183–191.Google Scholar
  3. Arenas, C. and Cuadras, C. M. (2004). Comparing two methods for joint representation of multivariate data. Communications in Statistics, Simulation and Computation, 33:415–430.MATHCrossRefMathSciNetGoogle Scholar
  4. Cox, T. F. and Cox, M. A. A. (1994). Multidimensional Scaling. Chapman and Hall, London.MATHGoogle Scholar
  5. Cuadras, C. M. (1989). Distance analysis in discrimination and classification using both continuous and categorical variables. In Statistical Data Analysis and Inference, (Ed. Y. Dodge), pp. 459–473. Elsevier Science Publishers B. V. (North–Holland), Amsterdam.Google Scholar
  6. Cuadras, C. M. (1992). Some examples of distance based discrimination. Biometrical Letters, 29:3–20.Google Scholar
  7. Cuadras, C. M. and Arenas, C. (1990). A distance based regression model for prediction with mixed data. Communications in Statistics, Theory and Methods, 19:2261–2279.CrossRefMathSciNetGoogle Scholar
  8. Cuadras, C. M. and Fortiana, J. (1995). A continuous metric scaling solution for a random variable. Journal of Multivariate Analysis, 52:1–14.MATHCrossRefMathSciNetGoogle Scholar
  9. Cuadras, C. M. and Fortiana, J. (2004). Distance-based multivariate two sample tests. In Parametric and Semiparametric Models with Applications to Reliability, Survival Analysis and Quality of Life, (Eds. M. S. Nikulin, N. Balakrishnan, M. Mesbah, N. Limnios), 273–290. Birkhauser, Boston.Google Scholar
  10. Cuadras, C. M., Arenas, C., and Fortiana, J. (1996). Some computational aspects of a distance-based model for prediction. Communications in Statistics, Simulation and Computation, 25:593–609.MATHCrossRefGoogle Scholar
  11. Cuadras, C. M., Atkinson, R. A., and Fortiana, J. (1997a). Probability densities from distances and discriminant analysis. Statistics and Probability Letters, 33:405–411.MATHCrossRefMathSciNetGoogle Scholar
  12. Cuadras, C. M., Fortiana, J., and Oliva, F. (1997b). The proximity of an individual to a population with applications in discriminant analysis. Journal of Classification, 14:117–136.MATHCrossRefMathSciNetGoogle Scholar
  13. Cuadras, C. M., Cuadras, D., and Lahlou, Y. (2006). Principal directions of the general Pareto distribution with applications. Journal of Statistical Planning and Inference, 136:2572–2583.MATHCrossRefMathSciNetGoogle Scholar
  14. Cuadras, C. M. and Lahlou, Y. (2000). Some orthogonal expansions for the logistic distribution. Communications in Statistics, Theory and Methods, 29:2643–2663.MATHCrossRefMathSciNetGoogle Scholar
  15. Escoufier, Y. (1973). Le trataiment des variables vectorielles. Biometrics, 29:751–760.CrossRefMathSciNetGoogle Scholar
  16. Flury, B. (1997). A First Course in Multivariate Statistics. Springer-Verlag, New York.MATHGoogle Scholar
  17. Gower, J. C. (1966). Some distance properties of latent roots and vector methods in multivariate analysis. Biometrika, 53:315–328.Google Scholar
  18. Gower, J. C. and Legendre, P. (1986). Metric and Euclidean properties of dissimilarity coefficients. Journal of Classification, 3:5–48.MATHCrossRefMathSciNetGoogle Scholar
  19. Liu, Z. J. and Rao, C. R. (1995). Asymptotic distribution of statistics based on quadratic entropy and bootstrapping. Journal of Statistical Planning and Inference, 43:1–18.MATHCrossRefMathSciNetGoogle Scholar
  20. Mardia, K. V, Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.MATHGoogle Scholar
  21. Rao, C. R. (1982). Diversity: its measurement, decomposition, apportionment and analysis. Sankhya A, 44:1–21.MATHGoogle Scholar

Copyright information

© Birkhäuser Boston 2008

Authors and Affiliations

  • Carles M. Cuadras
    • 1
  1. 1.Department of StatisticsUniversity of BarcelonaSpain

Personalised recommendations