Abstract
Classification between two populations dealing with both continuous and binary variables is handled by splitting the problem into different locations. Given the location specified by the values of the binary variables, discrimination is performed using the continuous variables. The location probability model with homoscedastic across location conditional dispersion matrices is adopted for this problem. In this paper, we consider presence of continuous covariates with heterogeneous location conditional dispersion matrices. The continuous covariates have equal location specific mean in both populations. Conditional homoscedasticity fails when strong interaction between the continuous and binary variables is present. A plug-in covariance adjusted rule is constructed and its asymptotic distribution is derived. An asymptotic expansion for the overall error rate is given. The result is extended to include binary covariates.
Similar content being viewed by others
References
Aitkin, M. A., Anderson, D. A., Francis, B. J. and Hinde, J. P. (1989). Statistical Modelling in GLIM, Clarendon, Oxford.
Anderson, T. W. (1973). An asymptotic expansion of the distribution of the studentized classification statistic W, Ann. Statist., 1, 964–972.
Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, 2nd ed., Wiley, New York.
Balakrishnan, N., Kocherlakota, S. and Kocherlakota, K. (1986). On the errors of misclassification based on dichotomous and normal variables, Ann. Inst. Statist. Math., 38, 529–538.
Chang, P. C. and Afifi, A. A. (1974). Classification based on dichotomous and continuous variables, J. Amer. Statist. Assoc., 69, 336–339.
Cochran, W. G. and Bliss, C. T. (1948). Discriminant functions with covariance, Ann. Math. Statist., 19, 151–176.
Friedman, J. H. (1989). Regularized discriminant analysis, J. Amer. Statist. Assoc., 84, 165–175.
Haberman, S. J. (1972). Log-linear fit for contingency tables, Algorithm AS 51, Appl. Statist., 21, 218–225.
Kanazawa, M. and Fujikoshi, Y. (1977). The distribution of the studentized classification statistic W* in covariate discriminant analysis, J. Japan Statist. Soc., 7, 81–88.
Krzanowski, W. J. (1975). Discrimination and classification using both binary and continuous variables, J. Amer. Statist. Assoc., 70, 782–790.
Kshirsagar, A. M. (1972). Multivariate Analysis, Marcel Dekker, New York.
Leung, C. Y. (1994). The location linear discriminant with covariates, Comm. Statist. Simulation Comput., 23(4), 1027–1046 (Correction: ibid. (1995). 24(4), 1061–1062).
McLachlan, G. J. (1992). Discriminant Analysis and Statistical Pattern Recognition, Wiley, New York.
Memon, A. Z. and Okamoto, M. (1970). The classification statistic W* in covariate discriminant analysis, Ann. Math. Statist., 41, 1491–1499.
Olkin, I. and Tate, R. F. (1961). Multivariate correlation models with mixed discrete and continuous variables, Ann. Math. Statist., 32, 448–465 (Correction: ibid. (1965). 36, 343–344).
Searle, S. R. (1971). Linear Models, Wiley, New York.
Srivastara, M. S. and Khatri, C. G. (1979). An Introduction to Multivariate Statistics, North-Holland, New York.
Tu, C. T. and Han, C. P. (1982). Discriminant analysis based on binary and continuous variables, J. Amer. Statist. Assoc., 77, 447–454.
Vlachonikolis, I. G. (1985). On the asymptotic distribution of the location linear discriminant function, J. Roy. Statist. Soc. Ser. B, 47, 498–509.
Vlachonikolis, I. G. and Marriott, F. H. C. (1982). Discrimination with mixed binary and continuous data, Applied Statistics, 31, 23–31.
Author information
Authors and Affiliations
About this article
Cite this article
Leung, CY. The Covariance Adjusted Location Linear Discriminant Function for Classifying Data with Unequal Dispersion Matrices in Different Locations. Annals of the Institute of Statistical Mathematics 50, 417–431 (1998). https://doi.org/10.1023/A:1003517226502
Issue Date:
DOI: https://doi.org/10.1023/A:1003517226502