Abstract
Multivariate kernel density estimation is an important technique in exploratory data analysis. Its utility relies on its ease of interpretation, especially by graphical means. The crucial factor which determines the performance of kernel density estimation is the bandwidth matrix selection. Research in finding optimal bandwidth matrices began with restricted parameterizations of the bandwidth matrix which mimic univariate selectors. Progressively these restrictions were relaxed to develop more flexible selectors. In this paper, we propose the first plug-in bandwidth selector with the unconstrained parameterizations of both the final and pilot selectors. Up till now, the development of unconstrained pilot selectors was hindered by the traditional vectorization of higher-order derivatives which lead to increasingly intractable matrix algebraic expressions. We resolve this by introducing an alternative vectorization which gives elegant and tractable expressions. This allows us to quantify the asymptotic and finite sample properties of unconstrained pilot selectors. For target densities with intricate structure (such as multimodality), our unconstrained selectors show the most improvement over the existing plug-in selectors.
Similar content being viewed by others
References
Bowman AW (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71:353–360
Chacón JE (2009) Data-driven choice of the smoothing parametrization for kernel density estimators. Can J Stat 37:249–265
Chacón JE, Duong T, Wand MP (2009) Asymptotics for general multivariate kernel density derivative estimators (submitted)
Duong T (2009) ks: Kernel smoothing. R package version 1.6.5
Duong T, Hazelton ML (2003) Plug-in bandwidth matrices for bivariate kernel density estimation. J Nonparametr Stat 15:17–30
Duong T, Hazelton ML (2005a) Convergence rates for unconstrained bandwidth matrix selectors in multivariate kernel density estimation. J Multivar Anal 93:417–433
Duong T, Hazelton ML (2005b) Cross-validation bandwidth matrices for multivariate kernel density estimation. Scand J Stat 32:485–506
Hall P, Marron JS (1987) Estimation of integrated squared density derivatives. Stat Probab Lett 6:109–115
Hall P, Marron JS, Park BU (1992) Smoothed cross-validation. Probab Theory Relat Fields 92:1–20
Henderson HV, Searle SR (1979) Vec and vech operators for matrices, with some uses in Jacobians and multivariate statistics. Can J Stat 7:65–81
Holmquist B (1985) The direct product permuting matrices. Linear Multilinear Algebra 17:117–141
Holmquist B (1996a) The d-variate vector Hermite polynomial of order k. Linear Algebra Appl 237–238:155–190
Holmquist B (1996b) Expectations of products of quadratic forms in normal variables. Stoch Anal Appl 14:149–164
Jones MC, Sheather SJ (1991) Using nonstochastic terms to advantage in kernel-based estimation of integrated squared density derivatives. Stat Probab Lett 11:511–514
Magnus JR, Neudecker H (1980) The elimination matrix: some lemmas and applications. SIAM J Algebra Discrete Methods 1:422–449
Magnus JR, Neudecker H (1999) Matrix differential calculus with applications in statistics and econometrics, revised edn. Wiley, Chichester
Meijer E (2005) Matrix algebra for higher order moments. Linear Algebra Appl 410:112–134
Park BU, Marron JS (1990) Comparison of data-driven bandwidth selectors. J Am Stat Assoc 85:66–72
Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:832–837
Rudemo M (1982) Empirical choice of histograms and kernel density estimators. Scand J Stat 9:65–78
Sain SR, Baggerly KA, Scott DW (1994) Cross-validation of multivariate densities. J Am Stat Assoc 89:807–817
Schott JR (2003) Kronecker product permutation matrices and their application to moment matrices of the normal distribution. J Multivar Anal 87:177–190
Schott JR (2005) Matrix analysis for statics, 2nd edn. Wiley, New York
Scott DW, Terrell GR (1987) Biased and unbiased cross-validation in density estimation. J Am Stat Assoc 82:1131–1146
Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. J R Stat Soc Ser B Stat Methodol 53:683–690
Silverman BW (1986) Density estimation for statics and data analysis. Chapman & Hall, London
Simonoff JS (1996) Smoothing methods in statics. Springer, Berlin
Wand MP (1992) Error analysis for general multivariate kernel estimators. J Nonparametr Stat 2:2–15
Wand MP, Jones MC (1993) Comparison of smoothing parameterizations in bivariate kernel density estimation. J Am Stat Assoc 88:520–528
Wand MP, Jones MC (1994) Multivariate plug-in bandwidth selection. Comput Stat 9:97–117
Wand MP, Jones MC (1995) Kernel smoothing. Chapman & Hall, London
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chacón, J.E., Duong, T. Multivariate plug-in bandwidth selection with unconstrained pilot bandwidth matrices. TEST 19, 375–398 (2010). https://doi.org/10.1007/s11749-009-0168-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-009-0168-4
Keywords
- Asymptotic MISE
- Multivariate kernel density estimation
- Plug-in method
- Pre-sphering
- Unconstrained bandwidth selectors