Noncentralities Induced in Regression Diagnostics
Anomalies persist in the use of deletion diagnostics in regression. Tests for outliers under subset deletions utilize the R-Fisher FI statistics, each having a noncentral F-distribution with noncentrality parameter λ as a function of shifts only at deleted rows in the index set I. Numerous studies examine empirical outcomes of these diagnostics in random experiments. In contrast, studies here are probabilistic, examining distributions behind those empirical outcomes and tracking the effects of shifts at nondeleted rows. By allowing shifts at nondeleted rows in a set J, in addition to traditional shifts at deleted rows in I, FI is shown to have a doubly noncentral F-distribution. By removing the unnecessary restriction that shifts occur only at deleted rows, these findings support constructs akin to power curves in tracking probabilities of masking or swamping as shifts evolve. In addition, “regression effects” among outliers may have unforeseen consequences. A dichotomy of shifts is discovered as projections into the “regressor” and “error” spaces of a model. Hidden shifts at nondeleted rows can obfuscate not only meanings ascribed to traditional outlier diagnostics, but also to subset influence diagnostics corresponding one-to-one with FI. In short, despite wide usage abetted by software support, deletion diagnostics in current vogue no longer can be recommended to achieve objectives traditionally cited. Case studies illustrate the debilitating effects of these anomalies in practice, together with conclusions misleading to prospective users.
KeywordsSubset leverages Coleverages Vector outliers Regression diagnostics
AMS Subject Classification62J05 62J20
Unable to display preview. Download preview PDF.
- Jensen, D. R., and D. E. Ramirez. 1996. Computing the CDF of Cook’s DI statistic. In Proceedings of the 12th Symposium in Computational Statistics ed. A. Prat, and E. Ripoll, 65–66. Barcelona, Spain: Institut d’Estadistica de Catalunya.Google Scholar
- Myers, R. H. 1990. Classical and modern regression with applications, 2nd ed. Boston, MA: PWS-KENT.Google Scholar