Abstract
An influence measure for investigating the influence of deleting an observation in linear regression is proposed based on geometric thoughts of the sampling distribution of the distance between two estimators of regression coefficients computed with and without a single specific observation. The covariance matrix of the above sampling distribution plays a key role in deriving the influence measure. It turns out that geometrically, this distance is distributed entirely along the axis associated with the nonnull eigenvalue of the covariance matrix. The deviation of the regression coefficients computed without an observation from the regression coefficients computed with the full data is reflected in the eigenvalue of the covariance matrix which can be used for investigating the influence. The distance is normalized using the associated covariance matrix and this normalized distance turns out to be the square of internally studentized residuals. Illustrative examples for showing the effectiveness of the influence measure proposed here are given. In judging the influence of observations on the least squares estimates of regression coefficients, Cook’s distance does not work well for one example and therefore we should be cautious about a blind use of the Cook’s distance.
Similar content being viewed by others
References
Barnett V, Lewis T (1994) Outliers in statistical data, 3rd edn. Wiley, New York
Chatterjee S, Hadi AS (1988) Sensitivity analysis in linear regression. Wiley, New York
Cook RD (1977) Detection of influential observation in linear regression. Technometrics 19:15–18
Cook RD, Weisberg S (1982) Residuals and influence in regression. Chapman and Hall, New York
Draper NR, Smith S (1981) Applied regression analysis, 2nd edn. Wiley, New York
Ellenberg JH (1973) The joint distribution of the standardized least squares residuals from a general linear regression. J Am Stat Assoc 68:941–943
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, Waltham
Miller RG (1974) An unbiased jackknife. Ann Stat 2:880–891
Schott JR (1997) Matrix analysis for statistics. Wiley, New York
Seber GAF (1977) Linear regression analysis. Wiley, New York
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kim, M.G. Influence measure based on probabilistic behavior of regression estimators. Comput Stat 30, 97–105 (2015). https://doi.org/10.1007/s00180-014-0524-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-014-0524-z