Abstract
In this paper we show that estimated sufficient summary plots can be greatly improved when the dimension reduction estimates are adjusted according to minimization of an objective function. The dimension reduction methods primarily considered are ordinary least squares, sliced inverse regression, sliced average variance Estimates and principal Hessian directions. Some consideration to minimum average variance estimation is also given. Simulations support the usefulness of the approach and three data sets are considered with an emphasis on two- and three-dimensional estimated sufficient summary plots.
Similar content being viewed by others
References
Adragni KP, Raim A (2014) ldr: an R Software package for likelihood-based sufficient dimension reduction. J Stat Softw 61:1–21
Castlehouse H (2008) The biogeochemical controls on arsenic mobilisation in a geogenic arsenic rich soil. PhD Dissertation, University of Sheffield
Chang J, Olive DJ (2007) Resistant dimension reduction, (Culver Stockton College and Southern Illinois University, unpublished, 2007). www.math.siu.edu/olive/ppresdr.pdf
Cook RD (1996) Graphics for regressions with a binary response. J Am Stat Assoc 91:983–992
Cook RD (1998a) Principal Hessian directions revisited. J Am Stat Assoc 93:84–100 with comments by Ker-Chau Li and a rejoinder by the author
Cook RD (1998b) Regression graphics: ideas for studying regressions through graphics. Wiley, New York
Cook RD (2007) Fisher lecture: dimension reduction in regression. Stat Sci 22:1–26
Cook RD, Forzani L (2008) Principal fitted components for dimension reduction in regression. Stat Sci 23:485–501
Cook R, Weisberg S (1991) Comment on “Sliced inverse regression for dimension reduction” by K.-C. Li. J Am Stat Assoc 86:328–332
Enz R (1991) Prices and earnings around the globe. Union Bank of Switzerland, Zurich
Garnham AL, Prendergast LA (2013) A note on least squares sensitivity in single-index model estimation and the benefits of response transformations. Electron J Stat 7:1983–2004
Gather U, Hilker T, Becker C (2001) A robustified version of sliced inverse regression. In: Statistics in genetics and in the environmental sciences (Ascona, 1999). Trends Math., Birkhäuser, Basel, pp 147–157
Gather U, Hilker T, Becker C (2002) A note on outlier sensitivity of sliced inverse regression. Statistics 36:271–281
Huber PJ (1964) Robust estimation of a location parameter. Ann Math Stat 35:73–101
Huber PJ (1973) Robust regression: asymptotics, conjectures and Monte Carlo. Ann Stat 1:799–821
Li KC (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86:316–342 with discussion and a rejoinder by the author
Li KC (1992) On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J Am Stat Assoc 87:1025–1039
Li KC, Duan N (1989) Regression analysis under link violation. Ann Stat 17:1009–1052
Li L, Li B, Zhu LX (2010) Groupwise dimension reduction. J Am Stat Assoc 105:1188–1201
Lin L (1989) A concordance correlation coefficient to evaluate reproducibility. Biometrics 45:255–268
Liquet B, Saracco J (2012) A graphical tool for selecting the number of slices and the dimension of the model in SIR and SAVE approaches. Comput Stat 27:103–125
Lue HH (2001) A study of sensitivity analysis on the method of principal Hessian directions. Comput Stat 16:109–130
Prendergast LA (2007) Implications of influence function analysis for sliced inverse regression and sliced average variance estimation. Biometrika 94:585–601
Prendergast LA (2008) Trimming influential observations for improved single-index model estimated sufficient summary plots. Comput Stat Data Anal 52:5319–5327
Prendergast LA, Smith JA (2010) Influence functions for dimension reduction methods: an example influence study of principal Hessian direction analysis. Scand J Stat 37(4):588–611
R Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org
Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Maechler M (2011) robustbase: basic robust statistics. R package version 0.8-0. http://CRAN.R-project.org/package=robustbase
Shaker AJ, Prendergast LA (2011) Iterative application of dimension reduction methods. Electron J Stat 5:1471–1494
Sheather SJ (2009) A modern approach to regression with R. Springer, New York
Tryfos P (1998) Methods for business analysis and forecasting: text & cases. Wiley, New York
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th ed. Springer, New York. http://www.stats.ox.ac.uk/pub/MASS4
Weisberg S (2002) Dimension reduction regression in R. J Stat Softw 7:1–22
Xia Y, Tong H, Li WK, Zhu LX (2002) An adaptive estimation of dimension reduction space. J R Stat Soc Ser B Stat Methodol 64:363–410
Acknowledgments
The authors are very thankful for the useful comments and suggestions offered by the Editor, Associate Editor and an anonymous referee. The resulting changes have lead to a vastly improved manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendix: Computing details
Appendix: Computing details
In this section we detail the use of R which was used to compute all results. The analyses that have been carried out in the previous sections using the freely available statistical software package R version 2.13.1 (R Core Team 2014, see R: A Language and Environment for Statistical Computing by). The tolerance level for exiting the iterative process was set at \(\text {tol}=1\times 10^{-5}\) although a maximum iterations of 100 was also enforced.
Initial SIR, SAVE and pHD estimates were obtained using the dr package (Weisberg 2002). For SIR and SAVE we did not specify the slicing parameter H and instead used the package default of \(H=\max (8,p + 3)\). When robust estimates were obtained in Steps 0.2 and i.2, the function rlm from the ‘MASS’ package (Venables and Ripley 2002) with the Huber weight function as shown in (7) and where the default choice of \(c=1.345\sigma \) was used where \(\sigma \) is replaced with a robust estimate of the residual standard deviation (the median absolute deviation estimate). Otherwise ordinary least squares was utilized. For \(\rho (r)=r^2\) (non-robust), in Step i.1 non-linear least squares was carried out using the function nls with the Gauss-Newton algorithm. For the robust equivalent, robust non-linear least squares was employed via the nlrob function from the ‘robustbase’ package (Rousseeuw et al. 2011) and again with the Huber weight function. Polynomial transformation of the \(\widehat{\mathbf {{B}}}^\top {\mathbf {x}}_i\)’s in R is routine using the poly function and setting the optional argument raw to TRUE to ensure that the dimension reduced regressors are transformed such that they are original polynomial terms and not orthogonal terms.
With respect to the spline fitting in Sect. 2.3, for the minimization we used the nlminb function available through the R stats package (available as standard in the R base distribution) whose arguments were the initial estimates based on either OLS, SIR, SAVE of pHd and the objective function to be minimized. A cubic smooth spline was fitted using the R function smooth.spline.
For MAVE, we used functionality provided in support of work from Li et al. (2010) found at http://www4.stat.ncsu.edu/~li/software/GroupDR.R. This functionality uses SIR to initialize the estimates and we used the default tolerance and kernel settings. In the final example where the method PFC was employed, we used the R package ldr (Adragni and Raim 2014) which includes functionality for several likelihood-based dimension reduction methods. We used the same arguments in the call to the PFC functionality for this example as was found in the paper.
Rights and permissions
About this article
Cite this article
Prendergast, L.A., Healey, A.F. Improving estimated sufficient summary plots in dimension reduction using minimization criteria based on initial estimates. Comput Stat 31, 899–922 (2016). https://doi.org/10.1007/s00180-015-0614-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-015-0614-6