Abstract
One fundamental concept in regression is dimension reduction, the basic idea being to reduce the dimension of the predictor space without loss of information on the regression. To avoid the curse of dimensionality, many methods in this field restrict attention to inverse reduction in the framework of inverse regression. This review focuses on model-based inverse regression. First, we consider sufficient reduction for multivariate count data in different contexts, on the basis of the multinomial distribution and its generalizations. Second, we take a different perspective on model-based inverse reduction. Sufficient reduction is achieved in the dual sample-based space, rather than in the primal predictor-based space. The results extend the known duality between principal component analysis and principal coordinate analysis. Finally, we consider an application of inverse modeling to testing the independence between the microbiome composition and a continuous outcome. An adaptive test is presented based on a dynamic slicing technique.
Tao Wang was supported in part by National Natural Science Foundation of China (11971017), National Key R&D Program of China (2018YFC0910500), Shanghai Municipal Science and Technology Major Project (2017SHZDZX01), SJTU Trans-med Awards Research Young Faculty Grant (YG2019QNA26, YG2019QNA37), and Neil Shen’s SJTU Medical Research Fund. Lixing Zhu was supported by a grant from the University Grants Council of Hong Kong.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
K.P. Adragni, R.D. Cook, Sufficient dimension reduction and prediction in regression. Philos. Trans. R. Soc. A 367(1906), 4385–4405 (2009)
J. Aitchison, The Statistical Analysis of Compositional Data (Chapman and Hall, London, 1986)
A. Antoniadis, S. Lambertlacroix, F. Leblanc, Effective dimension reduction methods for tumor classification using gene expression data. Bioinformatics 19(5), 563–570 (2003)
D. Billheimer, P. Guttorp, W.F. Fagan, Statistical interpretation of species composition. J. Am. Stat. Assoc. 96(456), 1205–1214 (2001)
E. Bura, R.D. Cook, Estimating the structural dimension of regressions via parametric inverse regression. J. R. Stat. Soc. Ser. B 63(2), 393–410 (2001)
E. Bura, L. Forzani, Sufficient reductions in regressions with elliptically contoured inverse predictors. J. Am. Stat. Assoc. 110(509), 420–434 (2015)
E. Bura, R.M. Pfeiffer, Graphical methods for class prediction using dimension reduction techniques on DNA microarray data. Bioinformatics 19(10), 1252–1258 (2003)
E. Bura, S. Duarte, L. Forzani, Sufficient reductions in regressions with exponential family inverse predictors. J. Am. Stat. Assoc. 111(515), 1313–1329 (2016)
F. Chiaromonte, J. Martinelli, Dimension reduction strategies for analyzing global gene expression data with a response. Bellman Prize Math. Biosci. 176(1), 123–144 (2002)
P. Clifford, Markov random fields in statistics, in Disorder in Physical Systems: A Volume in Honour of John M. Hammersley (Clarendon Press, Oxford, 1990)
R.D. Cook, Using dimension-reduction subspaces to identify important inputs in models of physical systems, in Proceedings of the section on Physical and Engineering Sciences (American Statistical Association, Alexandria, VA, 1994), pp. 18–25
R.D. Cook, Regression Graphics: Ideas for Studying Regressions Through Graphics (Wiley, New York, 1998)
R.D. Cook, Fisher lecture: dimension reduction in regression. Stat. Sci. 22(1), 1–26 (2007)
R.D. Cook, Principal components, sufficient dimension reduction, and envelopes. Annu. Rev. Stat. Appl. 5, 533–559 (2018)
R.D. Cook, L. Forzani, Likelihood-based sufficient dimension reduction. J. Am. Stat. Assoc. 104(485), 197–208 (2009)
R.D. Cook, L. Li, Dimension reduction in regressions with exponential family predictors. J. Comput. Graph. Stat. 18(3), 774–791 (2009)
R.D. Cook, L. Ni, Sufficient dimension reduction via inverse regression: a minimum discrepancy approach. J. Am. Stat. Assoc. 100(470), 410–428 (2005)
R.D. Cook, L. Orzani, Principal fitted components for dimension reduction in regression. Stat. Sci. 23(4), 485–501 (2008)
R.D. Cook, S. Weisberg, Comment. J. Am. Stat. Assoc. 86(414), 328–332 (1991)
R.D. Cook, L. Forzani, A.J. Rothman, Estimating sufficient reductions of the predictors in abundant high-dimensional regressions. Ann. Stat. 40(1), 353–384 (2012)
L. Forzani, R.G. Arancibia, P. Llop, D. Tomassi, Supervised dimension reduction for ordinal predictors. Comput. Stat. Data Anal. 125, 136–155 (2018)
J.C. Gower, Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53, 325–338 (1966)
J.C. Gower, Adding a point to vector diagrams in multivariate analysis. Biometrika 55(3), 582–585 (1968)
R. Heller, Y. Heller, S. Kaufman, B. Brill, M. Gorfine, Consistent distribution-free K-sample and independence tests for univariate random variables. J. Mach. Learn. Res. 17(29), 1–54 (2016)
B. Jiang, C. Ye, J.S. Liu, Nonparametric K-sample tests via dynamic slicing. J. Am. Stat. Assoc. 110(510), 642–653 (2015)
P.S. La Rosa, J.P. Brooks, E. Deych, E.L. Boone, D.J. Edwards, Q. Wang, et al., Hypothesis testing and power calculations for taxonomic-based human microbiome data. PLoS One 7(12), e52078 (2012)
S.L. Lauritzen, Graphical Models (Clarendon Press, Oxford, 1996)
K.-Y. Lee, B. Li, F. Chiaromonte, A general theory for nonlinear sufficient dimension reduction: formulation and estimation. Ann. Stat. 41(1), 221–249 (2013)
K.-C. Li, Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86(414), 316–327 (1991)
K.-C. Li, On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J. Am. Stat. Assoc. 87(420), 1025–1039 (1992)
L. Li, Survival prediction of diffuse large-b-cell lymphoma based on both clinical and gene expression information. Bioinformatics 22(4), 466–471 (2006)
B. Li, Sufficient Dimension Reduction: Methods and Applications with R (CRC Press, Boca Raton, 2018a)
L. Li, Sufficient Dimension Reduction. Wiley StatsRef: Statistics Reference Online (2018b)
L. Li, X. Yin, Sliced inverse regression with regularizations. Biometrics 64(1), 124–131 (2008)
B. Li, S. Wang, On directional regression for dimension reduction. J. Am. Stat. Assoc. 102(479), 997–1008 (2007)
B. Li, H. Zha, F. Chiaromonte, Contour regression: a general approach to dimension reduction. Ann. Stat. 33(4), 1580–1616 (2005)
Y. Ma, L. Zhu, A semiparametric approach to dimension reduction. J. Am. Stat. Assoc. 107(497), 168–179 (2012)
Y. Ma, L. Zhu, A review on dimension reduction. Int. Stat. Rev. 81(1), 134–150 (2013)
B.H. Mcardle, M.J. Anderson, Fitting multivariate models to community data: a comment on distance based redundancy analysis. Ecology 82(1), 290–297 (2001)
N. Meinshausen, P. Bühlmann, High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
P.A. Naik, M.R. Hagerty, C. Tsai, A new dimension reduction approach for data-rich marketing environments: sliced inverse regression. J. Market. Res. 37(1), 88–101 (2000)
S.S. Roley, R.M. Newman, Predicting Eurasian watermilfoil invasions in Minnesota. Lake Reserv. Manage. 24(4), 361–369 (2008)
Y. Song, H. Zhao, T. Wang, An adaptive independence test for microbiome community data. Biometrics 76(2), 414–426 (2020)
Q. Sun, R. Zhu, T. Wang, D. Zeng, Counting process-based dimension reduction methods for censored outcomes. Biometrika 106(1), 181–196 (2019)
M. Taddy, Multinomial inverse regression for text analysis. J. Am. Stat. Assoc. 108(503), 755–770 (2013)
M. Taddy, Distributed multinomial regression. Ann. Appl. Stat. 9(3), 1394–1414 (2015)
D. Tomassi, L. Forzani, S. Duarte, R. Pfeiffer, Sufficient dimension reduction for compositional data. Biostatistics (2019). https://doi.org/10.1093/biostatistics/kxz060.
T. Wang, Dimension reduction via adaptive slicing. Stat. Sin. (2019) https://doi.org/10.5705/ss.202019.0102.
T. Wang, Graph-assisted inverse regression for count data and its application to sequencing data. J. Comput. Graph. Stat. 29(3), 444–454 (2020)
T. Wang, P. Xu, On supervised reduction and its dual. Stat. Sin. (2019) https://doi.org/10.5705/ss.202017.0532.
T. Wang, L. Zhu, Sparse sufficient dimension reduction using optimal scoring. Comput. Stat. Data Anal. 57(1), 223–232 (2013)
T. Wang, L. Zhu, Flexible dimension reduction in regression. Stat. Sin. 28(2), 1009–1029 (2018)
T. Wang, X. Guo, L. Zhu, P. Xu, Transformed sufficient dimension reduction. Biometrika 101(4), 815–829 (2014)
T. Wang, M. Chen, H. Zhao, L. Zhu, Estimating a sparse reduction for general regression in high dimensions. Stat. Comput. 28(1), 33–46 (2018)
T. Wang, C. Yang, H. Zhao, Prediction analysis for microbiome sequencing data. Biometrics 75, 875–884 (2019)
Y. Xia, H. Tong, W.K. Li, L. Zhu, An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B 64(3), 363–410 (2002)
F. Xia, J. Chen, W.K. Fung, H. Li, A logistic normal multinomial regression model for microbiome compositional data analysis. Biometrics 69(4), 1053–1063 (2013)
Z. Zhang, D. Yeung, J.T. Kwok, E.Y. Chang, Sliced coordinate analysis for effective dimension reduction and nonlinear extensions. J. Comput. Graph. Stat. 17(1), 225–242 (2012)
W. Zhong, P. Zeng, P. Ma, J.S. Liu, Y. Zhu, RSIR: regularized sliced inverse regression for motif discovery. Bioinformatics 21(22), 4169–4175 (2005)
L. Zhu, T. Wang, L. Zhu, L. Ferré, Sufficient dimension reduction through discretization-expectation estimation. Biometrika 97(2), 295–304 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Wang, T., Zhu, L. (2021). Model-Based Inverse Regression and Its Applications. In: Bura, E., Li, B. (eds) Festschrift in Honor of R. Dennis Cook. Springer, Cham. https://doi.org/10.1007/978-3-030-69009-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-69009-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69008-3
Online ISBN: 978-3-030-69009-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)