Recent advances in statistical methodologies in evaluating program for high-dimensional data

Zhan, Ming-feng; Cai, Zong-wu; Fang, Ying; Lin, Ming

doi:10.1007/s11766-022-4489-3

Recent advances in statistical methodologies in evaluating program for high-dimensional data

Open access
Published: 17 March 2022

Volume 37, pages 131–146, (2022)
Cite this article

Download PDF

You have full access to this open access article

Applied Mathematics-A Journal of Chinese Universities Aims and scope Submit manuscript

Recent advances in statistical methodologies in evaluating program for high-dimensional data

Download PDF

Ming-feng Zhan¹,
Zong-wu Cai²,
Ying Fang^1,3 &
…
Ming Lin^1,3

419 Accesses
Explore all metrics

Abstract

The era of big data brings opportunities and challenges to developing new statistical methods and models to evaluate social programs or economic policies or interventions. This paper provides a comprehensive review on some recent advances in statistical methodologies and models to evaluate programs with high-dimensional data. In particular, four kinds of methods for making valid statistical inferences for treatment effects in high dimensions are addressed. The first one is the so-called doubly robust type estimation, which models the outcome regression and propensity score functions simultaneously. The second one is the covariate balance method to construct the treatment effect estimators. The third one is the sufficient dimension reduction approach for causal inferences. The last one is the machine learning procedure directly or indirectly to make statistical inferences to treatment effect. In such a way, some of these methods and models are closely related to the de-biased Lasso type methods for the regression model with high dimensions in the statistical literature. Finally, some future research topics are also discussed.

Article PDF

Predictive power of composite socioeconomic indices for targeted programs: principal components and partial least squares

Article 28 December 2023

Comparison of Covariate Balance Weighting Methods in Estimating Treatment Effects

Article 20 June 2022

Doubly Robust Estimation of Treatment Effects from Observational Multilevel Data

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

S Athey, G W Imbens, S Wager. Approximate residual balancing: debiased inference of average treatment effects in high dimensions, Journal of the Royal Statistical Society, Series B, 2018, 80(4): 597–623.
Article MathSciNet Google Scholar
S Athey, J Tibshirani, S Wager. Generalized random forests, Annals of Statistics, 2019, 47(2): 1148–1178.
Article MathSciNet Google Scholar
A Belloni, V Chernozhukov. Least squares after model selection in high-dimensional sparse models, Bernoulli, 2013, 19(2): 521–547.
Article MathSciNet Google Scholar
A Belloni, V Chernozhukov, C Hansen. Inference on treatment effects after selection among high-dimensional controls, Review of Economic Studies, 2014, 81(2): 608–650.
Article MathSciNet Google Scholar
A Belloni, V Chernozhukov, I Fernández-Val, C Hansen. Program evaluation and causal inference with high-dimensional data, Econometrica, 2017, 85(1): 233–298.
Article MathSciNet Google Scholar
L Breiman. Bagging predictors, Machine Learning, 1996, 24(2): 123–140.
MATH Google Scholar
L Breiman. Random forests, Machine Learning, 2001, 45(1): 5–32.
Article Google Scholar
Z Cai. Recent developments in estimating treatment effects for panel data, China Journal of Econometrics, 2021, 1(2): 233–249.
Google Scholar
C Carvalho, R Masini, M C Medeiros. ArCo: An artificial counterfactual approach for high-dimensional panel time-series data, Journal of Econometrics, 2018, 207(2): 352–380.
Article MathSciNet Google Scholar
G Cerulli. Econometric Evaluation of Socio-Economic Programs. Advanced Studies in Theoretical and Applied Econometrics, Berlin Heidelber: Springer, 2015, 49.
Book Google Scholar
K C G Chan, S C P Yam, Z Zhang. Globally efficient nonparametric inference of average treatment effects by empirical balancing calibration weighting, Journal of the Royal Statistical Society, Series B, 2016, 78(3): 673–700.
Article MathSciNet Google Scholar
L Chen, J Z Huang. Sparse reduced-rank regression for simultaneous dimension reduction and variable selection, Journal of the American Statistical Association, 2012, 107(500): 1533–1545.
Article MathSciNet Google Scholar
M H Farrell. Robust inference on average treatment effects with possibly more covariates than observations, Journal of Econometrics, 2015, 189(1): 1–23.
Article MathSciNet Google Scholar
J Fan, M Zhan, Z Cai, Y Fang, M Lin. Covariate balancing in propensity score estimation with variable selection: Based on GMM-LASSO approach, Systems Engineering — Theory & Practice, 2021, 41(10): 2631–2639.
Google Scholar
C Hsiao, S H Ching, K S Wan. A panel data approach for program evaluation: measuring the benefits of political and economic integration of Hong kong with mainland China, Journal of Applied Econometrics, 2012, 27(5): 705–740.
Article MathSciNet Google Scholar
M Y Huang, K C G Chan. Joint sufficient dimension reduction and estimation of conditional and average treatment effects, Biometrika, 2017, 104(3): 583–596.
Article MathSciNet Google Scholar
K Imai, M Ratkovic. Covariate balancing propensity score, Journal of the Royal Statistical Society, Series B, 2014, 76(1): 243–263.
Article MathSciNet Google Scholar
G W Imbens, J M Wooldridge. Recent developments in the econometrics of program evaluation, Journal of Economic Literature, 2009, 47(1): 5–86.
Article Google Scholar
A Javanmard, A Montanari. Confidence intervals and hypothesis testing for high-dimensional regression, Journal of Machine Learning Research, 2014, 15(1): 2869–2909.
MathSciNet MATH Google Scholar
Z Liu, Z Cai, Y Fang, M Lin. Statistical analysis and evaluation of macroeconomic policies: a selective review, Applied Mathematics — A Journal of Chinese Universities, 2020, 35(1): 57–83.
Article MathSciNet Google Scholar
W Luo, Y Zhu, D Ghosh. On estimating regression-based causal effects using sufficient dimension reduction, Biometrika, 2017, 104(1): 51–65.
MathSciNet MATH Google Scholar
S Ma, L Zhu, Z Zhang, C L Tsai, R J Carroll. A robust and efficient approach to causal inference based on sparse sufficient dimension reduction, Annals of Statistics, 2019, 47(3): 1505–1535.
Article MathSciNet Google Scholar
Y Ning, S Peng, K Imai. Robust estimation of causal effects via a high-dimensional covariate balancing propensity score, Biometrika, 2020, 107(3): 533–554.
Article MathSciNet Google Scholar
P R Rosenbaum, D B Rubin. The central role of the propensity score in observational studies for causal effects, Biometrika, 1983, 70(1): 41–55.
Article MathSciNet Google Scholar
D B Rubin. Matching to remove bias in observational studies, Biometrics, 1973, 29(1), 159–183.
Google Scholar
D B Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies, Journal of Educational Psychology, 1974, 66(5): 688–701
Article Google Scholar
D B Rubin. Assignment to treatment group on the basis of a covariate, Journal of Educational Statistics, 1977, 2(1): 1–26.
Article Google Scholar
P H Sant’Anna, X Song, Q Xu. Covariate distribution balance via propensity scores, 2020, arXiv preprint arXiv:1810.01370v4.
Z Shi, J Huang. Forward-selected panel data approach for program evaluation, Journal of Econometrics, 2021, https://doi.org/10.1016/j.jeconom.2021.04.009.
Z Tan. Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data, Biometrika, 2020, 107(1): 137–158.
Article MathSciNet Google Scholar
Z Tan. Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data, Annals of Statistics, 2020b, 48(2): 811–837.
Article MathSciNet Google Scholar
R Tibshirani. Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B, 1996, 58(1): 267–288.
MathSciNet MATH Google Scholar
S Van de Geer, P Bühlmann, Y Ritov, R Dezeure. On asymptotically optimal confidence regions and tests for high-dimensional models, Annals of Statistics, 2014, 42(3): 1166–1202.
Article MathSciNet Google Scholar
S Wager, S Athey. Estimation and inference of heterogeneous treatment effects using random forests, Journal of the American Statistical Association, 2018, 113(523): 1228–1242.
Article MathSciNet Google Scholar
Y Xia, H Tong, W K Li, L X Zhu. An adaptive estimation of dimension reduction space, Journal of the Royal Statistical Society, Series B, 2002, 64(3): 363–410.
Article MathSciNet Google Scholar
M Yuan, Y Lin. Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society, Series B, 2006, 68(1): 49–67.
Article MathSciNet Google Scholar
C H Zhang, S S Zhang. Confidence intervals for low dimensional parameters in high dimensional linear models, Journal of the Royal Statistical Society: Series B, 2014, 76(1): 217–242.
Article MathSciNet Google Scholar
H Zou, T Hastie. Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society, Series B, 2005, 67(2): 301–320.
Article MathSciNet Google Scholar
J R Zubizarreta. Stable weights that balance covariates for estimation with incomplete outcome data, Journal of the American Statistical Association, 2015, 110(511): 910–922.
Article MathSciNet Google Scholar

Download references

Acknowledgement

The authors thank the editor and two anonymous referees for their helpful and constructive comments, which have improved the presentation of this article.

Funding

Supported by the National Natural Science Foundation of China(71631004, 72033008), National Science Foundation for Distinguished Young Scholars(71625001), and Science Foundation of Ministry of Education of China(19YJA910003).

Author information

Authors and Affiliations

Wang Yanan Institute for Studies in Economics and Fujian Key Laboratory of Statistical Sciences, Xiamen University, Xiamen, 361005, China
Ming-feng Zhan, Ying Fang & Ming Lin
Department of Economics, University of Kansas, Lawrence, KS, 66045, USA
Zong-wu Cai
Department of Statistics and Data Science, Xiamen University, Xiamen, 361005, China
Ying Fang & Ming Lin

Authors

Ming-feng Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Zong-wu Cai
View author publications
You can also search for this author in PubMed Google Scholar
Ying Fang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming Lin.

Rights and permissions

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the articles Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articles Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhan, Mf., Cai, Zw., Fang, Y. et al. Recent advances in statistical methodologies in evaluating program for high-dimensional data. Appl. Math. J. Chin. Univ. 37, 131–146 (2022). https://doi.org/10.1007/s11766-022-4489-3

Download citation

Received: 21 June 2021
Revised: 20 August 2021
Published: 17 March 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11766-022-4489-3

MR Subject Classification

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Recent advances in statistical methodologies in evaluating program for high-dimensional data

Abstract

Article PDF

Similar content being viewed by others

Predictive power of composite socioeconomic indices for targeted programs: principal components and partial least squares

Comparison of Covariate Balance Weighting Methods in Estimating Treatment Effects

Doubly Robust Estimation of Treatment Effects from Observational Multilevel Data

References

Acknowledgement

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

MR Subject Classification

Keywords

Navigation

Recent advances in statistical methodologies in evaluating program for high-dimensional data

Abstract

Article PDF

Similar content being viewed by others

Predictive power of composite socioeconomic indices for targeted programs: principal components and partial least squares

Comparison of Covariate Balance Weighting Methods in Estimating Treatment Effects

Doubly Robust Estimation of Treatment Effects from Observational Multilevel Data

Explore related subjects

References

Acknowledgement

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

MR Subject Classification

Keywords

Search

Navigation