Skip to main content
Log in

Standard Errors for Regression-Based Causal Effect Estimates in Economics Using Numerical Derivatives

  • Published:
Computational Economics Aims and scope Submit manuscript

Abstract

The aim of nearly all empirical studies in economics is to provide scientific evidence that can be used to assess policy relevant cause-and-effect. In the context of the general potential outcomes framework, we review how a causal effect parameter can be rigorously but tractably specified, identified and estimated along with its asymptotic standard error. For cases in which the analytic and computational requirements for calculation of the ASE are challenging, we suggest the use of numerical derivatives (ND). We detail the specific type of ND software required for this purpose, and note that it is offered as a feature in most statistical packages. As an illustration, we analyze the causal effect of wife's high school graduation on family size using the Stata/Mata deriv command. Code for this example is supplied in an appendix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. See any graduate econometrics text under “marginal effects” or “average partial effects.” For more detailed discussions see Terza (2016a, b, 2017), Basu and Rathouz (2005) and Dowd et al. (2014).

  2. Derivations using asymptotic first principles are also offered by Basu and Rathouz (2005) and Wooldridge (2011, pp. 184–186).

  3. Conway and Maxwell (1962).

  4. X is the global symbol replacing the phrase “presumably causal entity of interest” and Y is the global symbol replacing the phrase “outcome of interest.”.

  5. Terza (2020) reviews the GPOF.

  6. Here we use the prefixes “pre-” and “post-” in reference to the change in the X to be hypothetically mandated as part of the relevant counterfactual query.

  7. The DGP is the joint distribution from which the observable data can be sampled.

  8. An entity (e.g., a parameter) is identified if its value would be implied by hypothetical knowledge of relevant aspects of the DGP.

  9. The primitive conditions under which (4) holds are discussed in detail in Terza (2020).

  10. Examples include: the deriv command in Stata/Mata 18® (StataCorp, 2023); the numDeriv package in R (Gilbert and Varadhan, 2022); the NLPFDD package in SAS/IML (SAS Institute, 2023); the hessp and gradp commands in GAUSS (Aptech Systems, 2023); and the DERIVEST suite in Matlab (D’Errico, 2023).

  11. Here we assume that for the pre- scenario of the relevant counterfactual we are setting the education level for individual ω in the population to be equal to the individual’s observable level at the time of the survey – thus, one may write \({\text{X}}^{\text{c}}\left({\upomega}\right)={\text{X}}\left({\upomega}\right)\). We re-emphasize here, however, that \({\text{X}}^{\text{c}}\left({\upomega}\right)\) is not a random variable.

  12. Methodological works related to the CMP include: Conway and Maxwell 1962, Shmueli et al. (2005), Sellers and Shmueli (2010), Daley and Gaunt (2016), Huang (2017), Forthmann et al. (2020), Sellers (2023). Applications of the CMP include: Ghorbani et al. (2023), Shirani-Bidabadi et al. (2020), Yan et al. (2019), and Lord et al. (2008), Bogomolovas et al. (2023), Wimmer et al. (1994), Borle et al. (2007), and Fraser (2020).

  13. We coded this estimator in Stata/Mata 18® (StataCorp, 2023), relying mainly on the Mata moptimize function.

  14. The full Stata/Mata 18(R) code is given in the appendix.

  15. Asymptotic t-statistic was calculated as (AIE Estimate)/ASE, where ASE is the relevant version of the square root of (15) for the CMP.

References

  • Aptech Systems (2023). GAUSS Command Reference: Mathematical Functions—Differentiation and Integration.

  • Basu, A., & Rathouz, P. J. (2005). Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics, 6, 93–109.

    Article  Google Scholar 

  • Bogomolovas, J., Zhang, Z., Wu, T., & Chen, J. (2023). Automated quantification and statistical assessment of proliferating cardiomyocyte rates in embryonic hearts. American Journal of Physiology. Heart and Circulatory Physiology, 324(3), H288-h292. https://doi.org/10.1152/ajpheart.00483.2022

    Article  Google Scholar 

  • Borle, S., Dholakia, U. M., Singh, S. S., & Westbrook, R. A. (2007). The impact of survey participation on subsequent customer behavior: An empirical investigation. Marketing Science, 26(5), 711–726. https://doi.org/10.1287/mksc.1070.0268

    Article  Google Scholar 

  • Conway, R. W., & Maxwell, W. L. (1962). A queuing model with state dependent service rates. Journal of Industrial Engineering, 12, 132–136.

    Google Scholar 

  • Daley, F., & Gaunt, R. E. (2016). The Conway-Maxwell-Poisson distribution: distributional theory and approximation. Latin American Journal of Probability and Mathematical Statistics, 13(2), 635–658.

    Article  Google Scholar 

  • D’Errico, J. (2023). Adaptive Robust Numerical Differentiation,” MATLAB Central File Exchange. (https://www.mathworks.com/matlabcentral/fileexchange/13490-adaptive-robust-numerical-differentiation).

  • Dowd, B. E., Greene, W. H., & Norton, E. C. (2014). Computation of Standard Errors. Health Services Research, 49, 731–750.

    Article  Google Scholar 

  • Forthmann, B., Gühne, D., & Doebler, P. (2020). Revisiting dispersion in count data item response theory models: The Conway–Maxwell–Poisson counts model. British Journal of Mathematical and Statistical Psychology, 73(S1), 32–50.

    Article  Google Scholar 

  • Fraser, T. (2020). Japan’s resilient, renewable cities: How socioeconomics and local policy drive Japan’s renewable energy transition. Environmental Politics, 29(3), 500–523. https://doi.org/10.1080/09644016.2019.1589037

    Article  Google Scholar 

  • Ghorbani, M., Saffarzadeh, M., & Naderan, A. (2023). Crash prediction modeling for horizontal curves on two-lane, two-way rural highways based on consistency and self-explaining characteristics using zero-truncated data. KSCE Journal of Civil Engineering, 27(8), 3567–3580. https://doi.org/10.1007/s12205-023-0501-6

    Article  Google Scholar 

  • Gilbert, P., & Varadhan, R. (2022). Package ‘numDeriv’. Comprehensive R Archive Network.

  • Huang, A. (2017). Mean-parametrized Conway–Maxwell–Poisson regression models for dispersed counts. Statistical Modelling, 17(6), 359–380.

    Article  Google Scholar 

  • Lord, D., Guikema, S. D., & Geedipally, S. R. (2008). Application of the Conway–Maxwell–Poisson generalized linear model for analyzing motor vehicle crashes. Accident Analysis & Prevention, 40(3), 1123–1134.

    Article  Google Scholar 

  • Newey, W. K., & McFadden, D. L. (1994). Large sample estimation and hypothesis testing. In R. F. Engle & D. L. McFadden (Eds.), Handbook of Econometrics (pp. 2111–2245). Elsevier Science B.V.

    Google Scholar 

  • SAS Institute (2023): SAS/IML User’s Guide, p. 856.

  • Sellers, K. F., & Shmueli, G. (2010). A flexible regression model for count data. The Annals of Applied Statistics, 4(2), 943–961.

    Article  Google Scholar 

  • Sellers, K. F. (2023). The Conway–Maxwell–Poisson Distribution”. Cambridge University Press.

    Book  Google Scholar 

  • Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S., & Boatwright, P. (2005). A useful distribution for fitting discrete data: Revival of the Conway–Maxwell–Poisson distribution. Journal of the Royal Statistical Society: Series C (applied Statistics), 54(1), 127–142.

    Google Scholar 

  • Shirani-Bidabadi, N., Mallipaddi, N., Haleem, K., & Anderson, M. (2020). Developing bicycle-vehicle crash-specific safety performance functions in Alabama using different techniques. Accident Analysis and Prevention, 146, 105735. https://doi.org/10.1016/j.aap.2020.105735

    Article  Google Scholar 

  • StataCorp (2023). Stata: Release 18. Statistical Software, StataCorp LLC.

  • Terza, J. (2016a). Inference using sample means of parametric nonlinear data transformations. Health Services Research, 51, 1109–1113.

    Article  Google Scholar 

  • Zerza, J. (2016b). Supplementary appendix to ‘inference using sample means of parametric nonlinear data transformations.’ Health Services Research. https://doi.org/10.1111/1475-6773.12494

    Article  Google Scholar 

  • Terza, J. (2017). Causal effect estimation and inference using stata. The Stata Journal, 17, 939–961.

    Article  Google Scholar 

  • Terza, J. (2020). Regression-based causal analysis from the potential outcomes perspective. Journal of Econometric Methods, 9(1), 20180030. https://doi.org/10.1515/jem-2018-0030

    Article  Google Scholar 

  • Wang, W., & Famoye, F. (1997). Modeling household fertility decisions with generalized Poisson regression. Journal of Population Economics, 10, 273–283.

    Article  Google Scholar 

  • White, H. (1994). Estimation. Cambridge University Press.

    Google Scholar 

  • Wimmer, G., Köhler, R., Grotjahn, R., & Altmann, G. (1994). Towards a theory of word length distribution. Journal of Quantitative Linguistics, 1(1), 98–106. https://doi.org/10.1080/09296179408590003

    Article  Google Scholar 

  • Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data (2nd ed.). MIT Press.

    Google Scholar 

  • Wooldridge, J. M. (2011). Solutions manual and supplementary materials for econometric analysis of cross section and panel data (2nd ed.). MIT Press.

    Google Scholar 

  • Yan, X. C., Wang, T., Chen, J., Ye, X. F., Yang, Z., & Bai, H. (2019). Analysis of the characteristics and number of bicycle-passenger conflicts at bus stops for improving safety. Sustainability. https://doi.org/10.3390/su11195263

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Institute on Alcohol Abuse and Alcoholism under subcontract to the Alcohol Research Group (Center Grant #P50 AA05595-41).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joseph V. Terza.

Ethics declarations

Conflict of interest

The author has no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Full Stata/Mata Stata/Mata 18® Code

Appendix: Full Stata/Mata Stata/Mata 18® Code

figure d
figure e
figure f
figure g
figure h
figure i
figure j
figure k

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Terza, J.V. Standard Errors for Regression-Based Causal Effect Estimates in Economics Using Numerical Derivatives. Comput Econ (2024). https://doi.org/10.1007/s10614-024-10565-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10614-024-10565-w

Keywords

Navigation