Abstract
The aim of nearly all empirical studies in economics is to provide scientific evidence that can be used to assess policy relevant cause-and-effect. In the context of the general potential outcomes framework, we review how a causal effect parameter can be rigorously but tractably specified, identified and estimated along with its asymptotic standard error. For cases in which the analytic and computational requirements for calculation of the ASE are challenging, we suggest the use of numerical derivatives (ND). We detail the specific type of ND software required for this purpose, and note that it is offered as a feature in most statistical packages. As an illustration, we analyze the causal effect of wife's high school graduation on family size using the Stata/Mata deriv command. Code for this example is supplied in an appendix.
Similar content being viewed by others
Notes
Conway and Maxwell (1962).
X is the global symbol replacing the phrase “presumably causal entity of interest” and Y is the global symbol replacing the phrase “outcome of interest.”.
Terza (2020) reviews the GPOF.
Here we use the prefixes “pre-” and “post-” in reference to the change in the X to be hypothetically mandated as part of the relevant counterfactual query.
The DGP is the joint distribution from which the observable data can be sampled.
An entity (e.g., a parameter) is identified if its value would be implied by hypothetical knowledge of relevant aspects of the DGP.
Here we assume that for the pre- scenario of the relevant counterfactual we are setting the education level for individual ω in the population to be equal to the individual’s observable level at the time of the survey – thus, one may write \({\text{X}}^{\text{c}}\left({\upomega}\right)={\text{X}}\left({\upomega}\right)\). We re-emphasize here, however, that \({\text{X}}^{\text{c}}\left({\upomega}\right)\) is not a random variable.
Methodological works related to the CMP include: Conway and Maxwell 1962, Shmueli et al. (2005), Sellers and Shmueli (2010), Daley and Gaunt (2016), Huang (2017), Forthmann et al. (2020), Sellers (2023). Applications of the CMP include: Ghorbani et al. (2023), Shirani-Bidabadi et al. (2020), Yan et al. (2019), and Lord et al. (2008), Bogomolovas et al. (2023), Wimmer et al. (1994), Borle et al. (2007), and Fraser (2020).
We coded this estimator in Stata/Mata 18® (StataCorp, 2023), relying mainly on the Mata moptimize function.
The full Stata/Mata 18(R) code is given in the appendix.
Asymptotic t-statistic was calculated as (AIE Estimate)/ASE, where ASE is the relevant version of the square root of (15) for the CMP.
References
Aptech Systems (2023). GAUSS Command Reference: Mathematical Functions—Differentiation and Integration.
Basu, A., & Rathouz, P. J. (2005). Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics, 6, 93–109.
Bogomolovas, J., Zhang, Z., Wu, T., & Chen, J. (2023). Automated quantification and statistical assessment of proliferating cardiomyocyte rates in embryonic hearts. American Journal of Physiology. Heart and Circulatory Physiology, 324(3), H288-h292. https://doi.org/10.1152/ajpheart.00483.2022
Borle, S., Dholakia, U. M., Singh, S. S., & Westbrook, R. A. (2007). The impact of survey participation on subsequent customer behavior: An empirical investigation. Marketing Science, 26(5), 711–726. https://doi.org/10.1287/mksc.1070.0268
Conway, R. W., & Maxwell, W. L. (1962). A queuing model with state dependent service rates. Journal of Industrial Engineering, 12, 132–136.
Daley, F., & Gaunt, R. E. (2016). The Conway-Maxwell-Poisson distribution: distributional theory and approximation. Latin American Journal of Probability and Mathematical Statistics, 13(2), 635–658.
D’Errico, J. (2023). Adaptive Robust Numerical Differentiation,” MATLAB Central File Exchange. (https://www.mathworks.com/matlabcentral/fileexchange/13490-adaptive-robust-numerical-differentiation).
Dowd, B. E., Greene, W. H., & Norton, E. C. (2014). Computation of Standard Errors. Health Services Research, 49, 731–750.
Forthmann, B., Gühne, D., & Doebler, P. (2020). Revisiting dispersion in count data item response theory models: The Conway–Maxwell–Poisson counts model. British Journal of Mathematical and Statistical Psychology, 73(S1), 32–50.
Fraser, T. (2020). Japan’s resilient, renewable cities: How socioeconomics and local policy drive Japan’s renewable energy transition. Environmental Politics, 29(3), 500–523. https://doi.org/10.1080/09644016.2019.1589037
Ghorbani, M., Saffarzadeh, M., & Naderan, A. (2023). Crash prediction modeling for horizontal curves on two-lane, two-way rural highways based on consistency and self-explaining characteristics using zero-truncated data. KSCE Journal of Civil Engineering, 27(8), 3567–3580. https://doi.org/10.1007/s12205-023-0501-6
Gilbert, P., & Varadhan, R. (2022). Package ‘numDeriv’. Comprehensive R Archive Network.
Huang, A. (2017). Mean-parametrized Conway–Maxwell–Poisson regression models for dispersed counts. Statistical Modelling, 17(6), 359–380.
Lord, D., Guikema, S. D., & Geedipally, S. R. (2008). Application of the Conway–Maxwell–Poisson generalized linear model for analyzing motor vehicle crashes. Accident Analysis & Prevention, 40(3), 1123–1134.
Newey, W. K., & McFadden, D. L. (1994). Large sample estimation and hypothesis testing. In R. F. Engle & D. L. McFadden (Eds.), Handbook of Econometrics (pp. 2111–2245). Elsevier Science B.V.
SAS Institute (2023): SAS/IML User’s Guide, p. 856.
Sellers, K. F., & Shmueli, G. (2010). A flexible regression model for count data. The Annals of Applied Statistics, 4(2), 943–961.
Sellers, K. F. (2023). The Conway–Maxwell–Poisson Distribution”. Cambridge University Press.
Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S., & Boatwright, P. (2005). A useful distribution for fitting discrete data: Revival of the Conway–Maxwell–Poisson distribution. Journal of the Royal Statistical Society: Series C (applied Statistics), 54(1), 127–142.
Shirani-Bidabadi, N., Mallipaddi, N., Haleem, K., & Anderson, M. (2020). Developing bicycle-vehicle crash-specific safety performance functions in Alabama using different techniques. Accident Analysis and Prevention, 146, 105735. https://doi.org/10.1016/j.aap.2020.105735
StataCorp (2023). Stata: Release 18. Statistical Software, StataCorp LLC.
Terza, J. (2016a). Inference using sample means of parametric nonlinear data transformations. Health Services Research, 51, 1109–1113.
Zerza, J. (2016b). Supplementary appendix to ‘inference using sample means of parametric nonlinear data transformations.’ Health Services Research. https://doi.org/10.1111/1475-6773.12494
Terza, J. (2017). Causal effect estimation and inference using stata. The Stata Journal, 17, 939–961.
Terza, J. (2020). Regression-based causal analysis from the potential outcomes perspective. Journal of Econometric Methods, 9(1), 20180030. https://doi.org/10.1515/jem-2018-0030
Wang, W., & Famoye, F. (1997). Modeling household fertility decisions with generalized Poisson regression. Journal of Population Economics, 10, 273–283.
White, H. (1994). Estimation. Cambridge University Press.
Wimmer, G., Köhler, R., Grotjahn, R., & Altmann, G. (1994). Towards a theory of word length distribution. Journal of Quantitative Linguistics, 1(1), 98–106. https://doi.org/10.1080/09296179408590003
Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data (2nd ed.). MIT Press.
Wooldridge, J. M. (2011). Solutions manual and supplementary materials for econometric analysis of cross section and panel data (2nd ed.). MIT Press.
Yan, X. C., Wang, T., Chen, J., Ye, X. F., Yang, Z., & Bai, H. (2019). Analysis of the characteristics and number of bicycle-passenger conflicts at bus stops for improving safety. Sustainability. https://doi.org/10.3390/su11195263
Funding
This work was supported by the National Institute on Alcohol Abuse and Alcoholism under subcontract to the Alcohol Research Group (Center Grant #P50 AA05595-41).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author has no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Full Stata/Mata Stata/Mata 18® Code
Appendix: Full Stata/Mata Stata/Mata 18® Code
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Terza, J.V. Standard Errors for Regression-Based Causal Effect Estimates in Economics Using Numerical Derivatives. Comput Econ (2024). https://doi.org/10.1007/s10614-024-10565-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s10614-024-10565-w