# Bounding average treatment effects using linear programming

## Abstract

This paper presents a method of calculating sharp bounds on the average treatment effect using linear programming under identifying assumptions commonly used in the literature. This new method provides a sensitivity analysis of the identifying assumptions and missing data in two applications. The first application looks at the effect of parents’ schooling on children’s schooling, and the second application studies the effect of mandatory arrest policy on domestic violence recidivism. This paper shows that even a mild departure from identifying assumptions may substantially widen the bounds on average treatment effects. Allowing for a small fraction of the data to be missing also has a large impact on the results.

This is a preview of subscription content, log in to check access.

## Notes

1. 1.

Formally, the population forms a probability space $$(I,\mathcal {F},P)$$, where the population of individuals I is the sample space, $$\mathcal {F}$$ is a set of events and P is a probability measure. Hence, the only source of randomness is the choice of individual. The individual’s behavior is deterministic.

2. 2.

Available at http://www.ssc.wisc.edu/wlsresearch/.

3. 3.

The assumption that the outcome is a deterministic function of a treatment is intrinsic in the potential outcome framework of Rubin (1974).

4. 4.

Ordinary least squares regression analysis assumes exogenous treatment selection: $$\forall t_1, t_2: E[Y(t)|D = t_1] = E[Y(t)|D = t_2]$$, and it point identifies the average potential outcome: $$E[Y(t)] = E[Y(t)| D = t]P(D = t) + E[Y(t)| D \ne t]P(D \ne t) = E[Y(t)| D = t]$$.

5. 5.

The confidence sets that cover the whole identified set asymptotically are generally larger and may be preferable for a policymaker concerned with robust decisions, as argued in Henry and Onatski (2012).

6. 6.

The estimates of bounds in first column in this table are the same as in Table 1 in Lafférs (2013b).

7. 7.

This analysis was challenged by Antonovics and Goldberger (2005), who claim that their results are driven by a specific data coding. In a reply, Behrman and Rosenzweig (2005) argue that their story is supported by an additional data source.

8. 8.

We used a scalar relaxation parameter $$\alpha _{cMTS}$$ for simplicity. An extension to a vector parameter is straightforward.

9. 9.

From the dataset, we get an estimate $$\hat{p}_n$$ of the true and unknown p, where n is the sample size.

10. 10.

The inequalities $$g(\bar{p},p,\alpha _{MISS}) \le 0$$ ensure that $$p_{MISS} \in \mathcal {P}$$ ($$p_{MISS}$$ is a proper probability distribution) in the definition (MISS) of the set $$\mathcal {P}_{MISS}$$:

\begin{aligned} \begin{aligned} \forall i = 1,\dots ,K: \ \ \ g_i(\bar{p},p,\alpha _{MISS})&= \left( (1-\alpha _{MISS})p_1 - \bar{p}_1 \right) /\alpha _{MISS}, \\ g_{K+1}(\bar{p},p,\alpha _{MISS})&= \bar{p}_1 + \dots + \bar{p}_K - 1,\\ g_{K+2}(\bar{p},p,\alpha _{MISS})&= -(\bar{p}_1 + \dots + \bar{p}_K) + 1. \\ \end{aligned} \end{aligned}
11. 11.

Figure 5 and Table 4 show that the relaxation of $$\alpha _{cMTS}$$ translates to the upper bound one by one.

12. 12.

The MDVE was followed by experiments in six other cities (Atlanta, Charlotte, Colorado Springs, Metro-Date, Omaha and Milwaukee) with different types of treatment assignments. Sherman et al. (1992) compares the results of five of these experiments. The arrest/non-arrest grouping makes the results comparable with experiments with different research designs.

13. 13.

Given that Z is randomly assigned, conditioning on Z gives us the same assumption as the unconditional one.

\begin{aligned} \begin{aligned}&E[Y(t)|V=0 ] = E[Y(t) | Z=t, V = 0] P(Z=t|V=0) = E[Y(t) | Z=t, V = 0] P(Z=t) \\&E[Y(t)|V=1 ] = E[Y(t) | Z=t, V = 1] P(Z=t|V=1) = E[Y(t) | Z=t, V = 1] P(Z=t) \\&E[Y(t)|Z=t, V=0 ] \le E[Y(t)|Z=t, V=1 ] \iff E[Y(t)|V=0 ] \le E[Y(t)|V=1 ]. \end{aligned} \end{aligned}
14. 14.

For more references, see the footnote 2 in Lafférs (2013b).

15. 15.

Also, the MIV assumption is made conditional on the value of the treatment assigned ($$Z=z$$), which we highlighted by labeling this assumption as cMIV (conditional MIV).

## References

1. Antonovics KL, Goldberger AS (2005) Does increasing women’s schooling raise the schooling of the next generation? Comment. Am Econ Rev 95:1738–1744

2. Balke A, Pearl J (1994) Counterfactual probabilities: computational methods, bounds, and applications. In: de Mantaras LR, Poole D (eds) Uncertainty in artificial intelligence, vol 1. Morgan Kaufmann, pp 46–54

3. Balke A, Pearl J (1997) Bounds on treatment effects from studies with imperfect compliance. J Am Stat Assoc 439:1172–1176

4. Behrman JR, Rosenzweig MR (2002) Does increasing women’s schooling raise the schooling of the next generation? Am Econ Rev 92:323–334

5. Behrman JR, Rosenzweig MR (2005) Does increasing women’s schooling raise the schooling of the next generation? Reply. Am Econ Rev 95:1745–1751

6. Berk RA, Sherman LW (1988) Police responses to family violence incidents: an analysis of an experimental design with incomplete randomization. J Am Stat Assoc 83:70–76

7. Carter M (2001) Foundations of mathematical economics. MIT Press, Cambridge

8. Chiburis RC (2010) Bounds on treatment effects using many types of monotonicity. Working paper, Department of Economics, University of Texas at Austin

9. de Haan M (2011) The effect of parents’ schooling on child’s schooling: a nonparametric bounds analysis. J Labor Econ 29:859–892

10. Demuynck T (2015) Bounding average treatment effects: a linear programming approach. Econ Lett 137:75–77

11. Freyberger J, Horowitz JL (2015) Identification and shape restrictions in nonparametric instrumental variables estimation. J Econ 189:41–53

12. Galichon A, Henry M (2009) A test of non-identifying restrictions and confidence regions for partially identified parameters. J Econ 152:186–196

13. Hauser RM (2005) Survey response in the long run: the Wisconsin longitudinal study. Field Methods 17:3–29

14. Henry M, Onatski A (2012) Set coverage and robust policy. Econ Lett 115:256–257

15. Hirano K, Porter JR (2012) Impossibility results for nondifferentiable functionals. Econometrica 80:1769–1790

16. Holmlund H, Lindahl M, Plug E (2011) The causal effect of parents’ schooling on children’s schooling: a comparison of estimation methods. J Econ Literature 49:615–51

17. Honore BE, Tamer E (2006) Bounds on parameters in panel dynamic discrete choice models. Econometrica 74:611–629

18. Imbens GW, Manski CF (2004) Confidence intervals for partially identified parameters. Econometrica 72:1845–1857

19. Kim JH (2014) Identifying the distribution of treatment effects under support restrictions. Available at SSRN

20. Lafférs L (2013) Inference in partially identified models with discrete variables. Working paper

21. Lafférs L (2013) A note on bounding average treatment effects. Econ Lett 120:424–428

22. Lafférs L (2017) Identification in models with discrete variables. Comput Econ, forthcoming

23. Manski CF (1990) Nonparametric bounds on treatment effects. Am Econ Rev 80:319–23

24. Manski CF (1995) Identification problems in the social sciences. Harvard University Press, Cambridge

25. Manski CF (1997) Monotone treatment response. Econometrica 65:1311–1334

26. Manski CF (2003) Partial identification of probability distributions. Springer, New York

27. Manski CF (2007) Partial identification of counterfactual choice probabilities. Int Econ Rev 48:1393–1410

28. Manski CF (2008) Partial identification in econometrics. In: Durlauf SN, Blume LE (eds) The new Palgrave dictionary of economics. Palgrave Macmillan, Basingstoke

29. Manski CF, Pepper JV (2000) Monotone instrumental variables, with an application to the returns to schooling. Econometrica 68:997–1012

30. Martin D (1975) On the continuity of the maximum in parametric linear programming. J Optim Theory Appl 17:205–210

31. Munkres JR (2000) Topology, 2nd edn. Prentice Hall, Englewood Cliffs

32. Romano JP, Shaikh AM (2008) Inference for identifiable parameters in partially identified econometric models. J Stat Plan Inference 138:2786–2807

33. Romano JP, Shaikh AM (2010) Inference for the identified set in partially identified econometric models. Econometrica 78:169–211

34. Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688–701

35. Sherman LW, Schmidt JD, Rogan DP (1992) Policing domestic violence: experiments and dilemmas. Free Press, New York

36. Siddique Z (2013) Partially identified treatment effects under imperfect compliance: the case of domestic violence. J Am Stat Assoc 108:504–513

## Acknowledgements

This research was supported by VEGA Grant 1/0843/17. This paper is a revised chapter from my 2014 dissertation at the Norwegian School of Economics. I would like to thank Monique de Haan for generously providing me with the data used in this paper, as well as Christian Brinch, Andrew Chesher, Christian Dahl, Gernot Doppelhofer, Charles Manski, Peter Molnar, Adam Rosen, Erik Sorensen, Ivan Sutoris and Alexey Tetenov for valuable feedback. Special thanks goes to the referees and the editor for carefully reading through the manuscript and for suggesting the second application.

## Author information

Authors

### Corresponding author

Correspondence to Lukáš Lafférs.

## Rights and permissions

Reprints and Permissions

Lafférs, L. Bounding average treatment effects using linear programming. Empir Econ 57, 727–767 (2019). https://doi.org/10.1007/s00181-018-1474-z

• Accepted:

• Published:

• Issue Date:

### Keywords

• Partial identification
• Bounds
• Average treatment effect
• Sensitivity analysis
• Linear programming

• C4
• C6
• I2