Skip to main content
Log in

Marginal effects in multivariate probit models

  • Published:
Empirical Economics Aims and scope Submit manuscript

Abstract

Estimation of marginal or partial effects of covariates x on various conditional parameters or functionals is often a main target of applied microeconometric analysis. In the specific context of probit models, estimation of partial effects involving outcome probabilities will often be of interest. Such estimation is straightforward in univariate models, and results covering the case of quadrant probability marginal effects in bivariate probit models for jointly distributed outcomes y have previously been described in the literature. This paper’s goals are to extend Greene’s results to encompass the general \(M\ge 2\) multivariate probit context for arbitrary orthant probabilities and to extended these results to models that condition on subvectors of y and to multivariate ordered probit data structures. It is suggested that such partial effects are broadly useful in situations, wherein multivariate outcomes are of concern.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. One obvious example is that of conditional product moments \(E\left[ {\prod _{{ j}=1}^{M} {y_{j}^{{b}_{j} } } \left| \mathbf{x} \right. } \right] \) of which conditional covariances may be the most familiar example. In such cases, how \(\sigma _{{i,j}} \left( \mathbf{x} \right) \) varies with conditioning sets x may be of interest in applications (consider, for example, GARCH and related literature).

  2. See also Christofides et al. (1997, 1998).

  3. To streamline the analysis and notation, the x’s will be treated as continuous so that “\(\partial \mathbf{x}\)” calculus can be used. Discrete x’s (e.g., dummy variables, count measures) can be accommodated straightforwardly with the understanding that discrete differences in \(\Pr \left( {y_1 =k_1 ,\ldots ,y_{M} =k_{M} \left| \mathbf{x} \right. } \right) \) due to \(\Delta x_{j} =1\) will be of interest; these can be computed by evaluating \(\Pr \left( {y_1 =k_1 ,\ldots ,y_{M} =k_{M} \left| \mathbf{x} \right. } \right) \) at two different values of \(x_{j}\) and then differencing.

  4. Somewhat informally, the paper uses the term “orthant probability” in reference to the vector of binary outcomes y to refer to the probabilities that the underlying latent random variables that map into the observed binary y (see (4) below) occupy any of the \(2^{{M}}\) orthants in \({\mathbb {R}}^{{M}}\) defined implicitly by k. Some additional notation will also prove useful. Let K be the \(2^{{M}} \times M\) matrix whose rows (arranged arbitrarily) are the \(2^{{M}}\) possible outcome configurations k. Let \({\mathbb {P}}\) be a \(2^{{M}}\)-element set indexing rows of K having typical indexing element p, so that \(\mathbf{k}_\mathbf{p} =\mathbf{K}_{\mathbf{p}{\bullet }} \) will denote a particular (pth) outcome configuration.

  5. This stochastic structure allows for but does not appeal specifically to a common factor error structure for \({\varvec{\varepsilon }}\) in (4). It may be that such an assumption would simplify estimation and, ultimately, computation of the marginal effects.

  6. In applied studies, an explicit formulation of the model of interest as \(\Pr \left( {\mathbf{y}_\mathrm{a} ={\mathbf{k}}_{{p,a}} \left| {\mathbf{y}_{b} =\mathbf{k}_{{p,b}} ,\mathbf{x}} \right. } \right) \) is often absent, and this conditional probability may or may not be the parameter whose marginal effects are of interest. See Greene (1996) for conceptual discussion.

  7. Allowing the \(y_{j} \) to have different numbers of outcomes is straightforward; the assumption of equal numbers of categories across j is made solely to keep notation from becoming unwieldy.

  8. Estimation of the M-variate multivariate ordered probit model can be approached using the methods spelled out in Mullahy (2016).

  9. Of course, for each covariate the sum of the marginal effects across all 32 patterns must be zero.

  10. See Huguenin et al. (2009) for a discussion of other considerations that arise in estimation of MVP models, wherein dimension reduction is a primary consideration.

References

  • Cappellari L, Jenkins SP (2003) Multivariate probit regression using simulated maximum likelihood. Stata J 3(3):278–294

    Google Scholar 

  • Christofides LN, Stengos T, Swidinsky R (1997) On the calculation of marginal effects in the bivariate probit model. Econ Lett 54(3):203–208

    Article  Google Scholar 

  • Christofides LN, Stengos T, Swidinsky R (2000) Corrigendum. Econ Lett 68(3):339

    Article  Google Scholar 

  • Frees EW, Valdez EA (1998) Understanding relationships using copulas. N Am Actuar J 2(1):1–25

    Article  Google Scholar 

  • Greene WH (1996) Marginal effects in the bivariate probit model. NYU Stern School of Business working paper EC-96-11

  • Greene WH (1998) Gender economics courses in liberal arts colleges: further results. J Econo Educ 29(4):291–300

    Article  Google Scholar 

  • Greene WH (2004) Convenient estimators for the panel probit model: further results. Empir Econ 29(1):21–47

    Article  Google Scholar 

  • Greene WH, Hensher DA (2010) Modeling ordered choices: a primer. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Hajivassiliou V, McFadden D, Ruud P (1996) Simulation of multivariate normal rectangle probabilities and their derivatives: theoretical and computational results. J Econom 72(1–2):85–134

    Article  Google Scholar 

  • Huguenin J, Pelgrin F, Holly A (2009) Estimation of multivariate probit models by exact maximum likelihood. University of Lausanne, IEMS working paper 09-02

  • Mullahy J (2016) Estimation of multivariate probit models via bivariate probit. Stata J 16(1):37–51

    Google Scholar 

  • Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, New York

    Book  Google Scholar 

  • Trivedi PK, Zimmer DM (2005) Copula modeling: an introduction for practitioners. Found Trends Econom 1(1):1–111

    Article  Google Scholar 

Download references

Acknowledgments

Thanks are owed to Kevin Denny, Bill Greene, Jeff Hoch, Alberto Holly, Stephen Jenkins, Mari Palta, Ron Thisted, participants in seminars at University College Dublin, Minnesota, AHRQ, and Wisconsin, and an anonymous referee for their helpful comments, and to Katherine Mullahy for valuable editorial assistance. This work has been supported in part by the Health & Society Scholars Program, by NICHD grant P2C HD047873 to the Center for Demography and Ecology, and by the Robert Wood Johnson Foundation Evidence for Action Program (Grant 73336), all at the University of Wisconsin-Madison. An earlier, longer draft was circulated as NBER W.P. 17588.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John Mullahy.

Appendices

Appendix 1: Detailed derivations for the general case

For an intuition for (2), note that in the \(M{=}2\) case the partial derivative w.r.t. \(u_1 \) of the function \(g\left( {u_1 ,u_2 } \right) \equiv {\partial F\left( {u_1 ,u_2} \right) }big/{\partial u_2 }\) evaluated at u \(=\) v must in light of (1) yield the joint density \(f\left( {v_1 ,v_2 } \right) \). One function \(g\left( {v_1 ,v_2 } \right) \) satisfying this is \(g\left( {v_1 ,v_2 } \right) =f_2 \left( {v_2 } \right) \times F\left( {v_1 \left| {v_2 } \right. } \right) \), which is of the form (2); this follows since, at u \(=\) v,

$$\begin{aligned} \frac{\partial f_2 \left( {v_2 } \right) \times F\left( {v_1 \left| {v_2 } \right. } \right) }{\partial v_1 }=f_2 \left( {v_2 } \right) \times \frac{\partial F\left( {v_1 \left| {v_2 } \right. } \right) }{\partial v_1 }=f_2 \left( {v_2 } \right) \times f\left( {v_1 \left| {v_2 } \right. } \right) =f\left( {v_1 ,v_2 } \right) .\nonumber \\ \end{aligned}$$
(8)

By recursion, this result generalizes to M>2 by working backwards from the Mth cross-partial derivative. The general sequence of partial derivatives of \(F\left( {\ldots } \right) \) is (differentiating w.o.l.o.g. in the order \(j=1,2,\ldots ,M\)):

$$\begin{aligned} \frac{\partial F\left( {\mathbf{v}} \right) }{\partial v_1 }= & {} f_1 \left( {v_1 } \right) \times F_{-1} \left( {v_2 ,\ldots ,v_{M} \left| {v_1 } \right. } \right) \\ \frac{\partial ^{{r}}F\left( {\mathbf{v}} \right) }{\partial v_1 \cdots \partial v_{r} }= & {} f_1 \left( {v_1 } \right) \times \left\{ {\prod \limits _{{{k}}=2}^{r} {f\left( {v_{k} \left| {v_1 ,\ldots ,v_{{k}-1} } \right. } \right) } } \right\} \\ \quad\times & {} F_{-\left\{ {1,\ldots ,{r}} \right\} } \left( {v_{{r}+1} ,\ldots ,v_{M} \left| {v_1 ,\ldots ,v_{r} } \right. } \right) ,~ r=2,\ldots ,M-1 \\ \frac{\partial ^{{M}}F\left( {\mathbf{v}} \right) }{\partial v_1 \cdots \partial {v}_{M} }= & {} f\left( {\mathbf{v}} \right) . \end{aligned}$$

This result is trivial when the \(v_{{j}} \) are mutually independent, in which case all the conditioning arguments are irrelevant.

Alternatively, (2) can be obtained directly using Leibniz’s rule for differentiation of integrals whose limits depend on the variable of differentiation. Since \(F\left( {\mathbf{v}} \right) =\int _{-\infty }^{v_{M} } \cdots \int _{-\infty }^{v_1 } {f\left( {\mathbf{u}} \right) \hbox {d}u_1 \cdots \hbox {d}u_{M} } \), then one can obtain \({\partial F\left( {\mathbf{v}} \right) }\big /{\partial v_{j} }\) by noting that \(v_{j} \) appears in this expression only once, as the upper limit of one integration, so that passing Leibniz’s rule into the integral yields

$$\begin{aligned}&\frac{\partial }{\partial v_{{j}} }\left( {\int _{-\infty }^{v_{{M}} } \cdots \int _{-\infty }^{v_1 } {f\left( {\mathbf{u}} \right) \hbox {d}u_1 \cdots \hbox {d}u_{{M}} } } \right) \\&\quad =\int _{-\infty }^{v_{{M}} } \cdots \int _{-\infty }^{v_{{j}+1} } {\int _{-\infty }^{v_{{j}-1} } \cdots } \int _{-\infty }^{v_1 }\\&\qquad \times \, {\left( {\frac{\partial }{\partial v_{{j}} }\int _{-\infty }^{v_{{j}} } {f\left( {u_1 ,\ldots ,u_{{M}} } \right) \hbox {d}u_{{j}} } } \right) } \hbox {d}u_1 \cdots \hbox {d}u_{{j}-1} \hbox {d}u_{{j}+1} \cdots \hbox {d}u_{{M}} \\&\quad =\int _{-\infty }^{v_{{M}} } \cdots \int _{-\infty }^{v_{{j}+1} } {\int _{-\infty }^{v_{{j}-1} } \cdots } \int _{-\infty }^{v_1 }\\&\qquad \times \,{\left( {f\left( {u_1 ,\ldots ,u_{{j}-1} ,v_{{j}} ,u_{{j}+1} ,\ldots ,u_{{M}} } \right) } \right) } \hbox {d}u_1 \cdots \hbox {d}u_{{j}-1} \hbox {d}u_{{j}+1} \cdots \hbox {d}u_{{M}} \\&\quad =\int _{-\infty }^{v_{{M}} } \cdots \int _{-\infty }^{v_{{j}+1} } {\int _{-\infty }^{v_{{j}-1} } \cdots } \int _{-\infty }^{v_1 }\\&\qquad \times \,{\left( {f\left( {u_1 ,\ldots ,u_{{j}-1} ,u_{{j}+1} ,\ldots ,u_{{M}} \left| {v_{{j}} } \right. } \right) \times f\left( {v_{{j}} } \right) } \right) } \hbox {d}u_1 \cdots \hbox {d}u_{{j}-1} \hbox {d}u_{{j}+1} \cdots \hbox {d}u_{{M}} \\&\quad =f\left( {v_{{j}} } \right) \times \int _{-\infty }^{v_{{M}} } \cdots \int _{-\infty }^{v_{{j}+1} } {\int _{-\infty }^{v_{{j}-1} } \cdots } \int _{-\infty }^{v_1 }\\&\qquad \times \,{\left( {f\left( {u_1 ,\ldots ,u_{{j}-1} ,u_{{j}+1} ,\ldots ,u_{{M}} \left| {v_{{j}} } \right. } \right) } \right) } \hbox {d}u_1 \cdots \hbox {d}u_{{j}-1} \hbox {d}u_{{j}+1} \cdots \hbox {d}u_{{M}} \\&\quad =f\left( {v_{{j}} } \right) \times F\left( {v_1 ,\ldots ,v_{{j}-1} ,v_{{j}+1} ,\ldots ,v_{{M}} \left| {v_{{j}} } \right. } \right) \\ \end{aligned}$$

Analogous results appear in the literature on copula joint distribution functions (Frees and Valdez 1998; Trivedi and Zimmer 2005) in which the joint distribution of y is represented in copula form as

$$\begin{aligned} C\left( {F_1 \left( {y_1 } \right) ,\ldots ,F_{{M}} \left( {y_{{M}} } \right) } \right) =C\left( {u_1 ,\ldots ,u_{{M}} } \right) =F\left( {\mathbf{u}} \right) \end{aligned}$$

with \(F_{{j}} \) denoting the marginal distribution function of \(y_{{j}} \), with the \(u_{{j}} \) being marginally uniform variates. A familiar result in the bivariate copula literature is that \({\partial C\left( {u_1 ,u_2 } \right) }\big /{\partial u_1 =}F\left( {u_2 \left| {u_1 } \right. } \right) \). This is essentially equivalent to (8) since uniform marginal densities satisfy \(f_{{j}} \left( {u_{{j}} } \right) =1\). Note, however, that there are instances in the copula literature in which results like \({\partial F\left( {u_1 ,u_2 } \right) }\big /{\partial u_1 =}F\left( {u_2 \left| {u_1 } \right. } \right) \) are stated. In light of (8), this result in general does not hold unless \(f_{{j}} \left( {u_{{j}} } \right) =1\).

Appendix 2: Detailed derivations for the multivariate probit model

Let \(s_{{jp}} =2k_{{jp}} -1\) so that \(s_{{jp}} \in \left\{ {-1,1} \right\} \), and define correspondingly the \(M\times M\) diagonal transformation matrixes \({\mathbf{T}}_{p} =\hbox {diag}\left[ {s_{{jp}} } \right] , p=1,\ldots ,2^{{M}},j=1,\ldots ,M\). Define for each p the transformation \(\mathbf{Q}_{p} =\mathbf{T}_{p} \mathbf{RT}_{p} \) of the original covariance (i.e., correlation) matrix R, so that \(\mathbf{Q}_{{p}}\) is of the form

$$\begin{aligned} \mathbf{Q}_{p} =\left[ {{\begin{array}{cccc} 1&{}\quad {s_{1{p}} s_{2{p}} \rho _{12} }&{}\quad \cdots &{}\quad {s_{1{p}} s_{{Mp}} \rho _{1{M}} } \\ {s_{1{p}} s_{2{p}} \rho _{12} }&{}\quad 1&{}\quad &{}\quad \vdots \\ \vdots &{}\quad &{}\quad \ddots &{}\quad \\ {s_{1{p}} s_{{Mp}} \rho _{1{M}} }&{}\quad \cdots &{}\quad &{}\quad 1 \\ \end{array} }} \right] =\left[ {{\begin{array}{cccc} 1&{}\quad {\tau _{12{p}} }&{}\quad \cdots &{}\quad {\tau _{1{Mp}} } \\ {\tau _{12{p}} }&{}\quad 1&{}\quad &{}\quad \vdots \\ \vdots &{}\quad &{}\quad \ddots &{}\quad \\ {\tau _{1{Mp}} }&{}\quad \cdots &{}\quad &{} 1\quad \\ \end{array} }} \right] . \end{aligned}$$

The conditional-on-x probability of any particular outcome configuration \(\mathbf{k}_{p} \) is thus given by

$$\begin{aligned}&\Pr \left( {y_1 =k_{1{p}} ,\ldots ,y_{{M}} =k_{{Mp}} \left| \mathbf{x} \right. } \right) =\Phi _{\mathbf{Q}_{p} } \left( {s_{1{p}} \mathbf{x}{\varvec{\upbeta }}_1 ,\ldots ,s_{{Mp}} \mathbf{x}{\varvec{\upbeta }}_{{M}} } \right) \\&\qquad =\Phi _{\mathbf{Q}_{p} } \left( {\alpha _{1{p}} ,\ldots ,\alpha _{{Mp}}} \right) , \end{aligned}$$

where \(\Phi _\mathbf{Q} \) is the cumulative of an \(\hbox {MVN}\left( {\mathbf{0,Q}} \right) \) distribution with density \({\phi }_{\mathbf{Q}} \left( {\ldots } \right) \) and \(\alpha _{{jp}} =s_{{jp}} \mathbf{x}{\varvec{\upbeta }}_{{j}} \). Using the transformed matrixes Q in place of the original correlation matrixes R streamlines the exposition since for each configuration p, the outcome orthant probability can be described by a joint cumulative rather than by a notationally messy mix of cumulatives and survivor functions. This amounts to a linear change-of-variables operation on \({\varvec{\varepsilon }} =\left[ {\varepsilon _1 ,\ldots ,\varepsilon _{m} } \right] \) of the form \(\mathbf{T}_{p} {\varvec{\varepsilon }} \) which becomes the effective error structure of model at each p; this transformation works due to the symmetry of the distribution of \({{\varvec{\varepsilon }}}\) around the origin.

To obtain the MVP’s marginal effects, it thus suffices to obtain the particular expressions corresponding to the second line in (3). \(f_{{j}} \left( {c_{{j}} \left( {\varvec{\theta }} \right) } \right) \) is a univariate \(N\left( {0,1} \right) \) density and \(F_{-{j}} \left( {c_1 \left( {\varvec{\theta }} \right) ,\ldots ,c_{{j}-1} \left( {\varvec{\theta }} \right) ,c_{{j}+1} \left( {\varvec{\theta }} \right) ,\ldots ,c_{m} \left( {\varvec{\theta }} \right) \left| {c_{{j}} \left( {\varvec{\theta }} \right) } \right. } \right) \) is the cumulative of a conditional (M-1)-variate multivariate normal distribution. The \(c_{{j}} \left( {\varvec{\theta }} \right) \) in (3) are equal to \(s_{{j}} \mathbf{x}{\varvec{\upbeta }}_{{j}} \) in the MVP context, with x playing the role of the “parameter” that is common across outcomes, so that \({\hbox {d}c}_{{j}} \left( {\varvec{\theta }} \right) \big /{\hbox {d}{\varvec{\theta }} }\) is \({\hbox {d}\left( {s_{{j}} \mathbf{x}{\varvec{\upbeta }} _{{j}} } \right) }\big /{\hbox {d}{} \mathbf{x}}=s_{{j}} {\varvec{\upbeta }}_{{j}} \). Substituting into (14) \({\phi }\left( {\ldots } \right) \) for \(f\left( {\ldots } \right) \), \(\Phi \left( {\ldots } \right) \) for \(F\left( {\ldots } \right) \), and \(\alpha _{{jp}} \) for \(c_{{j}} \left( {\varvec{\theta }} \right) \) gives:

$$\begin{aligned}&\frac{\partial \Phi _{\mathbf{Q}_{p} } \left( {\alpha _{1{p}} ,\ldots ,\alpha _{{Mp}} } \right) }{\partial \mathbf{x}}=\sum \limits _{{j}=1}^{M} {\left\{ {\left( {\frac{\partial \Phi _{\mathbf{Q}_{p} } \left( {\alpha _{1{p}} ,\ldots ,\alpha _{{Mp}} } \right) }{\partial \alpha _{{jp}} }} \right) \times \left( {\frac{\partial \alpha _{{jp}} }{\partial \mathbf{x}}} \right) } \right\} } \\&\quad =\sum \limits _{{j}=1}^{M} {\left\{ {\left( {{\phi }\left( {\alpha _{{jp}} } \right) \times \Phi _{\mathbf{Q}_{p} \left\{ {-{j}} \right\} } \left( {\alpha _{1{p}} ,\ldots ,\alpha _{\left( {{j}-1} \right) {p}} ,\alpha _{\left( {{j}+1} \right) {p}} ,\ldots ,\alpha _{{Mp}} \left| {\alpha _{{jp}} } \right. } \right) } \right) \times \left( {s_{{jp}} {\upbeta }_{{j}} } \right) ^{\mathrm{T}}} \right\} .} \\ \end{aligned}$$

Given consistent estimates \(\widehat{\mathbf{B}}\) and \(\widehat{\mathbf{Q}}\), estimation of \({\partial \Phi _{\mathbf{Q}_{p} } \left( {\alpha _{1{p}} ,\ldots ,\alpha _{{Mp}} } \right) }/{\partial \mathbf{x}}\) is complicated only by evaluation of the term \(\Phi _{\mathbf{Q}_{p} \left\{ {-{j}} \right\} } \Big ( \alpha _{1{p}} ,\ldots ,\alpha _{\left( {{j}-1} \right) {p}} ,\alpha _{\left( {{j}+1} \right) {p}} ,\ldots ,\alpha _{{Mp}} \left| {\alpha _{{jp}} } \right. \Big )\). The following result provides a basis for this calculation:

Result: Joint Conditional Distribution of an MVN-Variate, Adapted from Rao (1973) (8a.2.11)

Suppose \(\mathbf{z}=\left[ {z_1 ,\ldots ,z_{{M}} } \right] \sim \hbox {MVN}\left( {\mathbf{0},{\varvec{\Omega }} } \right) \). Partition \({\varvec{\Omega }}\) as \(\left[ {{\begin{array}{ll} {\omega }_{11} &{} {{\varvec{\Omega }}_{12} } \\ {{\varvec{\Omega }}_{21} }&{} {{\varvec{\Omega }}_{22} } \\ \end{array} }} \right] \) with scalar. Then, \({\mathbf{z}}_{-1} =\left[ {z_2 ,\ldots ,z_{{M}} } \right] \) conditional on \(z_1\) is (\(M-1\))-variate \(\hbox {MVN}\left( {{\varvec{\Omega }} _{21} {\omega }_{11}^{-1} z_1 ,\left( {\varvec{\Omega }}_{22} -{\omega }_{11}^{-1} {\varvec{\Omega }}_{21} {{\varvec{\Omega }}_{12} } \right) } \right) \). \(\square \)

This generalizes straightforwardly to \({\mathbf{z}}_{-{j}} =\left[ {z_1 ,\ldots ,z_{{j}-1} ,z_{{j}+1} ,\ldots ,z_{{M}} } \right] \), j=2,...,M, by defining different partitions of \({\varvec{\Omega }}\). In the case of interest here, \({\varvec{\Omega }} =\mathbf{Q}_{p} \) so that \(\omega _{11} =1\). It follows that the joint conditional distribution is

$$\begin{aligned} {\mathbf{z}}_{-1} \left| {z_1 \sim } \right. \hbox {MVN}\left( {\left[ {{\begin{array}{c} {z_1 \tau _{12{p}} } \\ \vdots \\ {z_1 \tau _{1{Mp}} } \\ \end{array} }} \right] ,\left[ {{\begin{array}{cccc} {1-\tau _{12{p}}^2 }&{} {\tau _{23{p}} -\tau _{12{p}} \tau _{13{p}} }&{} \ldots &{} {\tau _{2{Mp}} -\tau _{12{p}} \tau _{1{Mp}} } \\ {\tau _{23{p}} -\tau _{12{p}} \tau _{13{p}} }&{} {1-\tau _{13{p}}^2 }&{} &{} \vdots \\ \vdots &{} &{} \ddots &{} \\ {\tau _{2{Mp}} -\tau _{12{p}} \tau _{1{Mp}} }&{} \ldots &{} &{} {1-\tau _{1{Mp}}^2 } \\ \end{array} }} \right] } \right) ,\nonumber \\ \end{aligned}$$
(9)

again with obvious generalization to the distributions of \({\mathbf{z}}_{-{j}} \left| {z_{{j}} } \right. \), j=2,...,M.

To obtain \(\Phi _{\mathbf{Q}_{p} \left\{ {-{j}} \right\} } \left( {\alpha _{1{p}} ,\ldots ,\alpha _{\left( {{j}-1} \right) {p}} ,\alpha _{\left( {{j}+1} \right) {p}} ,\ldots ,\alpha _{{Mp}} \left| {\alpha _{{jp}} } \right. } \right) \), define the (\(M-1\))-vector of differences

$$\begin{aligned} \Delta _{-{j,p}}= & {} \left[ \left( {\alpha _{1{p}} -\alpha _{{jp}} \tau _{1{jp}} } \right) ,\ldots ,\left( {\alpha _{({j}-1){p}} -\alpha _{{jp}} \tau _{({j}-1){jp}} } \right) ,\right. \nonumber \\&\left. \times \left( {\alpha _{({j}+1){p}} -\alpha _{{jp}} \tau _{({j}+1){jp}} } \right) ,\ldots ,\left( {\alpha _{{Mp}} -\alpha _{{jp}} \tau _{\mathrm{Mjp}} } \right) \right] ^{\mathsf{T}}, \end{aligned}$$
(10)

and an \((M{-}1)\times (M{-}1)\) diagonal transformation matrix \({\mathbf{H}}_{{jp}} \,{=}\,\hbox {diag}_{{k}\ne {j}} \left[ {\left( {\sqrt{1-\tau _{\mathrm{jkp}}^2 }} \right) ^{-1}} \right] \). Let \({\mathbf{L}}_{{jp}} {=}{\mathbf{H}}_{{jp}} {\varvec{\Delta }} _{-{j,p}} \) be the corresponding (M-1)-vector of normalized differences. Then, \(\Phi _{\mathbf{Q}_{{p}\left\{ {-{j}} \right\} } } \left( \alpha _{1{p}} ,\ldots ,\alpha _{\left( {{j}-1} \right) {p}} ,\alpha _{\left( {{j}+1} \right) {p}} ,\ldots ,\alpha _{{Mp}} \left| {\alpha _{{jp}} } \right. \right) \) can be computed by referring \(\mathbf{L}_{{jp}}\) to \(\Phi _{\mathbf{z},{\varvec{\Sigma }} } \left( {\ldots } \right) \), which is the cumulative of an (M-1)-variate \(\hbox {MVN}\left( {\mathbf{0},{\varvec{\Sigma }} } \right) \) distribution in which the off-diagonals of \({\varvec{\Sigma }} \) may be nonzero. In this instance, \({\varvec{\Sigma }} \) is the variance-covariance matrix of \(\mathbf{L}_{{jp}} \) which is in correlation matrix form having typical off-diagonal (r,c) element \({\left( {\tau _{{rcp}} -\tau _{{jrp}} \tau _{{jcp}} } \right) }\big /{\sqrt{\left( {1-\tau _{{jrp}}^2 } \right) \left( {1-\tau _{{jcp}}^2 } \right) }}\). Let this matrix be denoted \({\mathbf{V}}_{{jp}} \). The results derived in this appendix now provide the basis for computing the quantities of interest in (5).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mullahy, J. Marginal effects in multivariate probit models. Empir Econ 52, 447–461 (2017). https://doi.org/10.1007/s00181-016-1090-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00181-016-1090-8

Keywords

JEL Classification

Navigation