## Abstract

Recent empirical studies of firm-level performance have tested complementarity in the case of multiple practices. These papers have drawn conclusions using potentially biased estimates of pair-wise interaction effects. We develop a consistent and simple testing framework and test it against alternatives.

### Similar content being viewed by others

Avoid common mistakes on your manuscript.

## 1 Introduction

Researchers in the fields of industrial organization and management have long been interested in investigating complementary relations between various organizational practices. Complementarity is understood in this context to exist if the implementation of one practice increases the marginal or incremental return to other practices. Joint implementation of several practices may result in economies of scope (Baumol et al. 1988). The implementation of one practice might also decrease the marginal or incremental return to other practices. This is the case of substitutability (or subadditivity). Examples of studies of complementarity are the relationships between human resource practices and firm strategy (Ichniowski et al. 1997), firms’ internal R&D and external technology sourcing (Arora and Gambardella 1990), process and product innovation (Miravete and Pernias 2004), labor skill and innovation strategies (Leiponen 2005), different government innovation policies (Mohnen and Röller 2005), information technology, workplace reorganization, and new product and service innovations (Black and Lynch 2001; Bresnahan et al. 2002; Caroli and Van Reenen 2001), adoption of different information technologies in emergency health care (Athey and Stern 2002), different types of labor in the determination of trade patterns (Grossman and Maggi 2000) and use of external knowledge across different stages of new product development (Love and Roper 2009).

There are two econometric approaches used to test for complementarity: the “adoption” or “correlation” approach and the “production function” approach (e.g. Athey and Stern 1998). The former has been popular among empirical researchers due to its simplicity (Arora 1996). The adoption approach tests conditional correlations based on the residuals of reduced form regressions of the practices of interest on all exogenous variables. However, although this test can serve as supportive evidence of complementarity, it cannot serve as a definitive test. Estimated correlations between residuals may be the result of common omitted exogenous variables or measurement errors. Even in the case of well-measured correlation between practices, decision makers may not have been sufficiently well informed such that they chose efficiency or output enhancing combinations of practices.

The “production function” approach, in which organizational performance is related to combinations of organizational practices, does not have these drawbacks and can serve as a direct test for complementarity or substitutability.^{Footnote 1} However, no easily executable testing procedure has been available to test for complementarity or substitutability with more than two practices.^{Footnote 2} Studies adopting the production function approach have limited analysis to the estimation of pair-wise interaction effects, either including all pair-wise terms (e.g. Caroli and Van Reenen 2001), or estimating only the pair-wise interaction of interest (e.g. Bresnahan et al. 2002). This approach ignores the impact of additional cross-terms (e.g. a triple term in case of three practices), it examines only a partial expression for the cross derivative and is prone to an omitted variable bias that affects all coefficients. As noted by Athey and Stern (1998), a proper complementarity or substitutability test requires a testing framework that considers the complete set of organizational practices. In this paper we develop such a test based on a multiple-inequality restrictions framework corresponding to a definition of strict supermodularity or submodularity (Milgrom and Roberts 1990). We provide Monte Carlo results comparing the power of this test with the performance of the two pair-wise tests.

## 2 Complementarity and substitutability

We describe the definitions and conditions concerning complementarity and substitutability both for the case of continuously measured practices and the case of dichotomous practices. Consider an objective function *f* of which the value is determined by the practices *x*
_{
p
} (*p* = 1,…,*n*). In case the practices are measured continuously the following definition of complementarity holds (e.g. Baumol et al. 1988)^{Footnote 3}:

### Definition 1

*(continuous practices)* Practices x_{
i
} and x_{
j
} are considered complementary in the function f if and only if \( \partial^{2} f/\partial x_{i} \partial x_{j} \ge 0 \) for all values of \( (x_{1} , \ldots ,x_{n} ) \) with the inequality holding strictly for at least one value.

This definition is demanding in the sense of requiring the cross derivative to be non-negative for all possible or observed values of practices. The definition for substitutability is identical to definition 1 except that ‘larger’ is replaced by ‘smaller’. We use a cross-term specification of the objective function *f* to test for complementarity or substitutability. The expressions for *n* equal to 2, 3 and 4 are:

The cross-derivatives \( \partial^{2} f/\partial x_{1} \partial x_{2} \) are equal to \( \alpha_{12} \) for Eq. 1, \( \alpha_{12} + \alpha_{123} x_{3} \) for Eq. 2 and \( \alpha_{12} + \alpha_{123} x_{3} + \alpha_{124} x_{4} + \alpha_{1234} x_{3} x_{4} \) for Eq. 3, respectively. This implies that there is complementarity for the case of two practices if \( \alpha_{12} > 0 \). In case of three practices there are two conditions: \( \alpha_{12} + \alpha_{123} \min (x_{3} ) \ge 0 \) and \( \alpha_{12} + \alpha_{123} \max (x_{3} ) \ge 0 \) with at least one of the inequalities holding. In case of four practices there are four conditions, using the minimum and maximum of *x*
_{
3
} and *x*
_{
4
}, consecutively. We will concentrate upon the case of three and four practices, although the arguments can easily be extended to higher numbers of multiple practices. Figure 1 shows areas of complementarity and substitutability (or neither) in case of three practices and \( x_{3} \in [0,1] \). The latter can be seen as an adoption rate of a practice, running from 0% (no adoption) to 100% (complete adoption).^{Footnote 4} The areas of complementarity and substitutability include the bold lines but not the origin (0,0).

In case the practices take on discrete values variables (step size chosen equal to one) we replace the derivative in definition 1 by a difference. If we consider the first two practices, without loss of generality, the following definition holds:

### Definition 2

*(discrete practices)* Practices x_{
1
} and x_{
2
} are considered complementary in the function f if and only if \( f(x_{1} + 1,x_{2} + 1,x_{3} , \ldots ,x_{n} ) + f(x_{1} ,x_{2} ,x_{3} , \ldots ,x_{n} ) \ge f(x_{1} + 1,x_{2} ,x_{3} , \ldots ,x_{n} ) + f(x_{1} ,x_{2} + 1,x_{3} , \ldots ,x_{n} ) \) for all values of \( (x_{1} , \ldots ,x_{n} ) \) with the inequality holding strictly for at least one value.

The case of dichotomously measured practices (practice is used or not) is a special case of this definition. In that case functions (1), (2), and (3) can also be conveniently rewritten in terms of the possible combinations of practices (cf. Mohnen and Röller 2005). With two practices the collection of possible combinations is defined in the usual binary order as \( D = \{ \,(0,0),\,(0,1),\,(1,0),\,(1,1)\,\} \). We introduce the indicator function \( I_{D = (r,s)} \), equal to one when the combination is \( (r,s) \), else zero. Similar, we have \( I_{D = (r,s,t)} \) for the case of three practices. The functions *f* is rewritten as:

The conditions of complementarity now correspond to \( \alpha_{12} = f(1,1) - f(1,0) - f(0,1) + f(0,0) = \beta_{11} + \beta_{00} - \beta_{10} - \beta_{01} > 0 \) for two practices and \( \alpha_{12} = \beta_{110} + \beta_{000} - \beta_{100} - \beta_{010} \ge 0 \) and \( \alpha_{12} + \alpha_{123} = \beta_{111} + \beta_{001} - \beta_{101} - \beta_{011} \ge 0 \) for three practices, with one of the two inequalities holding strictly.

## 3 The testing procedure

In case of two practices the test for global complementarity is a one-sided* t*-test of the null hypothesis of \( \alpha_{12} = 0 \) in Eq. 1. However, in the general case of *n* practices, the number of constraints that have to be tested simultaneously is \( 2^{n - 2} \). One approach is to apply statistical tests along the lines of Gouriéroux et al. (1982), Kodde and Palm (1986) and Wolak (1989).^{Footnote 5} This procedure is followed by Mohnen and Röller (2005) for dichotomously measured practices. The critical values of such tests are however cumbersome to derive. This limits applicability. In addition the test requires software able to do linear regression under unequality constraints. We propose a simpler procedure, which we explain for three and four practices (for five practices, see the Appendix), all measured in the unit interval [0,1]: \( 0 \le x_{3} ,x_{4} \le 1 \). This also includes the case of dichotomously measured practices. Our procedure is a separate induced test, where a combined hypothesis is accepted if all the separate hypotheses are accepted (Savin 1980). For three practices we have:

where \( \varepsilon \sim {\text{N}}(0,\sigma_{\varepsilon }^{2} ) \). There is complementarity between practices 1 and 2 if \( \alpha_{12} \ge 0 \) and \( \alpha_{12} + \alpha_{123} \ge 0 \) with at least one of the two inequalities holding strictly. Now we rewrite Eq. 6 into:

The test can now be executed using linear regression and considering the significance of the coefficients of the variables \( x_{1} x_{2} - x_{1} x_{2} x_{3} \) and \( x_{1} x_{2} x_{3} \). Say that the* t*-value of the former is *t*
_{
1
} and of the latter *t*
_{
2
}, then the new test indicates complementarity if *either* “\( t_{1} > t_{c} \) and \( t_{2} > - t_{d} \)” *or* “\( t_{1} > - t_{d} \) and \( t_{2} > t_{c} \)” where *t*
_{
c
} and *t*
_{
d
} are the critical* t*-values depending upon the significance level. The test indicates substitutability if *either* “\( t_{1} < - t_{c} \) and \( t_{2} < t_{d} \)” *or* “\( t_{1} < t_{d} \) and \( t_{2} < - t_{c} \)”. For four practices we have:

This can be rewritten into:

The test on complementarity is whether \( \alpha_{12} \ge 0 \) and \( \alpha_{12} + \alpha_{123} \ge 0 \) and \( \alpha_{12} + \alpha_{124} \ge 0 \) and \( \alpha_{12} + \alpha_{123} + \alpha_{124} + \alpha_{1234} \ge 0 \) with at least one of the four inequalities holding strictly. Hence, we use linear regression and consider significance of the coefficients of the four variables \( x_{1} x_{2} + x_{1} x_{2} x_{3} x_{4} - x_{1} x_{2} x_{3} - x_{1} x_{2} x_{4} \), \( x_{1} x_{2} x_{3} - x_{1} x_{2} x_{3} x_{4} \), \( x_{1} x_{2} x_{4} - x_{1} x_{2} x_{3} x_{4} \) and \( x_{1} x_{2} x_{3} x_{4} \). Denote the* t*-values of these coefficients as *t*
_{
1
}, *t*
_{
2
}, *t*
_{
3
} and *t*
_{
4
}. The test indicates complementarity in case one of the following four conditions holds: \( (t_{1} > t_{c} )\, \wedge \,(t_{2} > - t_{d} )\, \wedge \,(t_{3} > - t_{d} )\, \wedge \,(t_{4} > - t_{d} ) \) or \( (t_{1} > - t_{d} )\, \wedge \,(t_{2} > t_{c} )\, \wedge \,(t_{3} > - t_{d} )\, \wedge \,(t_{4} > - t_{d} ) \) or \( (t_{1} > - t_{d} )\, \wedge \,(t_{2} > - t_{d} )\, \wedge \,(t_{3} > t_{c} )\, \wedge \,(t_{4} > - t_{d} ) \) or \( (t_{1} > - t_{d} )\, \wedge \,(t_{2} > - t_{d} )\, \wedge \,(t_{3} > - t_{d} )\, \wedge \,(t_{4} > t_{c} ) \). Testing for substitutability means that we replace the ‘larger than’ signs by ‘smaller than’ signs. The literature on Bonferroni procedures is now relevant for determining the probability of type I error for the significance level of the combined hypothesis. Given a significance level for the combined hypothesis of *A* and a total of \( 2^{n - 2} \) constraints, the (original) Bonferroni procedure suggests a significance level for the seperate hypotheses of *A/*
\( 2^{n - 2} \), see e.g. Olejnik et al. (1997), p. 391.^{Footnote 6} That is to reduce the overall probability of a type I error.

Our test procedure performs a multiple-restrictions test directly connected to the definition of complementarity and substitutability. We compare the performance of the multiple-restrictions test with two alternative test procedures used in recent empirical work. The “single cross-term” test procedure only incorporates the cross term of two practices in the estimated equation, and infers complementarity from the estimated coefficient of the cross-term (e.g. Bresnahan et al. 2002). The “all cross-term” test follows the same procedure but incorporates all pair-wise cross-terms *x*
_{
i
}
*x*
_{
j
}
*i* ≠ *j* in one equation (e.g. Caroli and Van Reenen 2001). Another recently proposed procedure is the one by Mohnen and Röller (2005). This procedure tests for strict complementarity and substitutability (where all ‘larger than’ and ‘smaller than’ signs are hypothesized to hold) and therefore is not directly comparable. The procedure is also limited to discrete practices (dummy variables) and by using the Kodde and Palm (1986) critical values has a sizeable inconclusive area. Such inconclusive test outcomes become more likely with the increase of the number of inequality constraints. Furthermore, the test is relatively complicated to execute, requiring optimization under unequality constraints, and difficult to extend to higher numbers of practices.

The performance function in the case of three practices is given in Eq. 6. The single cross term test imposes \( \alpha_{13} = \alpha_{23} = \alpha_{123} = 0 \) and judges complementarity to exist if \( \alpha_{12} > 0 \). This is a simple* t*-test. The multiple cross-term test applies the same criterion but only imposes \( \alpha_{123} = 0 \). Obviously, the “single cross-term” and “all cross-term” tests suffer from omitted-variable bias. However, since these tests involve restricted estimation, the estimators of \( \alpha_{12} \) are likely to have smaller variance (e.g. Judge et al. 1982, chapter 22). In the next section we devise a Monte Carlo experiment to compare the performance of the three test procedures having a trade-off between bias and precision. Since almost all empirical studies of complementarity in the literature examine the impact of using a certain practice or not, we focus our Monte Carlo experiment on the case of dichotomous variables.

## 4 Monte Carlo experiments

The data for our experiments are generated for samples of 1,000 and 5,000 observations. These are common sample sizes when investigating complementarities between organizational practices.^{Footnote 7} We describe the Monte Carlo experimental procedure for three practices. In the first step the coefficients \( \alpha_{1} \) through \( \alpha_{123} \) are randomly and independently drawn from the standard normal distribution and then rounded to whole or half numbers. In the second step, variables *z*
_{
1
}
*, z*
_{
2
}, *z*
_{
3
} are drawn from the multivariate standard normal distribution. Variables *x*
_{
1
}, *x*
_{
2
}, *x*
_{
3
} are equal to one when *z*
_{
1
} > 0, *z*
_{
2
} > 0 and *z*
_{
3
} > 0, respectively, else zero. In order to mimic empirical research settings, the correlation structure between the practices is allowed to depend on the presence of complementarity or substitutability. Organizations are more likely to simultaneously adopt two practices if these are complementary. In case the draws of \( \alpha_{1} \) through \( \alpha_{123} \) indicate complementarity, the correlation coefficient between *x*
_{
1
} and *x*
_{
2
} is set at 0.5 and in case of substitutability at −0.5. The correlation coefficient is set at zero if the draw indicates no complementarity or substitutability.^{Footnote 8} Eq. 6 is used to generate data for *y*. For four practices a similar procedure and Eq. 8 are used.

The outcomes of the tests are established using 10% two-sided significance levels. This means that the critical level is equal to 1.65 for the pair-wise tests. We also use *t*
_{
d
} = 1.65 but *t*
_{
c
} equal to 1.96 for the multiple-restriction test when there are three practices and 2.24 when there are four practices. The latter follow from the *A/*
\( 2^{n - 2} \) formula with *A* equal to 10% and *n* equal to 3 and 4, respectively. The pair-wise tests consider the sign and* t*-statistic for \( \hat{\alpha }_{12} \). The above procedure has been repeated 10,000 times for models with different explanatory power. Tables 1, 2, 3 and 4 presents the results of the Monte Carlo experiments for models with three different values of σ_{ε} These are σ_{ε} equal to 0.25, 1 and 3.5. These correspond to values for R-squared of approximately 90, 50 and 10% in case of three practices (Tables 1 and 2). The explanatory power is higher in the case of four practices with R-squared around 95, 67 and 18%, respectively (Tables 3 and 4). In Tables 1 and 3 we consider 1,000 observations and in Tables 2 and 4 we consider 5,000 observations. In each of the experiments we compare the results of the tests with the true states of complementarity and substitutability.

Our multiple-restrictions test outperforms both the “single cross-term” and “all cross-term” tests in the large majority of cases. Only in case of a model with a low fit (σ_{ε} equal to 3.5) and a relatively low number of observations vis-à-vis the number of practices, the pair-wise tests appear to perform better. The pair-wise tests perform especially poor in case of four practices. Obviously, in that case there are three further conditions than only \( \alpha_{12} > 0 \). The pair-wise tests perform relatively poorly in the high explanatory power models (σ_{ε} equal to 0.25, or 1). Clearly, the problem of bias is more important than the lower variance of \( \hat{\alpha }_{12} \) in those cases. The pair-wise tests perform much better in relative terms for the models with low* R*
^{2}. The “single cross-term” test shows the highest percentage of correct predictions with for example 63.5% in Table 1 and 71.0% in Table 3. Hence, the simpler tests restricting some of the parameters to zero, benefit from having low variance although at the expense of some bias. We conclude that our multiple-restrictions test is a clearly improved testing framework for complementarity or substitutability but only for models in which practices have a noticeable impact on performance. Otherwise, for three practices, pair-wise tests appear as easily executed alternatives with relatively good predictive power.

## 5 Conclusion

Recent empirical studies of organizational performance have been concerned with establishing potential complementarity between more than two organizational practices adopted simultaneously. These papers have drawn conclusions on the basis of potentially biased estimates of pair-wise interaction effects between such practices. This paper developed a consistent and simple testing framework based on multiple inequality constraints that derives from the definition of (strict) super modularity as suggested by Athey and Stern (1998), and compares the performance of this test with previously used methods. Monte Carlo results show that this multiple-restrictions test is generally superior for performance models.

## Notes

That is, as long as the population of organizations includes a reasonable number of organizations that takes non-optimal combinations of practices. In addition, omitted organizational practices may bias the test procedure.

Mohnen and Roller (2005) adopt a multiple-inequality restrictions framework but it is limited to dichotomous variables and their testing framework has the disadvantage of an inconclusive area.

In case all bilateral combinations of practices satisfy complementarity, the objective function is strictly supermodular.

Practices that are differently scaled may be rescaled to the unit interval [0,1]. For example, a practice

*x*that can take any real value, both positive or negative, can be rescaled as exp(*x*)/(1 + exp(*x*)).For a Bayesian approach, see Oh (1998).

There are more sophisticated, modified, Bonferroni procedures, see e.g. Olejnik et al. (1997). These may further improve our test procedure, but go beyond the scope of this note.

Examples include Black and Lynch (2001) with a number of observations of about 1,000, Galia and Legros (2004) with about 1,800, Laursen and Foss (2003) with about 1,900, Belderbos et al. (2006) with about 2,000, Bresnahan et al. (2002) with about 2,200, Catozzella and Vivarelli (2007) with about 3,000, Mohnen and Röller (2005) with about 5,500 and Cozzarin and Percival (2006) with about 5,900 observations.

For comparison, we executed similar Monte Carlo simulations with correlation coefficients set at 0.8, −0.8 and 0, respectively and without systematic correlation between the practices. We found only limited changes in the comparative accuracy of the tests. Obviously, tests of complementarity and substitutability perform better when there is lack of multicollinearity among practices.

## References

Arora A (1996) Testing for complementarities in reduced-form regressions: a note. Econ Lett 50:51–55

Arora A, Gambardella A (1990) Complementarity and external linkages: the strategies of the large firms in biotechnology. J Ind Econ 38:361–379

Athey S, Stern S (1998) An empirical framework for testing theories about complementarity in organizational design. NBER working paper no. 6600

Athey S, Stern S (2002) The impact of information technology on emergency health care outcomes. RAND J Econom 33:399–432

Baumol W, Panzar JC, Willig RD (1988) Contestable markets and the theory of industry structure. Harcourt Brace, Jovanovich

Belderbos R, Carree M, Lokshin B (2006) Complementarity in R&D cooperation strategies. Rev Ind Organ 28:401–426

Black S, Lynch L (2001) How to compete: the impact of workplace practices and information technology on productivity. Rev Econ Stat 83:434–445

Bresnahan T, Brynjolfsson E, Hit LM (2002) Information technology, workplace organization, and the demand for skilled labor: firm-level evidence. Quart J Econ 117:339–375

Caroli E, Van Reenen J (2001) Skill-biased organizational change? Evidence from a panel of British and French establishments. Quart J Econ 116:1449–1492

Catozzella A, Vivarelli M (2007) The catalyzing role of in-house R&D in fostering the complementarity of innovative inputs, IZA discussion paper 3126

Cozzarin BP, Percival JC (2006) Complementarities between organizational strategies and innovation. Econ Innov New Technol 15:195–217

Galia F, Legros D (2004) Complementarities between obstacles to innovation: evidence from France. Res Policy 33:1185–1199

Gouriéroux C, Holly A, Monfort A (1982) Likelihood ratio test, Wald test, and Kuhn-Tucker test in linear models with inequality constraints on the regression parameters. Econometrica 50:63–80

Grossman G, Maggi G (2000) Diversity and trade. Am Econ Rev 90:1255–1275

Ichniowski C, Shaw K, Prennushi G (1997) The effects of human resource management practices on productivity. Am Econ Rev 87:291–313

Judge G, Hill R, Griffiths W, Lütkepohl H, Lee T-C (1982) Introduction to the theory and practice of econometrics. Wiley, Hoboken

Kodde D, Palm F (1986) Wald criteria for jointly testing equality and inequality restrictions. Econometrica 54:1243–1248

Laursen K, Foss NJ (2003) New human resource management practices, complementarities and the impact on innovation performance. Cambridge J Econ 27:243–263

Leiponen A (2005) Skills and innovation. Int J Indust Organ 23:303–323

Love JH, Roper S (2009) Organizing the innovation process: complementarities in innovation networking. Ind Innov 16:273–290

Milgrom P, Roberts J (1990) The economics of modern manufacturing: technology, strategy, and organization. Am Econ Rev 80:511–528

Miravete EJ, Pernias JC (2004) Innovation complementarity and scale of production CEPR discussion paper no 4483

Mohnen P, Röller L-H (2005) Complementarities in innovation policy. Eur Econ Rev 49:1431–1450

Oh M-S (1998) A Bayes test for simple versus one-sided hypothesis on the mean vector of a multivariate normal distribution. Commun Stat Theory Methods 27:2371–2389

Olejnik S, Li J, Supattathum S, Huberty CJ (1997) Multiple testing and statistical powers with modified Bonferroni procedures. J Educ Behav Stat 22:389–406

Savin NE (1980) The Bonferroni and the Scheffé multiple comparison procedures. Rev Econ Stud 47:255–273

Wolak F (1989) Testing inequality constraints in linear econometric models. J Econom 41:205–235

## Acknowledgments

The authors thank Daron Acemoglu, Pierre Mohnen, Franz Palm and Scott Stern for helpful comments on earlier versions. The usual disclaimer applies.

### Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

## Author information

### Authors and Affiliations

### Corresponding author

## Appendix: general overview of variables and hypotheses

### Appendix: general overview of variables and hypotheses

The following table provides the relevant variables in the regression equation and the related hypotheses, for up to five practices, to allow for easy extension (Table 5).

## Rights and permissions

**Open Access** This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

## About this article

### Cite this article

Carree, M., Lokshin, B. & Belderbos, R. A note on testing for complementarity and substitutability in the case of multiple practices.
*J Prod Anal* **35**, 263–269 (2011). https://doi.org/10.1007/s11123-010-0189-8

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11123-010-0189-8