Skip to main content
Log in

Estimation of endogenous firm productivity without instruments: an application to foreign investment

  • Published:
Journal of Productivity Analysis Aims and scope Submit manuscript

Abstract

We consider identification and estimation of the firm-level gross production function with a controlled productivity evolution process and endogenous contemporaneous productivity determinants in the absence of instrumental variables (IVs). We allow the joint determination of firm productivity and its determinants, such as research and development, exports, and foreign direct investment, which is motivated by the recognition that certain unobserved confounders, such as CEO ability and managerial strategies, may simultaneously affect both productivity and these determinants. This assumption is in sharp contrast to those used in previous studies on proxy-variable estimation of production functions, in which productivity processes without determinants or with exogenous/predetermined determinants are often assumed. Since IVs for endogenous productivity determinants are in general difficult to obtain in the production context, we propose an IV-free generalized method of moments (GMM) estimator based on Lewbel et al. (2023) and use higher-order moments for unobserved errors in addition to conventional orthogonality conditions. We demonstrate the finite sample performance of the proposed estimator through Monte Carlo simulations. We also apply the methodology to investigate the impact of foreign equity on the productivity of Chinese manufacturing firms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Besides the proxy variable methods, these approaches include adding individual fixed effects to panel data production models (Hoch 1955, 1962; Mundlak 1961; Mundlak and Hoch 1965), applying dynamic panel methods to firm productivity (Arellano and Bond 1991; Blundell and Bond 1998, 2000), and using input prices as IVs for endogenous inputs (Doraszelski and Jaumandreu 2013, 2018). Unfortunately, these methods can be theoretically problematic, unsatisfactory in practice, and/or demanding on data; see reviews provided by Griliches and Mairesse (1995) and Ackerberg et al. (2015).

  2. See more discussions on this point by De Loecker (2013) and Malikov and Zhao (2021). Such modeling inconsistency and resulted biases have also been well documented in another strand of literature – the stochastic frontier analysis (see, e.g., Wang and Schmidt 2002).

  3. One exception is from Malikov and Zhao (2021). Although the authors study the case of the predetermined and exogenous productivity shifter in their baseline model, they allow for possible correlation between the productivity innovation and shifter in the robustness checks. Our paper can be considered as an extension of their work without productivity spillovers.

  4. We consider the identification and the estimation of a more flexible production function in Appendix A.

  5. It is possible to add productivity shifters in their models, which is explicitly discussed in the footnotes of Ackerberg et al. (2015) and Gandhi et al. (2020).

  6. Another way to see that kit and lit are not valid IVs for Zit is from (3.6). According to our model assumptions, we can rewrite (2.2) as (3.6), where the latent ωit is replaced with its proxy function. Then, we can clearly see that both kit and lit are included in this equation and they can not be treated as IVs of Zit.

  7. We assume that there is no serial correlation of ζit. It implies that, as traditionally assumed in the literature, the persistence of ωit is completely controlled by its lagged term (ωit−1) in its autoregressive process. Economically, given the ambiguous nature and inclusive interpretation of ωit, it means the persistent effects of unobserved confounders, such as the CEO’s ability and management practice, on ωit is captured by the AR(1) nature of the process. It is the changes of the CEO’s ability and management practice in period t that will affect the productivity innovation ξit and the unexplained part of productivity determinants εit simultaneously. We consider the situation of heteroskedasticity of \({({\zeta }_{it},{\nu }_{it},{u}_{it})}^{{\prime} }\) in Appendix D.

  8. For example, both kit and lit belong to \({{{{\mathcal{I}}}}}_{it-1}\) since they are predetermined. Conditions \({\mathbb{E}}({\xi }_{it}| {k}_{it},{l}_{it},{\omega }_{it-1})=0\) are used for identification by Gandhi et al. (2020).

  9. From an economic application perspective, one may have a clear prior for the sign of ρ. In Lewbel et al. (2020) example of a returns to schooling model, ρ > 0 is a natural assumption, since it says that unobserved ability that increases (decreases) education outcomes increases (decreases) wages. In our production context, a better (worse) managed firm, for example, may simultaneously have more (less) productivity innovation and attract more (fewer) foreign investors and subsidies. However, when the unobserved confounders/factors are multidimensional and ζit captures their aggregate effects, one should justify the assumption ρ > 0 or ρ < 0 more carefully based on the context of the specific application.

  10. Additional lagged terms may not help identification under our model assumption since they are not informative conditional on the control variables (1, kit, lit, ωit−1).

  11. Let the statistic of focus be denoted by T. We use its empirical distribution of B bootstrap estimates \(\{{\widehat{T}}_{1},\ldots ,{\widehat{T}}_{B}\}\) to estimate its standard error as \(\sqrt{\frac{1}{B-1}\mathop{\sum }\nolimits_{b = 1}^{B}{({\widehat{T}}_{b}-\bar{T})}^{2}}\), where \(\bar{T}=\frac{1}{B}\mathop{\sum }\nolimits_{b = 1}^{B}{\widehat{T}}_{b}\), and estimate its (1 − a) × 100% confidence bonds as intervals between the [a/2 × 100]th and [(1 − a/2) × 100]th percentiles of the bootstrap distribution.

  12. Price variation is ignored for ease of illustration.

  13. We thank an anonymous reviewer for this remark.

  14. We estimate and report \(\log \rho\) due to identification assumption, ρ > 0. We do not report the intercepts since our generated error terms do not have zero mean.

  15. Interestingly, when Zi,t−1 is used, Gandhi et al.’s approach still gives an consistent estimator for βK. Given the DGP, the true \({y}_{it}^{* }\) can be written as

    $${y}_{it}^{* }={\beta }_{K}{k}_{it}+\left({\gamma }_{0}+{\gamma }_{2}{\alpha }_{0}\right)+\left({\gamma }_{1}+{\gamma }_{2}{\alpha }_{1}\right){\omega }_{it-1}+0\cdot {Z}_{it-1}+\left({\varepsilon }_{it}+{\xi }_{it}+{\eta }_{it}\right),$$
    (4.4)

    where Zit−1 \(\left({\varepsilon }_{it}+{\xi }_{it}+{\eta }_{it}\right)\). \({y}_{it}^{* }={\beta }_{K}{k}_{it}+{\gamma }_{0}+{\gamma }_{1}{\omega }_{it-1}+{\gamma }_{2}{Z}_{it-1}+{\xi }_{it}+{\eta }_{it}\) βK.

  16. These results are reported as a placebo test only, because the consistency of our estimator relies on Assumption 2. When Assumption 2 (and other assumptions) holds, the three weight matrices are the same in the population, and the results from using them should be similar. When Assumption 2 is violated, the coefficient estimators are generally inconsistent in the first place, and how the the coefficient and standard error estimators behave is unknown.

  17. The industrial weighted productivity in 2004 is normalized to one for ease of comparison.

  18. The industrial weighted productivity in 2004 is normalized to one for ease of comparison.

  19. We can first put unknown parameters (βM, βMM, βKM, and βLM) and Λ together and get \(\widehat{{\beta }_{M}{{\Lambda }}}\), \(\widehat{{\beta }_{MM}{{\Lambda }}}\), \(\widehat{{\beta }_{KM}{{\Lambda }}}\), and \(\widehat{{\beta }_{LM}{{\Lambda }}}\) from NLS. Since \(\widehat{{{\Lambda }}}=\frac{1}{N}{\sum }_{i}{\sum }_{t}{\widehat{\eta }}_{it}\), we can obtain \({\widehat{\beta }}_{M}\), \({\widehat{\beta }}_{MM}\), \({\widehat{\beta }}_{KM}\), and \({\widehat{\beta }}_{LM}\) by taking their ratios.

References

  • Ackerberg DA, Caves K, Frazer G (2015) Identification properties of recent production function estimators. Econometrica 83:2411–2451

  • Amiti M, Freund C (2010) The anatomy of China’s export growth. In China’s growing role in world trade, 35–56 (University of Chicago Press)

  • Arellano M, Bond S (1991) Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. Rev Econ Stud 58:277–297

    Article  Google Scholar 

  • Baltagi BH, Egger PH, Kesina M (2016) Firm-level productivity spillovers in China’s chemical industry: a spatial Hausman-Taylor approach. J Appl Econom 31:214–248

    Article  Google Scholar 

  • Blundell R, Bond S (1998) Initial conditions and moment restrictions in dynamic panel data models. J Econom 87:115–143

    Article  Google Scholar 

  • Blundell R, Bond S (2000) GMM estimation with persistent panel data: an application to production functions. Econom Rev 19:321–340

    Article  Google Scholar 

  • Bratti M, Miranda A (2011) Endogenous treatment effects for count data models with endogenous participation or sample selection. Health Econ 20:1090–1109

    Article  Google Scholar 

  • De Loecker J (2013) Detecting learning by exporting. Am Econ J: Microecon 5:1–21

    Google Scholar 

  • De Loecker J, Goldberg PK, Khandelwal AK, Pavcnik N (2016) Prices, markups, and trade reform. Econometrica 84:445–510

    Article  Google Scholar 

  • Deb P, Trivedi PK (2006) Specification and simulated likelihood estimation of a non-normal treatment-outcome model with selection: application to health care utilization. Econom J 9:307–331

    Article  Google Scholar 

  • Doraszelski U, Jaumandreu J (2013) R&D and productivity: estimating endogenous productivity. Rev Econ Stud 80:1338–1383

    Article  Google Scholar 

  • Doraszelski U, Jaumandreu J (2018) Measuring the bias of technological change. J Political Econ 126:1027–1084

    Article  Google Scholar 

  • Gandhi A, Navarro S, Rivers DA (2020) On the identification of gross output production functions. J Political Econ 128:2973–3016

    Article  Google Scholar 

  • Grieco PL, Li S, Zhang H (2016) Production function estimation with unobserved input price dispersion. Int Econ Rev 57:665–690

    Article  Google Scholar 

  • Griliches Z, Mairesse J (1995) Production functions: the search for identification. NBER Working Paper

  • Hoch I (1955) Estimation of production function parameters and testing for efficiency. Econometrica 23:325–26

    Google Scholar 

  • Hoch I (1962) Estimation of production function parameters combining time-series and cross-section data. Econometrica: J Econom Soc 34–53

  • Hou Z, Jin M, Kumbhakar SC (2020) Productivity spillovers and human capital: a semiparametric varying coefficient approach. Euro J Oper Res 287:317–330

    Article  Google Scholar 

  • Jin M, Zhao S, Kumbhakar SC (2019) Financial constraints and firm productivity: evidence from Chinese manufacturing. Euro J Oper Research 275:1139–1156

    Article  Google Scholar 

  • Keller W (2010) International trade, foreign direct investment, and technology spillovers. In Handbook of the Economics of Innovation, vol. 2, 793–829 (Elsevier)

  • Keller W, Yeaple SR (2009) Multinational enterprises, international trade, and productivity growth: firm-level evidence from the united states. Rev Econ Stat 91:821–831

    Article  Google Scholar 

  • Kim K, Luo Y, Su Y (2019) A robust approach to estimating production functions: replication of the ACF procedure. J Appl Econom 34:612–619

    Article  Google Scholar 

  • Klein R, Vella F (2010) Estimating a class of triangular simultaneous equations models without exclusion restrictions. J Econom 154:154–164

    Article  Google Scholar 

  • Levinsohn J, Petrin A (2003) Estimating production functions using inputs to control for unobservables. Rev Econ Stud 70:317–341

    Article  Google Scholar 

  • Lewbel A (2012) Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. J Bus Econ Stat 30:67–80

    Article  Google Scholar 

  • Lewbel A, Schennach SM, Zhang L (2020) Identification of a triangular two equation system without instruments. Tech Rep, cemmap working paper

  • Lewbel A, Schennach SM, Zhang L (2023) Identification of a triangular two equation system without instruments. J Bus Econ Stat 1–35

  • Li M, Jin M, Kumbhakar SC (2022) Do subsidies increase firm productivity? Evidence from Chinese manufacturing enterprises. Euro J Oper Res

  • Malikov E, Zhang J, Zhao S, Kumbhakar SC (2022) Accounting for cross-location technological heterogeneity in the measurement of operations efficiency and productivity. J Oper Manag 68:153–184

    Article  Google Scholar 

  • Malikov E, Zhao S (2021) On the estimation of cross-firm productivity spillovers with an application to FDI. Rev Econ Stat

  • Malikov E, Zhao S, Kumbhakar SC (2020) Estimation of firm-level productivity in the presence of exports: evidence from China’s manufacturing. J Appl Econom 35:457–480

    Article  Google Scholar 

  • Marschak J, Andrews WH (1944) Random simultaneous equations and the theory of production. Econometrica 143–205

  • Melitz MJ (2003) The impact of trade on intra-industry reallocations and aggregate industry productivity. Econometrica 71:1695–1725

    Article  Google Scholar 

  • Mo J, Qiu LD, Zhang H, Dong X (2021) What you import matters for productivity growth: experience from Chinese manufacturing firms. J Dev Econ 152:102677

    Article  Google Scholar 

  • Mundlak Y (1961) Empirical production function free of management bias. J Farm Econ 43:44–56

    Article  Google Scholar 

  • Mundlak Y, Hoch I (1965) Consequences of alternative specifications in estimation of cobb-douglas production functions. Econometrica: J Econom Soc 814–828

  • Olley GS, Pakes A (1996) The dynamics of productivity in the telecommunications equipment industry. Econometrica 64:1263–1297

    Article  Google Scholar 

  • Park S, Gupta S (2012) Handling endogenous regressors by joint estimation using copulas. Mark Sci 31:567–586

    Article  Google Scholar 

  • Rigobon R (2003) Identification through heteroskedasticity. Rev Econ Stat 85:777–792

    Article  Google Scholar 

  • Roberts MJ, Tybout JR (1997) The decision to export in colombia: an empirical model of entry with sunk costs. Am Econ Rev 87:545–564

    Google Scholar 

  • Thorbecke W, Smith G (2010) How would an appreciation of the Renminbi and other East Asian currencies affect China’s exports? Rev Int Econ 18:95–108

    Article  Google Scholar 

  • Tran KC, Tsionas EG (2015) Endogeneity in stochastic frontier models: Copula approach without external instruments. Econ Lett 133:85–88

    Article  Google Scholar 

  • Upward R, Wang Z, Zheng J (2013) Weighing China’s export basket: the domestic content and technology intensity of Chinese exports. J Comp Econ 41:527–543

    Article  Google Scholar 

  • Wang H-J, Schmidt P (2002) One-step and two-step estimation of the effects of exogenous variables on technical efficiency levels. J Prod Anal 18:129–144

    Article  Google Scholar 

  • Xie M (2023) Research on the International Promotion of Chinese Driven by Foreign Trade: Theory and Empirical Evidence (People’s Publishing House)

  • Zhang H (2017) Static and dynamic gains from costly importing of intermediate inputs: evidence from Colombia. Euro Econ Rev 91:118–145

    Article  Google Scholar 

  • Zhang H (2019) Non-neutral technology, firm heterogeneity, and labor demand. J Dev Econ 140:145–168

    Article  Google Scholar 

  • Zhao S, Qian B, Kumbhakar SC (2020) Estimation of productivity and markups with price dispersion: Evidence from Chinese manufacturing during economic transition. South Econ J 87:666–699

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minjie Huang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publishers note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Additional empirical results

This section includes additional empirical results of firm/industry productivity in Chinese chemical manufacturing sector over the period 2004–2006.

Figure 3 provides the boxplots of firm productivity estimated from our model during the sample period and by ownership. It is apparent that both types of firms become more productive over our sample period – their median values and interquartile boxes shift upward. The estimated annual productivity of the foreign firms is on average larger than that of the domestic firms.

Fig. 3
figure 3

Boxplots of Productivity

Figure 4 shows the density plots of estimated productivity by ownership. Foreign firms have a slightly more left-skewed distribution compared with that of domestic firms, although the distribution range of these two types of firms are similar.

Fig. 4
figure 4

Density of Productivity

To further explore the productivity differentials between domestic and foreign firms at different quantiles of the productivity distribution, we estimate the deciles of estimated productivity for domestic and foreign firms respectively, and then calculate their differences. The numeric results of both models are reported in Table 7. Both methods show that foreign equity can bring productivity gains to foreign-invested firms that are located at different places of the productivity distribution; the productivity differential between foreign and domestic firms becomes larger as the decile increases.

Table 7 Productivity Quantile Comparison

We also estimate the following quantile regression which corresponds to the analysis (5.1) in the main text but controls for the plant size (proxied by the number of employees) as well as the region and year effects:

$${Q}_{t}[{\omega }_{it}]={\beta }_{0,\tau }+{\beta }_{1,\tau }{D}_{F}+{\beta }_{2,\tau }{X}_{it}\ \,{{\mbox{for}}}\,\ \tau \in (0,1),$$
(C.1)

where DF is a dummy variable which equals 1 if firm i is a foreign firm at year t and 0 for a domestic firm, X contains the control variables, e.g., the plant size. We let the quantile index τ take values from 0.1 to 0.9 with 0.05 increments. In Figure 5, the solid horizontal line shows the conditional mean productivity difference with its 95% confidence interval denoted by the dash lines. The quantile productivity differentials are all significantly positive and increasing with the productivity quantile, indicating that the productivity divergence between the foreign and the domestic is more prominent among the more productive firms.

Fig. 5
figure 5

Foreign Productivity Differential Estimates across Productivity Quantiles. Notes: The shaded band corresponds to a 95% confidence interval. The solid horizontal line shows the conditional mean, with the 95% confidence interval denoted by dashed lines

Table 8 reports the Olley and Pakes (1996) decomposition of the output-weighted aggregate industry productivity, Φ, following (5.2) which use Gandhi et al. (2020) estimates.Footnote 18 Similar to the findings in the main text, the production distribution shift dominates the effect of market share reallocation efficiency in both firm groups, though foreign firms perform slightly better in terms of the productivity level and reallocation efficiency.

Table 8 Weighted Productivity Decomposition – Gandhi et al. (2020) Estimator

1.2 Heteroskedasticity adapted estimator

We first extend Lemma 1 of Lewbel et al. (2023) to a conditional version. The proof follows the same arguments used by Lewbel et al. (2023) except that everything now is conditional on X, a vector of exogenous variables. Using the same notation as Lewbel et al. (2023), let Q = W − γY = βU + R, P = W − (γ + β)Y = − βV + R, and Y = U + V. The conditional version of the first equation in Lemma 1 of Lewbel et al. (2023) holds trivially, i.e., \({\mathbb{E}}(QPY| X)=0\). For the conditional version of the second equation, the left hand side is

$$\begin{array}{ll}Cov(QP,{Y}^{2}| X)&=Cov\left[(\beta U+R)(-\beta V+R),{(U+V)}^{2}| X\right]\\ &=Cov\left[{R}^{2}+RU\beta -RV\beta -UV{\beta }^{2},{U}^{2}+2UV+{V}^{2}| X\right]\\ &=Cov\left(-UV{\beta }^{2},{U}^{2}+2UV+{V}^{2}| X\right)\\ &=-{\beta }^{2}E\left[UV({U}^{2}+2UV+{V}^{2})| X\right]\\ &=-2{\beta }^{2}E({U}^{2}| X)E({V}^{2}| X),\end{array}$$

where the third equality holds because U, V and R are mutually independent conditional on X, the fourth equality holds because E(UVX) = E(UX)E(VX) = 0, and the last equality holds because E(U3VX) = E(U3X)E(VX) = 0 and E(UV3X) = E(UX)E(V3X) = 0. Meanwhile, notice that

$$\begin{array}{ll}E(QY| X)E(PY| X)\,=E\left[(\beta U+R)(U+V)| X\right]E\left[(-\beta V+R)(U+V)| X\right]\\ \qquad\qquad\qquad\qquad\quad=E\left[\beta {U}^{2}| X\right]E\left[-\beta {V}^{2}| X\right]\\ \qquad\qquad\qquad\qquad\quad=-{\beta }^{2}E({U}^{2}| X)E({V}^{2}| X)\\ \qquad\qquad\qquad\qquad\quad=-{\beta }^{2}{\sigma }_{U}^{2}(X){\sigma }_{V}^{2}(X).\end{array}$$

Hence,

$$Cov(QP,{Y}^{2}| X)=2E(QY| X)E(PY| X).$$
(D.1)

Note that Eq. (D.1) does not imply the following equality regarding unconditional moments

$$Cov(QP,{Y}^{2})=2E(QY)E(PY)$$

because

$$Cov(QP,{Y}^{2})\,\ne \,E\left[Cov(QP,{Y}^{2}| X)\right]$$
$$E(QY)E(PY)\,\ne \,E\left[E(QY| X)E(PY| X)\right]$$

To show that the estimator in our main text has the potential to be generalized to allow for heteroskedasticity, we demonstrate a simple estimator based on Eq. (D.1) in the case where the exogenous variable X is a scalar discrete variable through simulations.

To sketch ideas, rewrite Eq. (D.1) as

$$E\left[QP{Y}^{2}-2E(QY| X)E(PY| X)| X\right]=0.$$
(D.2)

Two issues emerge regarding the moment condition in Eq. (D.2). First, this is a conditional moment condition, and it implies that any functions of X can be used as IVs to convert Eq. (D.2) into a set of unconditional moments. The literature on how to construct the optimal IV then is relevant in selecting the IVs. Secondly, the conditional expectations E(QYX) and E(PYX) in Eq. (D.2) need to be estimated. Some candidate methods include using polynomials of X, using local kernel polynomials, or in the case that X is discrete, simply evaluating the two conditional means by X. Note that the complexity of the second issue compound with the first because finding the optimal IV requires the knowledge of the functional forms of E(QYX) and E(PYX).

Next, We develop a simple extension of the estimator in the main text to the case with a scalar discrete variable X. This simple setup allows us to estimate E(QYX) and E(PYX) non-parametrically using their sample counterparts. We do not dive into the optimal IV issue but simply select the function A(X) = aX/(1 + δX2) as the IV to use for the Eq. (D.2). This functional form is motivated by the heteroskedasticity functional form often used in the weighted least squares. Simulation results show that the proposed extension works.

The simulation design is similar to that in Section 4 except that the models for Z and ω are heteroskedastic. Specifically, the ω process now is

$${\omega }_{it}={\gamma }_{0}+{\gamma }_{1}{\omega }_{it-1}+{\gamma }_{2}{Z}_{it}+{\gamma }_{X}X+{\xi }_{it},$$

and the three error terms are generated according to

$${\zeta }_{it}={\zeta }_{0}+{\zeta }_{1}(1+\delta X),$$
$${u}_{it}={u}_{0}+{u}_{1}(1+\delta X),$$
$${\nu }_{it}={\nu }_{0}+{\nu }_{1}(1+\delta X),$$

where ζ0 ~ LogNormal (−0.5, 1) − e and e is the exponential constant (approximately, 2.718), u0 and ν0 ~ Gumbel − Γ and Γ is the gamma constant (approximately, 0.577), ζ1, u1 and ν1 ~ N (0, 1), and ζ0, u0, ν0, ζ1, u1, ν1 and X are mutually independent. That is, each error term is a sum of a homoskedastic part and a heteroskedastic part. The homoskedastic parts are essentially the same as those used in Section 4 except that they are now demeaned. The heteroskedastic parts are normal distributions of which the variances depends on the exogenous variable X. The coefficient δ controls the degree of heteroskedasticity. In the design reported here, δ = 2 and X ~ Bernoulli(0.5). Other designs where X is still discrete but takes on more than two outcomes (say uniform discrete random variable over {0, 1,.., 10}) have also been investigated, and the results are similar and thus are committed. The true parameter values are βk = 0.25, βm = 0.65, γ0 = 0.2, γ1 = 0.5, γ2 = 0.5, γX = 0.1, ρ = 0.5, θ0 = 0.01, and θ1 = 0.3.

We use discrete X in this simulation design to simplify the estimation of E(e1,ite3,itX) and E(e2,ite3,itX), as our purpose here is to show that the proposed method has the potential to be generalized to cases with heteroskedasticity. With discrete X, E(e1,ite3,itX) and E(e2,ite3,itX) can simply be estimated by their sample analogs. Estimating E(e1,ite3,itX) and E(e2,ite3,itX) when X is continuous is more difficult and deserves dedicated work in future studies.

We select the unity and a(X) = X/(1 + 2X)2 as the instruments for the conditional moment condition in (D.2). We chose them to (naively) capture the general pattern of the heteroskedasticity in the design. In particular, the denominator (1 + 2X)2 in a(X) is motivated by the fact that the variance of the heteroskedastic part of each error term is proportional to (1 + 2X)2. To some extent it mimics the idea of weighted least squares which downweights those less informative observations with larger noises. Deriving the optimal instruments is not easy here as we do not know the closed functional form of E(e1,ite3,itX) and E(e2,ite3,itX). Hence, the resulting GMM estimator is consistent but not efficient. These instruments suggest that we replace the fourth equation in (3.9) with the the following two conditional moments

$$E\left[{e}_{1,it}{e}_{2,it}{e}_{3,it}^{2}-2E({e}_{1,it}{e}_{3,it}| X)E({e}_{2,it}{e}_{3,it}| X)\right]=0,$$
$$E\left\{\left[{e}_{1,it}{e}_{2,it}{e}_{3,it}^{2}-2E({e}_{1,it}{e}_{3,it}| X)E({e}_{2,it}{e}_{3,it}| X)\right]a(X)\right\}=0.$$

It is worth noting that the first moment above is different from the fourth equation in (3.9) because \(E\left[E({e}_{1,it}{e}_{3,it}| X)E({e}_{2,it}{e}_{3,it}| X)\right]\,\ne \,E({e}_{1,it}{e}_{3,it})E({e}_{2,it}{e}_{3,it})\) in general.

The simulation results are reported in Table 9. The Gandhi et al. estimators using Zit still use the tilde notations, and our GMM estimators use the hat notations. The cross-sectional number of observations is 10,000, and the number of time periods is 5. The number of replications in the Monte Carlo simulation is 1000. The finite sample performance of the heteroskedasticity adapted estimator shows that it is robust to heteroskedasticity at least in the current simple setup.

Table 9 Heteroskedasticity Adapted Estimator

As a comparison, the simulation result for the homoskedastic version proposed in the main text is reported in Table 10. The results show large finite sample biases in the estimate for γ2 (see, e.g., \({\hat{\gamma }}_{2}\) for n = 20,000 in Table 10).

Table 10 Homoskedasticity Estimator

1.3 A translog production function

The proposed model in the main text can be modified to identify and estimate production functions that take more flexible functional forms. In this appendix, we consider a translog (parametric log-quadratic) production function, viz.,

$$\begin{array}{ll}{y}_{it}=&{\beta }_{K}{k}_{it}+{\beta }_{L}{l}_{it}+{\beta }_{M}{m}_{it}+\frac{1}{2}{\beta }_{KK}{k}_{it}^{2}+\frac{1}{2}{\beta }_{LL}{l}_{it}^{2}+\frac{1}{2}{\beta }_{MM}{m}_{it}^{2}\\ &+{\beta }_{KL}{k}_{it}{l}_{it}+{\beta }_{KM}{k}_{it}{m}_{it}+{\beta }_{LM}{l}_{it}{m}_{it}+{\omega }_{it}+{\eta }_{it}.\end{array}$$
(A.1)

The first-order condition of the firm’s short-run profit maximization problem suggests that

$$\ln {s}_{M,it}=\ln \left[\left({\beta }_{M}+{\beta }_{MM}{m}_{it}+{\beta }_{KM}{k}_{it}+{\beta }_{LM}{l}_{it}\right){{\Lambda }}\right]-{\eta }_{it}.$$
(A.2)

It is the counterpart of (3.2) in the main text. Since \({\mathbb{E}}({\eta }_{it}| {k}_{it},{l}_{it},{m}_{it})=0\), the above equation can be estimated from the conventional nonlinear least squares (NLS) method.Footnote 19 This is the first-step of our estimation, from which we get \({\widehat{\beta }}_{M}\), \({\widehat{\beta }}_{MM}\), \({\widehat{\beta }}_{KM}\), and \({\widehat{\beta }}_{LM}\).

In the second step, we redefine \({y}_{it}^{* }\) as \({y}_{it}^{* }={y}_{it}-{\beta }_{M}{m}_{it}-\frac{1}{2}{\beta }_{MM}{m}_{it}^{2}-{\beta }_{KM}{k}_{it}{m}_{it}-{\beta }_{LM}{l}_{it}{m}_{it}\). By definition, we have

$${y}_{it}^{* }={\beta }_{K}{k}_{it}+{\beta }_{L}{l}_{it}+\frac{1}{2}{\beta }_{KK}{k}_{it}^{2}+\frac{1}{2}{\beta }_{LL}{l}_{it}^{2}+{\beta }_{KL}{k}_{it}{l}_{it}+{\gamma }_{0}+{\gamma }_{1}{\omega }_{it-1}+{\gamma }_{2}{Z}_{it}+\rho {\zeta }_{it}+{\nu }_{it}+{\eta }_{it}.$$
(A.3)

The above equation can be used to replace the first equation in system (3.7). We still have the second-step estimator defined in (3.10) with the updated \({x}_{it}={(1,{k}_{it},{l}_{it},{k}_{it}^{2},{l}_{it}^{2},{k}_{it}{l}_{it},{\omega }_{it-1})}^{{\prime} }\). Additional lags of capital and labor can be added into xit. For the unobserved ωit−1, we can get its control function from the firm’s first-order condition with the translog production function form, viz.,

$$\begin{array}{r}{\omega }_{it-1}={m}_{it-1}^{* }-{\beta }_{K}{k}_{it-1}-{\beta }_{L}{l}_{it-1}-\frac{1}{2}{\beta }_{KK}{k}_{it-1}^{2}-\frac{1}{2}{\beta }_{LL}{l}_{it-1}^{2}-{\beta }_{KL}{k}_{it-1}{l}_{it-1},\end{array}$$
(A.4)

where

$$\begin{array}{ll}{m}_{it-1}^{* }\,=\,\ln ({P}_{t}^{M}/{P}_{t}^{Y})-\ln \left[\left({\beta }_{M}+{\beta }_{MM}{m}_{it-1}+{\beta }_{KM}{k}_{it-1}+{\beta }_{LM}{l}_{it-1}\right){{\Lambda }}\right]\\ \qquad\qquad+\,(1-{\beta }_{M}){m}_{it-1}-\frac{1}{2}{\beta }_{MM}{m}_{it-1}^{2}-{\beta }_{KM}{k}_{it-1}{m}_{it-1}-{\beta }_{LM}{l}_{it-1}{m}_{it-1}\end{array}$$
(A.5)

and \({m}_{it-1}^{* }\) is a function with parameters that have already been identified in the first step. Using (A.4) and (A.5), we replace the unobserved ωit−1 with observed data, estimated parameters in the first step, and unknown parameters, when the moment conditions involved in (3.10) are constructed.

1.4 A general framework with nonlinear DGPs

For the ease of notation, we assume linearity for the production function in (2.1), the productivity law of motion in (2.2), and the productivity determinant process in (2.3) in the main text. Although these linear equations are widely-used in the literature of production function and productivity estimation, they are not identification assumptions in our paper. To see that, in this appendix, we consider a general framework, in which the above linearity assumptions are relaxed.

Consider a general production process of firm i in time period t as follows:

$${Y}_{it}=F({K}_{it},{L}_{it},{M}_{it};\beta ){e}^{{\omega }_{it}+{\eta }_{it}},$$
(B.1)

where production function F( ⋅ ) is defined with a finite number of unknown parameters β. Apparently, both the Cobb-Douglas and translog production functions discussed previously are included in (B.1). In general, we can write the ω process as

$${\omega }_{it}=h({\omega }_{it-1},{Z}_{it};\gamma )+{\xi }_{it},$$
(B.2)

and the Z process as

$${Z}_{it}=g({\omega }_{it-1};\alpha )+{\epsilon }_{it},$$
(B.3)

where h( ⋅ ; γ) and g( ⋅ ; α) can be nonlinear functions of their respective arguments, and γ and α are two vectors of unknown parameters of finite dimensions. Note that although we allow the endogenous variable Zit to enter function h( ⋅ ) in a nonlinear way, we may not have enough moment conditions for identification if multiple unknown parameters are attached to Zit. For example, empirical studies often use the second-order polynomial functions to approximate smooth but flexible functional forms. In that case, we have \(h(\cdot ;\gamma )={\gamma }_{0}+{\gamma }_{1}{\omega }_{it-1}+{\gamma }_{2}{Z}_{it}+{\gamma }_{3}{\omega }_{it-1}^{2}\) and \(g(\cdot ;\alpha )={\alpha }_{0}+{\alpha }_{1}{\omega }_{it-1}+{\alpha }_{2}{\omega }_{it-1}^{2}\). We drop \({Z}_{it}^{2}\) and Zitωit−1 from a complete polynomial to approximate function h( ⋅ ) since that leads to multiple endogenous variables and we may not have enough moment conditions.

Make use of the first-order condition, and production function (B.1) implies that the counterpart of (3.6) takes the following form:

$${y}_{it}^{* }=\ln {F}^{* }({K}_{it},{L}_{it};{\beta }^{* })+h({\omega }_{it-1},{Z}_{it};\gamma )+{\xi }_{it}+{\eta }_{it},$$
(B.4)

where the superscript * means that the impact of the flexible input Mit has been removed from the production function after the first step estimation. Using the general processes of ωit and Zit, we can also get the counterpart of (3.8) as

$${y}_{it}^{* }=\ln {F}^{* }({K}_{it},{L}_{it};{\beta }^{* })+h({\omega }_{it-1},{Z}_{it};\gamma )+\rho {Z}_{it}-\rho g({\omega }_{it-1};\alpha )-\rho {u}_{it}+{\nu }_{it}+{\eta }_{it}$$
(B.5)

Update the definitions of the error terms e1,it, e2,it, and e3,it, we have

$$\begin{array}{ll}{e}_{1,it}\,\equiv \rho {\zeta }_{it}+{\nu }_{it}+{\eta }_{it}={y}_{it}^{* }-\ln {F}^{* }({K}_{it},{L}_{it};{\beta }^{* })-h({\omega }_{it-1},{Z}_{it};\gamma ),\\ {e}_{2,it}\,\equiv -\rho {u}_{it}+{\nu }_{it}+{\eta }_{it}={y}_{it}^{* }-\ln {F}^{* }({K}_{it},{L}_{it};{\beta }^{* })-h({\omega }_{it-1},{Z}_{it};\gamma )-\rho {Z}_{it}+\rho g({\omega }_{it-1};\alpha ),\\ {e}_{3,it}\,\equiv {\zeta }_{it}+{u}_{it}={Z}_{it}-g({\omega }_{it-1};\alpha ).\end{array}$$

We can show that the moment conditions in (3.9) do not rely on the functional forms of F*( ⋅ ; β*), h( ⋅ ; γ), and g( ⋅ ; α). Specifically, \({\mathbb{E}}({e}_{1,it}\cdot {x}_{it})=0\) and \({\mathbb{E}}({e}_{3,it}\cdot {x}_{it})=0\) directly follow from our structural model assumption that \({({\zeta }_{it},{\nu }_{it},{u}_{it},{\eta }_{it})}^{{\prime} }\) are contemporaneous errors and uncorrelated with predetermined variables in xit. To verify the condition \({\mathbb{E}}({e}_{1,it}\cdot {e}_{2,it}\cdot {e}_{3,it})=0\), we use the definitions of the error terms:

$${\mathbb{E}}({e}_{1,it}\cdot {e}_{2,it}\cdot {e}_{3,it})={\mathbb{E}}\left((\rho {\zeta }_{it}+{\nu }_{it}+{\eta }_{it})(-\rho {u}_{it}+{\nu }_{it}+{\eta }_{it})({\zeta }_{it}+{u}_{it})\right).$$

Since \({({\zeta }_{it},{\nu }_{it},{u}_{it},{\eta }_{it})}^{{\prime} }\) are mutually independent, after removing the parentheses inside the expectation operator, all terms have the expected value of zero. Then, the condition \({\mathbb{E}}({e}_{1,it}\cdot {e}_{2,it}\cdot {e}_{3,it})=0\) holds. For the last condition \({\mathbb{C}}{{{\rm{ov}}}}({e}_{1,it}\cdot {e}_{2,it},\,{e}_{3,it}^{2})-2\,{\mathbb{E}}({e}_{1,it}\cdot {e}_{3,it})\,{\mathbb{E}}({e}_{2,it}\cdot {e}_{3,it})=0\), we can show that

$${\mathbb{C}}{{{\rm{ov}}}}({e}_{1,it}\cdot {e}_{2,it},\,{e}_{3,it}^{2})={\mathbb{C}}{{{\rm{ov}}}}\left((\rho {\zeta }_{it}+{\nu }_{it}+{\eta }_{it})(-\rho {u}_{it}+{\nu }_{it}+{\eta }_{it}),{({\zeta }_{it}+{u}_{it})}^{2}\right)=-2{\rho }^{2}{\mathbb{E}}({\zeta }_{it}^{2}){\mathbb{E}}({u}_{it}^{2}),$$

and

$${\mathbb{E}}({e}_{1,it}\cdot {e}_{3,it})\,{\mathbb{E}}({e}_{2,it}\cdot {e}_{3,it})={\mathbb{E}}\left((\rho {\zeta }_{it}+{\nu }_{it}+{\eta }_{it})({\zeta }_{it}+{u}_{it})\right){\mathbb{E}}\left((-\rho {u}_{it}+{\nu }_{it}+{\eta }_{it})({\zeta }_{it}+{u}_{it})\right)=-{\rho }^{2}{\mathbb{E}}({\zeta }_{it}^{2}){\mathbb{E}}({u}_{it}^{2}).$$

Then, the last condition also holds. The specific functional forms of F*( ⋅ ; β*), h( ⋅ ; γ), and g( ⋅ ; α) play no role in the above proof.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, F., Huang, M. & Zhao, S. Estimation of endogenous firm productivity without instruments: an application to foreign investment. J Prod Anal 61, 135–155 (2024). https://doi.org/10.1007/s11123-023-00709-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11123-023-00709-9

JEL classification

Keywords

Navigation