1 Introduction

Semiconductors, also known as integrated circuits or chips, are the brains of almost all modern electronics. Since the release of the first commercial chips in the 1960s, the semiconductor industry has been a driving force in the electronics market. Before the 1980s, most of the companies in the semiconductor industry operated in the integrated device manufacturer (IDM) business model, in that one company carries out all stages of production in-house, including research and design (R&D), front-end fabrication, and back-end assembly and test (A&T). As Moore’s law approaches its physical limits in the semiconductor industry (e.g., see Thompson and Parthasarathy 2006), the expense of manufacturing a leading-edge chip has become prohibitive for all but a few IC suppliers. The rising pressures of developing more advanced technologies, accompanied by exponentially increasing costs, have kept the semiconductor industry on its toes and led to the rise of the fabless-foundry business model in the mid-1980s. In the fabless-foundry model, a fabless company focuses on R&D and sales, and partners with pure-play foundries for front-end fabrication as well as a third group of companies for back-end outsourced semiconductor assembly and test (OSAT) operations. Vertical specialization combined with the trend toward globalization has significantly changed the structure of the semiconductor industry over the past few decades.

Whether specialization by the fabless-foundry business model increases productivity is a topic of broad interest in the semiconductor industry (e.g., see Macher 2006; Dibiaggio 2007). On the one hand, vertical disintegration notably reduces the burden of capital expenditure (CAPEX) and ensures the domination of new markets by fabless firms in the semiconductor industry (e.g., see Balconi and Fontana 2011). On the other hand, despite a trend toward specialization driven by fabless start-ups, IDMs have continued to persist and coexist with the fabless firms in the semiconductor industry (e.g., see Kapoor 2013). Much research on the topic of structural change in the semiconductor industry emphasizes the impact of technological evolution (e.g., see Kapoor and Adner 2012; Pellens and Della Malva 2018). However, few studies focus on the impact of CAPEX or discuss the effect of the business model in the semiconductor industry adequately. This paper aims to use a two-stage nonparametric conditional approach to explore the operating performance of companies in the global semiconductor industry. The effects of capital investment and business model are handled by conditional efficiency estimators in the first stage, and the idiosyncratic efficiency, representing the Solow residual (e.g., see Mastromarco and Simar 2015) is abstracted by looking to the unexplained part of the conditional efficiencies in the second stage.

In the semiconductor industry, factors determining production costs and operating efficiencies vary substantially across device types and business models. For cutting-edge products such as microprocessors, manufacturing is capital intensive. The ever-increasing costs of building advanced fabrication facilities and the difficulties that arise from slowing development in node technology result in high barriers for entry and favor the success of large IDMs, such as Intel and Texas Instruments, which can forge ahead with innovative and expensive technologies that are needed to take chips to the next level. Comparably, for front-end fabrication and back-end A&T procedures, a significant challenge is the heavy CAPEX for cleanroom and costly equipment. The fabless-foundry model derives efficiencies from the delineation of tasks and specialization. More specifically, foundries and OSATs, both heavily invested in fixed assets, seek to optimize productivity by serving many fabless companies to achieve high capacity utilization. At the same time, the niche fabless start-ups or spinoffs, getting rid of the CAPEX burden, concentrate on R&D to compete with the IDM giants. Hence CAPEX is an indispensable factor in the performance evaluation between the IDM model and the fabless-foundry model in the capital-intensive semiconductor industry.

Data envelopment analysis (DEA) is one of the most popular approaches for performance evaluation, and there is no exception in the semiconductor industry. Among the rich records that have used DEA for efficiency estimation in the semiconductor industry, many have focused on a particular business model, while others have limited the research to a specific country or area. For example, Chu et al. (2008) measure the relative performance for global top 30 fabless firms, Chen and Chen (2011) analyze the operation performance of the foundries in Taiwan, and Lu et al. (2013) explore the relationship between corporate social responsibility and semiconductor companies’ performance in the US. Though these works examine the operating efficiencies in the semiconductor industry from different angles, few have expressed a global perspective on the semiconductor industry. Furthermore, because of the slow convergence rate in the nonparametric DEA approach, the limited number of observations in these studies, either constrained by geographic boundary or a single business model, may increase estimation error. With increasing numbers of inputs and outputs, the issue of slow convergence rate may become severe and lead to unconvincing results in the nonparametric frontier estimation (e.g., see Wilson 2018).

This paper tries to overcome the slow convergence rate in the nonparametric DEA approach in two ways. First, the data in this research cover 470 companies in the semiconductor industry from all over the world, not only helping to increase the sample base, but also helping to grasp a whole picture of the highly globalized semiconductor industry. Second, using the principal component analysis (PCA) for dimension reduction (e.g., see Daraio and Simar 2007a, pp. 148–150) further helps to reduce the estimation error. After filtering out the impacts of CAPEX and business model by the conditional nonparametric frontier framework, a flexible nonparametric location-scale model in a second stage regression identifies the idiosyncratic efficiencies in the semiconductor industry. The estimation results indicate that the vertically integrated IDM business model is constrained heavily by capital investment, and the vertically specialized fabless-foundry business model helps to improve the mean pure efficiency and mitigate the cyclical fluctuations in the global semiconductor industry.

The paper is organized as follows. Section 2 introduces the nonparametric frontier framework and discusses the diagnostics and test statistics for choosing a suitable estimator in this research. In Sect. 3, we describe the dataset and define the variables for the estimation. The empirical results of the idiosyncratic efficiency, and the impacts of capital investment and business model in the global semiconductor industry are presented in Sect. 4. The last section is the conclusion.

2 Methodology

2.1 Nonparametric production frontier

The economic theory of efficiency in production examines how firms transform their inputs into outputs. Consider a production process where a vector of inputs \(x\in {{\mathbb{R}}}_{+}^{p}\) is used to produce a vector of outputs \(y\in {{\mathbb{R}}}_{+}^{q}\). The production set \(\Psi\) is defined as \(\Psi\) = {(x, y)∣xcan producey} and the efficiency score of a particular production plan (x, y) in \(\Psi\) is determined by the distance from (x, y) to the boundary of \(\Psi\). Besides the commonly used input- and output-oriented Debreu-Farrel measures, the hyperbolic measure

$$\gamma (x,y)=\sup \{\gamma \,| \,({\gamma }^{-1}x,\gamma y)\in {{\Psi }}\}$$
(1.1)

is an alternative to selecting either an input- or output-orientation. Note that the hyperbolic measure in (1.1) is the reciprocal of the hyperbolic graph measure by Färe et al. (1985). Wilson (2011) popularizes the hyperbolic measure and derives its asymptotic properties.

In real-world research problems, the attainable set \(\Psi\) is unobserved. Nonparametric methods such as DEA and Free Disposal Hull (FDH) are widely applied to estimate the unobservable production set \(\Psi\). Let \({{{{{{\mathcal{S}}}}}}}_{n}=\{({X}_{i},{Y}_{i})\}\) denote a random sample of n observations of input-output pairs. Using only the free disposability assumption, Deprins et al. (1984) define the FDH estimator of \(\Psi\) as

$${\widehat{{{\Psi }}}}_{{{{{{\rm{FDH}}}}}}}=\mathop{\bigcup}\limits_{({X}_{i},{Y}_{i})\in {{{{{{\mathcal{S}}}}}}}_{n}}\{(x,y)\,| \,x\ge {X}_{i},y\le {Y}_{i}\}.$$
(1.2)

The FDH estimator of γ(x, y), denoted by \({\widehat{\gamma }}_{{{{{{\rm{FDH}}}}}}}(x,y)\), is obtained by replacing \(\Psi\) with \({\widehat{{{\Psi }}}}_{{{{{{\rm{FDH}}}}}}}\) in (1.1). Alternatively, Banker et al. (1984) propose the variable-returns-to-scale DEA (VRS-DEA) estimator of \(\Psi\) with the expression that

$${\widehat{{{\Psi }}}}_{{{{{{\rm{VRS}}}}}}}=\{(x,y)| \,y\le \mathop{\sum }\limits_{i=1}^{n}{\omega }_{i}{Y}_{i},\,x\ge \mathop{\sum }\limits_{i=1}^{n}{\omega }_{i}{X}_{i},\mathop{\sum }\limits_{i=1}^{n}{\omega }_{i}=1,{\omega }_{i}\ge 0\},$$
(1.3)

where the weights ωi and the constraints \(\mathop{\sum }\nolimits_{i = 1}^{n}{\omega }_{i}=1\) serve \({\widehat{{{\Psi }}}}_{{{{{{\rm{VRS}}}}}}}\) as the convex hull of \({\widehat{{{\Psi }}}}_{{{{{{\rm{FDH}}}}}}}\). The corresponding VRS-DEA estimator of γ(x, y), denoted by \({\widehat{\gamma }}_{{{{{{\rm{VRS}}}}}}}(x,y)\), is obtained by replacing \(\Psi\) with \({\widehat{{{\Psi }}}}_{{{{{{\rm{VRS}}}}}}}\) in (1.1). Another kind of DEA estimator, called the constant-returns-to-scale DEA estimator, relies on more restrictive assumptions and is less widely used. Therefore, the notation DEA is dedicated to VRS-DEA for the rest of the paper.

2.2 Conditional efficiency scores

Beyond the combinations (X, Y) of inputs and outputs, there are factors that may influence the production process but are typically not under the manager’s control in a firm. These factors, denoted by \(Z\in {{\mathbb{R}}}^{r}\), are referred to as environmental factors that may reflect differences in ownership, business model, capital investment, etc. Cazals et al. (2002) and Daraio and Simar (2005, 2007a, 2007b) develop a framework to investigate the joint behavior of (X, Y, Z) in probability terms. Defining the conditional attainable set as \(\Psi\)z = {(x, y)∣Z = z, xcan producey}, the distribution of (X, Y) conditional on Z = z is denoted by

$${H}_{X,Y| Z}(x,y| z)=\Pr (X\le x,Y\ge y\,| \,Z=z),$$
(1.4)

which gives the probability that a firm facing environmental conditions z will dominate the point (x, y). Introducing environmental factors into (1.1) extends the efficiency score of γ(x, y) into its conditional counterpart as

$$\gamma (x,y| z)=\sup \{\gamma \,| \,{H}_{X,Y| Z}({\gamma }^{-1}x,\gamma y\,| \,z) \, > \, 0\}.$$
(1.5)

Plugging a nonparametric estimator of HX,YZ(x, yz) from a sample Sn = {(Xi, Yi, Zi)} into (1.4) can derive the estimation of the conditional efficiency scores. Such a nonparametric estimator of HX,YZ(x, yz) may be obtained by standard kernel smoothing, for example,

$${\widehat{H}}_{X,Y| Z}(x,y| z)=\frac{\mathop{\sum }\nolimits_{i = 1}^{n}{{{{{\boldsymbol{I}}}}}}({X}_{i}\le x,{Y}_{i}\ge y){K}_{h}({Z}_{i}-z)}{\mathop{\sum }\nolimits_{i = 1}^{n}{K}_{h}({Z}_{i}-z)},$$
(1.6)

where Kh(⋅) is a kernel function with bounded support, h is a vector of bandwidths that h = (h1, …, hr), and r is the number of environmental variables. It is well known that the selection of bandwidth h is of critical importance in kernel smoothing. Hall et al. (2004), Bădin et al. (2010, 2012) and Li et al. (2013) propose the criterion of optimal bandwidth by least squares cross-validation (LSCV). Bădin et al. (2019) enhance the LSCV technique for bandwidth selection by a bootstrap approach.

2.3 Separability condition and second-stage regression

By construction, the conditional attainable set \(\Psi\)z is a subset of \(\Psi\) and \(\Psi\) = ⋃zZΨz. There is a particular case, called the separability condition in Simar and Wilson (2007), where Z has no impact on the boundary of \(\Psi\) and

$${{{\Psi }}}^{z}={{\Psi }}$$
(1.7)

for all z ∈ Z. Simar and Wilson (2007, 2011) emphasize that naive regression in a second-stage analysis may provide inconsistent estimation if the separability condition is violated. Daraio et al. (2018) demonstrate that standard central limit theorem results do not hold for means of nonparametric conditional efficiency estimators and construct a separability test by new central limit theorems. Simar and Wilson (2020) enhance the separability test by a bootstrap approach.

If the separability condition in (1.7) does not hold, Bădin et al. (2012, 2014) suggest a flexible nonparametric location-scale model

$$\gamma (X,Y| Z=z)=\mu (z)+\sigma (z)\varepsilon$$
(1.8)

in a second-stage regression, where μ(z) measures the average effect of z on the efficiency, and σ(z) provides additional information on the dispersion of the efficiency distribution as a function of z. The error term ε represents the unexplained part of the conditional efficiency measure and is called the idiosyncratic efficiency or pure efficiency. Mastromarco and Simar (2015) further interpret the pure efficiency as a measure of the Solow residual, which proxies for the aggregate effect of not included factors other than production inputs or conditions Z.

2.4 Tradeoff between FDH and DEA estimators

The tradeoff between the FDH and DEA estimators is nontrivial. Simar and Wilson (2015) provide a survey of the nonparametric frontier models and summarize that the FDH and DEA estimators converge to limiting distributions at rates \({n}^{\frac{1}{p+q}}\) and \({n}^{\frac{2}{p+q+1}}\) under appropriate assumptions, respectively. In either case, the convergence rate for a fixed sample size n slows down with increasing dimensionality (p + q) accompanied by increasing estimation errors. This phenomenon is often referred to as the curse of dimensionality. A feasible approach to minimize the curse of dimensionality is either to increase the sample size n or to decrease the total dimensions of (p + q). If the sample size n is limited to a small number by real-world constraints, including industry structure, geographic restriction, and high data collection costs, dimension reduction may become an attractive solution to escape the curse of dimensionality.

2.4.1 Diagnosing the curse of dimensionality

The PCA strategy for dimension reduction amounts to a mapping from the (p × n) matrix X and the (q × n) matrix Y to (1 × n) matrices \({{\Lambda }}^{\prime} {{{{{\boldsymbol{X}}}}}}\) and \({{\Lambda }}^{\prime} {{{{{\boldsymbol{Y}}}}}}\) by pre-multiplying the first eigenvectors \({{{\Lambda }}}_{{x}_{1}}\) and \({{{\Lambda }}}_{{y}_{1}}\) of the moment matrices \({{{{{\boldsymbol{X}}}}}}{{{{{\boldsymbol{X}}}}}}^{\prime}\) and \({{{{{\boldsymbol{Y}}}}}}{{{{{\boldsymbol{Y}}}}}}^{\prime}\), respectively. Though it is hard to provide theoretical guidelines of when dimension reduction should be used, Wilson (2018) proposes three diagnostics for empirical works. The first diagnostic is to compute the effective parametric sample size m. Assume a nonparametric estimator of n observations has the same level of measurement error as a parametric estimator of m observations. Equaling the nonparametric convergence rate nκ to the parametric convergence rate \({m}^{\frac{1}{2}}\), we can compute the effective parametric sample size m of the nonparametric estimator as m ≈ ⌊n2κ⌉, where ⌊a⌉ denotes the integer nearest a. Hence the criterion of judging the minimum sample size m in parametric estimation can be used as a comparative guideline in nonparametric estimation.

The second diagnostic is to consider the proportion of n observations that yield efficiency scores equal to one. If more than 25%–50% of the observations yield efficiency scores equal to one, the estimation results are not convincing. A third diagnostic is to exam the ratios Rx and Ry of the largest eigenvalue of the moment matrices \({{{{{\boldsymbol{X}}}}}}{{{{{\boldsymbol{X}}}}}}^{\prime}\) and \({{{{{\boldsymbol{Y}}}}}}{{{{{\boldsymbol{Y}}}}}}^{\prime}\) to the corresponding sum of eigenvalues for \({{{{{\boldsymbol{X}}}}}}{{{{{\boldsymbol{X}}}}}}^{\prime}\) and \({{{{{\boldsymbol{Y}}}}}}{{{{{\boldsymbol{Y}}}}}}^{\prime}\), respectively. The ratios Rx and Ry provide measures of how close the corresponding moment matrices are to rank-one, regardless of the joint distributions of inputs and outputs. For example, if Rx = 0.9, the matrix with dimension reduction \({{\Lambda }}^{\prime} {{{{{\boldsymbol{X}}}}}}\) contains 90% of the independent linear information in the original matrix X. In case there is evidence of excessive inputs or outputs, Wilson (2018) proposes standardizing the matrix X or Y before PCA to ensure the inputs or outputs have the same scale.

2.4.2 Testing convexity of the attainable set

After the diagnostics of dimension reduction, the choice between the FDH estimator and the DEA estimator can be decided by data-driven hypothesis testing. Since \({\widehat{{{\Psi }}}}_{{{{{{\rm{VRS}}}}}}}\) in (1.3) is defined as the convex hull of \({\widehat{{{\Psi }}}}_{{{{{{\rm{FDH}}}}}}}\) in (1.2), Kneip et al. (2015, 2016) construct a test of convexity using new central limit theorems for the tradeoff between \({\widehat{{{\Psi }}}}_{{{{{{\rm{FDH}}}}}}}\) and \({\widehat{{{\Psi }}}}_{{{{{{\rm{VRS}}}}}}}\). In detail, the test statistic τ proposed by Kneip et al. (2016) is computed as

$$\widehat{\tau }=\frac{({\widehat{\mu }}_{{{{{{\rm{FDH}}}}}},{n}_{2}}-{\widehat{\mu }}_{{{{{{\rm{VRS}}}}}},{n}_{1}})-({\widehat{B}}_{{{{{{\rm{FDH}}}}}},{n}_{2}}-{\widehat{B}}_{{{{{{\rm{VRS}}}}}},{n}_{1}})}{\sqrt{\frac{{\widehat{\sigma }}_{{{{{{\rm{FDH}}}}}},{n}_{2}}^{2}}{{n}_{2}}+\frac{{\widehat{\sigma }}_{{{{{{\rm{VRS}}}}}},{n}_{1}}^{2}}{{n}_{1}}}}\mathop{\longrightarrow }\limits^{{{{{{\mathcal{L}}}}}}}N(0,1),$$
(1.9)

where the sample Sn is randomly split into two mutually exclusive and collectively exhaustive subsets Sn1 and Sn2, and \({\widehat{\mu }}_{{{{{{\rm{VRS}}}}}},{n}_{1}}\), \({\widehat{\mu }}_{{{{{{\rm{FDH}}}}}},{n}_{2}}\) and \({\widehat{\sigma }}_{{{{{{\rm{VRS}}}}}},{n}_{1}}^{2}\), \({\widehat{\sigma }}_{{{{{{\rm{FDH}}}}}},{n}_{2}}^{2}\) are the means and variances of the DEA and FDH estimators in \({{{{{{\mathcal{S}}}}}}}_{{n}_{1}}\) and \({{{{{{\mathcal{S}}}}}}}_{{n}_{2}}\), respectively. The bias correction terms of \({\widehat{B}}_{{{{{{\rm{VRS}}}}}},{n}_{1}}\) and \({\widehat{B}}_{{{{{{\rm{FDH}}}}}},{n}_{2}}\) are computed by a generalized jackknife estimation (see Kneip et al. 2016 for details).

In practice, if the null hypothesis of convexity is rejected, the production set \(\Psi\) is nonconvex so that the FDH estimator is the only consistent estimator. Alternatively, if the null hypothesis of convexity is not rejected, the DEA estimator may be the preferred estimator because of its faster convergence rate. Under the latter situation, a further test of returns to scale can be processed for the tradeoff between the VRS-DEA and CRS-DEA estimators. However, the convexity test in (1.9) depends on randomly splitting the original sample into two independent subsamples \({S}_{{n}_{1}}\) and \({S}_{{n}_{2}}\) to calculate the bias terms, which introduces ambiguity in practice. Simar and Wilson (2020) develop a generalized bootstrap algorithm that eliminates much of this ambiguity by repeating the random splits a large number of times. The FEAR library introduced by Wilson (2008) provides convenient tools for the convexity test and the separability test.

3 Data and variable specification

The data are collected from the Sub-Industry of Semiconductors in Compustat database over 20 years ranging from 1999–2018. Since the semiconductor value chain is highly globalized, we combine data from both the Compustat North America database and the Compustat Global database to cover companies in the whole industry. We exclude photovoltaic producers, liquid crystal display manufacturers, and light-emitting diode manufacturers from the dataset, limiting the sample to only IC manufacturers. Hence, the sample includes 5,136 observations from 470 unique companies over 1999–2018. Table 1 breaks down the 5,136 observations in the global semiconductor industry by the business model from 1999 to 2018. Over half of the companies are fabless because the entry barriers, which mainly depend on CAPEX, are much lower for fabless than others. At the same time, the number of firms operating in each business model remains relatively stable. After the golden decade of fast growth in the semiconductor industry come to an end in the mid-2000s (e.g., see Flamm 2017), the proportions of firms in each business model were gradually fixed.

Table 1 Number of observations by business model.

We specify p = 4 inputs (labor, measured by the number of employees (X1); cost of goods sold (X2); R&D expenditure (X3); and sales & marketing expenditure (X4)) and q = 2 outputs (revenue (Y1); and shareholders’ equity (Y2)). In addition, we specify r = 2 environmental variables (fixed assets, measured by property, plant, and equipment (Z1); and business model (Z2)). The environmental variable Z1 is a continuous variable representing the level of capital investment, which measures the fixed and tangible assets of a company that cannot quickly be sold or turned into cash in a short time. The environmental variable Z2 is a discrete variable, which categories the four business models, including IDM, fabless, foundry, and OSAT, into three groups. That is, the foundries and OSATs, both of which are capital intensive, are naturally combined into one group. Table 2 gives summary statistics for the original variables in 1999–2018 pooled data. Except for the discrete variable Z2, the distributions of the continuous variables are heavily skewed to the right, owning to several tech giants such as Intel and Qualcomm that dominate the semiconductor industry.

Table 2 Summary statistics for 1999–2018 pooled data.

According to the discussion of the curse of dimensionality in Sect. 2.4.1, it is no surprise that without dimension reduction the effective parametric sample size m for either the FDH or the DEA estimator is small. For example, the effective parametric sample size m in 1999 is \(12{5}^{\frac{1}{3}}\approx 5\) for the FDH estimator and \(12{5}^{\frac{4}{7}}\approx 16\) for the DEA estimator without dimension reduction. Contrarily, the effective parametric sample size m increases to 1251 = 125 for the FDH estimator and \(12{5}^{\frac{4}{3}}=625\) for the DEA estimator with dimension reduction. In the meantime, the results of eigensystem analysis in Table 3 indicate that the first principle component of the moment matrices contains most of the independent information (from the lowest ratio of 88.47% for Rx in 2014 to the highest ratio of 99.46% for Ry in 1999) in the p = 4 inputs and q = 2 outputs specified above. The evidence is clear that dimension reduction likely reduces measurement error relative to what would be obtained in the full, six-dimensional space. Therefore, all of the following results are based on data with dimension reduction.

Table 3 Eigensystem analysis by year.

4 Empirical results

Among studies that use nonparametric frontier approaches to estimate efficiency and benchmark performance in the semiconductor industry, the vast majority choose the DEA estimator without comparing the pros and cons between the FDH and DEA estimators. As discussed in Sect. 2, the drawback of the DEA estimator is imposing the convex assumption on the production set \(\Psi\), while the FDH estimator is consistent regardless of the convexity of \(\Psi\). The statistic τ1 in Table 4 provides results of the convexity test developed by Kneip et al. (2016) and the bootstrap method developed by Simar and Wilson (2020). The efficiency scores are calculated by the hyperbolic-oriented measure defined in (1.1), and the bootstrap involves 100 splits and 1000 replications to eliminate most of the uncertainty surrounding the random splitting. At 95% confidence level, the null hypotheses of convexity are rejected for over 80% of the 20 years annual data, except for the three years of 2009, 2011, and 2012. Therefore, we use the FDH estimator for the remainder of the analysis.

Table 4 Test of convexity and test of separability with respect to year.

The statistic τ2 in Table 4 provides results of the separability test with respect to time, using the Daraio et al. (2018) approach and the bootstrap method by Simar and Wilson (2020). Taking year as the unit of time, it is common to treat year as a discrete variable to process the separability test for each of two adjacent years. The null hypotheses are all rejected for the pairs of adjacent years, providing clear evidence that the production frontiers are different for the adjacent years. Hence each year can be treated individually in the swift-changing semiconductor industry.

Table 5 presents the results of the separability test with respect to the environmental variables Z1 and Z2. The statistic τ3 in Table 5 shows that conditional on fixed assets Z1, over two-thirds of the years face different frontiers so that naive regressions in a second-stage analysis do not derive consistent estimation. For the separability test with respect to the discrete variable Z2, statistics τ4τ6 provide pairwise comparisons of the separability conditions among IDM, fabless, and foundry plus OSAT. In any case, the null hypotheses are rejected for most of the years. It is unanimous with the pros and cons analysis of the business models in Sect. 1, that the production frontiers are different and separable of the labor-intensive fabless firms, the capital-intensive foundries and OSATs, and the IDMs which are both labor-intensive and capital-intensive, conditional on business models.

Table 5 Test of separability with respect to Z1 and Z2.

Tables 6, 7, 8 present summary statistics of the unconditional and conditional efficiency scores. The efficiency scores are measured by the hyperbolic-oriented FDH estimator and estimated under annual base as the production frontiers are different in each year. For the continuous environmental variable Z1, we use the LSCV proposed by Li et al. (2013) to choose the optimal bandwidth \({h}_{{Z}_{1}}^{* }\) for kernel smoothing in (1.6). By minimizing

$${{{{{\rm{CV}}}}}}(h)=\frac{\mathop{\sum }\nolimits_{i = 1}^{n}\mathop{\sum }\nolimits_{j\ne i}^{n}{\left[{{{{{\boldsymbol{I}}}}}}({x}_{i}\le {x}_{j},{y}_{i}\ge {y}_{j})-\frac{\frac{1}{n}\mathop{\sum }\nolimits_{k\ne i}^{n}{{{{{\rm{I}}}}}}({x}_{k}\le {x}_{j},{y}_{k}\ge {y}_{j}){K}_{h}({z}_{i},{z}_{k})}{\frac{1}{n-1}\mathop{\sum }\nolimits_{k\ne i}^{n}{K}_{h}({z}_{i},{z}_{k})}\right]}^{2}}{n(n-1)},$$
(2.1)

the optimal bandwidth \({h}_{{Z}_{1}}^{* }\) is equal to 922. For the discrete environmental variable Z2, there is no smoothing to be done. The conditional efficiency estimation under the discrete environmental variable Z2 is simply the unconditional efficiency estimation for a subset of the sample restricted by the discrete environmental variable Z2.

Table 6 Summary statistics for unconditional efficiency scores \(\widehat{\gamma }(x,y)\).
Table 7 Summary statistics for conditional efficiency scores \(\widehat{\gamma }(x,y| {z}_{1})\).
Table 8 Summary statistics for conditional efficiency scores \(\widehat{\gamma }(x,y| {z}_{1},{z}_{2})\).

As shown in Table 5, the separability conditions do not hold. We use the method introduced in Bădin et al. (2012) to disentangle the impacts of the environmental variables to identify the effect on the shift of the frontier and the effect on the distribution of the inefficiencies. The effect of environmental variables on the shift of the frontier can be investigated by considering the ratio of conditional to unconditional efficiency scores. For example, based on the hyperbolic-oriented measure, such ratio can be expressed as

$${R}_{h}(x,y| z)=\frac{\gamma (x,y| z)}{\gamma (x,y)}$$
(2.2)

with the boundary \({R}_{h}(x,y| z)\in \left(0,1\right]\).

Figure 1 displays the effect of fixed assets Z1 on the ratios of \({\widehat{R}}_{h}(x,y| {z}_{1})\) and \({\widehat{R}}_{h}(x,y| {z}_{1},{z}_{2})\) over the 20 years from 1999–2018. Since it is not easy to visualize the three-dimensional panels in Fig. 1, we compare the mean ratios \(\overline{{R}_{h}}(x,y| {z}_{1})\) and \(\overline{{R}_{h}}(x,y| {z}_{1},{z}_{2})\) for the different business models over time in Fig. 2. The curves of mean ratios on IDM are notably different comparing with the curves of mean ratios on fabless and on foundry plus OSAT in Fig. 2, either conditional on Z1 or conditional on both Z1 and Z2. The clues in Fig. 2 imply that the impact of capital investment shifts the production frontiers for the highly capital intensive IDM business model unambiguously. On the contrary, for fabless, foundry, and OSAT firms, capital investment has an unclear impact on the production frontiers over time.

Fig. 1
figure 1

Effects of PP&E (Z1) on ratios \({\widehat{R}}_{h}(x,y| {z}_{1})\) and \({\widehat{R}}_{h}(x,y| {z}_{1},{z}_{2})\) over 1999–2018.

Fig. 2
figure 2

Mean ratios \(\overline{{R}_{h}}(x,y| {z}_{1})\) and \(\overline{{R}_{h}}(x,y| {z}_{1},{z}_{2})\) over 1999–2018.

The effects of environmental variables on the distribution of the inefficiencies can be investigated by the nonparametric location-scale model in (1.8). Bădin et al. (2012) define the pure efficiency as

$${\widehat{\varepsilon }}_{i}(t)=\frac{{\widehat{\gamma }}_{t}({X}_{it},{Y}_{it}| {Z}_{it})-\widehat{\mu }({Z}_{it,t})}{\widehat{\sigma }({Z}_{it,t})},$$
(2.3)

which is the estimated resulting residual for unbalanced panel data in the second-stage regression. The pure efficiency \({\widehat{\varepsilon }}_{i}(t)\) in (2.3) has cleaned off the influences of the environmental factors. Mastromarco and Simar (2015) indicate that the pure efficiency represents a new measure of the Solow residual by proxying for the aggregate effects of various factors not included in the production inputs or environmental variables. In detail, the Solow residual is calculated as

$${\widehat{{{{{{\rm{Solow}}}}}}}}_{i}(t)={e}^{-({\widehat{\varepsilon }}_{i}(t)-\mathop{{{{\min}}}}\limits_{j}{\widehat{\varepsilon }}_{j}(t))},$$
(2.4)

which varies between (0,1] to help the interpretation.

Table 9 shows the mean Solow residual, either filtering out Z1 or filtering out both Z1 and Z2. If only filtering out the effect of Z1, the mean Solow residuals are relatively high and statistically significant. If filtering out the effects of both Z1 and Z2, the mean Solow residuals are relatively low and statistically insignificant. Fig. 3 visualizes the changes of mean Solow residuals over time in Table 9 with confidence interval. Comparing the curve of mean Solow residuals that only filter out the impact of capital investment Z1 with the curve of mean Solow residuals that filter out the impacts of both capital investment Z1 and business model Z2, the former is at a higher level with less fluctuation over time. It is demonstrated in Figure 3 that specialization by the business model not only helps to improve the pure efficiency but also helps to mitigate the impact of business cycles in the global semiconductor industry.

Table 9 Mean solow residual.
Fig. 3
figure 3

Evolution of mean solow residual over 1999–2018

5 Summary and conclusions

It is well known that the barriers to entry are incredibly high in the capital-intensive semiconductor industry, especially in the manufacturing portion. Taking advantage of the economic moat by colossal capital investments and the economy of scale, the incumbent IDMs have dominated the semiconductor industry for a long time. Nonetheless, with the ever-expanding complexity of ICs and accelerated technology iterations, betting on new technologies and processes to stay ahead of the pack becomes a heavy burden even for the dominating IDMs nowadays. The raising of the fabless-foundry business model diversifies the financial risks of capital investment into specified R&D, front-end fabrication, and back-end A&T portions. The decentralized collaboration of the fabless-foundry alliance drastically reduces the barriers to entry into the globalized semiconductor value chain and leads to a flourishing of fabless design houses.

Using a flexible nonparametric conditional frontier approach, we measure the effect of business model and the constraint of capital investment in the global semiconductor industry. The empirical results are consistent with the pros and cons analysis that the entry barriers by capital investment shift the efficient frontier of the vertically integrated IDMs evidently, but have no clear effect on the efficient frontiers of the fabless-foundry models. The Solow residual estimation in a second-stage nonparametric regression reveals that the mutually beneficial fabless-foundry partnership helps to improve the pure operating efficiency and mitigate the impact of business cycles in the global semiconductor industry.

However, the distinction between the IDM model and the fabless-foundry model is fading away. Due to the constant and costly needs to upgrade manufacturing facilities to keep up with technological advances, several IDMs contract with foundries to manufacture specific chips while performing all other remaining tasks internally. The complementarity between IDMs and fabless-foundry firms is similar to a symbiotic relationship in the semiconductor ecosystem, which enhances competitiveness through increasing specialization in each segment of the value chain. This ecosystem improves the overall competitiveness of the global semiconductor industry in capabilities, product diversities, and technological advancements and enables the entire semiconductor industry to thrive and prosper.