Is the productivity premium of internationalized firms technology-driven?

We ask whether the productivity advantage of internationalized firms documented by the international trade literature can be interpreted most accurately in terms of proximity to the “technological frontier”. We answer in the affirmative using a methodology (based on mixture models) of unbundling technology and total factor productivity (TFP) by estimating “technology-specific” production function parameters. Exploiting detailed data provided by the EFIGE database (a sample of firms distributed across Austria, France, Germany, Hungary, Italy, Spain, and the UK), we find technology gaps (with respect to the frontier) more than three times larger than the TFP gaps on average. We also find sizable technology advantages for firms undertaking foreign direct investment and/or exporting to other European Union countries or to China, for importers of materials, and for firms with competitors in China and the USA. Medium and large firms feature a higher technology premium, which is even higher for firms operating in country-sectors that are more exposed to import competition from China. Younger firms use better technologies but less effectively.


Introduction
Productivity differentials across economic units are commonly interpreted in terms of "technological" heterogeneity, independently of being measured as labor productivity or total factor productivity (TFP). This practice is particularly pervasive in firm level studies, where making technological upgrading more accessible to the less "productive" firms seems to be the main prescription for helping them reduce their distance from the "frontier" and ultimately enhancing aggregate productivity growth.
However, productivity is a broad notion encompassing several hard-to-measure factors, which comprise all that cannot be explained by capital accumulation, in its standard TFP formulation, and even capital accumulation itself if labor productivity is used. Hence, enucleating the relative importance of the technological dimension within the wide notion of productivity is becoming increasingly important, particularly from a policy perspective.
For instance, the OECD (2015) documents labour productivity growth amounting to 3.5% at the global (manufacturing) frontier, against 0.5% for non-frontier firms, over the course of the 2000s (the gap is even more pronounced in the services sector). On the one hand, this report raises questions about the obstacles preventing laggard firms from adopting well-known frontier technologies: on the other, it mentions how, "given the difficulties in measuring technology, the globally most productive firms are also assumed to operate with the globally most advanced technologies but it should be recognized that very technologically advanced firms might not necessarily appear as the globally most productive or the most successful in terms of profits" (OECD 2015 page 33).
However, little empirical effort has been made to investigate the relative importance of the technological dimension of productivity within the general notion of productivity. As Bernard and Jones pointed out more than 20 years ago (Bernard and Jones 1996a), the usual (Solow residual) TFP measures are conceived under the hypothesis that the different economic units (country-sectors, in their case) are characterized by the same factor coefficients. Insofar as differences in factor coefficients mirror differences in technology, this rules out the possibility that different industries in different countries have access to different technologies. From the firm level perspective, this entails neglecting the armchair evidence that an array of technologies is normally available to firms operating in the same industry, with access to each technology varying across firms along several dimensions. The degree of internationalization is one of these dimensions.
In fact, the literature on firms' international exposure has extensively highlighted the positive correlation between a firm's export status and its productivity (called "exceptional exporter performance" by Bernard and Jensen (1999)). This paper explores whether, and to what extent, the productivity advantage of internationalized firms documented by the literature (see Sect. 2), can be understood in terms of technological advantage, rather than know-how and/or ability in using similar technologies.
The technological dimension of productivity is usually not considered explicitly because estimating technology-specific production function parameters, without exante assumptions on the technology used by each firm, seems a daunting task with standard econometrics. This process entails that the potential factor parameter dif-ferences across technologies flow entirely into a Solow residual (i.e., the difference between observed and predicted output), in which the predicted output is computed on the basis of sector-specific production function parameters (i.e., estimated at the sectoral level).
Our analysis builds on the mixture regression analysis discussed in Battisti et al. (2020) to unbundle technology and TFP by estimating technology-specific production functions. We retrieve technology-specific factor parameters, avoiding any type of ex-ante assumption on the degree of technological sharing across firms and leaving the number of available technologies (i.e. technology groups) unconstrained. This procedure allows us to give a sense of magnitude concerning how much of the productivity-internationalization premium can be attributed to firms' technological choices, on the one hand, and to an idiosyncratic TFP measure which, being net of this "technological" component, can be interpreted in terms of know-how and/or ability to exploit the given technology, on the other. We will focus on the complement (or gap) of the firm's productivity performance over these two productivity dimensions with respect to what can be considered as the best practice: the between-technology TFP gap, defined as the output difference between what a firm could have produced had it chosen the best (output maximizing) available technology and the predicted output of the firm under its actual technology; and the within-technology TFP gap, defined as the difference between the average output of the best 5% performing firms and the actual output of the given firm, within a same technology group.
The analysis takes advantage of the EFIGE-ORBIS database, provided by Bruegel. EFIGE is a survey that reports detailed information on the international operations of almost 15,000 firms in seven European economies (Austria, France, Germany, Hungary, Italy, Spain, and the UK). This is combined with balance sheet data drawn from the ORBIS database, provided by Bureau van Dijk.
The estimation results point to a sectoral number of technologies ranging from one to three, depending on the industry. The first result concerns the average importance of the estimated between-technology and within-technology TFP gaps across firms. Although there is some cross-sector variability, the between-technology gaps are found to be more than three times larger than the within-technology gaps, on average. Interestingly, Battisti et al. (2020) find them to be twice as large. These results suggest that sample selection plays a key role in the findings. The sample used in Battisti et al. (2020) is larger and much less skewed towards central Europe than is the sample used in our study. This seems to support the idea that technological advancement goes hand in hand with international openness. Indeed, this is the key result emerging from our correlation analysis with the degree of international exposure, which indicates a non-significant correlation between international exposure and our withintechnology TFP gaps. This complements the finding regarding the strong relevance of the between-technology gaps for most types of internationalization status. The technology advantage is found to be particularly important for firms undertaking foreign direct investment (FDI), for global exporters, importers of materials, and exporters.
Moreover, while the firms operating in country-sectors that are more exposed to competitive pressure from China do not seem to feature relatively higher withintechnology TFP gaps, large firms operating in those country-sectors display significant technology advantages. This finding is consistent with the evidence of a positive rela-tionship between competition and innovation provided by, among others, Bloom et al. (2016).
Firm size matters greatly for technology gaps, with larger firms producing with better technologies. Interestingly, age, which is found to significantly correlate with larger within-technology TFP, is associated with smaller technology gaps. This result suggests that technology is more easily accessible to young firms, but also that it takes time to be effectively used by the firm.
We finally address the geographical dimension of the international operations, showing that the technology advantage of being an exporter or being engaged in FDI activities is driven mainly by firms exporting to EU countries and, among the global exporters, by firms exporting to China. Moreover, the advantage is greater for firms importing from within the EU and, finally, for firms competing with the USA and other EU countries. Interestingly, having competitors located in China tends to be associated with higher between-technology technology gaps, except for large firms, which in this case also enjoy a technological advantage. From a policy perspective, the overall analysis lends support to the common interpretation of the productivity gap among less-internationalized firms in terms of distance from the technological frontier.
The rest of this paper proceeds as follows. Section 2 reviews the related literature. Section 3 briefly describes the study's data. Our estimation steps and results are discussed in Sect. 4, while Sect. 5 analyses the productivity-internationalization relationship. Section 6 concludes the paper.

Related literature
Our focus in the present analysis is on the extent to which cross-firm heterogeneity in productivity can be considered to mirror cross-firm heterogeneity in technology, paying particular attention to the productivity advantage of internationalized firms.
As mentioned, interest in the productivity advantage of internationalized firms started with the finding of a positive correlation between the export status and productivity, in the "exceptional exporter performance" highlighted by Bernard and Jensen (1999). Causality can run both ways. On the one hand, the relationship can be fostered by a process of "selection into export status", whereby firms that already perform better have a stronger propensity to export than other firms (Tybout 2002). Indeed, exposure to trade is found to force the least productive firms to leave the market (Clerides et al. 1998;Bernard and Jensen 1999;Aw et al. 2000) and to lead to market share reallocation toward the most productive firms (Pavcnik 2002;Bernard et al. 2006). On the other hand, one might think that firm productivity grows as a consequence of a process of "learning by exporting" (Delgado et al. 2002;Van Biesebroeck 2005;Lileeva and Trefler 2010;Serti and Tomasi 2008;De Loecker 2007) or due to technology upgrading effects induced by international competitiveness shocks (i.e., Bustos 2011;Bloom et al. 2016).
Independently of the direction of causality, which is beyond the scope of our analysis, studies have produced comprehensive evidence that productivity is positively related not only to the exporting, but, more generally, to other forms of international exposure, such as importing, undertaking FDI, and international outsourcing. For instance, Helpman et al. (2004) show that only the most productive among the firms that serve foreign markets engage in FDI. From a policy perspective, Mayer and Ottaviano (2008) conclude that the overall international performance of European countries is essentially driven by a handful of high-performance firms.
In general, the literature shows that the productivity premium seems to grow along with the complexity of international activities. This is clearly shown by Altomonte and Aquilante (2012), who use the same data uses in our study to reveal that firms' size and productivity premia are related to complex internationalization strategies such as exporting to a larger number of markets and/or to more distant countries, producing abroad through FDI, and being involved in international outsourcing. Such features seem to influence the patterns of internationalization in remarkably similar ways across countries. Consistently with these findings, the OECD (2015) highlights how "the global productivity frontier is actually comprised of firms from different countries […] They are very much "global firms" in the sense that they operate in different countries (often part of a MNE group), and are interconnected with suppliers/customers from different countries along global value chains (GVCs)." The productivity measure preferred by most of the above studies is TFP, measured along the lines suggested by the recent literature on TFP estimation (Olley and Pakes 1996;Levinsohn and Petrin 2003;Wooldridge 2009;Ackerberg et al. 2015). While this literature has emphasized on purging the estimates from the so-called simultaneity bias, we focus on the importance of firm level technology as a determinant of productivity differences among firms, while controlling for simultaneity. The opportunity to disentangle these two dimensions has been highlighted by, among others, Bernard and Jones (1996a, b), using the expression "total technological productivity". Moreover, Collard-Wexler and De Loecker (2014) document important productivity increases in the steel industry, associated with the adoption of a particular technology (i.e., the "minimill" technology), while Doraszelski and Jaumandreu (2013) show that a firm's TFP is stochastically affected by its investment in R&D. 1 From the theoretical perspective, a growing body of literature concentrates on "technology", in terms of both the creation of new technologies and the adoption/diffusion of available technologies, as one of the driving factors commonly included in standard TFP figures.
Studies on the relationship between technology adoption/diffusion and the diffusion of development date back to Gerschenkron (1962), and include Nelson and Phelps (1966) and Howitt (2000). Within this branch of literature, Parente and Prescott (1994) show that differences in barriers to technology adoption, which vary across countries and time, account for the great disparities in income across countries. Other works stress the spatial dimension of the process of technology adoption and diffusion. Desmet and Parente (2010) model the relationship between market size and technological upgrading. Desmet and Rossi-Hansberg (2014) propose a model in which technology diffusion affects economic development because technology diffuses spatially and firms in each location use the best technology they have access to. Comin et al. (2012) propose a theory in which technology diffuses slower to locations that are farther away from adoption leaders. 2 Other studies point out the importance of the firm level dimension of productivity growth (i.e. changes in the productivity distribution). Gabler andPoschke (2013) andDa Rocha et al. (2017) introduce endogenous establishment-level productivity in their studies on the evolution of aggregate productivity. A similar approach is adopted by a number of studies, in which aggregate growth is fostered by the evolution of the firm level TFP distribution, as induced by trade integration (see Grossman and Helpman (2015) for an early review). This literature explicitly focuses on the role of technological heterogeneity. Among others, Sampson (2016) stresses the role of the technological choice of the entrant firms. Perla and Tonetti (2014) focus on the diffusion of technology from the more to the less productive firms, which allows the TFP distribution to endogenously "evolve" even without the introduction of new technologies. In Perla et al. (2015), firms choose whether to adopt a better technology or not, and trade integration fosters aggregate growth, by increasing the incentives to adopt the better technologies. In Benhabib et al. (2017), firms choose whether to keep producing with their existing technology, adopt a new technology or innovate, but only innovation fosters growth in the long run. Luttmer (2007) focuses on imitation, highlighting that the small size of entrants de facto indicates that imitation is difficult. In Alvarez et al. (2014), the flow of new ideas is the engine of growth: firms acquire new technologies by learning from those with whom they do business, so that trade fosters technology diffusion and aggregate growth by enabling more meeting opportunities.
Two order-of-innovation consequences of the fact that increased exposure to international markets entails fiercer competition are put forward by the literature. On the one hand, increased competition is likely to reduce innovation to the extent that expected profits shrink. On the other hand, import competition, especially from low-cost countries, is likely to force firms to innovate more than they would otherwise. Aghion et al. (2005) reconcile these two effects by deriving and testing a U-shaped relationship between competition and innovation. The firm level adjustment to increased competition along this line is investigated by Gorodnichenko et al. (2010) and Hashmi (2013). Garcia-Marin and Voigtlander (2019) provide evidence that the declining marginal costs at export entry that emerge once the price bias in productivity measurements 2 Another line of research investigates technology diffusion in more detail using data on specific technologies. In particular, it is worth citing the Cross-country Historical Adoption of Technology (CHAT) dataset, described in Comin and Hobijn (2009), which includes long-run information on the extensive (whether a specific technology is present or not in a given country at a particular time) and intensive (the intensity with which producers or consumers employ a technology, at a given moment in time, scaled by the size of the economy) margins of technology adoption at the country-level on a number of technologies (e.g. tractors, fertilizer, portable cell phones). The CHAT dataset enables, among others, Comin and Hobijn (2010) to explain the very different speeds at which countries recovered after wars, Cervellati et al. (2014) to study the effect of trade liberalization and democratization on technology adoption, and Comin and Mestieri (2018) to explore the general patterns characterizing the diffusion of technologies, how they changed over time, and the key drivers of technology. induced by shrinking markups is accounted for are associated with higher plant-level investment (especially in machinery), which suggests that technology upgrading may be driven by a learning-from-exporting mechanism or by the increased potential benefits of innovation due to access to larger markets. Accordingly, they show that marginal costs drop steeply for plants that are initially less productive.

Data
This study's analysis is based on a sample of manufacturing firms drawn from the EU-EFIGE/Bruegel-UniCredit dataset 3 ("EFIGE" henceforth). This dataset surveys the international activities of almost 15,000 firms in seven European economies (Austria, France, Germany, Hungary, Italy, Spain, and the UK). The survey was conducted in 2010 and covers the period 2007 to 2009. The reported information is cross-sectional. Production function estimation, which requires two years in order to control for simultaneity (see Sect. 4), is carried out over the two years of 2009 and 2010.
The sample, which is representative of firms with more than 10 employees (see Altomonte and Aquilante 2012, for detailed information on the dataset and its representativeness), has been combined by the EFIGE team with balance sheet information from the ORBIS database of Bureau van Dijk (which covers 2001 to 2014).
From the EFIGE-ORBIS dataset, we draw information on the variables used in production function estimation. As well as value added and fixed assets, the survey provides us with quite detailed information on employees, allowing us to separately identify the number of workers in the three categories of: white collar, skilled blue collar and unskilled blue collar. Moreover, we are able to observe total (and thus average) wages. Value added, capital and wages are deflated using OECD-STAN countrysector-year deflators. From EFIGE-ORBIS, we also draw information on the other firm characteristics: age and size category (based on the total number of employees).
Data are trimmed at the first and last percentiles of the output per worker and capital per worker distributions. Tables 1 and 2 report the descriptive statistics at the sectoral and country levels, respectively.
The advantage of EFIGE over other firm level data is its detailed information on both labor and firms' international activities.

Unbundling between-technology and within-technology TFP
We disentangle the between-and within-technology dimensions of productivity by using the two-stage estimation detailed in Battisti et al. (2020). The first stage entails estimating the correction factors to be used in the estimation of the production functions in order to tackle the simultaneity issue (Klette and Griliches 1996). In the second stage, technology-specific production function parameters are estimated, through a mixture model that includes the correction factors obtained in the first step. The estimation algorithm identifies the number of technology groups (i.e., the number of production functions) for each sector and the probability of adopting each technology for each firm. Finally, we compute between-technology and within-technology TFP gaps along the lines described below. In its standard TFP interpretation, firm productivity encompasses output differences across firms that are not related to the choice of inputs. Hence, all else being equal, TFP differences are given by two dimensions. The first one is common among firms and is defined over the set of parameters that specify how inputs contribute to generating output. This is a technological dimension characterizing the shape (i.e., slope and intercept) of the production function. The second dimension is firm-specific and reflects the idiosyncratic ability of the firm to transform inputs into output, given the technology (e.g., Bhattacharya et al. 2013).
Before going into the details of each step, let us formalize the unbundling setup. Assume that firm i uses the bundle of inputs X to produce output Y through Y i = A i F(X i ). Following the standard approach, the idiosyncratic productivity (i.e., TFP) term A i can be retrieved by estimating the log-production function y i = α + βx i + i at the sectoral level and computing TFP as the Solow residual a i ≡ ln(A i ) = y i −ŷ i . In this case, TFP is expressed in relative terms with respect to the average firm in the sector. Let us consider an array of technologies available in each sector, with a number of firms using each technology. Then, the estimated TFP productivity distribution can actually be seen as the result of the overlapping of several TFP distributions, one for each (group of firms using the same) technology. We use m = 1 as an index of technology, with M denoting the number of available technologies. We also assume that the technology differs in terms of the production function parameters (α, β) and that we are able to estimate them for each technology (we will use mixture models for this). With this information, we are also able to identify the technology among all possibilities that would allow firm i to produce the maximum possible output level given the amount of capital. We refer to this technology as m H . Instead of using the average firm in the sector as a benchmark, as in the standard approach, we can now express the Solow residual in terms of the average firm among the firms using the best technology:â i,m H = y i −ŷ i,m H , withŷ i,m H denoting firm i's predicted output associated with using the best technology.
Let us now turn to the technology actually used by firm i and assume, for the time being, that it is observable. Using m D to refer to such technology, the corresponding predicted outputŷ i,m D can be used to obtain the following decomposition ofâ i,m H into a within-and a between-technology component: The "between" component, which is also equal toâ i,m D −â i,m H , is a measure of technology productivity expressed in terms of the distance between the predicted output associated with the firm's actual technology and the output associated with the output-maximizing technology, given the input levels, had the firm behaved as the "average" firm in the two technology groups. As the difference between predicted outputs, this measure is driven only by technology, through the different production function parameters.
The "within" component provides us with a quantification of the differenceâ i,m D between the observed output of the firm and the predicted output it would have obtained had it behaved as the "average" (representative) firm in its own technology group (i.e., within the group of firms using technology m D ). Production functions estimation-stage 1 (dealing with simultaneity) The first step of our method deals with the simultaneity issue. As is well known, productivity estimation at the firm level suffers from endogeneity. The problem stems from the fact that the information incorporated in the productivity term a it , although unknown to the econometrician, is arguably used by the firm when the decision on the amount of inputs is made. In other words, inputs might be chosen simultaneously with output. This causes the OLS estimated error term to correlate with the amount of factors and thus biases the estimated production function coefficients (i.e., Cov(K i,t , a i,t ) = 0, Cov(L i,t , a i,t, ) = 0, with K i,t and L i,t being the total amount of fixed capital and labor employed by firm i at time t). Since our mixture regression framework builds on weighted least squares (see next paragraph), our estimated parameters are potentially biased. Moreover, in our case, the technology may be chosen simultaneously with the inputs. This would yield an additional source of simultaneity (i.e., Cov(K i,t , m i,t ) = 0, Cov(L i,t , m i,t ) = 0) and make the bias even is more pervasive. However, the widely used semi-parametric solution suggested by Olley and Pakes (1996) as well as subsequent contributions (Levinsohn and Petrin 2003;Wooldridge 2009;Ackerberg et al. 2015) can hardly be applied to our mixture regression framework. Therefore, we deal with simultaneity using the following empirical approach suggested by Battisti et al. (2020). We assume that the capital and production technology in t are both decided in t −1, based on one-time-lagged capital and a vector Z t of external conditions (i.e., market frictions and other country characteristics such as interest rate and wage differentials). Labor is instead a freely adjustable input and its t level is decided in t on the basis of K i,t and the observed productivity a i,t . This leaves us with the following system of equations (with the index in square brackets referring to the time period in which the decision is made): under the assumption that u K i,t and u L i,t are i.i.d. errors, e K i,t can be thought of as "absorb- , a i,t ). As measures of the unobserved correlations, the estimated residu- can be included as additional regressors in the mixture regressions (Stage 2) to estimate potentially simultaneity-free production function parameters.
We estimate (2) sector-by-sector through three-stage least squares including country-year effects (to account for Z c,t ). The estimation, which requires two time periods, is carried out over the two years of 2009 and 2010, under the hypothesis that the technology to be used in 2010 is decided, together with the amount of capital, in 2009. This choice is consistent with the information on firms' international characteristics used in Sect. 5, as this is collected as a cross section for the years before 2010 (firms were requested to report their engagement in international activities in 2008 or over the 2007-2009 period, depending on the question). At this stage, we do not disentangle across labor categories, as we are interested only in the whole residual to be used in the second step. Production functions estimation-stage 2 To obtain technology-specific production function parameters, we estimate the following labor-intensive production function (index t is hereafter omitted) sector-by-sector: where labor (measured as number of employees) is disaggregated into three categories: white collar (w i ), skilled blue collar (s i ), unskilled blue collar (u i ). Moreover, r i is the average wage paid by the firm, used to proxy for the human capital dimension left unexplained by disentangling across labor categories; technology is described by the parameter set , with the TFP term split into the idiosyncratic component a i and the technology component α m ; FE s are threedigit industry fixed effects; η i is the i.i.d term.; and δ i,m are observation weights defined in (4). The probability distribution of y i is modelled as the weighted average of the M unknown distributions, each with a proper mean (μ m ) and variance (σ 2 The unknown ex-ante probability of belonging to group m (i.e., ω m = i p i,m / m i p i,m ) is dealt with by using the expectation-maximization (EM) algorithm to the weighted least squares estimation of Equation (3) (see De Sarbo and Cron 1988), where the observation weights are The (weighted least squares) estimation starts with random values of ω m to estimate (3) and compute posterior probabilities p i,m . These are used to update the weights ω m , and thus the regression coefficients in Eq.
(3) in each step of the EM algorithm. The algorithm alternates the "M" step (production function estimation) and the "E" step (probabilities computation) until the log-likelihood converges. Under the assumption that the density functions of Eq. (4) follow a Gaussian distribution, the characterizing parameters are the weighted mean and the variance of the group regressions. We use the Bayesian Information Criterion (BIC), to ex-post determine the number M of clusters, as it helps us identify the best M in terms of avoiding a poor description of the population on the one hand and over-fitting on the other (Steele and Raftery 2010). The obtained BIC values are reported in Table 3. 4 The estimation is carried out cross-sectionally for the year 2010 employing mixture models (McLachlan and Peel 2000) to recover a vector of production function parameters for each technology m. Mixture models allow us to avoid any ex-ante assumption on both the degree of technology sharing across firms and the number of available technologies, which are instead provided by the estimation routine. To give an intuition, the algorithm allows us to describe a production function process as being composed of several sub-processes, each of them with a set of parameters and with each firm having its own probability of belonging to one of these sets. Finally, in this framework, a "superior" technology is characterized by a parameter set (α m , β K m , β W m , β S m , β U m ) such that a higher output is produced at given input and TFP levels. 5 Table 4 reports the estimated α and β parameters for each technology group and sector. As the first column of this table shows, the number of technologies identified by our algorithm tends to be higher than one. Table 5 reports the total probability by technology group (computed as pr ob m = i p i,m ). This gives a sense of the magnitude of each group (i.e., number of firms in a given group, with each firm accounting for a share equal to its own probability of belonging to that group). None of the groups is empty or close to it. It is worth noting that the final number of technologies is generally lower than that suggested by Table 3. This happens because, although the BIC is used in deciding the best fit in the mixture regressions, we constrain the β parameters to be statistically significant (while non-significant α values are set to zero). For each firm, the probability of belonging to a discarded cluster is set to zero and is proportionally reallocated to the other sector clusters. "Between" and "Within" TFP gaps Once the production function parameters are estimated for each technology available within each sector, they are used to predict the output levels of the firm under each technology and give the decomposition in (1) an order of magnitude. The locally optimal technology m H at x i = {k i , w i , s i , u i } can be identified as the one, such thatŷ i,m H |x i >ŷ i,m |x i ∀m = m H . Moreover, since the technology actually used by firm i is unobservable, the under uncertainty equivalent y i,m D in (1) (i.e., probabilistic predicted output), can be obtained as m H m=1 p i,m ·ŷ i,mthat is, as a weighted average of the predicted output associated with each technology, with the weights represented by the probabilities p i , m provided by the estimation algorithm.
Hence, we can define the "between-technology" TFP (i.e., BTFP) gap as the difference between firm i's frontier output and firm i's probabilistic predicted output (i.e., the weighted average of the estimated predicted output associated with each possible technology m): We can define the "within-technology" TFP (i.e., WTFP) gap as the difference between firm i's observed output and firm i's probabilistic predicted output:

Table 4
Estimated production function parameters sector  where WTFP B denotes the average WTFP of the best 5% of firms (in terms of WTFP) in the sector. 6 Table 6 reports the sectoral estimates of both measures, in terms of average and standard deviation (columns 1 to 4).
To obtain more detail about the technological dimension, we also report two additional BTFP indexes. The first one, obtained as Ratio1 i = BTFP gap î y i,m H −ŷ i,m L , whereŷ i,m L is firm i's predicted output associated with using the worst technology, measures the extent to which the industry average BTFP gap is driven by sector-wide differences in the available technologies, as subsumed by the denominator (i.e., the gap based on predicted output using the best versus the worst technology). The second index gauges the relative importance of the technological dimension: For ease of interpretation, the estimated number of technologies is reported in the last column of Table 6.
In general, the contribution of the WTFP and BTFP gaps to cross-firm output differences is quite unbalanced, with technological differences taking the lead as drivers of the output differences across firms, particularly in sectors in which three technologies are obtained (Fd, Pr, Ma, Re). This is highlighted by the high average values of Ratio2.
By widening the output differences associated with the available technologies (see Ratio1), a higher number of technologies tend to increase the BTFP gap on average (but not its dispersion).  Figures 1 and 2, in which the sectoral distribution of the estimated BTFP and WTFP gaps is reported.
It is worth emphasizing that, in our methodology, the price-bias 7 usually associated with firm level TFP estimates does not affect our BTFP gap measure (under the assumption that firms in different technology clusters do not systematically apply different prices), with all the cross-firm price (i.e., mark-up) dispersion flowing into the WTFP gap term. Moreover, as human capital heterogeneity is controlled for by disentangling across different types of workers and by including the average wage within the firm, we can safely attribute cross-firm BTFP gap heterogeneity to differences in technology in a strict sense (e.g., different machinery and software returns).
Taken together, these results suggest that the technology dimension tends to dominate. At the same time, however, there are many firms in which the output gap is determined mostly by the ability to use the adopted technology.
Next, we discuss how international activities correlate with both dimensions is discussed next.

Disentangling the productivity-internationalization relationship
This section examines whether the widely documented productivity advantage of international firms can be understood in terms of technological advantage, as it is usually done, or whether it is more a matter of know-how and ability to use the given technology. In the first case, we expect firms' involvement in international activities to be correlated more with our BTFP gap measure than with WTFP gaps and vice versa, if technology is not the main driver of the productivity premium documented by the literature.
To this end, we focus on a vector of firm level measures of international exposure I i to estimate the following equation in different specifications where B i is, in turn, standard TFP (to reproduce the standard results in the literature), BTFP gap i , and WTFP gap i ; Z i is a vector of firm level controls; and F c and F s are country and sector fixed effects. ζ I is a vector of parameters that, though not entailing causality, reflect the empirical correlation of the productivity dimension of interest with the different types of international exposure.
To measure the international exposure of firms, we exploit the detailed information collected by the EFIGE survey, which is used to classify firms along seven (non-mutually exclusive) categories: exporters, importers of services, importers of materials, FDI undertakers, passive and active outsourcers, and global exporters. A dummy for each category is created. Clearly, if a firm is not involved in any of these activities, all the related dummies are equal to zero, and the firm is considered to be non-active abroad. Firms are considered exporters/importers if they have sold/bought abroad some or all of their own products/services in 2008. With respect to FDI and international outsourcing, the survey includes a question asking whether firms were running at least part of their production activity in another country: firms replying "yes, through direct investment (i.e. foreign affiliates/controlled firms)" are considered to be undertaking FDI, while firms replying "yes, through contracts and arm's length agreements with local firms" are considered to be pursuing an active international outsourcing strategy. Firms replying positively to a question asking whether part of their turnover is made up by sales produced according to a specific order coming from a customer (produced-to-order goods) and indicating that their main customers for the production-to-order activity are other firms located abroad are considered to be pursuing a passive outsourcing strategy. Hence, a passive outsourcer is the counterpart of an active outsourcer in an arm's length transaction. Finally, global exporters are firms that export to countries outside the EU.
To set a benchmark, we start by considering the correlation of the seven characteristics above with a standard TFP measure computed following Wooldridge (2009), i.e., without controlling for technological heterogeneity. 8 Table 7 reports the estimation output. The first column reports the coefficients estimated in separate regressions with country and sector fixed effects. The positive sign highlights the higher TFP characterizing the internationalized firms: this is the TFP premium highlighted by the literature in the field (see Altomonte and Aquilante 2012 for consistent estimates on the same data).
In columns 2-4, the seven characteristics (with the non-active-abroad firms set as the excluded category) are considered all together, in a short horse-race aimed at identifying the main markers of the productivity premium. As well as including country and sector fixed effects, we also consider firm age and size; in columns (4) and (5), we also control for how much the country-sector in which the firm operates is under pressure due to import competition (with data drawn from COMTRADE). To do this, we consider the share of imports from China in country-sector total imports. Moreover, four size categories are considered: small (10-19 employees), small-medium (20-49), medium (50-250), and large (>250). Larger firms are significantly more productive, particularly when operating in sectors featuring fiercer import competition from China. However, the exporter status loses significance when more complex specifications are used: the more productive firms seem to be those classified as active outsourcers and importers of services, as well as those involved in FDI activity.
Against the backdrop of these standard results, Tables 8 and 9 show the results on whether and how the significance in analogous regressions changes when standard TFP is replaced by our BTFP gap (see Table 8) and WTFP gap (see Table 9) measures.
Comparing Tables 8 and 9 with Table 7 clearly shows that, while the explanatory power of the country-sector controls is broadly unaffected, the significance of all the internationalization dummies vanishes when WTFP gaps are considered, while the correlation is even stronger with the BTFP gaps. In particular, the technology advantage is found to be particularly important for the firms undertaking FDI, for global exporters, importers of materials and, finally, for exporters.
Firms operating in country-sectors relatively more exposed to competitive pressure from China are not significantly less productive. At the same time, large firms operating in those country-sectors display significant technology advantages, consistent with the evidence provided by Bloom et al. (2016). As for the BTFP gaps, size strongly matters in all specifications, with larger firms found to produce with better technologies. Interestingly, firm age appears to be non-significant in the standard TFP regressions because it acts in opposite directions in terms of between-technology and within-technology productivity dimensions, with younger firms characterized by lower BTFP gaps and higher WTFP gaps. This suggests that technology is more accessible to young firms but also that they need time to accumulate experience in using it.
We next focus on the geographical articulation of the between-technology premiums premium. The results are reported in Tables from 10, 11, 12, 13 14. The significance of the technology advantage associated with being an exporter is driven by firms exporting to EU countries and, among the global exporters, by firms exporting to China. The same is true for FDI. Regarding importer status, the technology gaps are       , medium (50-250), large (>250). China: share of imports from China wrt total imports in the country-sector. Statistical significance: * p < 0.10, ** p < 0.05, *** p < 0.01 significantly lower in firms importing from within the EU and from China, and for the importers of services. Finally, in Table 14, we disentangle the China competition effect by exploiting information on the countries in which the firm has competitors. The more technologically advanced firms seem to be those operating in competition with the USA and other EU countries. Interestingly, having competitors located in China tends to be associated with higher technology gaps. This effect persists also when the interaction with size is controlled for (see the last column of Table 14). However, the technology gap is significantly lower when having competitors in China is coupled with a larger size (consistently with the results in column 4).

Conclusions
In the standard TFP concept, the notion of productivity includes several hard-tomeasure factors-namely, all the differences in output not explained by different input choices. Studying the role played by the technological dimension, within this general notion of productivity is gaining increasing interest, particularly from a policy perspective. Cross-firm productivity heterogeneity is commonly understood in terms of "technological" heterogeneity but lacks a clear interpretation. One of the fields in which this issue is particularly pervasive is international economics, where the firms exposed to international markets are usually found to be more productive than domestic firms. Regardless of the direction of causality, this productivity premium is commonly attributed to a superior technology level, although "it should be recognized that very technologically advanced firms might not necessarily appear as the globally most productive or the most successful in terms of profits" (OECD 2015, p. 33). The literature on the technological dimension of productivity is sparse, chiefly because estimating technology-specific production function parameters is a daunting task with standard econometrics, without an ex-ante assumption on the technology used by each firm. In standard estimates (producing sector-specific estimates), the factor parameter differences not accounted for in the estimation flow entirely into the Solow residual. This study has built on the mixture regression analysis discussed in Battisti et al. (2020) to unbundle between-technology and within-technology TFP by estimating technology-specific production functions and to disentangle technological and withintechnology productivity gaps across firms. Technology gaps are measures of the output loss associated with not using the best technology, given the level of inputs, while within-technology TFP gaps can be considered measures of the ability to use the given technology. Our analysis takes advantage of detailed information on labor types (which is key to controlling for human capital) and on firms' international operations (which is exploited to study the productivity-internationalization relationship) provided by the EFIGE database.
The estimation points to a sectoral number of technologies ranging from one to three, depending on the sector. We also find, although with some cross-sector variability, that technology gaps are more than three times larger than within-technology TFP gaps on average. A non-significant relationship between international exposure and within-technology TFP gaps is found, while the relationship with the betweentechnology gaps is strongly significant. In particular, the technology advantage is The most technologically advanced firms generally seem to be those operating in competition with the USA and other EU countries. Interestingly, while the firms operating in country-sectors relatively more exposed to competitive pressure from China do not seem to feature relatively higher within-technology productivity gaps, large firms operating in those country-sectors display significant technology advantages. Size strongly matters in technology adoption process, with larger firms producing with better technologies. Interestingly, age displays the opposite correlations, with younger firms using better technologies but less effectively. This result suggests that technology is more accessible to young firms but that they need time to accumulate experience in using it. From a policy perspective, the overall picture produced by our estimates supports the common interpretation of the productivity gap among less internationalized firms in terms of distance from the technological frontier. In addition, our analysis allows for more granular policy implications than previous studies do because it allows us to disentangle the specific correlation between given firm characteristics on the one hand and both the technology choices and the ability to use the adopted technology on the other hand. Our results suggest that helping young firms learn to use new technologies more quickly is more important than improving the technology adoption process. Furthermore, as larger firms are found to react more positively to external competition than their smaller counterparts, policy initiatives may consider creating instruments designed specifically to encourage the technological upgrade of smaller internationalized firms (e.g., institutional incentives for technological sharing in firm networks, . China: share of imports from China wrt total imports in the country-sector. Statistical significance: * p < 0.10, ** p < 0.05, *** p < 0.01 technological partnerships, shared patent portfolios). While a policy oriented discussion of our findings is outside the scope of this study, its implications are suggestive and should stimulate further empirical and methodological investigation into the empirical relationships between technology driven TFP and within-technology productivity.
function of these changes. Insofar as this function is invertible, its inverse may be calculated and plugged into the production function-estimating equation. Olley and Pakes (1996) suggest resorting to investment as a proxy, whereas Levinsohn and Petrin (2003) use intermediates. Doraszelski and Jaumandreu (2012), who extend Olley and Pakes (1996), show that a firm's TFP is stochastically affected by its investment in knowledge (considered in terms of R&D). 10 The implementation of such a semi-parametric approach within our mixture model framework raises identification problems. In particular, the way in which the proxy variable (either investment or materials) reacts to changes in technology and TFP should be specified separately because firms' input choices may be differently correlated with the technology parameters and firms' TFP. Our model of technology adoption developed in Sect. 4 is meant to deal with these two sources of simultaneity separately, without the need to rely on a specific proxy variable.
Similarly to the semi-parametric strategy, our approach to simultaneity entails a two-step estimation. However, instead of trying to include the whole productivity term (as proxied by the inverted investment or intermediates demand function) in the production function, our strategy seeks to take its effect (on the capital and labor choice) into account by controlling for its covariance with labor and capital. We exploit the fact that the correlations between capital and technology as well as between either capital or labor and TFP flow into the residuals of the system of equations in Eq. (2), consisting of the K policy function and static condition for L. Once estimated, these residuals allow us to control for simultaneity in the mixture analysis. As well as controlling for the simultaneity caused by idiosyncratic TFP, this strategy can control for any other factor affecting the input choice through the relationships in Eq. (2). This aspect is particularly relevant in light of the results provided by Gandhi et al. (2018) on the identification problems characterizing the semi-parametric approach. Such problems stem from the circumstance that even when the invertibility conditions (for the proxy variable demand function) identified by the literature are satisfied, it is still possible to construct a continuum of observationally equivalent production functions that cannot be distinguished from the true production function in the data. 11 By not resorting to any proxy variable, our approach is in principle immune to this order of problems.
Second, our timing of input choice is strictly in line with Olley and Pakes (1996), with labor dealt with as a "perfectly variable" input decided at the time of production. While this is also true in Levinsohn and Petrin (2003), Ackerberg et al. (2015) highlight the collinearity problems associated with such decision timing and suggest a slightly different specification in which labor is a "less variable" input than intermediates chosen one "subperiod" before productivity is observed. This implies that differently from Olley and Pakes (1996) and Levinsohn and Petrin (2003), in which the labor and capital coefficients are identified in the first and second steps respectively, the first step is only used to net the output function of the idiosyncratic productivity (and in our case also technology) shock, with both the capital and the labor coefficients retrieved in the second stage. 12 Although we use the same timing of input choice as Olley and Pakes (1996) and Levinsohn and Petrin (2003), our approach is not subject to the Ackerberg et al. (2015) critique, as both coefficients are estimated in the second stage in our case.

First-stage estimation
In this Section we report the output of the first-stage of our estimation algorithm. According to (2)  We also report in Fig. 3 the overall distribution of the estimatedˆ i andˆ i , which are then used in the second stage in order to control for simultaneity.