Introduction

The constant elasticity of substitution (CES) utility functions are commonly used in economic analysis. Since the seminal contributions of Krugman (1979) and, more recently, Melitz (2003), a large majority of studies in the area of international trade have relied on CES utility functions. In empirical studies aiming to estimate the structural parameters of economic relationships, knowledge of the CES preference parameter is crucial. One strand of the existing literature in applied industrial organization is dedicated to estimating firm markup (market power). For example,, De Loecker and Warzynski (2012) developed a framework to estimate the markup from firm-level production data and explored the relationship between markups and firm export behaviour. In theoretical models utilizing CES utility and/or production functions, in the presence of monopolistic competition, the CES parameter is linked to markup pricing through firm profit maximization behaviour.

In empirical studies aiming to estimate structural parameters, the CES utility function is regularly employed to generate a demand function, which is then used to determine the profit-maximizing price. In such situations, estimating the preference parameter is typically necessary to identify other underlying structural parameters. For example, Aw et al. (2011) estimated a dynamic structural model involving the R&D and export behaviour of firms in the Taiwanese electronics industry and recovered a CES preference parameter of around 0.84 in the domestic market and 0.83 in the export market. While examining the export behaviour of Colombian manufacturing firms using a Bayesian Monte Carlo Markov chain estimator, Das, Roberts, and Tybout (2007) recovered the CES preference parameter in the 0.88–0.94 range.

One commonly used strategy to identify the CES preference parameter (denoted as ρ) is to assume that the marginal cost (MC) of production for firms is constant and then utilize the link between total variable cost (TVC) and total revenue (TR) implied by profit maximization (described in detail in the next section).Footnote 1 TVC and TR can normally be directly observed in most datasets. In this paper, we consider the degree of robustness of this strategy concerning the degree of violation of the constant MC assumption. To this end, we rely on a controlled Monte Carlo experimental environment. We start by defining the degree of violation of the constant MC assumption by using the proportion of firms in the industry that have a constant marginal cost (β). This means that 1–β proportion of firms do not have a constant marginal cost. A higher β in a sample implies a relatively less severe violation of the constant MC assumption, whereas β = 0 means a complete violation. Using a Monte Carlo experiment, we derive the probability that a β-degree violation of the constant MC assumption would not lead to statistically significant differences between the estimates of ρ in the treatment and control groups as a function of β. This probability can be defined as the power of the constant MC assumption.

Our study has two important implications for empirical research. First, for studies that involve estimating the CES preference parameter using the constant MC assumption, knowledge of the power function of the assumption enables researchers to evaluate the robustness of their estimates. We find that this assumption is not overly restrictive, as it has a high power in recovering the underlying preference parameter. In our experiment, we find that, as long as at least 20 percent of the firms in the sample have a constant MC, the probability that the differences between the estimates of ρ in the treatment and control samples are statistically insignificant is higher than 90 percent. Additionally, if researchers are willing to focus only on comparing the mean, variance, or median between the control and treatment samples, then the proportion of constant MC firms can be further reduced to 10 percent.

Second, the constant MC assumption is widely used in applied industrial organization and international trade literature, either explicitly or implicitly. For example, studies on firm export behaviour often assume that MC of production is constant, allowing researchers to separately consider the export market and domestic market behaviour of a firm (see for example, Demidova et al. 2012; Eaton and Kortum 2002; Melitz 2003). In one strand of the literature dealing with the estimation of firm markup (market power), the constant MC assumption is used to link the markup to observable data. For example, De Loecker and Warzynski (2012) proposed a framework to estimate the markup from firm production data, where the constant MC assumption allows one to decompose the markup as a ratio of the output elasticity of an input as a proportion of the expenditure share of the same input in a firm's total sales. In this paper, we examine the power of the constant MC assumption, enabling us to assess the robustness of existing studies relying on this assumption.

In summary, the CES utility functions plays a pivotal role in economic analysis, particularly in the fields of international trade and industrial organization. Estimating the CES preference parameter is essential for comprehending firm behaviour, market power dynamics, and underlying structural relationships. Empirical investigations hinge on the assumption of constant marginal cost, a premise that our research rigorously evaluates for its robustness. Our findings offer valuable insights into the dependability of this assumption, thereby enhancing the validity of existing empirical research.

The rest of this paper is organized into five sections. In Sect. "Economic Framework and Identification Strategy", we describe the economic model and associated identification strategy. Sect. "The Experiment" details the experiment's setup, and Sect. "Findings" discusses the findings from the experiment. In Sect. "Estimation Using Real Data", we estimate the CES preference parameter from real data. Sect. "Concluding Remarks" concludes the paper.

Economic Framework and Identification Strategy

In a monopolistically competitive market where heterogeneous firms produce differentiated products and consumers have CES preferences. The consumer utility maximization yields the following Marshallian demand function:

$$ q_{i} = \Phi p_{i}^{{{1 \mathord{\left/ {\vphantom {1 {\left( {\rho - 1} \right)}}} \right. \kern-0pt} {\left( {\rho - 1} \right)}}}} $$

where q and p represent the quantity and price of demand, respectively; Φ represents the aggregate demand, which firms will take as exogenous since they are small relative to the market; ρ (0 < ρ < 1) is the CES preference parameter that researchers intend to estimate from the observed firm revenue and cost data; and the subscript i indexes the firms/varieties.

Firms have a cost function TCi(qi) with the marginal cost of production being MCi(qi). Note that both functions are indexed by i, indicating the heterogeneity among firms. Firms set prices of their products to maximize profits, as follows:

$$ \mathop {\max \pi_{i} }\limits_{{\left\{ {p_{i} } \right\}}} = \Phi p_{i}^{{\frac{\rho }{\rho - 1}}} - TC_{i} \left( {q_{i} } \right) $$

Solving the maximization problem yields the first order condition (FOC) and second order condition (SOC) as follows:

$${p}_{i}=\frac{{MC}_{i}\left({q}_{i}\right)}{\rho }$$
(1)
$$\frac{\partial {MC}_{i}}{\partial {q}_{i}}\frac{{q}_{i}}{{MC}_{i}}>\rho -1$$
(2)

where Eq. (1), which is the first order condition, describes the well-known firm pricing behavior, specifically the practice of setting the price as a markup over its marginal cost of production.

Equation (2) suggests that the quantity elasticity of marginal cost must be larger than ρ – 1.

To identify ρ, one can impose a constant marginal cost restriction on the cost function, specifically MCi = ci. Given this restriction, a firm’s optimal total revenue (TR) is \({TR}_{i}=\frac{1}{\rho }{\rho }^{\frac{1}{1-\rho }}\Phi {c}_{i}^{\frac{1}{\rho -1}}\) and its total variable cost (TVC) is \({TVC}_{i}={\rho }^{\frac{1}{1-\rho }}\Phi {c}_{i}^{\frac{1}{\rho -1}}\). Accordingly, there exists a relationship between TVC and TR, implied by firm profit maximization behavior, as follows:

$${TVC}_{i}=\rho {TR}_{i}+{\varepsilon }_{i}$$
(3)

where ε represents a measurement error.

With data on total revenue and variable cost in hand, one can simply regress total variable cost against total revenue to obtain an estimate of ρ. This strategy has been utilized by a number of researchers in the recently popular structural estimation of firm behavior (see for example Aw et al. 2011; Das et al. 2007; Sun 2023).

Clearly, this identification strategy relies on the assumption of marginal cost being constant with respect to the quantity of output. Nevertheless, how reasonable is this assumption? To explore this, assuming the total cost function is l differentiable, where l is a positive integer, one can Taylor expand the total cost function at qi = 0, as follows:

$${TC}_{i}\left({q}_{i}\right)\approx {fc}_{i}+{TC}_{i}{\prime}\left(0\right){q}_{i}+\frac{{TC}_{i}^{{\prime}{\prime}}\left(0\right)}{2}{q}_{i}^{2}+\frac{{TC}_{i}^{\left(3\right)}\left(0\right)}{3!}{q}_{i}^{3}+\cdots +\frac{{TC}_{i}^{\left(l\right)}\left(0\right)}{l!}{q}_{i}^{l}$$
(4)

where fc denotes the fixed cost of production and fci = TCi(0).

As l approaches ∞, the approximation becomes exact. Therefore, the constant marginal cost assumption is equivalent to truncating the approximation at l = 1 (first order Taylor expansion). The validity of this assumption depends on the magnitude of higher order polynomials in Eq. (4). Nevertheless, the constant MC assumption, which greatly facilitates economic modeling, can also lead to a good approximation of reality.

Despite the apparent validity of the constant marginal cost assumption, it is still worthwhile to investigate its impact on the identification of the preference parameter. To do so, we first truncate Eq. (4) at l = 3, and assume that each firm in the industry has one of the following three cost functions:

$${TC}_{i}\left({q}_{i}\right)={fc}_{i}+{b}_{i}{q}_{i}$$
$${TC}_{i}\left({q}_{i}\right)={fc}_{i}+{b}_{i}{q}_{i}+{c}_{i}{q}_{i}^{2}$$
$${TC}_{i}\left({q}_{i}\right)={fc}_{i}+{b}_{i}{q}_{i}+{c}_{i}{q}_{i}^{2}+{d}_{i}{q}_{i}^{3}$$

The above cost functions correspond, respectively, to constant, linear, and quadratic marginal costs. The third-order Taylor expansion can approximate the underlying cost function well. Suppose that β proportion of firms in the industry have constant MC (0 ≤ β ≤ 1) and the remaining firms have non-constant MC.

For each value of β (given ρ, Φ, fc, b, c, d), it is possible to generate firm level data (to which measurement errors can be appended to form realistic sample data) that can be used for empirical evaluation. Clearly if β = 1, estimation will correctly recover the parameter ρ. Hence, all firms with β = 1 can be used as the control group in our experiment, whereas firms with β < 1 belong to the treatment group. For example, if we would like to examine whether the identification strategy works when 20% of firms in the industry do not have constant MC (β = 0.8), we replace 20% of the firms in our simulated control sample (where β = 1) with firms that have non-constant MC. Except for the 20% firms, the control and treatment samples are identical. Hence, the differences in estimated ρ from the two samples can only be due to the differences in β. In the next section, we describe the experiment in detail.

The Experiment

To facilitate exposition, we first define three terms that are going to be used in this Section. First, as a part of the trial, we generate two sets of observations (i.e., the control and treatment samples). The ordinary least squares (OLS) technique is applied to both the control and treatment samples to recover the preference parameter ρ, where \({\widehat{\rho }}_{control}\) and \({\widehat{\rho }}_{treatment}\) denote the estimated parameter from control and treatment samples respectively. Second, each run contains a number of trials. Since each trial generates an estimate of ρ for the control and treatment samples, each run generates distributions of \({\widehat{\rho }}_{control}\) and \({\widehat{\rho }}_{treatment}\). If the identification strategy is robust, the differences between the two distributions would be statistically insignificant. Statistical tests, such as the Kolmogorov–Smirnov test, can be used to test whether the differences between the two distributions are statistically significant. Third, each round contains a number of runs. In each run, tests are conducted to check the differences between the distributions of the control and treatment estimates. If each round contains sufficiently large number of runs, each round can generate the probability that the statistical tests will fail to detect significant differences between the control and treatment estimates. This probability measures the extent to which the identification strategy is robust.

The hierarchal relationship among the round, run and the trial is as follows: round > run > trial. In other words, each round contains many runs and each run contains many trials. For each value of β, an experimental round can be implemented to generate the probability that the identification strategy is robust (which can also be defined as the power of the identification strategy). Therefore, a functional relationship between the power of the identification strategy and the proportion of firms that satisfy the constant MC assumption (i.e., β) can be derived through a multi-round experiment. In the following, we first describe the underlying data generating process and then the details of the experiment setup.

Data Generating Process

In each round, we generate pre-measurement error total revenue and variable cost for three types of firms (constant, linear, and quadratic MC firms). Data on these three types of firms are generated as follows.

First, we draw a set of values for the cost function parameters (i.e., fc, b, c, d) from uniform distributions such that \(fc\sim U\left(\underline{fc},\overline{fc}\right)\), \(b\sim U\left(\underline{b},\overline{b}\right)\), \(c\sim U\left(\underline{c},\overline{c}\right)\), and \(d\sim U\left(\underline{d},\overline{d}\right)\) where the upper and lower bars denote the upper and lower bounds of the uniform distributions. Such a process allows firms to be heterogeneous.

Second (given ρ, Φ, fc, b, c, d), we use Eq. (1) to calculate the optimal price that each of the three types of firms will charge. For constant MC firms, \({p}_{i}={MC}_{i}/\rho \). The optimal price charged by linear MC firms is determined by solving \(2c_{i} \Phi_{{p_{i} }}^{{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {\left( {p - 1} \right)}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\left( {p - 1} \right)}$}}}} - \rho p_{i} + b_{i} = 0\). The optimal price charged by quadratic MC firms is determined by \( 3d_{i} \Phi ^{2} p_{i} ^{{2/\left( {\rho - 1} \right)}} - \rho p_{i} + b_{i} = 0 \). Once the optimal prices have been determined, we check whether the second-order condition as given by Eq. (2) is satisfied. If the optimal price leads to violation of the second-order condition, then the firm is discarded.

Third, given the optimal price (p), we use the Marshallian demand function to determine the corresponding quantity. Given the optimal quantity (q), we determine the total revenue (TR) and total variable cost (TVC) from the cost function. This allows us to calculate the optimal profit of each firm. Firms with negative profits were excluded from the analysis, as these firms would not be expected to remain in the industry.

Experimental Setup

The most crucial parameter in our experiment is β, which represents the proportion of firms in the industry with constant MC. For a given β, we conduct one round of experiments to determine the probability that the differences between the distribution of treatment and control estimates (i.e., the power of constant MC assumption) are statistically insignificant. Depending on the number of values of β, the experiment may consist of multiple rounds. For instance, if we set β in the range 0 to 1 with an interval of 0.1, we will have 11 rounds in the experiment.

In each round, a set of observations (denoted by nobs) for firms are generated, and these observations are used within the same round. Each round contains “nr” runs, where nr denotes the number of runs in the round, also set to 100. Each run contains “nt” trials, where nt denotes the number of trials in the run. We set nt = 100. In each trial, two measurement errors are drawn from the standard normal distribution: one is appended to total revenue and the other to total variable cost. The control sample is drawn from the set of firms with constant MC, making the identification strategy valid for the control sample. Then, (1- β) proportion of firms in the control sample is removed and replaced with firms having either linear MC or a combination of linear and quadratic MC. Thus, the experiment has two types of treatments, and we will report the results separately.

For the control and treatment samples, the same measurement errors (generated previously) are appended, respectively, to TVC and TR. Observations with negative value of TVC or TR are discarded, which is a common practice. Then we regress TVC against TR to obtain the point estimate of ρ for the control and treatment samples respectively. Any differences in the estimated values using the control sample (\({\widehat{\rho }}_{control}\)) and the true known value of ρ can be attributed to measurement errors. In contrast, the differences in the estimated values using the treatment sample (\({\widehat{\rho }}_{treatment}\)) and ρ are due to both the measurement errors and the degree of violation of the MC assumption. Accordingly, \(\left({\widehat{\rho }}_{treatment}-\rho \right)-\left({\widehat{\rho }}_{control}-\rho \right)={\widehat{\rho }}_{treatment}-{\widehat{\rho }}_{control}\) is attributable to the violation of the MC assumption. If the violation is unimportant, we shall not observe any significant difference between the distributions of \({\widehat{\rho }}_{treatment}\) and \({\widehat{\rho }}_{control}\).

At the end of nt trials, we obtain nt = 100 estimates of \({\widehat{\rho }}_{treatment}\) and \({\widehat{\rho }}_{control}\). In the next stage, we test whether the distribution of the treatment estimates is significantly different from that of the control estimates. We rely on three types of tests.

The first test is the Kolmogorov–Smirnov test, which considers the entire distribution. The second test is the two-sample t-test, which is used to test the equality of the population means and variances. The third test is the two-sided Wilcoxon rank sum test, which is used to test for the equality of two medians. The results of these tests allow us to conclude whether the two distributions are significantly different from each other. The results of these tests, along with the results of the other runs, are used to compute the frequency (and hence the probability) of no significant difference in the two distributions. At the end of nr = 100 runs, we obtain the power of MC assumption, the estimated p-value corresponding to the value of β. After all rounds, we obtain a set of p-values and βs, to which we can fit a logistic function to estimate a power function of the MC assumption. Note that when β = 1, the power function is expected to take a value of 1.

Matlab, running on Windows on a laptop, is used to implement the experiment, and we provide seeds to the Matlab random number generator for replication.

Findings

As indicated earlier, the experiment has five environmental parameters (i.e., nro, nr, nt, nobs, β), where nro denotes the number of rounds in the experiment, which is the number of β values. We set nro = 31, nr = 100, nt = 100, nobs = 100 and \(\beta \in \left\{ {0.01 \times n,n = 0,1,2, \ldots 30} \right\}\). The choice of the parameter values is based on both the expected computation time and their possible impact on the outcome of the experiment. For example, since nt = 100, each run generates 100 estimates of ρ in the control and treatment samples, which leads to reasonable distributions of \({\widehat{\rho }}_{control}\) and \({\widehat{\rho }}_{treatment}\). As far as β is concerned, we only focus on 0–0.3 range as we find that β > 0.3 generally leads to the power of the constant MC assumption being 1.

The economic environment is captured by the parameters ρ, Φ, fc, b, c, and d. The experiment involves two scenarios. In Scenario 1, the non-constant MC observations in the treatment sample come only from linear MC firms, and < ρ, Φ, fc, b, c, d >  =  < {0.6, 0.7, 0.8, 0.9}, 1, fc ~ U(0.01, 0.1), b ~ U(0.5, 1), c ~ U(0, 1), 0 > . In Scenario 2, the non-constant MC observations in the treatment sample come from both linear and quadratic MC firms, and < ρ, Φ, fc, b, c1, c2, d >  =  < {0.2, 0.5, 0.9}, 1, fc ~ U(0.01, 0.1), b ~ U(0.5, 1), c1 ~ U(0, 1), c2 ~ U(-0.1, 1), d ~ U(0.01, 0.1) > , where c1 and c2, respectively, are the values of parameter c in the cost functions of linear and quadratic MC firms. In Scenario 1, we do not allow the parameter c to be negative because a negative value of c often results in no solution to the first order conditions. In addition, the law of diminishing return requires c to be positive. Similarly, in Scenario 2, d is positive. Nevertheless, we do allow for c to be negative in Scenario 2 (which implies that firms have decreasing MC over certain range of output).

Before we report the power functions of the constant MC assumption for both scenarios, we first present the distributions of \({\widehat{\rho }}_{treatment}\) and \({\widehat{\rho }}_{control}\) in one run in Scenario 1 where ρ = 0.9 and β = 0.2 in Fig. 1. In Fig. 1, the two distributions exhibit only a small difference. For the control estimate, the mean of \({\widehat{\rho }}_{control}\) is 0.8997 with a standard deviation of 0.0021, while the mean of \({\widehat{\rho }}_{treatment}\) is 0.8994 with a standard deviation of 0.0029. The two-sample Kolmogorov–Smirnov test yields a test statistic of 0.13 with a p-value of 0.34, which fails to reject the null hypothesis of no difference between the two distributions at the 1% level of significance. Similarly, the two-sample t-test and the two-sided Wilcoxon rank sum test also fail to reject the null hypothesis of no difference in the means, variances and medians at the 1% level, respectively. Therefore, in this particular run, we conclude that the constant MC assumption indeed identifies the underlying preference parameter. Such a run is repeated 100 times, which allows us to calculate the relative frequency (i.e., the probability) of the situation where constant MC assumption identifies ρ. This percentage can also be interpreted as the power of the constant MC assumption.

Fig. 1
figure 1

Distribution of \(\hat{\rho }_{treatment}\) and \(\hat{\rho }_{control}\) (ρ = 0.9 and β = 0.2)

Scenario 1

In Scenario 1, the non-constant MC firms in the treatment sample include only the linear MC firms. Figure 2 presents the power function of the constant MC assumption when the CES preference parameter (ρ) is 0.9. The fitted logistic function is as follows:

Fig. 2
figure 2

The power function (ρ = 0.9) in experiment 1

$$Pr\left(\left.\beta \right|\rho =0.9\right)=\frac{1}{1+{e}^{-38.68\left(\beta -0.12\right)}}$$

When β = 0, i.e., no firm in the sample has constant MC, the constant cost assumption has a power of 0, which means that one cannot recover the underlying preference parameter. However, as β increases, the power appears to increase fairly fast. The power converges to 1 when β approaches 0.25, i.e., if at least 25% of firms in the treatment sample have constant MC, then with a probability of 1 the constant MC assumption will identify the underlying preference parameter. Even for a β of 0.2, the power is as high as over 90 per cent. Note that Fig. 2 is the power function based on the Kolmogorov–Smirnov test of the equality of the ρ estimate distributions of the control and treatment. If one is willing to only compare the mean and variance (or median), then an even smaller β is needed to ensure that the constant MC assumption has a high power. For example, if only the mean and variance of the treatment and control distributions of ρ estimate are compared, a β value of only around 0.075 is sufficient to ensure that the constant MC assumption identifies ρ correctly with a probability of 0.90.

Figure 3 reports three power functions in experiment 1. These power functions correspond to three alternative values of the preference parameter ρ. The top panel in Fig. 3 is the power of the constant MC assumption when ρ = 0.8. We observe an S-shape curve, which is similar to the power function when ρ = 0.9 (in Fig. 2). The power function converges to 1 as β approaches 0.25. As ρ decreases to 0.7 and then to 0.6, the S-shape of the power function completely disappears (see the bottom panel of Fig. 3). In fact, when ρ = 0.6 in the bottom panel of Fig. 3, irrespective of the number of constant MC firms included in the treatment sample, the constant MC assumption identifies the preference parameter with a probability of 1. Therefore, the power of constant MC assumption also depends on the underlying preference parameter (ρ). When ρ is low (i.e., when elasticity of substitution is low), the constant MC assumption has a much higher power compared to the case of high ρ. Our results show that the power of the constant MC assumption approaches 1 for a sufficiently low value of ρ.

Fig. 3
figure 3

The power function in experiment 1

Scenario 2

In Scenario 2, the 1- β proportion of observations in the treatment sample include both linear and quadratic MC firms. We allow for the quadratic MC firms to have decreasing MC over a certain range of output. Figure 4 reports the power function where ρ = 0.9. Once again, we observe an S-shape curve. The fitted logistic function is as follows:

Fig. 4
figure 4

The power function (ρ = 0.9) in experiment 2

$$Pr\left(\left.\beta \right|\rho =0.9\right)=\frac{1}{1+{e}^{-37.35\left(\beta -0.11\right)}}$$

Like in scenario 1 as shown in Fig. 2 (where non-constant MC firms in the treatment sample include only the linear MC firms), the constant MC assumption is still very powerful in identifying the underlying CES preference parameter (given the true value of the parameter 0.9) for the following reasons. First, to achieve a power of 0.9, one needs a β of well below 0.2. In other words, inclusion of less than 20 per cent firms with constant MC can enable researchers to correctly identify the preference parameter ρ. Second, the threshold value of β required for the power to converge to 1 is also slightly lower (around 0.2). However, in both situations (i.e., in Experiments 1 and 2, where ρ = 0.9), a complete violation of the constant MC assumption (β = 0) always results in a 0 power of identifying ρ correctly. This suggests that if the sample does not include any constant marginal firms, then it would be inappropriate to utilize the constant MC assumption.

Like in Scenario 1, we also construct the power functions for alternative values of ρ. These results are shown in Fig. 5. Figure 5 resembles Fig. 3. As ρ decreases, the S-shape of the power function disappears. For a given β, a decrease in ρ increases the power of the constant MC assumption. For a sufficiently low value of ρ (for example 0.6 in Fig. 5), irrespective of the β value (i.e., even if all firms in the sample have non-constant MC), the constant MC assumption correctly identify ρ (as indicated by the corresponding probability of 1). This implies that the constant MC assumption is particularly applicable to industries where elasticity of substitution is low.

Fig. 5
figure 5

The power function in experiment 2

For a given ρ, not surprisingly, a higher value of β leads to a higher power of the identification strategy. In addition, unless one is examining industries that involve products with relatively low degree of substitutability (i.e., where ρ is low), the assumption of constant MC would be inappropriate unless the sample includes a certain proportion of constant MC firms.

Estimation Using Real Data

Considering the findings from the experiments, we now turn to real data to estimate the CES preference parameter (ρ). The data consist of firm revenue from the domestic market and total variable costs (the sum of total wages and intermediate inputs) for the year 2007, sourced from the National Bureau of Statistics (NBS) in China.Footnote 2 The dataset covers five four-digit industries: instant noodles and other convenient food manufacturing (252 firms), textile and garment manufacturing (6568 firms), cosmetics manufacturing (225 firms), household kitchen appliances manufacturing (306 firms), and computer manufacturing (85 firms). Table 1 presents the estimation results. Notably, the point estimates of ρ and the implied elasticities of substitution appear reasonable and broadly align with estimates in existing studies (for example, see Aw et al. 2011; Das et al. 2007; and Sun 2023).

Table 1 Estimations with real data

Concluding Remarks

In empirical studies involving estimation of structural parameters, to identify the usual CES preference parameter, ρ, a commonly used strategy is to assume that marginal cost (MC) of production of firms is constant. This assumption allows one to utilize the link between total variable cost and total revenue of firms to recover the CES preference parameter. In this paper, based on Monte Carlo experiments, we examine the robust of this strategy to violation of the constant MC assumption. To this end, we explore the power of the constant MC assumption in an experimental framework. For a known value level of the preference parameter, we first generate a control sample where all firms have constant MC. We then construct the treatment sample by replacing a (1–β) proportion of observations in the control sample with firms that have non-constant MC, where β is the proportion of firms with constant MC. By comparing the estimate of ρ from the treatment and control samples, we are able to estimate the power of the constant MC assumption as a function of β (i.e., the degree of violation to constant MC assumption).

Results of Monte Carlo experiments suggest that constant MC assumption has a high power in correctly identifying the underlying CES preference parameter. In other words, we find that constant MC assumption leads to higher probability of statistically insignificant differences between the treatment and control estimates. Nevertheless, when imposing the constant MC assumption to identify the CES preference parameter, it is important to keep in mind two things: (i) the degree of substitutability of the products in the industry and (ii) the actual proportion of the firms with constant MC in the industry. We find that constant MC assumption correctly identifies the underlying preference parameter when the degree of substitutivity of the products is not very high. For example, in the case of pharmaceutical industry, the constant MC assumption will work more effectively than in the case of non-alcohol beverage industry. Not surprisingly, we also find that constant MC assumption is relatively more effective in identifying the CES preference parameter in industries where the actual proportion of constant MC firms is higher. However, the news is not that bad. Our experiments suggest that, unless products within the industry have very low degree of substitutability, to correctly identify the CES parameter by assuming constant MC, the sample most contain at least 20% of firms that actually have constant MC. We find that if the CES preference parameter is 0.6 or lower (i.e., the elasticity of substitution is 2.5 or lower), irrespective of the proportion of constant MC firms in the sample, the constant MC assumption correctly identifies the underlying preference parameter.

The results of this paper have important implications for researchers. Specifically, researchers relying on the assumption of constant MC to estimate the CES preference parameter should consider industry-specific factors (i.e., product substitutability and the prevalence of firms with constant marginal cost). The analysis presented in this papers shows that while this assumption generally performs well, accuracy requires at least 20% representation of constant MC firms, particularly in industries with low substitutability. However, for CES preference parameters at 0.6 or below, the assumption remains reliable regardless of the proportion of constant MC firms.