Skip to main content

Recurrent Collusion: Cartel Episodes and Overcharges in the South African Cement Market


Cartel cases may involve recurrent collusion, with cartel periods interspersed by periods of greater competition. An empirical model of recurrent collusion must account for different data generating processes during collusive and non-collusive episodes and allow for the dating of such episodes. It should also allow for the possibility of flexible transitions between collusive and non-collusive episodes. This paper proposes a Markov regime-switching model to detect recurrent periods of collusive damages and to estimate cartel effectiveness in any given time period. We use this information to estimate overcharges, which we show are much higher than those suggested by conventional approaches.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. Throughout this paper we focus on cement, the binding substance used in construction, and not concrete, which is a composite material consisting of cement and other building aggregates.

  2. See also Kandori (1991), who finds similar results for alternative specifications of the correlation structure of demand shocks.

  3. The overcharge literature typically relies on static OLS models. OLS models provide asymptotically consistent estimators only in the presence of cointegration among the dependent variables, which are often unit root processes. Additionally, the autoregressive distributed lag (ARDL) form is preferable, since it provides a better representation of the dynamic effects, see (Boshoff 2015, 228) for discussion.

  4. We first verified that the optimal number of regimes, consistent with the data, is two. The results are reported in “Appendix 2”.

  5. Cement prices is a unit root process. Therefore, there is strong first order persistence and the Markov assumption is appropriate.

  6. Note that this is a single equation framework. A multiple equation framework was also estimated and the results indicated that there is no simultaneity. The results are available upon request.

  7. The main inputs in South African cement production are limestone and lime, coal, shale, silica sand, and gypsum (Lafarge 2018; AfriSam 2016). Limestone constitutes two-thirds of the raw materials used in South African cement manufacturing (Leach 1994). Roughly one and a half tonnes of limestone is required to produce one tonne of cement (Ali 2013).

  8. One interesting three-regime model—for the purposes of studying recurrent collusion in the cement market – would be one whose three regimes can be identified respectively as a legal collusion regime, an illegal collusion regime, and a non-collusive regime. Alternatively, another interesting model would be one whose three regimes can be identified respectively as a collusion regime, a competitive regime (before episodes before 2011) and another competitive regime (post-2011, to signal greater competition). The data clearly support only a two-regime model, which indicate that all collusive episodes are generated by the same regime. Similarly, all non-collusive episodes are generated by the same regime.


Download references


We are grateful for the useful comments by the participants and discussants at the BECCLE 2017 and CRESSE 2017 conferences. The paper also benefited from discussion at Stellenbosch University and the University of Cape Town. We thank Joe Harrington, John Connor, Johannes Paha, Maarten Pieter Schinkel, and Maurice Bun for comments on earlier versions.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Willem H. Boshoff.


Appendix 1: Methodology–Filter and Smoothing

We start with the conditional log likelihood function of Eq. (5) given by:

$$\begin{aligned} L(\varvec{\theta })=\ \sum ^T_{i=1}logf(p_t\ |\varOmega _{t-1};\ \varvec{\theta })\ , \end{aligned}$$

where \({{\Omega }}_t=\{p_t,\ p_{t-1},\ldots , p_1,p_0,{\varvec{x}}_t,{\varvec{x}}_{t-1},\ldots ,{\varvec{x}}_0\}\) denote the collection of all the observed variables up to time t, and \(\varvec{\theta }\varvec{=}{(\sigma ,\ a_1,\ldots ,a_4,\ {\gamma }_1,\ldots ,{\gamma }_4,\ c_0,\ \omega ,\ p_{11},p_{22})}^{'}\) is a vector of population parameters. Maximum likelihood estimation (MLE) of Eq. 5 requires construction of the conditional density function \(f(p_t\,|\,{{\Omega }}_{t-1};\,\varvec{\theta })\).

Following Hamilton (1989), we construct the conditional densities recursively as follows: Suppose that \(P(S_{t-1}=j{{\Omega }}_{\mathrm {t-1}};\ \varvec{\theta }\varvec{)}\) is known. Given the state variable \(S_t=j\) and the previous observations the conditional probability density function is given as:

$$\begin{aligned} f(p_t | S_t=i, \varOmega _{t-1};\,\varvec{\theta }) = \frac{1}{\sigma \sqrt{2\pi }} \exp {\left( -\,\frac{\left( p_t-c_i + \sum ^m_{l=1}{a_l}p_{t-l} + \sum ^n_{l=0}{\gamma _l \varvec{x}_{t-l}}\right) ^2}{2\sigma }\right) }\ . \end{aligned}$$

To construct \(f(p_t\,|\,{{\Omega }}_{t-1};\,\varvec{\theta })\) Hamilton use the following equations

$$\begin{aligned} \xi _{i,t-1}&= P(S_t = i | \varOmega _{t-1};\,\varvec{\theta }) = \displaystyle \sum _{j=1}^{2}P(S_t = i | S_{t-1} = j, \varOmega _{t-1} ;\,\varvec{\theta })P(S_{t-1} = j, \varOmega _{t-1} ;\,\varvec{\theta })\nonumber \\&= \displaystyle \sum _{j=1}^{2}p_{ij}P(S_{t-1} = j, \varOmega _{t-1} ;\,\varvec{\theta }) . \end{aligned}$$

Since \(p_{ij}\) is known and \(P(S_{t-1}=j\,|\,{{\Omega }}_{t-1};\varvec{\theta })\) is assumed as given we have \({\xi }_{i,t-1}\). Now to derive \(f(p_t\,|\,{{\Omega }}_{t-1};\ \varvec{\theta })\) we use

$$\begin{aligned} f(p_t | \varOmega _{t-1} ;\,\varvec{\theta }) = \displaystyle \sum _{i=1}^{2}f(p_t | S_t = i, \varOmega _{t-1} ;\varvec{\theta })P(S_t = i | \varOmega _{t-1};\,\varvec{\theta }) . \end{aligned}$$

Substituting (7) into (8) and re-arranging we have

$$\begin{aligned} f(p_t | \varOmega _{t-1};\varvec{\theta }) = \displaystyle \sum ^2_{i=1}{\displaystyle \sum ^2_{j=1}{f(p_t | S_t = i , \varOmega _{t-1}; \varvec{\theta })}{\xi }_{i,t-1}}\ . \end{aligned}$$

Now that we have \(f(p_t\,|\,{{\Omega }}_{t-1};\varvec{\theta })\), the next step is to update (7) so that we can calculate \(f(p_{t+1}\,|\,{{\Omega }}_t;\varvec{\theta })\) where

$$\begin{aligned} f(p_{t+1} | \varOmega _t;\,\varvec{\theta }) = \displaystyle \sum _{i=1}^{2} f(p_{t+1} | S_t = i, \varOmega _t;\,\varvec{\theta })\xi _{i,t} . \end{aligned}$$

The conditional density function \(f(p_{t+1}\,|\,S_t=i,\ {{\Omega }}_t;\ \varvec{\theta })\) will have the same form as in Eq. (6). Therefore, the only requirement to calculate (10) is \({\xi }_{i,t}=P(S_t=i\,|\,{{\Omega }}_t;\varvec{\theta })\). This is calculated by simply updating \({\xi }_{i,t-1}\) to reflect the information contained in \(p_t\). The update is performed using a Bayes’ rule:

$$\begin{aligned} {\xi }_{i,t} = P(S_t=i | \varOmega _t;\,\varvec{\theta }) = \frac{f(p_t | S_t=i,{\varOmega }_{t-1};\,\varvec{\theta }){\xi }_{i,t-1}}{f(p_t |{\varOmega }_{t-1};\varvec{\theta })} . \end{aligned}$$

Therefore, \(f(y_t\,|\,{{\Omega }}_{t-1};\varvec{\theta })\) is obtained for \(t=1,\ 2,\ldots , T\) by assigning a starting value \(P(S_{t-1}=j{{\Omega }}_{\mathrm {t-1}};\ \varvec{\theta }\varvec{)}\) to initialize the filter and then to iterate Eq. (7)–(11).

The question that remains is how to set \(P(S_{t-1}=j\,|\,{{\Omega }}_{\mathrm {t-1}};\ \varvec{\theta })\) to initialize the iterations for the filter? When \(S_t\) is an ergodic Markov chain, the standard procedure is to simply set \(P(S_{t-1}=j\,|\,{{\Omega }}_{\mathrm {t-1}};\ \varvec{\theta })\) equal to the unconditional probability \(P(S_0=i)\). The unconditional probabilities is given by

$$\begin{aligned} P(S_0 = 1)= & {} \frac{1-p_{22}}{2-p_{11}-p_{22}} ; \end{aligned}$$
$$\begin{aligned} P(S_0 = 2)= & {} 1 - P(S_0 = 1)= \frac{1-p_{11}}{2-p_{11}-p_{22}} \end{aligned}$$

An advantage of the Hamilton filter is that it directly evaluates \(P(S_t=i\,|\,{{\Omega }}_t;\varvec{\theta })\), which is referred to as the “filtered” probability. The estimates of \(P(S_t=i\,|\,{{\Omega }}_t;\varvec{\theta })\) can further be improved by “smoothing”. This is done by using the information set in the final period \({{\Omega }}_T\), in contrast to the filtered estimates that only use the contemporaneous information set \({{\Omega }}_{\mathrm {t}}\). The likelihood of the observed data’s appearing in different periods is linked together by the transition probabilities.

Therefore, the likelihood of being, for example, in regime i in period t is improved by using information about the future realisations of \(p_d\), where \(d>t\). A suitable smoothing technique is provided by Kim (1994). The smoothing method requires only a single backward recursion through the data. Kim (1994) shows that the joint probability under the Markov assumption is given by

$$\begin{aligned} P(S_t=i, S_{t+1}=j| {\varOmega }_T;\,\varvec{\theta })&= P(S_t=i | S_{t+1}=j, \varOmega _T;\,\varvec{\theta })P(S_{t+1}=j | \varOmega _T;\,\varvec{\theta })\end{aligned}$$
$$\begin{aligned}&= \frac{P(S_t=i | S_{t+1}=j, \varOmega _t;\,\varvec{\theta })}{P(S_{t+1}=j | \varOmega _t;\,\varvec{\theta })}P(S_{t+1}=j | \varOmega _T;\varvec{\theta }). \end{aligned}$$

To move from (14) to (15), it is important to note that under the correct assumptions, if \(S_{t+1}\) is known, the future data in (\({{\Omega }}_{t+1},\ldots , {{\Omega }}_T)\) will contain no additional information about \(S_t\). Therefore, by marginalizing the joint probability with respect to \(S_{t+1}\), the smoothed probability in period t is obtained by

$$\begin{aligned} P(S_t=i | \varOmega _T;\varvec{\theta })&= \displaystyle \sum ^2_{j=1}P(S_t=i, S_{t+1}=j | \varOmega _T;\,\varvec{\theta })\nonumber \\&= \sum ^2_{j=1}{\frac{P(S_t=i | S_t=i S_{t+1}=j, \varOmega _t;\varvec{\theta })}{P(S_{t+1}=j | \varOmega _t;\varvec{\theta })}P(S_{t+1}=j | \varOmega _T;\varvec{\theta })}. \end{aligned}$$

Appendix 2: Motivation and Choice of Markov RS Model

Consider a standard ARDL model of price with the following form:

$$\begin{aligned} p_t=c_0 +\ \sum ^m_{l=1}{a_l}p_{t-l}+\ \sum ^n_{l=0}{{\varvec{\gamma }}_l{\varvec{x}}_{t-l}}+\ {\varepsilon }_t\, \end{aligned}$$

with \({\varepsilon }_t\ \sim \ N\left( 0,{\sigma }^2\right) \), where \(p_t\) denotes price at time t, and \({\varvec{x}}_t\) denotes a vector of demand and costs drivers as shown in Table 3 The residual diagnostics are reported in Table 7.

Table 7 ARDL residual diagnostic tests

The residuals of the ARDL model (Eq. 17) exhibit heteroskedasticity and serial correlation. Such a result is to be expected in the presence of regime changes, since the residuals will no longer be Gaussian. From the diagnostic tests it is evident that the linear functional form of Eq. 17 is unsuitable. This result could be anticipated, given the prior knowledge of the cement cartel and cement price regime shifts. Therefore, standard least-squares estimation of (17), including a dummy variable to capture overcharges, will not give an accurate measure of the true overcharge.

In some specifications the coefficient of the electricity variable had the incorrect sign. Specifically it was found that there is a negative relationship between electricity prices and the price of cement, which is not a sensible conclusion. A graphical investigation of Fig. 5 provides some insight as to why this was the case.

Fig. 5
figure 5

Cement price and industrial electricity prices

There appear to be time periods for which there is positive correlation and time periods for which there is negative correlation. It is therefore sensible to make the coefficient of electricity regime-dependent. The results confirm this idea, where the coefficient for electricity is positive in the collusive regime \((S_t=1)\) and effectively zero in the non-collusive regime \((S_t=1)\).

Appendix 3: Diagnostic Tests

This section reports the diagnostic tests for both the Markov RS model, (that generated the transition probabilities for the two regimes) and the final ARDL model that was used to calculate the overcharge. Diagnostic tests for the ARDL model are reported in Table 8. As shown, the ARDL model passes the standard diagnostic tests.

Table 8 ARDL diagnostic tests

The ARDL model relies on the cartel effectiveness measure, which is calculated from the RS model. Performing diagnostic tests for an RS model is more complex than in standard linear models and, until recently, the applied literature has often relied on only a few (if any) of these tests (Breunig et al. 2003; Smith 2008).

A challenge that is faced when performing diagnostic tests for an RS model is that the true residuals are unobserved, as they are dependent on the unobserved state variable. To overcome this issue, we follow the methodology proposed in Maheu and McCurdy (2000), according to which expected residual are calculated, conditioned on past information. Smoothed values obtained from the Kim filter cannot be used to construct the residuals, as the filter includes future information and, as a result, the current residual will contain future information.

Table 9 reports selected diagnostic tests for the RS model, which we are capable of generating. These appear satisfactory. Normality tests on residuals in an RS model are more complicated and the RS model performs less well on our version of these tests. While deviation from normality may be problematic for inference, this is not the main focus of our paper. Therefore, in sum, we are confident of the stability of the model.

Table 9 RS diagnostic tests

Appendix 4: Comparison to Structural Breaks

As discussed tests for structural breaks face limitations in this example. The recursive residuals (Fig. 6) cross the significance band at various time points and do not give a clear indication for how long these possible breaks affected the price series. While the CUSUM test (Fig. 7) provides a better picture, the result is not as convincing as the probabilities of Fig. 1. The test indicates a break in the model from 2001 to around 2007. These dates do not accurately depict our prior knowledge of the cement case, since it would suggest that damages were only observed three years after the cartel was formed and ceased two years before the information exchange was terminated.

Fig. 6
figure 6

Recursive residuals

Fig. 7
figure 7

CUSUM of squares

The Bai–Perron test in Table 10 treats the break dates as unknown and estimates them along with the regression coefficients with the use of least-squares estimation. The break points are estimated as 1996Q2, 2005Q2 and 2009Q2. This is certainly a more accurate depiction of the changes in the d.g.p. compared to the recursive residuals and squared CUSUM. However, as expected the test takes 1996Q2 as the first break date. Therefore, construction of a dummy variable that is based on this test will include the price war during this time as part of the collusive regime and lead to a lower overcharge estimation.

Table 10 Bai–Perron break test

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Boshoff, W.H., van Jaarsveld, R. Recurrent Collusion: Cartel Episodes and Overcharges in the South African Cement Market. Rev Ind Organ 54, 353–380 (2019).

Download citation

  • Published:

  • Issue Date:

  • DOI:


  • Cement cartel
  • Collusion detection
  • Markov-switching
  • Overcharge estimation
  • Recurrent collusion