1 Introduction

The transition to a circular economy (CE) is being endorsed by an increasing number of countries and regions around the world. The main global players are: China, with its progressive CE legislation, e.g. prohibiting imports of certain plastic waste streams (Brooks et al. 2018; Qi et al. 2016), Japan, with well-developed waste management practices (Sasao 2014), and the European Union (EU). For this study, our primary focus will lie on circularity of plastics in the EU, mainly set out by the European Commission’s (EC) Plastics Strategy (EC 2018). Investigating the circularity of plastics is essential in the transition process to a CE. The ubiquity of plastics in the economy is the result of a continuing trend since the middle of the 20\(^{th}\) century. Indeed, the use of plastics has experienced a constant growth after World War II, mostly because of the unique and desirable properties of the material (OECD 2018). Paradoxically, commonly used plastic waste treatment methods, such as incineration, tend to have high negative environmental impacts and cannot be regarded as being circular (Rigamonti et al. 2014).

Recently, the EU has taken action to start closing the material cycle of plastics and to minimize the harmful impacts of the material in general. The EU’s three main action plans or strategies are: (i) the EU’s Single Use Plastics Directive which was overwhelmingly accepted by the Members of the European ParliamentFootnote 1 at the end of 2018 (EP 2019). This directive prohibits certain plastic products to be used only once. (ii) The REACH regulationFootnote 2 which deals with the use of certain harmful chemicals, e.g. phthalates in the production of plastics. It is generally accepted that harmful chemicals hamper the transition to circularity, e.g. health and safety difficulties arise during the recycling process (EC 2020b). (iii) The European Strategy for Plastics in a Circular Economy (EC 2018), which sets out targets and partly regulates the market. Although the existing policies are useful, some markets, e.g. the polyethylene (PE) market, remain poorly regulated. Certain interest groups, e.g. the European Federation of Waste Management and Environmental Services (FEAD), are lobbying and public opinion is pushing to further regulate the market. An example we study in this paper is the corporate lobby on an ‘EU Action on Recycled Content Mandates for Plastics’ (FEAD 2018).

Policies for a CE can either be incentive-based, e.g. subsidising circular technology investments, or regulatory, e.g. forcing investments. In this study, we focus on a regulatory policy forcing the market to transition. An example is the REACH framework which obliges the market to invest in the use of alternative (harmless) chemicals; if one’s production process uses prohibited chemicals, production will have to be suspended until the new guideline requirements are complied with. Often, the implementation of policies is driven by the reaction of policymakers to the public opinion (Wlezien and Soroka 2012). Therefore, implementation dates are uncertain and as a consequence they cause market distortion. In this research, we assess the impact on investments of uncertain implementation times of such regulatory policies. This type of uncertainty, simply referred to as policy uncertainty in the remainder of the paper, is the only source of uncertainty in our setting that impacts investment decisions. Uncertainty regarding policy content is, also in the case of plastics, limited, i.a. by technical feasibility. Both policymakers and firms are aware of these constraints, and as a consequence, policies are set within the narrow boundaries of feasibilityFootnote 3.

In order to analyze the influence of an uncertain policy implementation time on investment decisions, we develop a real option model. This type of model allows to define a stochastic process to represent the uncertainty process. The monopolistic firm we consider in the analysis partially observes and learns from the public opinion. This opinion is assumed to be the driver for policymakers to change existing policies, i.e. to mandate a minimum use of recycled plastics in production processes. We base ourselves on publications of the EC to parametrize the public opinion. Our model offers firms the optimal investment strategy, that is a strategy that maximizes expected profits under the presence of policy uncertainty. The model is designed for applications in the plastics industry and is applied to the PE market in the EU. Results indicate that firms optimally plan the first investment step, so that the timing of the second investment step approximates their projection on the policy implementation time. Only when the first investment causes profit losses before the policy implementation that are greater than the losses due to the partial market exclusion afterwards, will firms plan the first investment step at the policy implementation time. Although these results are found for a polymer case study, we stress that implications of this model transcend this industry.

The remainder of the paper is organized as follows: Sect. 2 provides an overview of existing relevant literature. Sections 3 and 4, respectively, introduce the investment model under the presence of policy uncertainty and its solution, i.e. the optimal investment times. Section 5 presents and discusses the results found for the PE case study, and Sect. 6 determines the optimal capacity of the investment steps. Section 7 concludes on the research findings. All proofs of propositions introduced throughout this study can be found in Appendix A.

2 Literature

A CE entails uncertainty (Linder and Williander 2017). Therefore, investment decisions in a CE setting should be studied with a real option approach. This approach correctly accounts for uncertainty and is flexible with regard to the investment timing and capacity (Dixit and Pindyck 1994). To the best of our knowledge, this is the first study adopting this method to investigate the investment decision in the use of recycled plastics under the presence of policy uncertainty. Policy uncertainty has been studied before in different real option settings. Within this strand of literature, three generations can be identified. The first generation mainly focuses on tax policy uncertainty. One of the first academic studies can be found in Chapter 9 of Dixit and Pindyck (1994). Dixit and Pindyck (1994) analyze the influence of a possible tax credit retraction on a fixed-sized investment. A similar study on this matter was performed by Hassett and Metcalf (1999). In their work, they argue that the use of a geometric Brownian motion (GBM) to model policy uncertainty is inferior to the use of a Poisson jump process. The results of their work show that the influence of policy uncertainty on private investments is highly dependent on how it is modeled. The advantage of a Poisson jump process, they argue, is the sharpness of the jumps, corresponding with the sudden implementation of policies.

A second generation of literature studies climate change policy uncertainty. This generation mainly focuses on the uncertain carbon price. Yang et al. (2008) analyze the investment in a power plant; the carbon price follows a GBM and influences the profitability of investments. Fuss et al. (2008) investigate an investment option in carbon-saving technology under an uncertain carbon price. They consider a bifurcating carbon price, representing policy changes, and find that increased uncertainty delays investments. Compernolle et al. (2017) analyze the investment in carbon capture technology under an uncertain carbon price, which is modeled with a GBM.

A third, and recent generation of literature distinguishes itself from the first generation by analyzing investments in renewable energy sources. This topic became increasingly important in the 21\(^{st}\) century. These studies typically regard policy uncertainty in the form of a random provision, revision or retraction of a subsidy or support scheme. Boomsma and Linnerud (2015) and Boomsma et al. (2012) use a Markov switching process to model the uncertain discrete changes between the support schemes that governments adopt for renewable energy. An example: in case a subsidy is granted, the level of a subsidy is modeled with a GBM, while its retraction is modeled using a Poisson jump process. They find that policy uncertainty regarding the intensity delays investments. Uncertainty regarding the possible retraction can influence the investment timing either way. If the market believes that the decision of retraction will be applied retroactively, investments are delayed and vice versa. Eryilmaz and Homans (2016) find that higher uncertainty regarding the granting of investment credits in the future, speeds up investments today. They consider a 30 percent probability that the investment credit will be retracted, without considering re-installment in the future. A similar result is found by Chronopoulos et al. (2016). Policy uncertainty, in the form of a random provision or retraction of a subsidy, modeled with a Poisson jump process, speeds up investment. However, the installed capacity under the presence of uncertainty will be lower. The investment value is found to be larger when considering stepwise investment instead of lumpy investment, the difference in value is found to be inversely proportional to the intensity of the subsidy (Samadi 2018).

Despite the available and observable information, the aforementioned publications assume private investors’ projections on policy changes to be constant. Literature combining a real option approach with active learning is rather limited. Dalby et al. (2018) present a good overview of the existing literature combining both. To the best of our knowledge, policy uncertainty and active learning have only been considered twice before. Dalby et al. (2018) consider an investment option under policy uncertainty and allow for active learning via Bayesian updating. They study how investment behavior is affected by updating a subjective belief on the timing of a subsidy revision. It is found that investors are less likely to invest when the arrival rate of a policy change increases. An alternative approach was introduced by Pawlina and Kort (2005). They assume the policymaker is influenced by an exogenously driven dynamic, based on the firm’s market value, and that firms know this dynamic too. In their paper, the market value influences the policymaker to retract an investment subsidy. The threshold of the market value at which the subsidy is retracted, is unknown to the firm. They can, however, make projections on the retraction based on their active learning.

The existing real option literature on policy uncertainty, both including and excluding active learning, regards uncertainty as the intensity or provision (retraction) of an investment-stimulating policy or of changing carbon price. Such incentive-based policies typically take the form of subsidies, like feed-in tariffs or investment credits. We extend the real option literature on policy uncertainty by studying an uncertain regulatory policy. Regulatory policies are deemed to become increasingly important in a CE setting. Both the Ellen Macarthur Foundation (2019) and the EC (2018) concluded regulatory policies are an effective policy tool to enable the transition to a CE. The best example of the latter is probably the progressive Chinese CE, which has been stimulated by the Chinese government who issued regulatory policies, e.g. banning imports of certain plastic waste streams (Brooks et al. 2018). The potential of regulatory policies to enable a CE is great. Therefore, more of this type of policies are to be expected, increasing the relevance of this work.

3 Model

We consider a profit maximizing monopolistic firm in a continuous time setting. The firm has the option to invest, by investing it will become more circular. The type of investment can differ. Relevant examples are investing in a new production machine that allows the use of recycled material, e.g. the investment in a filtration machine, or the training of workers in stabilizing incoming batches of recycled material. Typically recycled material needs to be stabilized with additives before it can be used in a production process. The firm faces the risk of being legally required to become more circular at some point in the future, e.g. the mandatory utilization of a certain fraction of recycled plastics in production processes. Such a policy, compelling the firm to be more circular, is assumed to be implemented at a random future point in time, \(\gamma\). The random time, \(\gamma\), is driven by an exogenous stochastic process \(\left\{ L_t: t\ge 0\right\}\) with initial value l. This process represents the public pressure on the policymaker to regulate. If the public pressure or value of the process reaches a critical level \(L^*\), policy implementation follows. However, the critical level \(L^*\) is unknown ex ante to the firm. Similar to Pawlina and Kort (2005), we make the assumption that the policymaker is consistent. If the policy has not been implemented by time \(\phi\), while \({\hat{L}}\) is the highest realization of the process so far, the policy will not be implemented at any time \(u>\phi\), as long as \(L(t)\le {\hat{L}}\) for all \(t\le u\). When L(t) has reached a new height \({\hat{L}}\), L(t) could be equal to the critical level, \(L^*\), which induces policy implementation. However, if policy implementation does not follow, then it still holds that \(L(t) < L^*\). The firm has learned that, as long as \(L<{\hat{L}}\), the policy will not be implemented. After the policy implementation, the investment is a conditio sine qua non to keep production running and therefore, to have a profit flow.

After investing, extra production steps, higher input prices (Brooks et al. 2019), more production errors, higher quality control costs will influence the profit flow negatively. Therefore, we assume that investing lowers the profit by an ex ante known and fixed factor \(\delta \in \left[ 0,1\right]\)Footnote 4. Nevertheless, the firm still has an incentive to invest, as the lack of investment results in zero profit after the policy implementation. Hence, there exists a trade-off between lost profits due to higher production costs before the policy implementation vs. the risk of losing all profit after the policy implementation and before investment.

Without loss of generality, we assume the production of one unit per time period t, using a combination of input materials \(q_{A}(t)\) and \(q_{B}(t)\), yielding a profit P that is negatively influenced by the use of \(q_{B}\). It holds that \(q_{A}(t)+q_{B}(t)=1\). The fraction \(q_B\), is the fraction to be regulated at the random implementation time of the policy, e.g. a mandatory fraction of recycled plastics that should be used in production processes. At the beginning of the planning period we assume \(q_B(0)=0\), i.e. no recycled plastics are used for production. Upon introduction, the policy will require a fraction \(\overline{q_B}\) to be used. \(\overline{q_B}\) is known at all times and can be reached by investment. Lacking capacity of \(q_B\) to reach \(\overline{q_B}\) when \(t\ge \gamma\), will lower the profit function according to the lacked capacity in \(q_B\). We assume that profits cannot grow by overinvesting, i.e. \(q_B\le \overline{q_B}\). The following profit function captures these characteristics, and will therefore be adopted in our model:

$$\begin{aligned} \pi (t)= {\left\{ \begin{array}{ll} P-P\delta q_{B}(t) &{} if\ t < \gamma \\ \frac{q_{B}(t)}{\overline{q_B}}\left( P-P\delta q_{B}(t)\right) &{} if\ t \ge \gamma \end{array}\right. } \end{aligned}$$

The required investment to reach the desired capacity \(\overline{q_{B}}\), is executed in n steps \(\left( n>1, n \in \mathbb {N} \right)\). We exclude the case \(n=1\), because it yields a trivial problem. Then it is always optimal to invest at the moment of policy implementation, i.e. \(\tau = \gamma\), where \(\tau\) denotes the investment time. Stepwise investment is a reasonable constraint to impose. Firms allow themselves to adapt without jeopardizing their entire production. A sudden change of the entire production, without testing nor learning, could lead to all products being faulty. As a consequence, stepwise investment is expected to be cheaper than lumpy investment. Therefore, we define an investment cost function that internalizes the learning effects taking place with stepwise investment, in which C is a positive constant:

$$\begin{aligned} I\left( q_B\right) =C\left( e^{q_B}-1\right) \end{aligned}$$

Equation (2) represents the total cost of the investment step in a fraction \(q_B\). This cost follows an exponential increase. That is because it internalizes the lack of learning effects when investing in large fractions of \(q_B\) per step. Note that two types of learning effects can take place: (i) the learning effects reduce the investment cost of each investment step, (ii) or only reduce the investment cost of the next investment step. In case the latter holds, Eq. (2) simplifies the experienced investment cost by equally distributing the reduced investment cost to all investment steps. We assume a minimal withholding time, \(\theta\), between the investment steps. Such a withholding time represents the time needed to adapt and incorporate learning effects (Samadi 2018). Allowing the minimal withholding time to be zero in a continuous time setting, yields a trivial problem comparable to the case \(n=1\).

We solve the model for a two-step investment with investment steps \(q_{B,1}\) and \(q_{B,2}\), respectively made at \(t=\tau _1\) and \(t=\tau _2\). Since investing in \(q_B\) is essential to comply with regulation, the investment times are decision variables that impact the firm’s profits. Since the policy implementation time, \(\gamma\), is unknown, the profit function cannot be defined a priori. We elaborate on the different possible outcomes in the next paragraph. We do know the fraction of \(q_B\) used by the firm throughout time:

$$\begin{aligned} q_B(t)={\left\{ \begin{array}{ll} 0,&{} t<\tau _1\\ q_{B,1},&{} \tau _1\le t<\tau _2\\ q_{B,1}+q_{B,2},&{}t\ge \tau _2 \end{array}\right. } \end{aligned}$$

Therewith, the optimization problemFootnote 5 for a two-step investment can be formulated as follows:

$$\begin{aligned} V\left( l,{\hat{L}}\right) =&\sup \limits _{{\tau _1 \ge 0}, {\tau _2 \ge \tau _1+\theta }}E\left[ \int _{0}^{min\left( \tau _1, \gamma \right) }\pi (t)e^{-rt}dt+\int _{min\left( \tau _1, \gamma \right) }^{min(\tau _1+\theta , \gamma )}\pi (t)e^{-rt}dt\right. \nonumber \\&\left. -I(q_{B,1})e^{-r\tau _1}+\int _{min(\tau _1+\theta , \gamma )}^{max(\tau _1+\theta , \gamma )}\pi (t)e^{-rt}dt\right. \nonumber \\&+\int _{max(\tau _1+\theta , \gamma )}^{\tau _2}\pi (t)e^{-rt}dt+\int _{\tau _2}^{+\infty }\pi (t)e^{-rt}dt \left. -I(q_{B,2})e^{-r\tau _2}\right] \end{aligned}$$

Equation (4) represents the firm’s expected profits over the interval \([0, +\infty [\), discounted to the beginning of the planning period, \(t=0\). Note that in Eq. (4), we cannot write the profit function in terms of Eq. (1), but have to refer to \(\pi (t)\). The reason is that the profit function depends upon \(\gamma\), e.g. over the interval \(\left[ min\left( \tau _1, \gamma \right) , max\left( \tau _1+\theta , \gamma \right) \right]\), the profit function is not defined a priori. The two investment timings, \(\tau _1\) and \(\tau _2\), are chosen in order to maximize the firm’s expected value. Initially the profit function is known, \(\pi (0)=P-P\delta q_{B}(0)=P\), since the policy has not arrived yet. The profit function will change at \(t=min\left( \tau _1, \gamma \right)\). If the policy arrives before the first investment, the profit drops to zero until the firm invests to reach the capacity \(q_{B,1}\). At \(\tau _1\), the firm invests in a fraction \(q_{B,1}\). As a result, profit decreases if the policy is not implemented yet. The next possible change in profit occurs at \(t = min \left( \tau _1+\theta , \gamma \right)\), that is when the minimal withholding time has expired or the policy has arrived. Under the hypothesis that the minimal withholding time has expired before the policy implementation, the profit will remain unchanged until the firm decides to undergo the second investment or the policy implementation occurs. The optimization problem takes into account this situation by considering a possible change in the profit over the interval \(t \in \left[ min\left( \tau _1+\theta , \gamma \right) , max\left( \tau _1+\theta , \gamma \right) \right]\). Afterwards, it remains to account for the period before and after the second investment.

For the purpose of illustration, we introduced the simplest case, i.e. a two-step investment, for which we will provide a solution. However, it is straightforward to extend our model and to investigate the investment behavior when \(n>2\). The number of investment steps correspond with the number of decision moments. At each point, information based on the density of \(\gamma\) is incorporated. Therefore, more investment steps allow the firm to incorporate more information. The setup of the optimization problem is thus under partial information and learning. On the one hand, the firm observes the pressure on the policymaker to regulate, without having information on the level that triggers the decision of regulating. On the other hand, the firm is learning, as time goes by, it gains insight on the responsiveness of the policymaker to the public pressure.

The setup of partial information and learning has been adopted in real option in a limited number of papers. Most of them consider that noisy information is directly related to the project value, see for instance: Décamps et al. (2005); Pawlina and Kort (2005); Dalby et al. (2018). In our model, the stochastic process is completely exogenous to the firm. However, it does impact the value of the investment, because it triggers the appearance of new regulation.

4 Model Solution

We start solving the optimization problem by first defining the condition under which a firm is incentivized to invest, i.e. the investment is profitable. Investments are profitable if net cash flows are positive. At the policy implementation time, every firm keeps the choice to either invest or leave the market. Therefore, we define the profitability condition of the investments in Proposition 1.

Proposition 1

The first investment step is profitable if:

$$\begin{aligned} I(q_{B,1}) < \frac{q_{B,1}}{\overline{q_B}}\frac{P-P \delta q_{B,1}}{r} \end{aligned}$$

The second investment step is profitable if:

$$\begin{aligned} I(q_{B,2})<\frac{q_{B, 2}}{\overline{q_{B}}} \frac{P-P\delta \overline{q_B} -P \delta q_{B,1}}{r} \end{aligned}$$

Note that these profitability conditions enable us to easily obtain bounds for \(\overline{q_B}\). It informs policymakers at which point their policies are destroying markets. We will further elaborate on these boundaries when considering the case study (see Sect. 5).

Proposition 2

The distribution of capacities among \(q_{B,1}\) and \(q_{B,2}\) that allows the largest \(\overline{q_B}\) without violating the profitability conditions, is \(q_{B,1} = q_{B,2}=\frac{\overline{q_B}}{2}\). Therefore, the profitability conditions are not satisfied if:

$$\begin{aligned} \overline{q_{{B}}}\ge \frac{r}{\delta P}\left( \frac{P}{r}-2I\left( \frac{\overline{q_{{B}}}}{2}\right) \right) \end{aligned}$$

Conditional on the profitability of the investment, we continue solving the model by defining the optimal investment time for the second investment step, \(\tau _2^*\)Footnote 6.

Proposition 3

If the second investment step is profitable:

$$\begin{aligned} \tau _2^* = max (\gamma , \tau _1^*+\theta ) \end{aligned}$$

We distinguish two situations: (i) \(\gamma \ge \tau _1^*+\theta\), that is, the policy arrives after the first investment step has taken place and after the minimal withholding time has expired, in that case \(\tau _2^*=\gamma\), (ii) \(\gamma < \tau _1^*+\theta\), that is the policy arrives before the second investment step can take place, in that case \(\tau _2^*=\tau _1^*+\theta\). Figure 1 graphically represents both situations with their concomitant solution.

Fig. 1
figure 1

Optimal investment time of the second investment step

It remains to calculate the optimal time for the first investment step, \(\tau _1^*\). The results presented in Proposition 4 state that \(\tau _1^* \le \gamma\) always holds. We make the distinction between two cases depending on whether the firm aims to invests before or at the policy implementation, respectively \(\tau _1^*<\gamma\) and \(\tau _1^*=\gamma\). If (i) the minimal withholding time is zero, or (ii) the first investment step is profitable but the profit losses of investing early are greater than the losses due to the partial market exclusion after the policy implementation, the optimal strategy is to invest at the policy implementation time, \(\tau _1^*=\gamma\). If losses due to the partial market exclusion are relatively large, the optimal strategy depends upon the firm’s belief regarding the willingness of the policymaker to regulate. In this sense, the optimal time is a random variable whose density depends on the density of \(\gamma\). Note that this result is atypical when compared with the defined thresholds, usually found in real option literature.

Proposition 4

If the withholding time \(\theta = 0\), then:

$$\begin{aligned} \tau ^*_1 = \gamma . \end{aligned}$$

Under the assumption that \(\theta > 0\), the following holds:

  1. 1.

    if \(\frac{P\delta q_{B,1}}{r}+I(q_{B,1}) > \frac{q_{B,2}}{\overline{q_B}}\frac{P-P\delta \overline{q_B}-P\delta q_{B,1}}{r}-I(q_{B,2})\), then: \(\tau ^*_1 = \gamma ;\)

  2. 2.

    if \(\frac{P\delta q_{B,1}}{r}+I(q_{B,1}) \le \frac{q_{B,2}}{\overline{q_B}}\frac{P-P\delta \overline{q_B}-P\delta q_{B,1}}{r}-I(q_{B,2})\), then the probability density function (pdf) of \(\tau ^*\) is given by:

    1. (a)

      \(\theta \ge {\tilde{\theta }}\)

      $$\begin{aligned} f_{\tau _1^*}(t)={\left\{ \begin{array}{ll} f_\gamma (t+\theta )+f_\gamma (t),&{}0<t<\theta \\ f_\gamma (t+\theta ),&{}t>\theta \\ 0,&{}\text {elsewhere} \end{array}\right. }; \end{aligned}$$
    2. (b)

      \(\theta < {\tilde{\theta }}\)

      $$\begin{aligned} f_{\tau _1^*}(t)={\left\{ \begin{array}{ll} P(\gamma \le \theta ),&{}t=0\\ f_\gamma (t+\theta ),&{}t>0\\ 0,&{}\text {elsewhere} \end{array}\right. } \end{aligned}$$

    where \({\tilde{\theta }}= \frac{1}{r}\ln \left( \frac{\frac{q_{B,2}}{\overline{q_B}}\frac{P-P\delta \overline{q_B}-P\delta q_{B,1}}{r}-I(q_{B,2})}{P\delta q_{B,1}/r+I(q_{B,1})}\right)\), and \(f_\gamma\) and \(f_{\tau _1^*}\) represent the pdf of \(\gamma\) and \(\tau _1^*\), respectively.

Case 1 of Proposition 4 analyzes the optimal time for the first investment step, \(\tau _1^*\), when the investment is profitable, but profit losses before the policy implementation are greater than profit losses after the policy implementation. As a consequence, the firm starts investing at the policy implementation time, \(\gamma\). Case 2 analyzes the optimal time for the first investment step, \(\tau _1^*\), when profit losses before the policy implementation are smaller than profit losses after the policy implementation. In this case, it may be optimal to make the investment decision before the policy implementation. Due to the randomness of the policy implementation time, the outcome does not result in a concrete investment rule (or a first passage time)Footnote 7. Therefore, we provide probabilistic information on \(\tau _1^*\) that helps the firm in its decision process. We note, however, that \({\tilde{\theta }}\) provides a bound to the minimal withholding time, \(\theta\), which determines the optimal strategy in case 2 of Proposition 4. Fast learning firms, i.e. firms with a small minimal withholding time, are thus more likely to find themselves in case 2b instead of case 2a and vice versa. Therefore, it is not the investment characteristics, such as the price, but the relative value of \(\theta\) and \({\tilde{\theta }}\) that directly influence \(\tau _1^*\) in case 2.

Proposition 5

Assume that \(\frac{P\delta q_{B,1}}{r}+I(q_{B,1}) \le \frac{q_{B,2}}{\overline{q_B}}\frac{P-P\delta \overline{q_B}-P\delta q_{B,1}}{r}-I(q_{B,2})\). Then the expected value and variance of \(\tau ^*\) are given by:

  1. (a)

    if \(\theta \ge {\tilde{\theta }}\), then

    $$\begin{aligned} E(\tau _1^*)=E(\gamma )-\theta P(\gamma>\theta ),\ Var(\tau _1^*)=Var(\gamma )+\theta ^2 P(\gamma \le \theta )P(\gamma > \theta ) \end{aligned}$$
  2. (b)

    if \(\theta < {\tilde{\theta }}\), then

    $$\begin{aligned} E(\tau _1^*)=E(max(\gamma -\theta ,0)),\ Var(\tau _1^*)=Var(max(\gamma -\theta ,0)) \end{aligned}$$

Assuming that \(\theta\) is fixed, one can easily see that \(E(\tau _1^*)\) is larger when \(\theta \ge {\tilde{\theta }}\). Additionally, independently of the value of \(\theta\), \(E(\tau _1^*)<E(\gamma )\). Thus, the firm tends to make its first investment before the policy arrives, and sooner when \(\theta < {\tilde{\theta }}\). As a consequence, in the latter case, market supply is less affected because firms are not or only for a short period of time excluded from the market. Therefore, the policymaker prefers case 2b. We note that by adapting \(\overline{q_B}\), the policymaker can influence the occurrence of the applicable case.

To provide practical insights on the optimal strategy, we develop a numerical procedure to compute the expected value of \(V\left( l,{\hat{L}}\right)\), based on the Monte Carlo method. We proceed as follows to compute the above mentioned expected value.

figure a

5 Results and Discussion

5.1 The Polyethylene Case

Lately, EU citizens have urged the EU institutions to take environmentally enhancing measures. For plastics in particular, there exists a general concernFootnote 8 about their impact on our environment. Lobby groups, such as FEAD, have launched calls to regulate a mandatory use of recycled plastics (FEAD 2018). In this paper we elaborate on a case study that focuses on PE. We study how the uncertain implementation time of a mandatory use of recycled PE, impacts investment behavior of manufacturers. Today, many manufacturers still use virgin PE. The main drivers for this choice are the small spread between procurement prices of virgin and recycled PEFootnote 9 and quality control challenges arising from the use of recycled PE. The latter are caused by fluctuating mixtures of, e.g. pigments, antioxidants, anti-static additives, etc. In order to stabilize the properties of recycled PE, extra production steps are needed, resulting in increasing production costs. Due to the rise of production costs and the small spread in procurement prices, net profit is negatively influenced. The combination of these challenges leaves the market with little incentive to use recycled PEFootnote 10. As a consequence, the transition to a large scale utilization can only be triggered by government-driven incentives or regulating policies.

Pursuant to the public opinion and the call of lobbyists, we take into account, for this case study, that the EC considers to regulate a minimum fraction of 30 percent recycled PEFootnote 11 to be used in the production of certain PE goodsFootnote 12. The EC will only regulate when public pressure grows and reaches a threshold value. This value, at which the EC issues a regulation is unknown. However, we know this threshold value is larger than the maximum pressure reached so farFootnote 13. Moreover, we also know from the Eurobarometer that pressure is growing. This barometer is a collection of reports on the public opinion in the EU. Typically 1000 citizens per Member State are surveyed per report. Since 2007, fiveFootnote 14 topical reports on the attitudes of EU citizens toward the environment have been published. The two most recent reports, dating from 2017 (EC 2017b) and 2019 (EC 2020a), have a special focus on plastics. These reports reveal that more than 90 percent of EU citizens think it is important to protect the environment. In 2019, 89 percent of EU citizens were worried about the impact of plastic products on the environment. That is a 2 percentage point increase compared with 2017. Around one-third is convinced that production and consumption has to change. In 2017, 62 percent blamed the EU institutions of ‘not doing enough to protect the environment’. This fraction increased to 68 percent in 2019. The pressure on policymakers to regulate thus increased by 4.7 percent per year. In response, Commissioner V. SinkevičiusFootnote 15 acknowledged that the current EC wants to start addressing these concerns with the European Green Deal, which for plastics mainly refers to the 2018 plastics strategy. Nonetheless, in 2019, two-third of EU citizens favored the enhancement of plastic recycling and the use of recycled plastics in production. Therefore, pressure and support by the public remains to incentivize and regulate recycling, as well as the use of recycled plastics.

We consider two hypotheses to analyze and model the impact of this growing pressure and support, by regarding a continuous and a discontinuous process. We find compelling arguments for both types of processes to be used in the model. On the one hand, no EU citizen has a voice powerful enough to change policies. However, the gradual change of a group’s preferences has a significant effect on policymakers. In this case, we choose to model the public opinion with a GBMFootnote 16. A GBM is a continuous-time stochastic process which takes nonnegative values only. On the other hand, citizens could group their voices and start a European Citizens’ Initiative. If the initiative fulfils the legal requirements, citizens can directly request the EC to exercise her right of initiative. This type of initiative has been taken before in similar matters, e.g. in the process of banning glyphosate (EC 2017a). In such a case, we choose to model the pressure on the policymaker with a Poisson jump process. A Poisson jump process is a stochastic process showing jumps at random moments that follow a specific intensity. The intensity is defined as the expected number of jumps per unit of time.

5.2 Policy Implementation Timing and Investment Timing

We now present our base line parameter values which we use to generate numerical results. This section will then proceed by introducing the resulting policy implementation time and optimal investment timings for a continuous and a discontinuous growing pressure, respectively. Table 1 summarizes the base line parameter values we use throughout the case study.

Table 1 Base line parameter values

The profit, P, is normalized and equal to 1. According to Proposition 2, the investment would not be profitable, ceteris paribus, when \(P \le 0.25\). After investment in \(q_{B}\), the corresponding profit decreases to \(P \delta q_{B}\). The parameter \(\delta\) internalizes increased production costs and different procurement prices of raw material. Although prices for recycled PE are lower than the prices for virgin PE, the labour costs to, e.g. stabilize the quality of each batch of recycled PE, are higher. Moreover, production processes, e.g. extrusion, take more time with recycled PE, and a higher quality control cost is incurred. We set \(\delta\) equal to 0.8, that is profit decreases with 20 percent after investing in the use of recycled PE. If \(\delta \ge 2.7\), ceteris paribus, profitability conditions are not fulfilled. Note that Tables 3 and 6 show the sensitivity of results with regard to the parameter \(\delta\). The firm invests in two equal steps of 15 percentage points to reach \(\overline{q_{B}}\), set at 0.3, corresponding with the call of FEAD (2018). In Sect. 6 the distribution of the capacity to be reached, \(\overline{q_{B}}\), is optimized between the first and the second investments step. Between the investment steps, a minimal withholding time of 0.5 years is imposed, allowing learning effects to take place. The learning pace is firm-specific and determines the solution of case 2 in Proposition 4, see Tables 3 and 6. The policymaker, who prefers case 2b, can influence the applicable case by setting a different fraction of recycled material to be used, \(\overline{q_B}\). Given the base line parameter values, \({\tilde{\theta }}\) becomes negative when \(\overline{q_B} \ge 0.34\). As a consequence, case 1 of Proposition 4 is applicable if the profitability conditions are fulfilled. For \(\overline{q_B} =0.3\), \({\tilde{\theta }}=11.171\). Hence, in our study, case 2a will not be applicable, a learning period of more than 11.171 years is not realistic and the minimal withholding time, \(\theta\), is 0.5. The investment cost C is derived backwards from a real world example. When \(C = 30\) and the other base line parameter values hold, the profitability conditions are fulfilled as long as \(\overline{q_B} \le 0.66\). Since this value for \(\overline{q_B}\) approximates values found by Wielders and Bergsma (2007), we set C equal to 30. Note that a sensitivity analysis is performed and its results are shown in Tables 3 and 6. The discount rate is assumed to be equal to 2 percent, reflecting the current low inflation and interest rates in the EU. Note that for case 2, presented in Proposition 4, only the withholding time impacts the optimal strategy directly. All other parameter values presented in Table 1, impact the strategy in case 2 by providing a bound to \(\theta\), that is \({\tilde{\theta }}\).

5.2.1 Continuous Growth of Pressure

We first model the continuous pressure on the policymaker, L(t), by a GBM, represented by the following quation:

$$\begin{aligned} dL(t)=\alpha L(t)dt + \sigma L(t)dz(t) \end{aligned}$$

\(\alpha\) is the deterministic drift rate that represents the growth rate of public pressure. According to the Eurobarometer (EC 2017b, 2020a), this growth rate is 5 percent. \(\sigma\) is the instantaneous standard deviation and dz is the increment of a Wiener process. We assume that the standard deviation is rather high and set it to 10 percent. Our motivation to consider a high volatility is that some situations, like the COVID-19 pandemic, can distract the public and shift attention from environment to other matters. The current level of the GBM is set equal to the percentage (68 percent) of citizens that urge the EC to change policiesFootnote 17(EC 2020a). We assume that the highest level of the GBM, or pressure, reached so far is 0.7. Note that the preference of the public on the use of recycled plastics has increased steadily over the past years. As a consequence, it is reasonable to assume \({\hat{L}}\) (0.7) has a very similar value to L(t) (0.68). Once a new maximum is reached, we assume the firm knows the probability of a policy change. That follows from the fact that the normal cumulative distribution function (cdf), with mean \(\mu\) and variance \(\omega ^2\), measuring the occurrence probability of a policy change, is known at all times. \(\omega ^2\) is set at 0.01. We assume a moderately responsive EC so that \(\mu =0.8\). These parameters are summarized in Table 2 and a sensitivity analysis is shown in Table 3.

Table 2 Geometric Brownian motion parameter values

Given that \(f_{\gamma \vert L^*= a}(t)\) represents the density function of the first hitting time of a GBM at the level \(L^*=a\) and \(f_{L^*}(a)\) represents the density function of a normal distribution conditional to the information that \(L^*>{\hat{L}}\). Note that a new high level of \({\hat{L}}\), before the policy is implemented, results in a smaller variance of the conditional normal, which models the firm’s knowledge about the pressure that triggers the implementation of the new policy. Therefore, observing \({\hat{L}}\) allows to learn about the policy implementation risk. The density function of \(\gamma\) can be obtained according to the following equation:

$$\begin{aligned} f_{\gamma }(t)&=\int _{{\hat{L}}}^{+\infty }f_{\gamma \vert L^*= a}(t)f_{L^*}(a)da,\nonumber \\&=\int _{{\hat{L}}}^{+\infty }\frac{e^{-\frac{(a-\mu )^2}{2\omega ^2}-\frac{\left( -t\left( \alpha -\frac{\sigma ^2}{2}\right) +log\left( \frac{a}{1}\right) \right) ^2}{2t\sigma ^2}}log\left( \frac{a}{1}\right) }{\pi \sqrt{t^3}\sigma \omega \left( 1-\frac{2}{\sqrt{\pi }}\int _0^{\frac{{\hat{L}}-\mu }{\sqrt{2}\omega }}e^{-t^2}\right) } da. \end{aligned}$$

Integrating Eq. (6) in t, yields the cdf that is known at all times by the firm.

Numerical results can be obtained by applying Algorithm 1. We split the results into: (i) dynamics caused at the policymaking side, and (ii) dynamics caused at the private investor’s side. We proceed by first showing the dynamics found at the policymaking side. Figure 2 shows the truncated normal pdf of the policy implementation time, \(\gamma\), obtained with the base line parameter values that are shown in Table 2. Our simulation results show that a policy change is most likely to arrive within the first few years. Indeed, when analyzing the base line parameter values, our results show that the policy implementation is expected to take place in 3.648 yearsFootnote 18. We also find that the expected timing of the policy implementation is sensitive to certain parameter values, e.g. the standard deviation, \(\sigma\). Table 3 shows the likelihood of the policy implementation to arrive before one, two, or three years, with regard to different parameter values. The second value of the parameters in Table 3 is equal to the base line value for the given parameter.

Fig. 2
figure 2

Probability density function of \(\gamma\) for continuous pressure

Table 3 Sensitivity of \(\gamma\) with respect to continuous pressure

Table 3 indicates that the responsiveness of the EC, \(\mu\), and its associated standard deviation, \(\omega\), have a relatively large impact on the likelihood of the policy implementation time, \(\gamma\), to arrive before a certain time, e.g. one year. Take for example \(\Delta\)P\(\left[ \gamma \le 1\right] = 21\) percent \((0.228-0.018)\) when \(\mu\) shifts from 0.8 to 1. We conclude that the expected policy implementation time, \(\gamma\), is quite sensitive to the policymaker’s responsiveness, \(\mu\), and the associated standard deviation, \(\omega\). Obviously, the expected policy implementation time is delayed when the policymaker is less responsive or when the standard deviation increases. The drift rate, \(\alpha\), and standard deviation, \(\sigma\), of the GBM have a similar but toned down impact, e.g. \(\Delta P\left[ \gamma \le 1\right] = 11\) percent when \(\sigma\) shifts from 0.05 to 0.10. This means that a fast growing and uncertain public pressure leads to an advanced expected policy implementation time, \(\gamma\). The larger the difference between the maximum and current state of the GBM or public pressure, respectively \({\hat{L}}\) and L(t), the lower the risk for a policy implementation in the near future. This result follows our intuition since the policymaker is assumed to be consistent.

We continue by introducing the dynamics found at the private investor’s side. These dynamics, based upon the solution found in Proposition 4, are presented in Table 4. Given the base line parameter values, case 2b is always applicable. That is because the inequality presented in Proposition 4 leads us to case 2, since \(\theta =0.5\) and \({\tilde{\theta }}=11.171\), we always end up in case 2b. As a consequence, case 2a will never be applicable for these base line parameter values, a learning period of more than 11.171 years is not realistic. However, by changing the minimal withholding time, \(\theta\), we are able to find examples that result in case 2a. Note that case 1 would be applicable if ceteris paribus \(\overline{q_B}>0.33\), which is the case when EC imposes a more far-reaching policy. Results for case 1 can be obtained by changing the base line parameter values, so that the inequality that is presented in Proposition 4 has a different outcome. To investigate, we consistently change one investment parameter at a time. According to Proposition 4, the inequality’s outcome changes, when ceteris paribus \(P\le 0.95\), or \(\delta \ge 0.82\), or \(C\ge 31\), or \(r\ge 0.03\). That is, profits are lower, or profits decrease less after investing, or the investment cost is higher, or the discount rate is higher. These changes have in common that they reduce the investment’s net present value, thus making the investment less attractive. Note that for case 1, it is always optimal to make the first investment late, i.e. at the policy implementation time, \(\tau _1^*=\gamma\). The last column represents the expected discounted value of the firm at the beginning of the planning period. This value is sensitive to changing investment parameter values because they impact Eq. (4). For case 2, this value changes similarly.

Table 4 Sensitivity investment characteristics in a continuous pressure setting

Proposition 4 states that for case 2, the optimal investment time of the first investment step does not provide concrete information on when the firm should invest. Therefore, we provide probabilistic information on the distribution of \(\tau _1^*\) in Table 5.

Table 5 Distribution \(\tau _1^*\) in a continuous pressure setting

Table 5 provides probabilistic information on the timing of the first investment step, \(\tau _1^*\), for case 2a \((\theta =20)\) and 2b \((\theta =0.5)\) of Proposition 4. Namely, the expected value, the variance, quantiles, and probability of the policy implementation being sooner than the expiration of the minimal withholding. This table corresponds to the last two rows of Table 4. We find that, for case 2a \((\theta =20)\), there exists a 50 percent probability that the policy implementation time lies in the interval, \(\gamma \in [1.093, 5.370]\). Given that the withholding time is 20 years, \(\theta =20\), it is little surprising that the probability of the policy implementation time, \(\gamma\), being sooner than the expiring of the withholding time, \(\theta\), is very high (98 percent). As a consequence, firms will most likely decide to invest when the policy arrives. For case 2b \((\theta =0.5)\), there exists a 50 percent probability that the policy implementation time lies in the interval, \(\gamma \in [0.607, 5.059]\). Given that the withholding time is 1/2 year, \(\theta =0.5\), there only exists an 11.6 percent probability that the policy is implemented before the withholding time, \(\theta\), has expired. As a consequence, firms will most likely aim to start investing at \(E(\gamma -\theta )\), so that the second investment step can take place at, or not too far from the expected policy implementation time.

5.2.2 Discontinuous Growth of Pressure

In this subsection, we generate results for the hypothesis in which public pressure follows a discontinuous process. Note that comparing results between the two hypotheses (continuous - discontinuous) is not straightforward because that implies that two different contexts are compared.

The discontinuous pressure on the policymaker, L(t), is assumed to follow a Poisson process with intensity \(\lambda _1\). We choose to set this parameter to 1/15, which means we expect a jump or a successful Citizen’s Initiative every 15 years. \(L^*\), that is the level at which the policymaker implements a new policy, follows a zero truncated Poisson distribution with intensity \(\lambda _2\). We set this parameter equal to 1Footnote 19, as we expect that only one successful Citizen’s InitiativeFootnote 20 is needed to trigger a policy change. Therefore, the actual level of the distribution does not matter. Since the Poisson process is increasing, the initial condition verifies, \(l=L^*\). We, therefore, set the initial level equal to zero.

Given the properties of the Poisson process and the analysis above, we have that \(\gamma \vert L^*=n\) is Gamma distributed with parameters n and \(\lambda _1\). If \(f_{\gamma \vert L^*= n}(t)\) represents the density function of the Gamma distribution and \(f_{L^*}(n)\) represents the density function of the zero truncated Poisson random variable. Conditional on \(L^*>{\hat{L}}\), the density function of \(\gamma\) can be represented by Eq. 7. Therefore, by observing \({\hat{L}}\) before the policy is implemented, a firm learns about the variance of the conditional normal, which models the firm’s knowledge about the pressure that triggers the implementation of the new policy. As such, the firm learns about the risk of policy implementation.

$$\begin{aligned} f_{\gamma }(t)&=\sum _{n=1}^{+\infty }f_{\gamma \vert L^*= n}(t)f_{L^*}(n),\nonumber \\&=\frac{e^{-\lambda _1t}t^{-1}}{e^{\lambda _2}-1}\sum _{n=1}^{+\infty }\frac{(t\lambda _1\lambda _2)^n}{n!(n-1)!} \end{aligned}$$

Integrating Eq. (7) in t, yields the cdf that is known at all times by the firm.

Similar to the numerical results presented in Sect. 5.2.1, we split the results into: (i) dynamics caused at the policymaking side, and (ii) dynamics caused at the private investor’s side. We proceed by presenting the dynamics at the policymaking side. Figure 3 shows the pdf of the policy implementation time, \(\gamma\), for the base line parameter values, i.e. \(\lambda _1 = 1/15\) and \(\lambda _2 =1\). We find that a policy change is most likely not to arrive soon. For the base line parameter values, we get that the policy implementation is expected in 23.768 yearsFootnote 21. This result reflects our observations of the European Citizens’ Initiatives. Several initiatives have been undertaken since the framework’s inception by the Lisbon Treaty of 2007, few have been successful. Table 6 shows the sensitivity of the likelihood of the policy implementation, \(\gamma\), arriving before one, two, or three years, with regard to the intensities.

Fig. 3
figure 3

Probability density function of \(\gamma\) for discontinuous pressure

Table 6 Sensitivity of \(\gamma\) with respect to discontinuous pressure

Table 6 shows that the likelihood of the policy implementation, \(\gamma\), arriving before a certain point in time, is closely linked to \(\lambda _1\). When \(\lambda _1\) doubles, e.g. from 1/20 to 1/10, the respective probabilities that the policy is implemented before a certain time, also almost double. As we expect that one jump is enough to trigger a policy implementation, we choose values for \(\lambda _2\) so that the expected value of the zero truncated Poisson process is close to 1.

In Table 7, we present results related to the dynamics found at the private investor’s side. Note that the setup of this table is similar to the one of Table 4, and that dynamics follow the results of proposition 4. Since Proposition 4 is not influenced by the process that models the public pressure, previously presented interpretations and bounds linked to Table 4 remain the same for Table 7. The main difference between the two tables is that in Table 7 the policy implementation is expected to take place at a later point in time, which increases the project value.

Table 7 Sensitivity investment characteristics in a discontinuous pressure setting

Just like Table 5, Table 8 provides probabilistic information on the timing of the first investment step, \(\tau _1^*\), when parameter values are chosen so that case 2 applies. We find, for case 2a \((\theta =20)\), that there exists a 50 percent probability that the policy implementation time lies in the interval, \(\gamma \in [6.778, 32.803]\). Given that the withholding time is 20 years, \(\theta =20\), there exists a 44.4 percent probability that it has expired before the policy is implemented. As a consequence, firms are moderatly incentivized to invest for the first time at the expected policy implementation time. For case 2b \((\theta =0.5)\), we find there exists a 50 percent probability that the policy implementation time lies in the interval, \(\gamma \in [4.720, 17.733]\). Given that the withholding time is 1/2 year, \(\theta =0.5\), there exists a 98.1 percent probability that it has expired before the policy is implemented. As a consequence, firms will be most likely invest at \(E(\gamma -\theta )\), so that the second investment step can take place at, or not too much later than the expected policy implementation time.

Table 8 Distribution \(\tau _1^*\) in a discontinuous pressure setting

6 Investment Capacities as Decision Variable

Throughout this paper, we have assumed an investment in two steps with equal capacity, that is \(q_{B,1}=q_{B,2}\). However, the distribution among \(q_{B,1}\) and \(q_{B,2}\) influences the expected value of the firm at the beginning of the planning period through Eq. (4). Note that Eq. (4) is calculated regardless of which stochastic process applies. Therefore, in this section, we build upon the results found in Sect. 5, and define the following optimization problem:

$$\begin{aligned} \left( \tilde{q_{B,1}}, \tilde{q_{B,2}}\right) &= arg\ max_{q_{B,1},q_{B,2}} \ V\left( l,{\hat{L}};q_{B,1},q_{B,2}\right) \\ &\quad q_{B,1}+q_{B,2}=\overline{q_B} \\ &\quad q_{B,1}\ge 0, q_{B,2} \ge 0 \end{aligned}$$

Recall that for our case study, \(\overline{q_B}=0.3\). The analysis of this optimization problem is done in a numerical way since the function \(V\left( l,{\hat{L}};q_{B,1},q_{B,2}\right)\) is highly nonlinear. The nonlinearity is driven by the different strategies that can be followed. Figure 4 shows the sensitivity of \(V\left( l,{\hat{L}};q_{B,1},q_{B,2}\right)\) with respect to \(q_{B,1}\) for \(P=0.6\)Footnote 22. We find that \(V\left( l,{\hat{L}};q_{B,1},q_{B,2}\right)\) is non-monotonic with respect to \(q_{B,1}\). It follows from this non-monotonic behavior that there exists an optimal distribution among \(q_{B,1}\) and \(q_{B,2}\) that maximizes the value of the investment.

Fig. 4
figure 4

Sensitivity of \(V\left( l,{\hat{L}};q_{B,1},q_{B,2}\right)\) with regard to \(q_{B,1}\)

Table 9 shows this optimal distribution by defining \(\tilde{q_{B,1}}\) for different values of the investment characteristics. We find that \(\tilde{q_{B,1}}\) is decreasing in P, but increasing in \(\delta\), C, and r. We conclude that the firm, when regarding the base line parameter values, will invest in the following optimal capacities for the two investment steps: \(\tilde{q_{B,1}}=0.138\) and \(\tilde{q_{B,2}}=0.162\). Note that, contrary to the optimal investment timings, the optimal capacities, \(\tilde{q_{B,1}}\) and \(\tilde{q_{B,2}}\), are directly influenced by the investment characteristics.

Table 9 Value maximizing \(\tilde{q_{B,1}}\)

These results are driven by the expected policy implementation time, \(\gamma\), which determines the profit function in Eq. (1). In the previous parts of this paper, we discussed how the optimal investment times, \(\tau _1^*\) and \(\tau _2^*\), are determined. In this section, we find that not only the investment times, but also the optimal distribution of the capacity among the investment steps is influenced by policy uncertainty. We obtain that firms choose a small capacity for \(q_{B,1}\), which reduces profits through \(\delta\) and a larger capacity for \(q_{B,2}\) which increases the investment cost for that step. That is because a lower capacity of \(q_{B,1}\) reduces the profit reduction before the policy is implemented, but the result is a higher capacity of \(q_{B,2}\) making the corresponding investment costs considerably higher due to the exponential increase of costs in investment quantity. For case 1 of Proposition 4, we seemingly find an equal capacity distribution between the two investment steps \(\left( \tilde{q_{B,1}}=0.150 \right)\). However, there is a difference, too small to be noticed for our case study. That is because the differences of the capacity distribution for case 1 are driven by the discounted investment costs. Note that the investment steps are only 0.5 years separated from each other and that the discount rate is 2 percent.

7 Conclusion

This study develops a real option model that calculates the optimal investment strategy for a stepwise investment in circular plasticsFootnote 23. With this research, we extend the existing literature on policy uncertainty within the field of real option. We develop a model that allows the firm to observe the public pressure on the policymaker to regulate. Based on these observations, the firm makes projections on when the policy will be implemented. The optimal investment strategy can be derived from these projections. The model thus offers a tool to: (i) firms to plan their investment steps, (ii) policymakers to assess the impact of their behavior and policies, e.g. assessing if policies destroy the market.

We solve the model for a two-step investment in the use of recycled PE. Using recycled PE is more expensive than using virgin PE. Additionally, the quality of recycled PE is not as stable as the quality of virgin PE. As a consequence, recycled PE is not widely used today. As shown by our study, a regulatory policy that mandates the use of recycled PE can effectively contribute to a CE. However, if that policy is set too strict, i.e. it is not profitable for firms to invest in the use of recycled PE, the market will cease to exist. Therefore, we define the boundary below which the policy needs to be set in order for investments to take place. Conditional on the investments to be profitable, we find that the first investment step occurs before or right at the moment of the policy implementation. The timing of the second investment step depends upon the timing of the first investment step. If the minimal withholding time, representing the time the firm needs to learn, has expired when the policy is implemented, the second investment will be made at the policy implementation time. Otherwise, the second investment step will be made as soon as the minimal withholding time has expired.

Our results indicate that expected investment timings are not very sensitive to the investment characteristics. Therefore, we conclude that incentive-based policies accompanying a regulatory policy would only impact investment timings if they would be sufficiently strong. If the policymaker wants the market to convert more quickly, the regulatory policy can simply be implemented earlier. If the market receives clear signals about advancing the policy implementation in time, uncertainty on the market is reduced, minimizing market distortions.

The expected value of the firm at the beginning of the planning period is found to be sensitive for investment characteristics. We are able to determine the optimal distribution of the capacity sizes of the two investment steps that maximize the firm’s expected value. For our case study, we find that the optimal capacity of the first investment is smaller than the optimal capacity of the second investment.

Future research could, instead of investigating a GBM and a Poisson process separately, look into a combined GBM Poisson process. This would allow to compare both processes more easily. This paper has been written from the perspective of a monopolistic investor. Relaxing the assumption of a monopoly and, e.g. considering a duopoly, may lead to what is called ‘a war of attrition’ in game theory. If one player invests in the use of recycled plastics, this could take the heat of the public debate and reduce public pressure. This in turn may lead to a postponement of the expected policy implementation time and allowing the other player to postpone his or her investment. Consequently, both players have an incentive to be the second mover. Also the policymaker’s optimal time to implement a regulation could be studied.