Optimal bidding functions for renewable energies in sequential electricity markets

In most modern energy markets, electricity is traded in pay-as-clear auctions. Usually, multiple sequential markets with daily auctions, in which each hourly product is traded separately, coexist. In each market and for each traded hour, each power producer and consumer submits multiple price and volume combinations, called bids. After all bids are submitted by the market participants, the market-clearing price for each hour is published, and the market participants must fulfill their accepted commitments. The corresponding decision problem is particularly difficult to solve for market participants with stochastic supply or demand. We formulate the energy trading problem as a dynamic program and derive the optimal bidding functions analytically via backward recursion. We demonstrate that, for each hour and market, the optimal bidding function is completely defined by two bids. While we focus on power producers with stochastic supply (e.g., wind or solar), our model is applicable to power consumers with stochastic demand, as well. The optimal policy is applicable in most liberalized energy markets, virtually independent of the structure of the underlying electricity price process.


Introduction
The energy supply is one of the fundamental needs of modern society. Meeting this need results in a multitude of decision problems on all planning levels. Strategic decision problems include the setup of conventional power plants or the expansion of the electricity grid. The tactical decision level includes the harvest planning and inventory planning of biomass (e.g., Ying et al. 2020). Decisions on the operative 1 3 planning level include the profit-maximizing management of energy storages or minimizing the operational cost of microgrids. Since these decision problems are of high practical and scientific importance, multiple literature reviews are dedicated to these topics (e.g., Rahman et al. 2015;Weitzel and Glock 2018;Jin et al. 2020).
One operative decision problem faced by most energy producers is selling the produced electricity in an energy market. To match the electricity supply and demand and guarantee grid stability, electricity is traded in advance. Therefore, market participants must schedule their energy production and consumption some time before the actual delivery. In day-ahead auction markets, the hourly products (commitment for production or consumption in a specific 60-min time slot) are traded more than 24 h ahead of physical delivery. This is especially difficult for power producers with stochastic supply (e.g., wind or solar), as they must trade based on their production forecast. The market participants submit up to N price and volume combinations p i , x i , i ∈ {1, … , N} for each hourly product (e.g., N = 25 combinations in the Spanish markets). A price and volume combination is called a bid. After the market is cleared, the market participants must fulfill their accepted bids to the hourly market clearing prices (MCP). A buying bid is accepted if the corresponding price is greater than the MCP, while a selling bid is accepted if the corresponding price is less than the MCP. Since these discrete bids map each MCP to a volume, the resulting staircase function is called bidding function.
As the time for physically delivering the electricity nears, the production forecast becomes increasingly reliable. Therefore, in most energy markets, one or more intraday auction markets exist, which allow the market participants to adjust their commitments. Again, market participants must submit multiple bids for each traded product. This sequence of events is illustrated in Fig. 1.
In this paper, we analytically derive the optimal bidding functions for market participants with stochastic demand or supply in sequential electricity auction markets. These market participants can be utility providers, wind or solar power plants, aggregators, or multiple of these assets combined to a virtual power plant. In our numerical study, we focus on a wind power producer in the Spanish electricity markets (up to seven sequential markets), but our approach is applicable to most liberalized energy auction markets. In addition, our approach is applicable with most electricity price processes. To demonstrate the gains from the analytically derived optimal policy, we compare this with two benchmark approaches. Optimal bidding functions for renewable energies in sequential… Our paper contributes to the existing literature in two ways. First, we formulate the energy trading problem as a dynamic program in which the decision is a bidding function with a continuous domain and solve it analytically. As we demonstrate, the optimal bidding function is a simple decision rule and is completely defined by two discrete bids. Therefore, our approach is applicable in real-world markets. Second, since our approach can be applied with state-of-the-art price processes, we present a strong evaluation tool for the forecasting community to show the financial benefits of an enhanced price forecast.
This paper is structured as follows: Sect. 2 is a brief literature review. In Sect. 3, we formulate the decision problem as a dynamic program and solve it analytically in Sect. 4. Section 5 presents the data for our numerical study in Sect. 6, while Sect. 7 concludes the paper.

Literature review
The profit-maximizing trading in energy markets is a highly active research field. Therefore, multiple literature reviews are dedicated to this topic (e.g., Fathima and Palanisamy 2015;Rahman et al. 2015;Weitzel and Glock 2018). We focus our literature review on the trading in multiple markets. In most countries, multiple electricity markets coexist which are used sequentially by the market participants. Since the integrated or coordinated trading in these markets can dramatically increase the complexity of the decision problem, most authors focus on a single market setting, while other markets are (if at all) only included implicitly (e.g., Jiang and Powell 2015;Gönsch and Hassler 2016;Zhou et al. 2016;Franz et al. 2020;Ghavidel et al. 2020;Finnah and Gönsch 2021). Like in sequential auction markets, a product can be traded multiple times in a continuous intraday market. The key difference is that in continuous intraday markets, the current energy price is observable. While this opportunity is often ignored and products are only traded once, some papers model multiple trades (e.g., Aïd et al. 2016;Bertrand and Papavasiliou 2020;Boukas et al. 2020). Löhndorf et al. (2013) formulate the decision problem of an owner of a hydro storage with stochastic inflow on the German day-ahead and continuous intraday market as a dynamic program and use an approximate dual dynamic programming approach to solve it. The energy price processes are based on a few fundamentals, such as the mean temperature, total solar power generation, and gas price. To apply their heuristic, the underlying stochastic processes are discretized. Meanwhile, Ding et al. (2015) use stochastic programming to (re-) optimize the management of a wind turbine combined with an energy storage on the Spanish day-ahead, intraday, and real-time market. Here, they employ four different timescales down to one minute to capture the dynamics of the real-time market. Ding et al. (2015) do not optimize all considered markets integrated but use the different stochastic programs in a receding horizon manner. The authors assume that the next market's price is deterministic. This dramatically decreases the complexity, as the price components of the bids are not needed. Furthermore, Crespo-Vazquez et al. (2018) focus on a wind power producer with an energy storage on two Spanish auction markets and an implicit modeled balancing market with hourly products; the other auction markets are ignored. The decision problem is formulated as a stochastic program with an optimization horizon of 24 hours. Again, the price component of the bids is not modeled. Heredia et al. (2018) use multistage stochastic programming (MSSP) to optimize the management of a wind turbine combined with an energy storage. The considered power producer maximizes the daily profits on two Spanish auction markets and a reserve power market. Therefore, the Spanish energy markets are not modeled in full complexity. Rintamäki et al. (2020) considers the integrated trading in two energy markets of an energy producer with controllable load. The problem is modeled as a bi-level program. The upper level is a stochastic program that models the integrated trading, itself, while the lower-level problems are the dispatching problems on the day-ahead and intraday market. Sequential bidding in a day-ahead and a balancing market is considered in, e.g., Boomsma et al. (2014); Kumbartzky et al. (2017); Kongelf et al. (2019); Mazzi et al. (2019). Most authors restrict themselves to a small number of markets. This is because the computational burden of MSSPs increases exponentially in the number of markets. Therefore, the Spanish electricity market, which consists of seven sequential markets, cannot be handled in full complexity using this approach. The first to handle all seven Spanish markets is Wozabal and Rameseder (2020). In contrast with the rest of the literature, the authors model the trading problem for the Spanish electricity market as a dynamic program, in which each decision stage corresponds to one market. Since the computational burden of dynamic programs increases only linearly in the number of stages/markets, Wozabal and Rameseder (2020) can solve the trading problem efficiently but heuristically with an approximate dual dynamic programming approach. Wozabal and Rameseder (2020) propose two model variants: one without updating the power production forecast and one with updates. We refer to the more interesting model with production forecast updates. As no energy storage is modeled, each day can be optimized separately. However, Wozabal and Rameseder (2020) modeled the dynamic program for all 24 h at once, even if the hours could be modeled independently, as well. To determine the bids for each stage (market), different quantities are mapped to the hour-dependent price points selected previously. The authors demonstrate the influence of the underlying price process (e.g., distribution of noise) and include risk aversion by optimizing the nested Conditional Value-at-Risk. Since the state space must include all Markovian features of the underlying price and production processes, Wozabal and Rameseder (2020) use only the most recent market prices and production forecast as state-dependent features. Due to missing production data, the authors focus primarily on a setting without updating the power production forecast in their numerical study.
It is common in the literature that the prices and volumes of the bids are not optimized simultaneously. This is because the simultaneous price and volume decisions result in nonlinear and non-concave decision problems. Therefore, most authors decide on either the prices p i or the volumes x i , while the counterpart is given by parameters (e.g., Morales et al. (2010) Wozabal and Rameseder (2020)). This reduces the computational burden but leads to sub-optimal decisions. A consequence is that the modeled decision problem must be (partially) replaced by 1 3 Optimal bidding functions for renewable energies in sequential… the sample average approximation (SAA), which lowers the quality of the solution again. The computational burden of the SAA formulation increases in the number of samples used, which are needed to capture state-of-the-art stochastic processes with their correlations.
In summary, to the best of our knowledge, no paper solves the energy trading problem in sequential markets analytically. In contrast, for single market settings analytical solutions exist in the literature (e.g., Kim and Powell (2011);Densing (2013); Aïd et al. (2016)). Kim and Powell (2011) model the hour-ahead market trading of an energy storage combined with a wind farm as a dynamic program but ignore price-volume bids. For the analytical solution, Kim and Powell (2011) need assumptions regarding the stochastic processes of the wind power production and the energy prices. Densing (2013) analytically solve the price-volume bidding of an energy storage in an auction market. For this, the lower and upper bound of the energy storage is ignored and only the expectation of the storage level is constrained. Aïd et al. (2016) minimize the imbalance cost in the continuous intraday market under stochastic demand, production, and prices. Further, a controllable thermal power plant is integrated. Aïd et al. (2016) analytically solve a relaxed problem that allows negative production. The solution of the relaxed problem is used to solve the non-relaxed problem heuristically.

Dynamic program
In this section, we present our dynamic program based on Wozabal and Rameseder (2020). As these authors note, without storage, each day can be optimized separately. However, Wozabal and Rameseder (2020) optimize all hours jointly. We model our dynamic program for a single hourly product, as these can be optimized independently. Instead of deciding on discrete bids (p i , x i ) , we decide on a bidding function x(⋅) with a continuous domain that maps each price p to a volume x(p). We demonstrate that the value functions of the trading problem are linear in the market position and can be derived analytically. Additionally, we demonstrate that the optimal bidding functions of the trading problem are defined by two discrete bids. Moreover, our approach is virtually independent of the structure of the underlying stochastic processes and can be applied with state-of-the-art price processes.
We denote the number of markets on which a product can be traded as T. The index t ∈ {0, ⋯ , T} denotes the number of markets on which a product has already been traded.

Exogenous information
The exogenous information W t+1 describes the information that becomes known after the power producer trades on market t but before trading on market t + 1 . After the power producer trades on market t, the MCP of this market P t+1 ∈ P min , P max is published. P min and P max denote the minimum and maximum market prices. In addition, as the time until the physical delivery of the electricity decreases, the power production forecast becomes more reliable. Consequently, the power producer updates the production forecast. We denote the power producer's forecast of the power production during physical delivery, immediately before trading on market t + 1 as Y t+1 . These updates contrast with Wozabal and Rameseder (2020), who update the production forecast only once before trading on the last market T. We extend our model by a multidimensional dummy information variable t+1 , which includes all the information necessary for the underlying price and production processes. This could include the last market prices, load forecasts, natural gas prices, temperature, weather forecasts, and many other factors.
We do not assume specific characteristics of the underlying stochastic processes, so correlation between these can be considered.

State variable
The state S t includes everything the power producer's decision depends on. The decision depends on the current market position R t and all known information W t . A positive position R t is a commitment for delivery, while a negative R t is a commitment for consumption.

Transition function
The transition of the exogenous part W t of the state S t is described by the underlying stochastic processes. The transition of the market position is given by The transition is determined by the decided bidding function x t (⋅) and the stochastic market clearing price P t+1 . A positive x t P t+1 corresponds to selling energy, while a negative one corresponds to buying energy.

Action variable
As we model the decision problem with a bidding function with a continuous domain, the power producer's decision is the function x t (⋅) , which maps each possible MCP P t+1 ∈ P min , P max to a volume x t P t+1 . In most countries, the bidding function must be non-decreasing (e.g., OMIE (2018)).
Market participants are registered with a maximum and minimum position. (1) For power producers, the maximum position is typically the rated capacity of the power plant.
As we assume that the power producer is a price-taker, we restrict the deviation of the position from the production forecast to prevent excessive speculation, particularly on the later markets, which are usually less liquid.
with maximum absolute deviation c t+1 . This is in contrast to Wozabal and Rameseder (2020), in which the trading decisions are constrained on all but the last market. This is crucial, since the last market is typically the least liquid one. 1 Like Wozabal and Rameseder (2020), we do not allow systematic over-or underproduction; therefore, we set Y T−1 − R T = 0 , which is a special case of the above restriction ( c T = 0).
Combined, the restrictions can be written as with the time-and state-depending lower bound X min t Y t and upper bound X max t Y t and maximum and minimum position In summary, the decision x t (⋅) is a non-decreasing function

Contribution function
The contribution function describes the producer's one-stage revenue. Since we use the common price-taker assumption (e.g., Jiang and Powell (2015); Gönsch and Hassler (2016) (2020)), the power producer does not influence the market price, and the contribution function is relatively obvious.

Value function
The power producer's optimization problem can be written as the Bellman equation.
subject to the boundary conditions and (9) and x t (⋅) non-decreasing. f t+1 (⋅) is the probability distribution function of the market prices P t+1 . The optimal policy, which is the optimal bidding function x t (⋅) , depends on the current state S t . The initial state S 0 is straightforward, with the initial position R 0 = 0 and P 0 = � , as the first market has no previous market.
To ease the analytical solution, we reformulate the Bellman equation.
with and Here, the same constraints must be respected. For the reformulation, one can use (a special case of) Pontryagin's maximum principle. This allows for converting the infinite-dimensional optimization problem into infinite one-dimensional optimization problems. The intuition for this is that since the bidding function x t (⋅) maps each MCP P t+1 to a volume x t P t+1 and does not influence the system's behavior for other possible outcomes, we can optimize x t P t+1 for each P t+1 individually. This allows us to interchange the maximization step with the expectation with respect to the MCP. Therefore, we can treat the here-and-now decision as a wait-and-see decision. (11) Optimal bidding functions for renewable energies in sequential…

Analytical solution
In this section, we solve the dynamic program analytically using backward recursion. To accomplish this, we prove that the value functions are linear in the position R t . For this, we define the sets P + t+1 S t and P − t+1 S t .
The expectations in (17) and (18) are conditioned on the currently known information W t and the next market's outcome P t+1 . P + t+1 S t is the set of prices P t+1 for which the expectation of the next price P t+2 is greater than P t+1 . Meanwhile, P − t+1 S t is the set of prices P t+1 for which the expectation of the next price P t+2 is less than or equal to P t+1 .
1 3 An intuitive and sufficient (not necessary) condition for Condition 1, which holds for most price processes, is that the expected market price of market t + 2 increases less in P t+1 than the identity P t+1 , itself. For the common linear price processes, this sufficient condition is that the regression coefficient of P t+1 in the price process of P t+2 is less than or equal to one.
Condition 1 holds for most energy price processes due to the mean-reverting behavior of energy prices (e.g., Weron (2014)). If the MCPs on market t + 1 are high, more conventional power producers begin to sell energy on market t + 2 , which lowers the price. Meanwhile, if the MCPs on market t + 1 are low, power producers buy energy on market t + 2 to reduce their position (commitment for production), thereby increasing the price.
Proposition 1 If Condition 1 holds, the value functions can be written as: with an appropriate function g t W t .
For a proof see Appendix 1.
Proposition 2 If Condition 1 holds, the optimal bidding functions are described by two price and volume combinations p i t , x i t .
Case a: X min Optimal bidding functions for renewable energies in sequential… Proposition 2 is derived from the optimal bidding function x * t (⋅) in the proof of Proposition 1 in Appendix 1 by translating the bidding function into discrete bids to be in line with the market rules. If P t+2 is continuous (e.g., linear) in P t+1 , P * t+1 can be derived by solving P t+2 |W t , P * t+1 = P * t+1 with P * t+1 ∈ P min , P max . If no solution in P min , P max exists, P * t+1 is the lower bound P min or the upper bound P max . In the numerical study, we solve this equation analytically. In general, if P t+2 |W t , P * t+1 = P * t+1 cannot be solved in closed form, the threshold P * t+1 can be found by using a simple line search on the P t+1 values.
The optimal policy is a simple decision rule: As long as the expected MCP of market t + 2 is less than P t+1 , it is best to sell as much as possible, while as long as the expected MCP of market t + 2 is above P t+1 , it is best to buy as much as possible. This simple rule is defined by (20) and (21). If the power producer is forced to sell (Case b) or buy (Case c) energy due to the stochastic transition of the production forecast, the power producer must use a so-called price-accepting bid. The optimal bidding function is illustrated in Fig. 3. The threshold P * t+1 depends on the conditioned distribution of the MCP of market t + 2 . Therefore, P * t+1 is influenced by the currently known exogenous information W t , including the most recent production forecast Y t , the last market price P t , and all information in t .
Moreover, the optimal bidding function does not depend on information further ahead. Instead, the optimal policy iteratively compares the next two markets. This is a beneficial property, as it reduces the market participant's forecasting effort to the next two markets. The optimal bidding function buys energy on market t + 1 for all prices that satisfy P t+1 ≤ P t+2 |W t , P t+1 . Therefore, the market participant buys the energy cheaper (in expectation) compared to waiting for trading on market t + 2 (visa verse for selling energy). Since the position could be closed on the next market, this trading strategy is always beneficial, regardless of information further ahead.

Data for the numerical study
In this section, we specify the parameters of our decision problem that are used in our numerical study in Sect. 6. We benchmark the optimal policy against two alternative approaches over an entire year for the Spanish electricity market via simulation and a backtest.
The Spanish electricity market consists of one day-ahead market (DM) and six intraday markets (IM1 to IM6), though not all products can be traded on all markets. The traded products and market closures are presented in Table 1.
Depending on the hour h ∈ {1, … , 24} , we denote the number of markets on which a product can be traded as T h .
In the Spanish electricity market, market participants are registered as producers or consumers. While for producers, the market position must be non-negative, the position of consumers must be non-positive. In the numerical study, we consider a producer with a wind turbine with a rated capacity of 1 MW, which is a typical-size wind turbine. On average, a Spanish wind farm has a rated capacity of 20 MW installed. Therefore, the maximum position is R max = 1 , while the minimum position is R min = 0 . We set the parameters for the maximum difference between the position and the production forecast as c t,h = In the Spanish energy markets, the minimum market price is P min = 0€∕MWh , while the maximum market price is P max = 180.3€∕MWh (OMIE 2018). The stochastic process for the update of the power producer's production forecast is given in Sect. 5.1, and the price processes are stated in Sect. 5.2. All explanatory variables in the stochastic processes are governed by the dummy information variable (⋅) defined in Sect. 3.1 and therefore influence the optimal policy. The alternative trading policies for the numerical study are presented in Sect. 5.3. We index the days of physical delivery with d. We consequently denote the production forecast and MCP of market t, hour h, and day d as Y d t,h and P d t,h . Moreover, we denote the optimal price threshold of market t, hour h, and day d as P d * t,h .

Stochastic process for the wind power forecast
For the numerical study, we do not need a stochastic process for the actual wind power production, itself; rather, we need a stochastic process for the wind power producer's forecast of the wind power production. To the best of our knowledge, no suitable time series for the estimation of such a stochastic process is publicly available, so we must generate our own. In Sect. 5.1.1, we construct the time series of the wind power producer's forecast of the wind power production. This time series is used together with historical prices in the backtest in Sect. 6.2. To simulate the bidding behavior of the wind power producer in Sect. 6.1, we estimate the stochastic process for the update of the wind power producer's forecast in Sect. 5.1.2. For this, we use the time series generated in Sect. 5.1.1 as input data.

Production forecast time series
In this section, we construct a time series of the wind power producer's forecast of the actual wind power production. To achieve this, we employ the hourly production data of the wind turbine of Sotavento (2019) and scale the power production data to the considered rated capacity (1 MW). We denote the scaled production as y d h . Since approximately 1.6 % of the data are missing, we linearly interpolate these missing data with the adjacent productions.
We estimate different stochastic processes with production data that is known immediately before the market t closes (multi-step ahead forecasting). While trading the products with physical delivery on day d on market t, we denote the number of realized observations at d ′ ≤ d as d t d ′ ; these are presented for d � = d and d � = d − 1 in Table 2. For an overview of multi-step ahead forecasting, see Taieb et al. (2012) or Wang et al. (2016).
For the wind power process, we adopt an autoregressive model. See Jeon et al. (2019) for an overview of state-of-the-art wind power processes.
with noise d t,h , regression parameters (⋅) t,h , and the point estimator for future production Y d t,h . While trading the products with physical delivery on day d, we use all realized productions of the previous two days (d,d − 1,d − 2 ). This accounts for shortterm weather changes. The last sum includes all realizations at the same hour of the previous 25 days ( d − 3 to d − 25 ), which accounts for long-term weather effects.
To fit (27) s.t.(28), we use the least absolute shrinkage and selection operator (lasso) and select the penalty factor for the regression parameters with the Bayesian information criterion (BIC). To generate the production forecast time series for the days ranging from 01 November 2018 to 31 October 2019, the data are fitted with a moving window of two years (initial window: 01 November 2016 to 31 October 2018). The production forecast Y d t,h is always non-negative and less or equal to the rated capacity of the wind turbine.
We measure the out-of-sample deviation of the forecasted and realized power production for the days from 01 November 2018 to 31 October 2019. For this, the considered measurements are the mean absolute error (MAE), root mean squared error (RMSE), and median absolute deviation (MAD). We do not report the mean absolute percentage error (MAPE), as the wind power production is often zero. The measurements are summarized in Table 5   Optimal bidding functions for renewable energies in sequential… measurements are computed using the errors of all hours/products that can be traded in decision stage t. Since all measurements decrease in the decision stage, the power production forecast becomes more reliable.

Updating the production forecast
In this section, we estimate a stochastic process that models the behavior of the wind power producer's forecast of the actual wind power production. This process is used for the simulation in Sect. 6.1. Here, we use the time series of Sect. 5.1.1 as input data.
For this, we adapt the method of Wozabal and Rameseder (2020) and model the wind-power producer's forecast as a non-parametric discrete Markov process. To do so, we cluster for each hour h and for each decision stage t, the production forecast data via k-means clustering. The cluster centroids define the states of the Markov process. The discrete transition probabilities from a state in stage t to a state in stage t + 1 are estimated by the share of data assigned to the corresponding cluster centroids.
We perform the k-means clustering with the city-block norm and 11 clusters on the days ranging from 01 November 2018 to 31 October 2019. The MAE, RMSE, MAPE, and MAD are summarized in Table 6 displayed in Appendix 2. We report the error measurements computed over all decision stages.

Price processes
In this section, we state the stochastic processes for the electricity prices. The stochastic process for the day-ahead market is stated in Sect. 5.2.1, while the process for the intraday markets is introduced in Sect. 5.2.2. To estimate the electricity price processes for the days from 01 November 2018 to 31 October 2019, we fit the models on the data of OMIE (2019) with the same moving window as in Sect. 5.1.

Day-ahead market price process
The wind power producer's decision for the day-ahead market depends on the conditioned expectation of the first intraday market's (IM1) prices. As our intraday price process in Sect. 5.2.2 includes the day-ahead market price as explanatory variable, we need a stochastic process for the day-ahead market prices P d 1,h to derive this. We model the day-ahead market prices with the multivariate auto-correlated process proposed by Ziel (2016), which depends on all realized day-ahead market prices of the last week and the day of the week. with regression parameters (⋅) 1,h and noise ̄d 1,h . The stochastic process is fitted using lasso with the BIC. Table 7 in Appendix 2 presents the corresponding out-of-sample error measurements. We do not assume a specific distribution for the noise ̄d 1,h . While simulating the day-ahead market prices, we use bootstrapping. More precisely, we randomly draw a vector of noises out of the residuals, which allows for considering the correlation between the noises of the hours. If a sampled day-ahead market price is below the lower bound P min or exceeds the upper bound P max , the sample is capped. This never happens in the numerical study.

Intraday market price process
For the intraday market prices, we assume linear dependence on all published market results of the same day.
H t denotes the first hour traded on the intraday market t and is presented in Table 1. Again, we use lasso with the BIC to estimate the multivariate price model. Condition 1 is never violated. Table 8 displayed in Appendix 2 summarizes the out-ofsample MAE, RMSE, MAPE, and MAD of the estimated process; these indicate that the intraday market prices are much easier to predict than the day-ahead market prices. As with the day-ahead market price process, we employ bootstrapping to draw sample intraday market prices in the numerical study. Again, a sampled intraday market price is capped if the sample is not in the feasible range [P min , P max ] . In the numerical study, this happens in less than 0.003 % of cases.

Alternative trading strategies
To demonstrate the financial benefits of using the analytically derived optimal policy, we compare it with two benchmark approaches: a myopic policy and a rolling horizon policy. To ease the explanation and align with the notation used in Sect. 3, we ignore the indexes h and d.

Optimal policy:
This is the policy derived in Sect. 4. Here, the price threshold is derived analytically by solving P t+2 |W t , P * t+1 = P * t+1 with P * t+1 ∈ P min , P max . If no solution in P min , P max exists, P * t+1 is the lower bound P min or the upper bound P max . This never occurs in the numerical study.
Optimal bidding functions for renewable energies in sequential…

Myopic policy:
The myopic policy sells or buys energy to ensure that the position is equal to the power producer's forecast of the power production during physical delivery Y t . For this, the bidding function is a single price-accepting bid, in which the volume is defined by Y t − R t . Note that since power producers are obligated to clear predictable deviations from the position, the myopic policy is the simplest practically feasible policy. Rolling horizon policy: The rolling horizon policy solves a deterministic optimization problem at each decision stage t while being in state S t . Again, the bidding function is a single priceaccepting bid and is completely defined by a volume. We denote the volume decision by the rolling horizon policy at decision stage t in state S t as x RH t .
The deterministic optimization problem (32) s.t. (33) to (35) is reoptimized in each decision stage. Thus, in stage t, only the decision x RH t is used. The expectations in (32) and (34) are conditioned on the currently known information W t . To reflect this with our estimated stochastic processes, we use the law of total expectation. In contrast with the optimal policy, the rolling horizon policy uses information over the full decision horizon. Rolling horizon policies are very popular for solving operations management problems under uncertainty (see Chand et al. (2002) for an overview).
Compared to the optimal policy, the rolling horizon policy has the disadvantage that it solves a simplified problem and does not use price-volume bids. Therefore, it would be fairer if we model price-volume bids by matching volumes to previously chosen price points. This is a common technique to model price-volume bids (see Sect. 2). Overall, we compare the optimal policy only with a myopic policy and a direct look-ahead (rolling horizon) policy. See Powell (2019) for an overview of further techniques to derive a policy. Especially, we do not benchmark against a value function approximation approach like approximate dual dynamic programming (e.g., Löhndorf et al. (2013); Wozabal and Rameseder (2020)). We do not tune the rolling horizon policy and do not implement further approaches for two reasons: First, these approaches are sample based and/or require discretized price processes, which lowers solution quality. Second, these approaches are computational more expensive as the superior optimal policy.

Numerical study
In the numerical study, we compare the optimal policy with the two benchmark approaches. In Sect. 6.1, we compare the approaches on a multitude of simulated trajectories. Meanwhile, in Sect. 6.2, we demonstrate how the approaches would have performed in the past based on a backtest. In the numerical study, we consider multiple settings.

Basic:
The Basic setting is the setting described in the paper thus far.

Unrestricted:
In the Unrestricted setting, we do not restrict the deviation of the position from the current production forecast. Therefore, in this setting, we set c t,h = 1 for t < T h . Relaxing this restriction is problematic for large power producers, which can influence the market prices on the later, less liquid markets. The myopic policy does not benefit from the Unrestricted setting, as this policy always uses c t,h = 0. Limited trading T L : In this setting, we limit the trading of the wind power producer to the first T L markets. After trading on the first T L markets with a look-ahead policy (optimal or rolling horizon), the power producer switches to the myopic policy. The Limited trading T L setting with T L = 6 is equal to the Basic setting. For T L = 0 , the power producer always trades with the myopic policy. Limiting the use of the optimal policy can slightly reduce the price forecasting effort of the power producer, as price processes for the later markets might not be needed. However, this does not hold for the rolling horizon policy, which uses the price processes of all markets at each decision stage.

Policy evaluation
This section shows the financial benefits of the optimal policy based on the assumed stochastic processes. We simulate 100 trajectories, in which each trajectory consists of an entire year. Since we estimated the price processes for 365 days with a moving window, we have a unique set of estimated models for each day. The noise of the multivariate price processes is sampled using bootstrapping. More precisely, for each day and each decision stage t, we randomly draw a vector of noises out of the residuals with the same decision stage. The power producer's forecast of the power production follows the estimated transition probabilities. For each approach and each setting, we use the same out-of-sample random numbers and report the mean revenue in euros over the complete year and the standard deviation (SD) of the yearly revenue. For a better comparison, we report the mean revenue relative to the mean revenue of the optimal policy in the Basic setting in percent. The data are summarized in Table 3. Table 3 displays the mean and the standard deviation of the yearly revenue for each setting and each policy after 100 simulations, with common random numbers for each policy and setting. The difference between the mean revenue of all policies is significant. The standard deviations are very low (less than 1% of the mean), since a year consists of multiple thousands of trades. This reveals that there is no need for a risk-averse optimization. In the Basic setting, the optimal policy outperforms the rolling horizon policy by more than 3400 EUR, or approximately 3.8 percentage points. The increased revenue scaled to the average 20 MW wind farm is greater than 68,000 EUR per year. Compared with the myopic policy (Setting Limited trading 0), the optimal policy increases the revenue by 7.2 percentage points, or more than 5500 EUR.
Comparing the look-ahead policies in the Unrestricted setting reveals that the optimal policy benefits more from removing the constraints for the position than does the rolling horizon policy. Since this constraint restricts trading in the later markets the most, this indicates that price-volume bids are especially important in the less liquid and therefore more volatile markets.
The difference between the optimal policy and the rolling horizon policy decreases in the number of optimized markets. In the Limited trading 0 setting, both policies are equal to the myopic policy. The value of considering an additional market in Limited trading T L setting decreases in T L , due to two reasons. First, the number of traded products decreases; second, the constraint for the position is more restrictive on the later markets.

Backtesting
In this section, we repeat the numerical study performed in the previous section but evaluate the policies and settings using real-world data. Since we estimate 1 3 the price processes and production forecast time series in Sect. 5.1.1 using a moving window, all data in this section are out of sample. Table 4 presents the revenue of the different approaches and settings based on real-world data. Additionally, we report the revenue relative to the revenue of the optimal policy in the Basic setting. In the backtest, the different approaches and settings behave similarly as in Sect. 6.1. The optimal policy outperforms the rolling horizon policy in the Basic setting by 2.7 percentage points. Scaled to the average 20 MW wind farm, this would equate to approximately 47,500 EUR. This demonstrates the importance of price-volume bids and is aligned with Wozabal and Rameseder (2020). The authors applied their heuristic on a setting with and without price-volume bids. The price-volume bids increase the revenue by 2.6%.
To consider the actual decisions of the wind power producer with the optimal policy, we plot the price threshold P d * t+1 for each hour in Fig. 4. Figure 4 displays the box plots for the price threshold P d * t+1 for each hour h over all decision stages t ≤ T h − 2 . We exclude the last market, as the power producer uses a price-accepting bid in the decision stage T h − 1 . In addition, Fig. 4 includes the hourly average market prices. Here, the average is computed over the entire year and all decision stages t ≤ T h − 2 . The median price thresholds (red lines) are close to the average market price for each hour. For most hours, 50% of the data (blue boxes) are within ±5€∕MWh of the median price threshold, while the whiskers contain the other 50% of the data. This reveals that for t ≤ T h − 2 , the price threshold P d * t+1 was never a price-accepting bid.

Conclusion
In this paper, we modeled the trading in sequential energy markets for profit-maximizing (cost-minimizing) market participants, with exogenous production and consumption as a dynamic program. This model accounts for updating the production/ consumption forecast over the entire trading horizon. In addition to the common price-taker assumption, the model does not need specific assumptions about the underlying stochastic processes and can handle state-of-the-art forecasts. We solved the trading problem analytically for price processes that meet a weak condition. This condition holds in practice due to the mean-reverting behavior of energy prices. The optimal policy is a simple decision rule. We compared the optimal policy with two benchmark approaches in different settings via simulation and real-world data. Compared with numerical optimization, the analytically derived optimal policy dramatically reduces the computational burden for the market participants; this is especially beneficial for small market participants, which cannot afford the know-how and infrastructure needed for complex numerical approaches. The simple decision rule is optimal only because of the price-taker assumption, which is common in the literature. While this assumption holds for small market participants, our decision rule cannot be applied by a large share of small market participants at the same time or by price-makers. Therefore, future research should weaken the price-taker assumption, which would lead to nonlinear value functions and nonlinear bidding functions. Afterward, these nonlinear bidding functions can be approximated by discrete bids. These approximated bids might not be optimal but should be close to an optimal solution.
Another possible research avenue is to further investigate the benefits from the optimal policy. The optimal policy can be benchmarked against a tuned rolling horizon policy that accounts for price-volume bids and against approximate dynamic programming approaches. Additionally, the optimal policy with continuous price Fig. 4 Box plot of price thresholds for each hour processes can be benchmarked against the optimal policy with discretized price processes, which are often needed for approximate dynamic programming approaches. These investigations can provide valuable insights on how much is lost from discretization and from suboptimal decisions. Especially for similar decision problems without analytical solution, this would answer an important question: Should one design an approximate dynamic programming algorithm that handles price processes well, or should one accept inferior stochastic processes and aim for high solution quality?

3
Optimal bidding functions for renewable energies in sequential… The function G t+1 is linear in z. Thus, depending on the sign of the coefficient of z, an optimal solution is either the minimum or maximum argument. Therefore, an optimal bidding function x * t P t+1 = arg max z G t+1 S t , P t+1 , z is: As by assumption sup P + t+1 S t ≤ inf P − t+1 S t , the optimal decision x * t (⋅) is nondecreasing in P t+1 . Therefore, the value function is: +g t+1 W t+1 |W t , P t+1 dP t+1 = max x t (⋅)

Appendix 2: Error measurements of the estimated stochastic processes
See Tables 5, 6, 7 and 8.
= P max ∫ P min f t+1 P t+1 |W t ⋅ −R t ⋅ P t+1 |W t , P t+1 dP t+1 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.