1 Introduction

The use of financial derivatives for hedging purposes is widespread among non-financial firms and is considered as an integral part of the business and investment decisions of many companies. Specifically, the hedging of foreign exchange and interest rate risks are of high relevance for many firms.Footnote 1 Considering the practical importance of hedging, one might guess that firms can rely on a well developed academic literature for advice on how to hedge. Corporate hedging has actually attracted considerable attention in academic literature; however, the focus has mainly been on understanding why firms should hedge and the characteristics of firms that hedge versus those who do not. Considering the theoretical justifications for hedging and the empirical evidence on the value of hedging, surprisingly little is known about how firms should hedge in practice. Examples of questions that we have only partial answers to are: the impact of empirically documented premia in e.g. the interest rate market on optimal hedging decisions; the value inherent in an expanded asset universe; whether risks should be partially or fully hedged; and not least, how a model supporting the hedging decision faced by market practitioners can be built considering e.g. transactions costs, trading at market prices, and flexibility in modeling choices. In this study, we take a first step towards addressing these questions by building a stochastic programming framework for determining optimal hedging decisions, given stochastic cash flows in multiple currencies and exposure to foreign exchange and interest rate risks.

The first step in determining optimal hedging decisions is to define the problem, which requires us to understand first of all why firms should hedge. The famous result of Modigliani and Miller (1958) implies that, given the assumption of perfect capital markets, financial decisions such as hedging do not create firm value. The important lesson from this result is not that financial decisions are irrelevant, but that the assumption of perfect capital markets must fail to hold for financial decisions to create firm value. A long strand of literature has suggested different market frictions as rationales for corporate hedging. Such rationales include: costs of financial distress (Smith and Stulz 1985), convex tax functions (Smith and Stulz 1985), costly external financing (Froot et al. 1993), informational asymmetry between firm and shareholders (DeMarzo and Duffie 1991), and more generally, non-linearities in the function describing firm value (Mackay and Moeller 2007).Footnote 2 The main lesson from this strand of literature is that costly states, in which firm value is destroyed due to capital market imperfections, do exist, which implies that firm value is a non-linear function of profits.Footnote 3 Within the firm value maximization paradigm, the objective for the managers of the firm is to maximize the expected firm value which is itself a non-linear function of profits. Unless variation in profits is infinitely costly, the optimal hedging decision is not to minimize variation in profits, but to determine the optimal trade-off between risk and the costs of hedging. This fundamental insight is central in the proposed framework for corporate hedging where in addition to transaction costs, we carefully consider the empirical research on documented premia in the interest rate and foreign exchange markets.

There is an extensive empirical literature that investigates expected interest and foreign exchange rates. The existence of time-varying term premia on the interest rate market is well documented. Two of the most influential contributions are those of Fama and Bliss (1987) and Cochrane and Piazzesi (2005). A common finding in these and other papers on term premia is that some measure of the slope of the term structure predicts excess bond returns. The literature on foreign exchange rate expectations is substantial. From the uncovered interest rate parity (UIP), it follows that the expected future spot rate is equal to the forward rate. However, it has been documented that forward-spot differentials and subsequent changes in the spot rates are uncorrelated, or even negatively correlated (see e.g. Fama 1984; Engel 1996), which strongly reject the UIP. This failure of the UIP is often referred to as the forward premium puzzle and is what makes currency carry trades, i.e. the investment in high-interest-rate currencies funded by low-interest-rate currencies, profitable on average. A growing literature on carry trades shows that their excess returns are partly a compensation for exposure to crash or skewness risk (see e.g. Brunnermeier et al. 2008), but that carry trades hedged against crash risk remain profitable (see e.g. Jurek 2014). The impact of carry trades on foreign exchange dynamics is significant. However, to predict foreign exchange rates, the simple random walk model remains pre-eminant, as concluded in the survey on foreign exchange rate expectations by Jongen et al. (2008).Footnote 4

An important implication of the empirically observed properties of interest and foreign exchange rates is that the expected return of different hedging instruments varies with respect to both maturity and currency. Hence, the inclusion of hedging instruments spanning over a large set of maturities in all relevant currencies is important not only for the purpose of reducing risk, but is also necessary to enable finding the optimal hedge, considering risk as well as costs. We include in total 66 different hedging instruments in the optimization model. These are yearly spaced currency forward contracts and interest rate swaps in the relevant markets. The hedging decision is complicated not least by the combinatorial property of the problem, where (possibly infinitely) many combinations of financial contracts can be used to control the same risk, but importantly, at different costs.

Theoretical models of interest and foreign exchange rate dynamics that are consistent with empirical findings are key inputs in the proposed framework for optimal hedging. A complication in financial modeling is the fact that theoretical models in general do not provide a perfect fit to market prices, which causes model arbitrage if we allow trading at market prices. However, stochastic programming allows us to determine the optimal allocation of market traded contracts and still consider the important results from theoretical models. The use of market prices offers the opportunity to utilize relative price advantages in the market, and enables the optimal positions to be entered to these prices. Within the stochastic programming framework, we model term structure dynamics using the essentially affine model defined in Duffee (2002) which is flexible enough to capture empirically observed properties of term premia. Consistent with the findings of non-normality for foreign exchange returns (see e.g. Westerfield 1977) and possible conditional skewness documented in the carry-trade literature (see e.g. Brunnermeier et al. 2008), we model foreign exchange rates by a Poisson jump-diffusion model with stochastic volatility. Motivated by the findings in Jongen et al. (2008), we set the drift to zero.

The true objective relating firm value and profits is unknown, and so we must rely on approximations. Popular objective functions used in academic literature as well as in industry include measures of dispersion, tail-risk measures, and concave functions inspired by utility theory. We study specifications of the firm objective of all three types; namely variance, expected shortfall, and mean log profits.

We view the main contributions of this study to be (i) the development of an optimization framework that is robust enough to be used for hedging on real market data, (ii) the investigation of the impact from different objectives on the properties of the optimal hedge, and (iii) the findings on the importance of the asset universe made available for the risk manager, and the impact of term premia on the optimal hedge.

2 Literature

The literature on foreign exchange risk management predominantly addresses the situation of a known (deterministic) cash flow. The optimal hedge of a single known cash flow, expressed as the ratio between futures contracts and foreign currency exposure, has been shown to be close, but not necessarily equal, to one, as documented in e.g. Ederington (1979). The case of an uncertain future cash flow is addressed in Kerkvliet and Moffett (1991), where the hedge ratio that minimizes the variance of a cash and futures portfolio is derived. However, the result is restricted to the special case of a single cash flow and is based on simplifying assumptions such as no interest rate risk. Another aspect of uncertain cash flows, namely the arrival of new information, is investigated in Eaker and Grant (1985). In their paper, the optimal hedge is determined as the solution to a stochastic dynamic programming problem using forward contracts. Compared to Kerkvliet and Moffett (1991), as well as Eaker and Grant (1985), we consider a more general setting with multi-currency cash flows, interest rate risk, market prices, transaction costs, a larger set of hedging instruments and term premia. However, we do not address the arrival of new information, which would require a multi-stage stochastic programming model, and we obtain no closed-form optimal hedge ratio.

The paper that we find to be closest to ours is Volosov et al. (2005), in which a stochastic programming model for hedging of foreign exchange exposure in a deterministic single-currency cash flow stream is formulated. Uncertainty is restricted to random fluctuations in future spot and forward rates. As in previous studies, interest rate risk is disregarded and only currency forward contracts are included. Using a multi-objective function, minimizing transaction costs and deviations from treasury targets while maximizing expected terminal wealth, to solve the problem, they find, through ex-post simulations, a considerable improvement in the risk-return profile compared to the no-hedging strategy.

Topaloglou et al. (2002) build a stochastic programming model for international asset allocation. By integrating the hedging decision with the portfolio selection problem, they show an improved risk-return profile from international diversification. In Topaloglou et al. (2008), the international asset allocation problem is addressed in a dynamic setting using a multi-stage stochastic programming model. Through ex-post simulation, they demonstrate the stochastic programming framework to be a flexible and effective decision tool by showing that the risk of international portfolios is considerably reduced with this framework.

3 The hedging problem

In this section we give a general formulation of the corporate hedging problem and present a suitable method for solving this class of decision problems. Within the firm value maximization paradigm the objective for the managers is to maximize the expected value of the firm. However, due to frictions in the markets, firm value itself is a nonlinear function of firm profits. Determining the exact shape of this function is a nontrivial task even in the case of a single firm, and so finding a general functional form for a larger family of firms is likely to be an impracticality.

In literature as well as in industry, different approximations for the objective of a firm are chosen. These show varying concordance with the properties of costs caused by financial frictions. Three popular choices are tail-risk measures (e.g. expected shortfall), measures of dispersion (e.g. variance, absolute semi-deviation), and utility functions (e.g. power utility). These alternative specifications capture different properties of costs resulting from financial frictions. If the cost of financial distress is the only friction, then tail-risk measures are likely to be suitable as they focus on controlling low-profit states. Minimization of variance (standard deviation) is optimal if variation in profits is infinitely costly. This is seldom the case in practice, but measures of variation may provide good decisions in environments where the impact of expected returns and third and higher moments are negligible. If financial frictions destroy value in bad (low-profit) as well as good (high-profit) states of nature, then a concave function, potentially inspired by utility theory, may provide a good approximation. In our numerical analysis, we work with objective function specifications of all three types: expected shortfall; variance; and the natural logarithm.

Although the nature of the hedging problem is sequential, we work in a single-period framework. The motivation for this is two-fold. First, for dynamic portfolio choice problems with transaction costs, Brown and Smith (2011) show that single-period stochastic programming models can give decisions that are close to optimal. Secondly, there is a trade-off between the number of stages and the number of outcomes per stage, and we know from e.g. Kaut et al. (2007) that even in a single-period model the required number of scenarios for a stable solution is likely to be large.Footnote 5 Although this is interesting for future research, multiple stages introduce additional difficulties, and to focus on the challenges already inherent in the corporate hedging problem, in this study we work in a single-period framework.

The optimal hedging decision, formulated as a single-period optimization problem within the firm value maximization paradigm, can now be formulated. For this purpose, let y denote the decision vector, i.e. the number (and type) of financial derivatives to trade, and \(\xi \) a random vector describing the uncertainty in the decision problem. We denote the (stochastic) next-period firm profit by \(\varPi (y,\xi )\), so that the corporate hedging problem can be written as

$$\begin{aligned} \begin{aligned} \underset{y}{\max } \quad&{\text {E}}[f(\varPi (y,\xi ))] \\ \text{ s.t. } \quad&y \in \mathcal {Y}\end{aligned} \end{aligned}$$
(1)

for some concave function f, and with \(\mathcal {Y}\) denoting the set of constraints on the decision variable y. The set \(\mathcal {Y}\) typically contains cash balance constraints and potential restrictions on trading that are dictated by the financial policy of the firm, such as limitations on short-selling.

For the vast majority of real-world applications the hedging problem in (1) is analytically intractable, and so has to be solved with numerical techniques. Stochastic programming provides a suitable framework for this purpose, and solves the problem in (1) by discretizing the distribution of the random vector \(\xi \), e.g. through Monte Carlo simulation. Given the discretization, the population mean in the objective of (1) is approximated by the sample counterpart which is known as the sample average approximation. The resulting optimization problem is a (large scale) deterministic non-linear program, commonly referred to as the deterministic equivalent of (1). The main steps in formulating and solving the stochastic programming model are illustrated in Fig. 1.

Fig. 1
figure 1

Steps involved in formulating a stochastic programming model

When it is available, historical data is usually an important source of guidance on the modeling of uncertainty in stochastic programming applications. For the financial case that we study, observed empirical properties of interest rates and foreign exchange rates are crucial when choosing suitable stochastic processes to model the risk faced by a firm. Scenarios for the uncertain parameters are generated by means of Monte Carlo simulation, which is a popular scenario generation method in stochastic programming.Footnote 6 Given a finite set of possible realizations for the random parameters, the optimal hedge can be determined as the solution to the defined optimization problem.

3.1 Problem instance

The specific instance that we study is inspired by a real case of a Swedish company (henceforth referred to as the case company) that serves as a supplier to a US company (referred to as the customer). The supplier contract stipulates that over the following ten years the customer has the right, but not the obligation, to buy (a limited number of) products at a fixed price in USD. This corresponds to the customer holding a set of American call options on the products, with strike prices equal to the agreed product prices and time to maturities of up to 10 years. The exposure, from what corresponds to written call options, creates large risks for the case company if it is left unhedged.

The manufacturing costs for the case company are paid in SEK, GBP and EUR, while the cash inflows from sales are in USD. Cash in- and outflows are spread over the 10 years covered by the agreement. We model the cash flows in all four currencies as annually spaced at \(1,2,\ldots ,10\) years from the time of the first hedging decision, and we assume a 1-week hedging horizon.

The customer is obliged to notify the case company on a weekly basis about the expected order sizes, but may in the case of new information give updates more frequently. The information about expected future cash flows is used by the case company to monitor the value of the order book. We let \(\mathcal {T}\) denote the set of times for the project and hedge portfolio cash flows, and \({\mathcal {E}} = \{\text{ EUR/SEK }, \text{ GBP/SEK }, \text{ USD/SEK }\}\) the set of exchange rates with SEK as the term currency. Further, we let \(c_b(e)\) determine the base currency of exchange rate \(e \in {\mathcal {E}}\) (e.g. \(c_b(\text {EUR/SEK}) = \text {EUR}\)). The net value of the order book and the hedge portfolio at time t can then be expressed as

$$\begin{aligned} \varPi _t =&\sum _{\tau \in {\mathcal {T}}} d_{\text {SEK},t,\tau } \left( D_{\text {SEK},t,\tau } + \sum _{j=1}^M C_{\text {SEK},t,\tau ,j} \right) \nonumber \\&+ \sum _{e \in {\mathcal {E}}} f_{e,t} \sum _{\tau \in {\mathcal {T}}} d_{c_b(e),t,\tau } \left( D_{c_b(e),t,\tau } + \sum _{j=1}^M C_{c_b(e),t,\tau ,j} \right) , \end{aligned}$$
(2)

where \(f_{e,t}\) is the spot exchange rate \(e \in {\mathcal {E}}\), at time t; \(d_{c_{b(e),t,\tau }}\) is the discount factor in currency \(c_b(e)\), for time \(\tau \), observed at time t; \(C_{c_b(e),t,\tau ,j}\) is the cash flow forecast in currency \(c_b(e)\), at time \(\tau \), for project \(j=1,\ldots ,M\), given the information obtained up to time t; and \(D_{c_b(e),t,\tau }\) is the cash flow in currency \(c_b(e)\), at time \(\tau \), which results from the hedge portfolio held at time t. The risk faced by the case company derives from the distribution of changes in the net profits, i.e. \(\varDelta \varPi _{t+1} = \varPi _{t+1} - \varPi _t\). Hence, we see from (2) that risk arises from the combined effect of changes in spot exchange rates, movements in the term structures of interest rates, and uncertainty in future cash flows. We include \(2,3,\ldots ,10\) year interest rate swaps in all four markets and \(1,2,\ldots ,10\) year currency forward contracts in the three exchange rates with SEK as the term currency. The hedging decision concerns choosing the portfolio of interest rate swaps and currency forward contracts, generating \(D_{c_{b(e),t,\tau }}\), so as to control the risk from the uncertain cash flows and thereby create additional firm value.

4 The optimization framework

In this section, we move on from the general description of stochastic programming models in Fig. 1 and propose a framework for the specific problem of hedging currency and interest rate risk in an environment with uncertain cash flows. The first step in setting up the framework is to define the optimization problem, and given this we can determine the necessary input in terms of scenarios. Next, we make distributional assumptions for the risk factors and choose a suitable scenario generation method. This is followed by parameter estimation for the stochastic processes, and generation of scenarios. Finally, we solve the optimization problem, given the scenarios of the company cash flows and the hedging instruments. An illustration of the framework from the perspective of the flow of data, starting with the collection of historical data and the steps involved in generating scenarios and ending with the determination of the optimal hedge, is presented in Fig. 2.

Fig. 2
figure 2

Optimization framework for corporate hedging

For the purpose of estimating the dynamics of interest and foreign exchange rates, we use synchronized historical weekly data on forward rate agreements, interest rate swaps, and foreign exchange rates. Parameters for the dynamic models of foreign exchange rates can be estimated directly from the historical data available, while the interest rate models first require complete term structures on each of the four markets to be determined. The reason for this is simply that the interest rate models require historical observations of fixed maturity yields that are not directly observable in the market. Estimation of the complete term structures, if made carefully, also has the important advantage of reducing the impact of noise present in the prices of individual contracts. The choice of the term structure estimation method, and specifically its ability to reduce noise in the raw data, has a big impact on the parameter estimation and thus on the quality of the scenarios generated. More generally, in a stochastic programming framework, the properties of the scenarios is of great importance for the quality of the optimal decision. The parameter estimation is performed separately for each interest rate market and foreign exchange rate, while the scenarios are generated collectively using a joint copula. By separating the estimation of the univariate distributions and the modeling of dependencies, we are able to estimate the multivariate distribution of all risk factors. The dynamic model for the company cash flows is based on distributional assumptions made by the case company. Although some data exists, we do not have enough historical data available to statistically estimate the distribution of the uncertain cash flows. Hence, the best estimate we can obtain is likely to be based on the expert opinions from managers within the company.

Given the scenarios, we can determine the value of all hedging instruments as well as the present value of the cash flows in each state of the future. In addition to the future prices and values, we also need the current market prices and present values, which we determine from term structure estimates and foreign exchange rate observations on the day of the hedging decision.

One of the advantages of the proposed framework is its flexibility, as it allows for replacement of different components, e.g. the assumed stochastic processes, the scenario generation technique, the set of included financial contracts, or the optimization model. Specifically, the framework allows us to study the impact of changing different components on the optimal hedge. One such experiment is to let one or several risk factors be deterministic, something which can help us to understand the importance of modeling uncertainty in the different risk factors.

Next, we define the optimization problem, which is followed by a description of the parameter estimation and the scenario generation for interest rates, foreign exchange rates, and cash flows respectively. The technical details of the pricing of currency forwards and IRSs can be found in “Appendix A”, and the details of the scenario generation is presented in “Appendix B”.

4.1 The stochastic programming model

We define the hedging problem as a two-stage (single-period) stochastic programming model without recourse decisions. Hence, decisions are made only at the initial stage, and the objective is defined over the possible realizations in the second stage. The objective is to maximize the expected firm value given exposure to (stochastic) cash flows, and we study alternative specifications for the function relating firm value and profits. Specifically, we consider the problem of forming a hedge portfolio from an asset universe of currency forward contracts and interest rate swaps. The asset set contains in total 66 different hedging instruments, made up of 30 forward contracts (3 exchange rates, \(1,\ldots ,10\) years) and 36 interest rate swaps (4 markets, \(2,\ldots ,10\) years). In addition to the currency forwards and interest rate swaps, the firm can also choose to hold cash that grows with the risk-free rate over the 1-week hedging horizon. We use the following sets and indices for the purpose of defining the optimization problem,

Sets:

  • \(\mathcal {C}= \{\text{ EUR }, \text{ GBP }, \text{ SEK }, \text{ USD }\}\)    currencies;

  • \({\mathcal {E}} = \{\text{ EUR/SEK }, \text{ GBP/SEK }, \text{ USD/SEK }\}\)    FX rates.

Indices:

\(c \in \mathcal {C}\) :

currencies;

\(e \in {\mathcal {E}}\) :

FX rates;

\(\tau \in \left\{ 1,\ldots ,10 \right\} \) :

years forward in time;

\(i \in \left\{ 1,\ldots ,N \right\} \) :

scenarios.

The decision variables of the model are the number of long and short contracts in currency forwards and interest rate swaps. Firm profit is an auxiliary variable which we define as the sum of the hedge portfolio, the present value of uncertain cash flows, and the holdings in cash. Auxiliary variables are also used for the purpose of defining expected shortfall. The notation chosen for the decision and auxiliary variables is,

Decision variables:

  • \(x^{F,L}_{e,\tau }\)    number of long forwards in FX rate e with maturity in \(\tau \) years;

  • \(x^{F,S}_{e,\tau }\)    number of short forwards in FX rate e with maturity in \(\tau \) years;

  • \(x^{I,L}_{c,\tau }\)    number of receiver IRSs in currency c with maturity in \(\tau \) years;

  • \(x^{I,S}_{c,\tau }\)    number of payer IRSs in currency c with maturity in \(\tau \) years.

Auxiliary variables:

\(z_i\) :

firm profit in scenario i;

\(y_i\) :

excess loss over VaR (at optimum) in scenario i;

\(\zeta _\alpha \) :

help variable for calculation of expected shortfall (\(=\)VaR at optimum);

\(\nu \) :

post-decision holdings in cash.

Trading in currency forwards is an agreement for a future transaction which is defined to have zero value at the initiation of the contract, and so only requires contract values to be calculated in the second stage. The values of interest rate swaps are modeled as the prices of corresponding coupon bonds, and so have non-zero values also at the initial stage.Footnote 7 All prices are expressed in SEK. For the purpose of determining expected shortfall, we calculate the expected profits at the initial stage from the pre-decision holdings in cash, the value of previously traded contracts, and the expected future cash flows. To summarize we have,

Deterministic parameters:

R :

   risk-free growth rate;

\(z_0\) :

   initial (expected) profit;

\(P^{I,L}_{c,\tau ,0}\) :

   initial price of receiver IRSs in currency c, with maturity in \(\tau \) years;

\(P^{I,S}_{c,\tau ,0}\) :

   initial price of payer IRSs in currency c, with maturity in \(\tau \) years;

\(\alpha \) :

   confidence level in calculation of expected shortfall;

h :

   pre-decision holdings in cash.

The scenario data, that together with the probabilities specify the multivariate distribution of the random variables at the end of the model horizon, are the values of long and short positions in the currency forwards and interest rate swaps, and the present value of the cash flows. We collect the present value of the cash flows and (if they exist) the values of previously traded contracts in a single random variable. Market spreads on swap rates and currency forward prices are included in the model, and create the difference in value between long and short positions. All asset values are expressed in SEK. “Appendix A” presents the exact pricing formulas.

Stochastic parameters:

\(p_i\) :

   probability of scenario i;

\(P^{F,L}_{e,\tau ,i}\) :

   value of long forward in FX rate e, with mat. in \(\tau \) years, in scen. i;

\(P^{F,S}_{e,\tau ,i}\) :

   value of short forward in FX rate e, with mat. in \(\tau \) years, in scen. i;

\(P^{I,L}_{c,\tau ,i}\) :

   value of receiver IRSs in currency c, with mat. in \(\tau \) years, in scen. i;

\(P^{I,S}_{c,\tau ,i}\) :

   value of payer IRSs in currency c, with mat. in \(\tau \) years, in scen. i;

\(b_i\) :

   present value of company cash flows in scenario i.

Motivated by the lack of a single theoretically and empirically well-founded firm objective, we study alternative functional specifications which relate firm profits and value. We define the problem for a general objective, \(f(z_i,y_i)\), and present the instances that we study below. With the given notation, the general optimization problem can be formulated as:

$$\begin{aligned}&\max \quad \sum _{i=1}^N p_i f(z_i,y_i) \end{aligned}$$
(3a)
$$\begin{aligned}&\text{ s.t. } \quad \nu = h - \sum _{c \in \mathcal{C}} \sum _{\tau =2}^{10} \left( P_{c,\tau ,0}^{I,L} x_{c,\tau }^{I,L} - P_{c,\tau ,0}^{I,S} x_{c,\tau }^{I,S} \right) \end{aligned}$$
(3b)
$$\begin{aligned}&z_i = \sum _{c \in \mathcal{C}} \sum _{\tau =2}^{10} \left( P_{c,\tau ,i}^{I,L} x_{c,\tau }^{I,L} - P_{c,\tau ,i}^{I,S} x_{c,\tau }^{I,S} \right) \nonumber \\& + \sum _{e \in \mathcal{E}} \sum _{\tau =1}^{10} \left( P_{e,\tau ,i}^{F,L} x_{e,\tau }^{F,L} - P_{e,\tau ,i}^{F,S} x_{e,\tau }^{F,S} \right) + R \nu + b_i&i = 1, \ldots , N \end{aligned}$$
(3c)
$$\begin{aligned}&x_{c,\tau }^{I,L}, x_{c,\tau }^{I,S} \ge 0&c \in \mathcal{C}, \tau = 2,\ldots , 10\end{aligned}$$
(3d)
$$\begin{aligned}&x_{e,\tau }^{F,L}, x_{e,\tau }^{F,S} \ge 0&e \in \mathcal{E}, \tau = 1,\ldots , 10 \end{aligned}$$
(3e)
$$\begin{aligned}&y \in \mathcal {Y}\end{aligned}$$
(3f)

We determine the cash holding after transactions and the profits in the different scenarios by constraints (3b) and (3c) respectively. To model the transaction costs, we use separate variables for long and short positions, which from their construction are required to be non-negative by (3d) and (3e). Finally, we use the set \(\mathcal {Y}\) in (3f) to model expected shortfall. We study three instances of the optimization model in (3), where we use different functional specifications of the objective.Footnote 8 First, the minimum variance hedge is obtained by setting

$$\begin{aligned} f(z_i) = - \left( z_i - \bar{z} \right) ^2, \end{aligned}$$
(4)

where \(\bar{z}\) is the sample mean of \(z_1,\ldots ,z_N\). Second, we study minimization of expected shortfall. As shown in Rockafellar and Uryasev (2000), this problem can be formulated as a linear program, which with our notation yields

$$\begin{aligned} f(z_i,y_i) = -\left( \zeta _{\alpha } + \frac{1}{1-\alpha } y_i \right) , \end{aligned}$$
(5)

and

$$\begin{aligned} \mathcal {Y}= \left\{ y_i | y_i = \max \left( z_0 R - z_i - \zeta _{\alpha }, 0 \right) ,\quad i = 1,\ldots ,N \right\} , \end{aligned}$$
(6)

where we have defined the loss relative to the initial wealth, \(z_0\), capitalized by the risk-free rate. In addition to these two popular measures of risk, we study the maximization of expected logarithmic profits. Being strictly concave on its whole domain, this objective function may potentially capture the aggregate impact from all financial frictions affecting the firm. Finally, we study the maximization of expected logarithmic profits given constraints on expected shortfall. Through a Lagrangian relaxation, this problem can be equivalently formulated as a trade-off between expected logarithmic profits and expected shortfall by defining

$$\begin{aligned} f(z_i,y_i) = \lambda \ln (z_i) - (1 - \lambda ) \left( \zeta _{\alpha } + \frac{1}{1-\alpha } y_i \right) , \end{aligned}$$
(7)

and with \(\mathcal {Y}\) defined as in (6). Note that \(\lambda = 0\) gives minimization of expected shortfall and \(\lambda = 1\) maximization of expected logarithmic profits. The cases with \(0< \lambda < 1\) correspond to the maximization of expected logarithmic profits given constraints on expected shortfall.

4.2 Scenario generation

Scenario generation produces a set of future states for the stochastic parameters in the stochastic programming model, and is used to represent the uncertainty inherent in the model. The stochastic parameters in the stochastic programming model are determined by the interest rate term structures in the four markets, the three exchange rates with SEK as the term currency, and the project cash flows. With three risk factors on each interest rate market, we have in total 16 random variables made up of 12 state variables, 3 exchange rates, and the order size representing the uncertainty in the project cash flows. Scenarios are generated by Monte Carlo simulation with a weekly discretization of the assumed continuous stochastic processes, where a Gaussian copula is used to preserve the historical correlation of the uncertain parameters. The modeling choices and the parameter estimation for the interest, foreign exchange, and cash flow models are discussed below, and a detailed mathematical description of the scenario generation is given in “Appendix B”.

4.2.1 Interest rate modeling

We need an interest rate model that allows us to represent the uncertainty in the term structure while capturing the important term premia. In the choice of model, a first natural requirement is arbitrage-free modeling. Affine term structure models allow this while remaining analytically tractable. An important contribution in the class of affine term structure models is the essentially affine model proposed by Duffee (2002), which is a model carefully specified to be consistent with empirical properties of term premia. Compared to the standard class of affine term structure models, the essentially affine model of Duffee (2002) improves the forecasting potential of term premia by a more flexible specification of the market prices of risk. Within this framework, we are able to model the term structure with term premia included, while still allowing modeling which uses market prices.Footnote 9

Model instance In the class of affine term structure models, a large number of model specifications are possible. As in Duffee (2002), we choose to model three state variables. This choice is motivated by the findings in Litterman and Scheinkman (1991) who show that approximately 95–99% of term structure movements can be explained by only three factors. Among the class of essentially affine models with three state variables, the Gaussian model, in which no state variables affect the instantaneous variance of the state variable vector, offers the greatest flexibility for the estimation of time varying term premia. The Gaussian model is referred to as the \(m=0\) model in Duffee (2002), where results from a statistical comparison of models estimated with three state variables verify that this model is the best at fitting time varying term premia. By imposing normalization on the Gaussian model as in Duffee (2002), which follow the setup of the canonical completely affine models in Dai and Singleton (2000), the dynamics of the model instance used in this study can be written as

$$\begin{aligned} P(X_t,\tau )&= e^{A(\tau ) - B(\tau )^{\prime } X_t}, \end{aligned}$$
(8)
$$\begin{aligned} r_t&= \delta _0 + \delta ^{\prime } X_t, \end{aligned}$$
(9)
$$\begin{aligned} dX_t&= - K X_t dt + dW_t, \end{aligned}$$
(10)

where \(P(X_t,\tau )\) is the zero coupon bond price with time to maturity \(\tau \), \(r_t\) the short rate, and \(X_t \equiv \left( X_{t,1}, X_{t,2}, X_{t,3}\right) ^{\prime }\) the state-price vector at time t. In (8), \(A(\tau )\) and \(B(\tau )\) are scalar and vector valued functions respectively. The short rate process is affine in the state-price vector \(X_t\), with \(\delta _0\) denoting a scalar and \(\delta \in \mathbb {R}^3\). The mean-reversion of the state-price dynamics in (10) is governed by the lower triangular matrix \(K \in \mathbb {R}^{3 \times 3}\), while uncertainty is generated by a three dimensional Brownian motion, \(W_t \equiv \left( W_{t,1},W_{t,2},W_{t,3} \right) ^{\prime }\), under the real-world probability measure.Footnote 10

Interest rate data Interest rate curves are estimated for Euro (EUR), Pound Sterling (GBP), Swedish Krona (SEK) and US Dollar (USD). Table 1 presents the contracts used to estimate yield curves consistent with market prices.

Table 1 Interest rate data collected from Thomson Reuters Eikon

To estimate stable interest rate curves we use Blomvall (2017),

$$\begin{aligned} \begin{array}{ll} \displaystyle \min _{f(t),z_e} &{} h(f(t)) + \frac{1}{2} z_e^\prime E_e z_e \\ \text{ s.t. } &{} g_e(f(t)) + z_e = \rho \\ &{} f(t) \ge 0 \end{array} \end{aligned}$$
(11)

where a smooth forward rate curve f(t) is estimated, and where deviations, \(z_e\), from market yields, \(\rho \), are penalized with the diagonal weighting matrix \(E_e\). The roughness is measured by h, and \(g_{e,j}\) is a function which transforms the forward rate curve f(t) to a yield for instrument j. The forward rate curve is discretized with one forward rate per day, \(\varDelta _\tau = 1/365\), and roughness is measured as a numerical approximation of the integral of the squared second order derivative of f(t),

$$\begin{aligned} h(f) = \frac{1}{2} \sum _{\tau =1}^{n_f-2} \varphi _\tau \left( \frac{2}{\varDelta _{\tau -1} + \varDelta _\tau } \left( \frac{f_{\tau +1} - f_\tau }{\varDelta _\tau } - \frac{f_\tau - f_{\tau -1}}{\varDelta _{\tau -1}} \right) \right) ^2 \frac{\varDelta _{\tau -1} + \varDelta _\tau }{2}, \end{aligned}$$
(12)

where \(n_f\) is the number of (discretized) forward rates. The main difference to traditional term structure estimation methods is that here, the infinite-dimensional optimization problem is discretized, instead of constraining the forward rate curve to specific parameterizations such as the quartic spline. With a daily discretization, all instruments can be dealt with exactly, and the behavior of the forward rate term structure can be handled by choosing \(\varphi _\tau \). The problem is a large-scale non-linear optimization problem which can be solved efficiently by algorithms such as Manzano and Blomvall (2004). Even though many starting solutions have been tried, the algorithm converges to the same optimum, which indicates that the global optimum is identified.

To estimate forward rate curves of high quality, a careful choice of the parameter values is generally required. The forward rate curve is more variable in the short end, and to capture this, a time varying \(\varphi _\tau \) is usually preferred. However, for this application, the simple constant choice \(\varphi _\tau = 1\) is sufficient, since only yearly spot rates are required and the large number of contracts in the short end makes the forward rate curve more variable in that region. The weighting of the pricing errors should depend on the amount of noise in the market prices. For the forward rate agreement and interest rate swap market, the noise is of similar level for all instruments, which makes equal weighting appropriate. If too much emphasis is on repricing the instruments, unrealistic forward rate curves will be created. If too little emphasis is on repricing the instruments there will be systematic pricing errors in the instruments, as the forward rate curve becomes too rigid. In extensive tests the choice \(E_e = 100 I\), where I is the identity matrix, has proven to be a good choice that creates realistic forward rate curves that do not contain systematic pricing errors.

Parameter estimation

Following Duffee (2002), we estimate parameters using quasi-maximum likelihood (QML). The estimation is implemented by assuming that yields with maturities 3 months, 5 years and 10 years are measured without error, while yields with maturities 6 months, 3 years and 7 years are assumed to be measured with error. The choice of maturities is motivated by the desire to achieve uniform distribution over the term structure.

The QML estimation technique produces a non-convex optimization problem with a large number of local maxima. We implement the optimization problem in AMPL, using IPOPT to determine optimal parameters. Following Duffee (2002), starting solutions are randomly generated from a multivariate normal distribution with diagonal covariance matrix, and with mean and variances set to ’plausible’ values. Using this technique, 10,000 starting solutions are generated for each market. The 20 starting solutions with the highest QML values are then used for optimization in AMPL. The solution on each market with the highest QML value, after optimization in AMPL, is used as model parameters. Table 2 reports estimated parameters for the Gaussian essentially affine term structure models along with the QML values for each market.

Table 2 Parameters from quasi maximum likelihood estimation of the normalized Gaussian essentially affine term structure model in (8)–(10)

We identify four main sources of difficulties in estimating the parameters. First, the QML optimization problem is non-convex, giving rise to a large number of local maxima, some of which produce unrealistic model properties. Different starting solutions are likely to produce a diverse set of parameter estimates, but with similar QML values, which give significantly different model properties. We handle this by excluding parameter estimates that produce unrealistic model properties in terms of expected future interest rates, as discussed below. Second, the estimation method used, following Duffee (2002), where three yields (\(n=3\)) are assumed to be measured without error, while the other yields in the data set are assumed to be measured with error, is not economically well justified. Different choices of the set of trusted yields produce different parameter estimates, and there is no justification for a certain choice of the set of trusted yields. However, the estimation method provides a way of taking the information contained in more than n yields into consideration. Due to the risk of an ill-conditioned measurement error covariance matrix, the number of yields with measurement error included is, as in Duffee (2002), restricted to three. Third, the length and the quality of the raw data is critical for the parameter estimation. We use weekly observations of interest rate data on the four markets over a period of approximately 16 years, compared to the monthly observations of U.S. data over a period of almost 47 years used by Duffee (2002) (which is based on McCulloch and Kwon 1993; Bliss 1996). Fourth and finally, the structure of the mean reverting Gaussian essentially affine term structure model implies a long run expected short rate. This varies significantly in different local maxima and it is independent of the current state, making the expected short rate estimation critically dependent on the sample period used, as discussed below. The expected short rate at time T conditional on \(r_t\) follows from (9) as

$$\begin{aligned} E[r_T|r_t] = \delta _0 + \delta ^{\prime } E[X_T|X_t], \end{aligned}$$
(13)

where the conditional expectation of \(X_T\) given \(X_t\), derived in Duffee (2002), is given by

$$\begin{aligned} E[X_T|X_t] = \exp {(-K(T-t))}X_t, \end{aligned}$$
(14)

and where \(\exp {(-K(T-t))}\) is the matrix exponential. Stationarity requirements on the feedback matrix K, i.e., requirements of positive eigenvalues for K, imply that \(E[X_T|X_t] \rightarrow 0\) as \(T \rightarrow \infty \). From (13) and (14) we then obtain that \(E[r_T|r_t] \rightarrow \delta _0\) as \(T \rightarrow \infty \). To guarantee realistic model properties, the long run expected short rate is used as a filter in the selection of starting solutions. A comparison of expected future short rates and forward rates for EUR, GBP, SEK and USD on July 26th, 2013, as implied by the parameter estimates in Table 2, is presented in Fig. 3.

Fig. 3
figure 3

The forward rate curves (\(\%\)) in ad are calculated from observed market prices of FRAs and IRSs on July 26th, 2013 using the model in (11), with \(E_e=100I\) and \(\xi _t=1/365\). Expected short rates are given by (13) using the parameter estimates in Table 2

The main point of interest in Fig. 3 is whether the expected short rates, or the term premia, represented by the difference between forward rates and expected short rates, are reasonable representations of market expectations. There is no universal answer to this question, but there is empirical evidence to help us understand the properties of term premia. The documentation of term premia, which is closely connected to tests of the expectations hypothesis, is a well investigated area of research. Early references documenting the existence of term premia and rejecting the expectations hypothesis for the U.S. market include Fama and Bliss (1987) and Campbell and Shiller (1991). However, for the short end of the yield curve, there is evidence for term premia being close to zero (see e.g. Longstaff 2000). In a cross-country study by Wright (2011), term premia are documented for ten industrialized countries, with the four markets in this study included.Footnote 11 Wright (2011) finds that term premia have declined globally over the period 1990–2009, and even turned negative in some countries. The term premia decline is to a large extent explained by monetary policy changes during the sample period, which have reduced the uncertainty about future inflation. The supporting evidence for a global movement from high to low interest rate economies with declining term premia, suggests weaknesses in the model defined in Duffee (2002) when using a sample period covering different monetary regimes. The long-run expected short rate, \(\delta _0\), is sample period sensitive, as it acts as a mean reverting level for the chosen estimation period. This motivates using the level of \(\delta _0\) as a filter to reduce the effect of the identified model weakness. To summarize, empirical evidence suggests that the forward rate and expected short rate curves in Fig. 3 should almost coincide for short term maturities. However, for long term maturities, non-zero term premia have empirical support. The results produced by the essentially affine term structure model imply that only USD shows a close match between forward rates and expected short rates in the short end. For longer maturities, all markets have positive term premia, expressed as the difference between the forward rates and expected short rates.

4.2.2 Foreign exchange rate modeling

The literature on empirical properties of foreign exchange rates provides evidence that (at least) two important properties should be considered for the purpose of foreign exchange risk management. First, as established in e.g. Westerfield (1977), and later Osler and Savaser (2011), the return distribution of foreign exchange rates exhibit excess kurtosis, i.e. possess fat tails compared to the normal distribution. Second, the carry trade literature documents that the distributions of foreign exchange rates are conditionally skewed, with significant negative skewness for high-interest-rate currencies and positive skewness for low-interest-rate currencies (see e.g. Brunnermeier et al. 2008). Motivated by these empirical facts, we model the foreign exchange rates by discrete time Poisson jump-diffusion models with stochastic volatility. The dynamics of foreign exchange rate \(e \in {\mathcal {E}}\), \(f_{e,t}\), are given by

$$\begin{aligned} \begin{aligned} f_{e,t+\varDelta t}&= f_{e,t} \exp {\left( \mu _e \varDelta t + \sigma _{e,t} \epsilon _{e,t} \sqrt{\varDelta t} + \alpha _e \sigma _{e,t} \sqrt{z_{e,t}} \xi _{e,t} \sqrt{\varDelta t} \right) }, \\ \sigma _{e,t + \varDelta t}^2&= \beta _{e,0} + \beta _{e,1} \sigma _{e,t}^2 + \beta _{e,2} \frac{\left( \ln \frac{f_{e,t+\varDelta t}}{f_{e,t}} - \gamma _e \varDelta t \right) ^2}{\varDelta t}, \end{aligned} \end{aligned}$$
(15)

where the random variables \(\epsilon _{e,t}, \xi _{e,t} \sim N(0,1)\), \(z_{e,t} \sim Po(\lambda _e \varDelta t)\), and \(\alpha _e\) represents the relative impact of a Poisson jump. The Poisson process describes the arrival of an uncorrelated normally distributed random variable with standard deviation \(\alpha _e \sigma _{e,t} \sqrt{\varDelta t}\), which creates fat tails. The skewness of returns derives from the asymmetric update of variance with \(\gamma _e \varDelta t\). The (conditional) cumulative distribution function (cdf) for the logarithmic returns of exchange rate \(e \in {\mathcal {E}}\) over \([t,t+\varDelta t]\) is given by

$$\begin{aligned} F_{e,t}(x) = \sum _{k=0}^{\infty } F_N\left( \frac{x - \mu _e \varDelta t}{\sigma _{e,t} \sqrt{(1 + \alpha _e^2 k)\varDelta t} }\right) \frac{(\lambda _e \varDelta t)^k}{k!} \exp \left( -\lambda _e \varDelta t \right) , \end{aligned}$$
(16)

where \(F_N\) is the standard normal cdf.

We use maximum likelihood estimation to determine the parameters of the foreign exchange rate processes in (15). Due to non-convexities, we have solved the problem for a large number of starting solutions to obtain the parameter estimates in Table 3. An interesting point to take from the estimation results is that \(\gamma _e\) is negative for all \(e \in {\mathcal {E}}\). This implies that all exchange rates, expressed with SEK as the term currency, are estimated to have positive skewness. The implication of this is that periods for which SEK is weakened coincides with an increased volatility. During stressed market conditions there tends to be an increased flow of capital to large currencies (safe havens), which provides economic intuition for these results. The estimated values of \(\alpha _e\) for EUR/SEK and USD/SEK in Table 3, imply that one Poisson jump corresponds to an increased volatility (on that particular day) of approximately \(\sqrt{1+1.38^2} - 1 \approx 70\% \). The parameters \(\lambda _e\) represent the expected number of jumps per year, which is approximately 14 for EUR/SEK and GBP/SEK, and just over 3 jumps per year for USD/SEK.

Table 3 Parameters for the Poisson jump-diffusion models in (15), which have been determined using maximum likelihood estimation on weekly data over the period from September 19th, 1997 to July 26th, 2013

From a quantile–quantile plot we can investigate how well the Poisson jump-diffusion model fits historical data. Figure 4 illustrates the improved fit when using the Poisson jump-diffusion model compared to a log-normal distribution, exemplified by the EUR/SEK exchange rate.

Fig. 4
figure 4

Quantile–quantile (QQ) plots constructed from weekly data for EUR/SEK over the period from September 19th, 1997 to July 26th, 2013. a The QQ-plot for the historical log-returns. b Illustrates the fit of the Poisson jump-diffusion model and is constructed by a transformation of historical log-returns using the Poisson jump-diffusion cdf in (16), followed by an inverse transformation using the standard normal cdf

Panel (a) of Fig. 4 clearly illustrates the fat tails usually observed in foreign exchange rates, and we can see from panel (b) that the Poisson jump-diffusion model provides a good description of historical EUR/SEK movements. Based on quantile–quantile plots, the empirical distributions of GBP/SEK and USD/SEK are also significantly better described by the Poisson jump-diffusion model than by the log-normal distribution.

4.2.3 Cash flow modeling

The mechanism underlying uncertainty in the cash flows is the update of the order-size forecast provided by the customer. We model the change in the order-size forecast as a log-normally distributed random variable. This scales the cash flow forecast in currency \(c \in \mathcal {C}\) for time \(\tau \in {\mathcal {T}}\) from project \(j=1,\ldots ,M\) as

$$\begin{aligned} C_{c,t+\varDelta t,\tau ,j} = C_{c,t,\tau ,j} \exp \left( -\frac{\sigma ^2_{\tau ,j}}{2} \varDelta t + \sigma _{\tau ,j} \sqrt{\varDelta t} \xi _{\tau ,j} \right) \end{aligned}$$
(17)

over \([t,t+\varDelta t]\), where \(\sigma _{\tau ,j}\) is the order-size volatility, and \(\xi _{\tau ,j} \sim N(0,1)\). The present value of (expected) future cash flows can be determined as

$$\begin{aligned} b =&\sum _{\tau \in {\mathcal {T}}} d_{\text {SEK},t+\varDelta t,\tau } \left( D_{\text {SEK},t,\tau } + C_{\text {SEK},t+\varDelta t,\tau } \right) \\&+ \sum _{e \in {\mathcal {E}}} f_{e,t+\varDelta t} \sum _{\tau \in {\mathcal {T}}} d_{c_b(e),t+\varDelta t,\tau } \left( D_{c_b(e),t,\tau } + C_{c_b(e),t+\varDelta t,\tau } \right) , \end{aligned}$$

where \(D_{c_b(e),t,\tau }\) is the (deterministic) cash flows resulting from previously traded hedging instruments.

5 Numerical results

In this section we present numerical results based on the optimization framework in Fig. 2. We start by describing details of the problem instances studied, and then discuss properties of the optimal hedging decisions resulting from in-sample analysis. This is followed by an investigation of the out-of-sample performance for the model.

5.1 Problem details

The aim of the numerical experiments is to investigate properties of the optimal hedge from the stochastic programming model in (3) given the alternative specifications of the objective function in (4), (5), and (7), and with variations in model parameters. We use the in-sample analysis to investigate the properties of the optimal hedge given different firm objectives, varying asset universe, and with and without uncertainty in the cash flows. The aims of the out-of-sample analysis are to test if the results obtained in-sample can be validated, and to investigate the robustness of the model. For the in-sample as well as the out-of-sample analysis, we consider the following variations of model specifications and parameters:

  • objective function defined in terms of variance, (4); expected shortfall (given \(95\%\) confidence level), (5); and mean log project value given constraints on expected shortfall, (7);

  • a single project (\(M=1\)), and order size volatility, \(\sigma _{\tau ,1} = 0\), or \(\sigma _{\tau ,1} = 0.05\) \(\forall \tau \in \mathcal {T}\);

  • asset universe with all assets (66 contracts), 1-year currency forwards (3 contracts), and 1-year currency forwards with 5-year IRSs (7 contracts).

The in-sample analysis is based on optimal hedging decisions determined on July 26th, 2013, with a 1-week hedging horizon. The out-of-sample analysis starts with the optimal solutions from the in-sample analysis and runs over 191 weeks ending on March 27th, 2017. Parameters for the stochastic processes are estimated using weekly data from September 19th, 1997 to July 26th, 2013. Hence, parameter estimates are kept fixed over the whole out-of-sample period. The only exceptions are the GARCH volatilities in the Poisson jump-diffusion models for the foreign exchange rates, which are updated on the basis of realized market returns.

All trading in the hedging instruments induces transaction costs, modeled as a spread of 80 percentage in point (‘pip’) for the currency forwards, and 2 basis points for the interest rate swaps.Footnote 12 We assume that the case company starts with unhedged cash flows and no cash holdings. The initial project value, corresponding to the present value of expected future cash flows, is 1334 million SEK (MSEK) at the date of the (first) hedging decision. We use 10,000 scenarios in all studied hedging problems, which have been chosen to maintain simulation efficiency while implying in and out-of-sample stability as defined in Kaut et al. (2007).Footnote 13 For the purpose of reducing the variance in the scenarios, we sample with the latin hypercube technique and use antithetic variates (see e.g. Glasserman 2013). We use MATLAB for scenario generation, computations, and data handling, while the optimization problems are modeled in AMPL and solved with CPLEX and IPOPT.

5.2 In-sample analysis

We begin the in-sample analysis by studying the situation faced by the risk manager. This is illustrated in Fig. 5. The blue-yellow bars in panel (e) [and (f)] show the expected cash flows in the four currencies at the time of the hedging decision in units of respective currency. Panels (a–d) illustrate the distribution of the cash flows over a 1-week horizon implied by uncertainty in exchange rates, interest rates, and cash flows. The distributions are presented in terms of the present value in SEK and represent the set of scenarios used in the in-sample analysis.

Fig. 5
figure 5

Project and hedge portfolio cash flows. ad The distributions of the present value of the cash flows (in SEK) over a 1-week horizon given stochastic cash flows. The central mark in the boxes is the median, the bottom and top edges represent the 25th and 75th percentiles, and the \(+\) signs beyond the outer marks represent ‘outliers’ (see MATLAB, boxplot()). e, f The expected cash flows in the four currencies (blue-yellow bars) along with the cash flows resulting from the optimal hedge portfolios (variance—black, expected shortfall—red), given deterministic and stochastic cash flows respectively

In the first numerical experiment, we compare the optimal hedge portfolios resulting from the minimization of variance and expected shortfall, given an asset universe with all 66 hedging instruments, and with and without uncertainty in the project cash flows. In this first numerical test, we add position constraints which limit the risk manager from taking both long and short positions in the same contracts.Footnote 14 The motivation for the position constraints is the fact that variance is location-invariant. This implies that the objective is indifferent to the level of the mean, and hence it may be optimal to “destroy” project value by taking long and short positions in the same contracts.Footnote 15 On the other hand, the location-invariance makes variance a suitable benchmark measure in the model set-up with premia and transaction costs, as it is indifferent to the cost of hedging. To highlight structural differences in the two hedging strategies, we present the cash flows resulting from the optimal hedge portfolios together with the project cash flows in panels (e) and (f) of Fig. 5, for the case with deterministic and stochastic cash flows respectively. We can see that for both cases, the variance strategy (black) implies (approximate) cash flow matching, while the ES strategy (red) produces (partly) off-setting exposure only for a subset of the cash flows.

Another way to investigate the properties of the hedging problems studied is to analyze the distributions of the project values. These are presented in Fig. 6. Panels (a) and (c) show the distributions for the unhedged project given deterministic and stochastic cash flows respectively. We see that the risk for the project, if left unhedged, is significant for both cases. The standard deviations are 2.64, and \(2.73 \%\) over the 1-week horizon for the deterministic and stochastic cases respectively. We can also observe that the shape of the distribution changes only slightly as uncertainty (\(\sigma _{\tau ,j} = 5\%\), yearly) is added to the project cash flows. We next examine the distributions implied by the optimal hedge portfolios. These are presented in panels (b) and (d) for the cases with deterministic and stochastic cash flows respectively. Starting with panel (b), we see that the variance strategy is able to eliminate all variation in project value. Given deterministic cash flows, this result is expected as long as the asset universe is rich enough to allow perfect cash flow matching. Examining the ES strategy, we see that the resulting distribution has a non-zero variance, but importantly, a higher project value than the variance strategy in all scenarios. For the case with stochastic cash flows presented in panel (d), none of the strategies can eliminate all variation in project value. The distribution implied by the variance strategy shows a slightly lower dispersion in project values but is again shifted to the left, relative to the ES strategy, and hence produces a hedge with higher expected costs.

The analysis of cash flow structure and project value distribution undertaken so far suggests that the ES strategy is selective in the choice of hedging instruments, and that the expected cost is an important criterion in determining the optimal hedge. On the other hand, the variance strategy seems to produce hedge portfolios that are more costly, but as expected, imply lower variation in project value.

Fig. 6
figure 6

Distributions of unhedged and optimally hedged cash flows. a, c The distributions of project value (present value of expected cash flows in SEK) in 1 week given unhedged cash flows, with and without uncertainty in the cash flows respectively. b, d The distributions resulting from minimization of variance and expected shortfall

To further investigate properties of the optimal hedge portfolios, we present in Table 4 information about risk exposure broken down into risk factors, the number of (unique) instruments in the hedge portfolios, and the change in project value implied by the optimal hedges.

Table 4 Risk exposure statistics for the unhedged (UH), variance (Var), and expected shortfall (ES) strategies

The hedgeable risk in the problems studied is composed of currency and interest rate risk. A majority of the interest rate risk can be attributed to shifts in the yield curve and is commonly measured as duration. The currency risk can be eliminated simply by matching the initial exposure, and the risk coming from shifts in the yield curve is hedged by creating a portfolio with off-setting duration. We present properties of the optimal hedge portfolios resulting from variance (Var), and ES minimization along with the unhedged case (UH), i.e. the project cash flows, in Table 4. To capture the risk from shifts in the yield curve, we study the Fisher–Weil durationFootnote 16 which is defined as

$$\begin{aligned} D = \frac{1}{P} \sum _{i=1}^{N} t_i c_i e^{-y_i t_i}, \end{aligned}$$

where \(c_i\) is the cash flow at time \(t_i\), \(y_i\) the corresponding continuously compounded spot rate, and P the present value of the set of cash flows.

Starting with the variance strategy, we see from Table 4 that: (i) the currency and duration risks are well hedged; (ii) all available contracts are entered; and (iii) the hedge is costly when compared to the ES strategy. On the other hand, the ES strategy leaves exposure to the major risks, enters only a subset of the available hedging instruments, and has low(er) expected costs. These observations confirm the results obtained so far, but raise a new question; namely why it is optimal to remain exposed to the major risks under the ES strategy, specifically given non-zero cash flow volatility. To help answer this question, we present supplementary results in “Appendix C”. Table 6 adds results for cash flow volatility of 1, 3, and \(10\%\), Table 7 presents the composition of the optimal hedge portfolios, and Table 8 shows the expected returns of the different hedging instruments. Based on these results the pattern is clear; with increased cash flow volatility the ES strategy produces larger positions in contracts with (relatively) high expected returns, implying that the (expected) project value increases, but necessarily, also creates residual exposure to the major risks.

To understand the rationale for this, recall first that ES can be reduced by shifting the distribution. Note also that the cash flows are assumed to be uncorrelated with the other risk factors. Hence, with increased cash flow volatility, the marginal contribution to the total risk from foreign exchange and interest rates decreases. This effect allows larger positions in contracts with (relatively) high expected returns, that positively affect ES, to be entered. The optimal hedge portfolio captures the trade-off between the positive effect on ES from shifting the distribution, and the negative effect of more tail-events. Finally, we note that the relative deviation in currency exposure is significantly larger in EUR and GBP than in USD. However, the relevant measure of the effect on project value, and thus on ES, is the absolute deviation.

A lesson to learn from this numerical example is that variance and expected shortfall may produce structurally different hedge portfolios when the set of hedging instruments induces different costs. We have seen that the variance strategy eliminates the major risks, and that this is achieved by trading in a large set of hedging instruments, possibly at a (relatively) high cost. On the contrary, the ES strategy sets up a hedge that carefully considers the expected cost, and potentially leaves exposure to major risks. In the problem instance studied, the difference in expected costs between contracts comes from varying term premia, but may more generally be a consequence of e.g. non-homogeneous transaction costs or liquidity premia.

Now we leave the comparison of the variance and ES strategies and go on to analyze optimal hedging given firm objective defined in terms of mean log-project value, and with constraints on ES. We analyze the risk-return profile resulting from this problem formulation, and in addition, we investigate the impact of the asset universe available to the risk manager. For this purpose, we study three different cases with asset universe containing; (i) 1-year currency forwards (3 assets), (ii) 1-year currency forwards and 5-year IRSs (7 assets), and (iii) 1–10 year forwards and 2–10 year IRSs (66 assets). The first case allows the major risk, i.e. the currency risk, to be hedged, but gives little flexibility in hedging interest rate risk. By adding 5 year IRS some, but not all of the interest rate risk can be hedged. Note also that with only one contract per market and asset type, there is little flexibility for choosing hedging instruments based on their expected cost. The case with all assets is set-up to be flexible enough to hedge the currency and interest rate risk while allowing for a selective choice of hedging instruments.

Compared to the previous analysis, we here focus solely on the case with stochastic cash flows (\(\sigma _{\tau ,1} = 0.05\))Footnote 17 and we remove the position constraints that were used to handle the location-invariance of the variance strategy. We study the maximization of mean log project value for different upper limits on ES and present this in terms of the equivalent Lagrangean formulation given in (7), with \(\lambda \in [0,1]\). The extreme case with \(\lambda = 0\) corresponds to the most restrictive limit on ES for which a feasible solution exists, and is equivalent to minimization of ES. The other extreme, namely \(\lambda = 1\), corresponds to a limit on ES that is loose enough to make the constraint redundant, and is equivalent to maximization of the mean log project value. To illustrate the relation between risk and hedging costs for different limits on ES, we present the efficient frontier as the trade-off between ES and the increase in (expected) project value. The results for the full asset universe and the two restricted versions, together with the unhedged case, are presented in Fig. 7.

Fig. 7
figure 7

The efficient frontier determined using (3), with objective function (7), for \(\lambda = 0,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.95,0.98,1\). Project cash flow scenarios are generated using (17) with \(\sigma _{\tau ,1} = 0.05\). The magnified area shows results for \(\lambda \) up to 0.9, 0.95, and 0.98 for the cases with all assets, 8 assets and 3 assets respectively. The increase in project value is calculated as the difference between the expected project value in 1 week given the optimal hedging decision, and the project value at the starting date, 1334 MSEK

We see that the efficient frontier generated from the case with access to the full asset universe clearly dominates the restricted cases. Focusing first on the risk reduction potential, we note that using all the hedging instruments available ES can be reduced by \(95.3\%\) compared to the unhedged case (from 76.8 to 3.6 MSEK). Comparing this to the restricted case with 1-year currency forwards only, the upper limit on risk reduction in terms of ES is \(62\%\) (to 29.2 MSEK). Adding 5-year IRSs to the menu of hedging instruments allows ES to be reduced by approximately \(69\%\) (to 24.2 MSEK). Examining next the risk-return profiles, we see that access to a richer asset universe offers additional flexibility for the risk manager to choose a cost-efficient hedge. This is illustrated by the increased slope of the efficient frontier as more hedging instruments are added to the asset universe. Note however that the upper-right areas of the efficient frontiers, corresponding to unconstrained maximization of mean log-project value, produce portfolios with large risk. For the case with access to the full asset universe, the resulting ES is almost \(34\%\) of the initial project value, which is a risk level that is unlikely to be acceptable to most corporate risk managers. From the perspective of practical risk management, we argue that the relevant formulations are the cases with (low) limits on ES. In the out-of-sample analysis that follows, we will therefore focus on the cases producing low risk in the project, but still investigate the impact of slightly looser limits on ES.

5.3 Out-of-sample analysis

The in-sample analysis provided insights about properties of the optimal hedge under different model specifications. First, comparing the ES and variance strategy, we observed that the latter produced hedge portfolios containing more assets and with higher expected hedging costs. We also learned from the in-sample analysis that the asset universe made available to the risk manager can have a significant impact on the potential to set up a cost efficient hedge that reduces risk properly. Finally, we learned that the allowed level of risk can have a substantial impact on the expected hedging cost.

With the out-of-sample analysis, we aim to test on out-of-sample data, (i) whether these findings can be validated, and (ii) if the proposed model is robust enough to be a candidate for practical decision support. The out-of-sample tests are based on the realized foreign exchange and interest rates over the studied period, along with simulated trajectories for the project cash flows.Footnote 18 The procedure starts with the optimal hedging decision determined on July 26th, 2013. We then move forward 1 week in time, and with reference to (3), update the pre-decision holdings in cash, h, and the cash flow vector b. We handle hedging positions as the equivalent set of cash flows and collect them in the vector b along with the expected project cash flows. We assume that the asset universe available to the risk manager is the same in every period, which is a relevant assumption considering how OTC contracts are quoted in the market. Hence, previously entered contracts can not be traded. The project cash flows are assumed to have fixed dates, and so get closer for each week the clock advance.

In the analysis that follows, we make comparisons with the in-sample results which are based on the objectives in (4), (5) and (7). As discussed in Sect. 5.2, the location-invariance property makes variance an improper objective in the studied environment with transaction costs, and given access to both long and short positions. As we move from a single period to sequential decisions, position constraints are no longer appropriate, as they limit the flexibility needed to control the risk under changing market conditions. To handle the implications of the location-invariance while still allowing comparisons with the in-sample results, we study instead the mean squared deviation from the (current) project value capitalized by the risk free rate.Footnote 19

As in the in-sample analysis, we begin by investigating ES and (modified) variance, and we focus first on the case with deterministic cash flows. Figure 8 shows the realized project value trajectories given the variants of the asset universe studied in the in-sample analysis. We also present the standard deviation, \(\sigma \) (yearly), of the relative changes in project value along each trajectory.

Fig. 8
figure 8

Out-of-sample trajectories of project value resulting from solving (3) sequentially over the out-of-sample period of 191 weeks, given minimization of a modified variance, and b ES, for different sets of hedging instruments, given deterministic cash flows. The project value volatilities, \(\sigma \) (yearly), are the standard deviations of the relative change in project value over each trajectory

We note first that both strategies produce trajectories with (relatively) low variation in project value, which is a first requirement for the proposed model to be considered as robust. As a comparison, the unhedged portfolio value has a volatility of \(11.3\%\) (yearly) over the same period. Examining next the impact of the asset universe, we see that the risk, in terms of variation in project value, decreases significantly as the number of available hedging instruments is increased. This validates the results obtained in-sample of the importance of having access to a large-enough asset universe to properly control risk. Comparing the trajectories from the two strategies pairwise for each asset universe, the ES strategy produces higher project values in the final period in all cases.Footnote 20 Hence, over the studied period, the hedging cost resulting from using the ES strategy would have been lower relative the variance strategy, irrespective of the asset universe available for hedging. We note, however, that more data is needed to show statistically significant differences in expected hedging costs.

In Fig. 9, we show the absolute nominal values (in SEK) entered in the different hedging instruments over all out-of-sample periods for the case with the full asset universe. The nominal values correspond to the number of contracts entered in each hedging instrument times the prevailing foreign exchange rate in the respective market (with SEK as the term currency).

Fig. 9
figure 9

The nominal value of entered hedging instruments (in SEK) in period \(1,\ldots ,191\), in the 66 hedging instruments (1–10 year currency forwards, F, and 2–10 year IRSs, I), given minimization of a variance, and b expected shortfall

Comparing the concentration of bars in panels (a) and (b) of Fig. 9, we see that the variance strategy implies that significantly more instruments are entered to control the risk. The average number of contracts entered per period is 61.2 for the variance strategy while it is only 6.96 for the ES strategy. Hence, the observation in the in-sample analysis that the variance strategy trades in significantly more instruments is validated in this first out-of-sample experiment. As expected, the largest positions are entered in the first period when the cash flows are still unhedged. This is most pronounced for the ES strategy, for which the first-period hedging constitutes \(89.4\%\) of the total nominal hedging value aggregated over all periods.

In the second numerical experiment we add cash flow uncertainty by generating scenario trajectories for the project cash flows, and study these along with the realized market prices on foreign exchange and interest rates. In Table 5, we present summary statistics for the variance and ES strategies given 50 scenario trajectories for the stochastic cash flows.

Table 5 Out-of-sample statistics given uncertain cash flows

Consistent with the results obtained so far, we see that: (i) the proposed optimization model shows evidence for being robust in handling risk; (ii) hedging with the variance strategy induces trading in many more instruments than does the ES strategy; (iii) the number of hedging instruments made available to the optimization model has a clear impact on the potential to reduce risk.

The final numerical experiment studies the maximization of mean log project value given constraints on ES for the case with deterministic project cash flows. Panel (a) of Fig. 10 presents the out-of-sample trajectories resulting from applying (3) with objective function (7), for different limits on ES. Panel (b) shows the differences in trajectory values relative the ES minimization case.

Fig. 10
figure 10

Out-of-sample trajectories resulting from (3) with the objective function in (7). a The realized project value trajectories given \(\lambda = 0,0.2,0.3,0.4,0.5\). b The performance of the different objectives relative the ES minimization case (\(\lambda = 0\))

We focus on the cases with relatively strict limits on ES, corresponding to \(\lambda \le 0.5\). Although the project values resulting from looser limits on ES (\(\lambda > 0.5\)) proved to produce higher project values for the period studied, the resulting hedge implies increasingly riskier portfolios, and is unlikely to be acceptable to a risk manager. We recall from the in-sample analysis, that unconstrained maximization of mean log project value is associated with large risks. By comparing the trajectories in Fig. 10, we see from panel (a) that a looser constraint on ES produces a (slightly) higher volatility in project value. On the other hand, as highlighted in panel (b), the slightly increased risk implies a reduced hedging cost, which is consistent with the risk-return profile from the in-sample analysis in Fig. 7. We note again, however, that the amount of data is insufficient to obtain statistically significant differences in expected hedging costs.

6 Concluding remarks

In this study, we have developed a framework for the optimal hedging of foreign exchange and interest rate risk given uncertain cash flows in multiple currencies. We carefully consider the environment faced by the risk manager in that we include transaction costs, the empirically well documented term premia, non-normal foreign exchange rates, and trading at market prices. We study optimal hedging given three alternative objective functions: namely variance; ES; and mean log profits with limit on ES.

The numerical results show that (i) the choice of objective function can have significant implications on the composition of the optimal hedge, the resulting risk, and the hedging costs, (ii) the size of the asset universe made available to the risk manager is important for the flexibility to control risk, and to set up a cost-efficient hedge, (iii) the expected cost of different hedging instruments, governed by term premia and transaction costs, is a fundamental determinant of the optimal hedge, and (iv) the model is robust when applied to out-of-sample data.

The proposed stochastic programming framework offers important flexibility, in that model components can be easily exchanged and modeling assumptions re-examined, in order to add even more realism to the model. As the framework provides optimal portfolio holdings as well as concrete measures of risk, it has the potential to serve as a useful and flexible decision support tool for risk managers.