A Neural Network Approach to Value R&D Compound American Exchange Option

In this paper we show as the neural network methodology, coupled with the Least Squares Monte Carlo approach, can be very helpful in valuing R&D investment opportunities. As it is well known, R&D projects are made in a phased manner, with the commencement of subsequent phase being dependent on the successful completion of the preceding phase. This is known as a sequential investment and therefore R&D projects can be considered as compound options. In addition, R&D investments often involve considerable cost uncertainty so that they can be viewed as an exchange option, i.e. a swap of an uncertain investment cost for an uncertain gross project value. Finally, the production investment can be realized at any time before the maturity date, after that the effects of R&D disappear. Consequently, an R&D project can be considered as a compound American exchange option. In this context, the Least Squares Monte Carlo method is a powerful and flexible tool for capital budgeting decisions and for valuing American-type options. But, using the simulated values as “targets”, the implementation of a neural network allows to extend the results for any R&D valuation and to abate the waiting time of Least Squares Monte Carlo simulation.


Introduction
R&D investments are considered an important driving force for the growth of the modern economy. For the analysts, is very important to value these investments considering their uncertainty. As it is well-know, R&D projects are characterized by the sequentiality of their investments and by the flexibility to realize the production investment at any time before the expiration time of R&D innovation. In this scenario, the real option approach can capture these aspects, unlike the Net Present Value (NPV) and the Internal Rate of Return (IRR) that underestimate R&D projects. In particular, the R&D projects can be considered as compound American exchange option (CAEO) in which both the gross project value and the investment cost are uncertain. Papers that deal with exchange option valuation are Margrabe (1978), McDonald and Siegel (1985), Carr (1988), Carr (1995), Armada et al. (2007) and so on. In particular way, McDonald and Siegel (1985) value a simple European exchange option, Carr (1988) develops a model to price a compound European exchange option while Armada et al. (2007) propose a Richardson extrapolation in order to value a simple American exchange option.
These models consider that assets distribute ''dividends'' that, in real options context, are the opportunity costs if an investment project is postponed Myers (1977).
However, the analytical computation of CAEO is more difficult and it is convenient to implement a numerical method. Numerical approximation is therefore an important task as witnessed by the contributions of Tilley (1993), Barraquand and Martineau (1995), Broadie and Glasserman (1997).
The first goal of our paper is to implement a Monte Carlo methodology in order to value a CAEO applied in the context of R&D investment. To realize this objective, based on Cortelezzi and Villani (2009) and Villani (2014), we present the Least Square Monte Carlo (LSM) proposed by Longstaff and Schwartz (2001) in order to value the CAEO. Despite this approach is valid in term of accuracy, the time required for the simulation of this kind of option is very long. Consequently, the second aim of this paper is to build a neural network architecture based on Back Propagation (BP) system using the simulation results as ''targets'' in the learning phase. As there is not a market valuation of CAEO, the advantages of this approach are first of all the speed and the accuracy in the computations and, after that, the possibility to extend the trained neural network to value any R&D investment project. To highlight our method, we compare BP approach with Radial Basis Function (RBF) and General Regression (GRNN).
The computing power has allowed nonlinear methods to become applicable to modeling and forecasting a host of economic and nancial relationships. Neural networks, in particular, have been applied to many of these empirical cases. For instance, Aminian et al. (2006) compare the predictive power of the linear regression model against the fully generalized nonlinear neural network, with the improvement exposing the degree of nonlinearity present in the relationship investigated. Their study uses neural networks as an efcient nonlinear regression technique to assess the validity of linear regression in modeling nancial data. Andreou et al. (2006) show that the artificial neural network models with the use of the Huber function outperform the ones optimized with least squares. Eskiizmirliler et al. (2020) approximate the unknown function of the option value using a trial function, which depends on a neural network solution and satisfies the given boundary conditions of the Black-Scholes equation. Arin and Ozbayoglu (2020) develop hybrid deep learning based options pricing models to achieve better pricing compared to Black-Scholes. The results indicate that the proposed models can generate more accurate prices for all option classes. RBF method as a meshless technique is suggested to solve time fractional Black-Scholes model for European option pricing problem Golbabai et al. (2019).
The literature that studies the real option in the neural network context is not very extensive. For instance, Ma (2016) based on real options method to construct a petroleum exploration and development projects, select the appropriate option pricing method and instance data analyzed by gas exploration and point out that the application of real option method can effectively improve the investment project evaluation. Moreover, Taudes et al. (1998) propose to use neural networks to value options approximating the value function of the dynamic program showing for each mode of operation the current state as input and yielding the mode to be chosen as output.
The paper is organized in this fashion. Section 2 presents the structure of an R&D investment and its evaluation in term of real option while, Sect. 3, illustrates the valuation of CAEO using the LSM approach. Moreover, the implementation of the neural network architecture is realized in Sect. 4 and some numerical applications are proposed in Sect. 5. Finally, Sect. 6 concludes.

R&D Structure as Real Option
In this section, we present a two-stage R&D investment which structure is the following: R is the research investment spent at initial time t 0 ¼ 0; IT is the investment technology to develop innovation payed at time t 1 ; D is the production investment in order to obtain the R&D project's value and V is the R&D project value. Let assume that IT ¼ qD is a proportion q of asset D, so it follows the same stochastic process of D and the production investment D can be realized between t 1 and T. In particular way, investing R at time t 0 , the firm obtains a first investment opportunity that can be value as a CAEO denoted by CðS k ; IT; t 1 Þ. This option allows to realize the investment technology IT at time t 1 and to obtain, as underlying asset, the option to realize the market launch. Let denote by S k ðV; D; T À t 1 Þ this option value at time t 1 , with maturity date T À t 1 and exercisable k times. In detail, during the market launch, the firm has got another investment opportunity to invest D between t 1 and T and to receive the R&D project value V. Specifically, using the LSM approach, the firm must decide whether to invest D or to wait at any discrete time s k ¼ t 1 þ kDt, for k ¼ 0; 1; 2; Á Á Á h with Dt ¼ TÀt 1 h and h is the number of discretizations. In this way we capture the managerial flexibility to invest D before the maturity T and so to realize the R&D cash flows. Figure 1 depicts the R&D investment structure.
We assume that V and D have the following geometric Brownian motion: where l v and l d are the expected rates of return, d v and d d are the corresponding dividend yields, r 2 v and r 2 d are the respective variance rates, q vd is the correlation between changes in V and D, ðZ v t Þ t2½0;T and ðZ d t Þ t2½0;T are two Brownian processes defined on a filtered probability space ðX; A; fF t g t ! 0 ; PÞ, where X is the space of all possible outcomes, A is a sigma-algebra, P is the probability measure and fF t g t ! 0 is a filtration with respect to X space. Assuming that the firm keeps a portafolio of activities which allows it to value activities in a risk-neutral way, the dynamics of the assets V and D under the risk-neutral martingale measure Q are given by: Cov dZ Ãv t ; dZ Ãd where r is the risk-free interest rate, Z Ãv t and Z Ãd t are two Brownian standard motions under the probability Q with correlation coefficient q vd . After some manipulation, we get the equations for the price ratio P ¼ V D and D T under the probability Q: where D 0 is the value of asset D at initial time. We can observe that U ðÀ ffiffiffi ffi T p Þ and therefore expðUÞ is log-normal distributed whose expectation value E Q ½expðUÞ ¼ 1. By Girsanov's theorem, we define a new probability measure Q $ equivalent to Q whose Radon-Nikodym derivative is: Hence, substituing in (8) we can write: By the Girsanov's theorem, the evolution of processes: are two Brownian motions under the risk-neutral probability space ðX; A; F ; Q $ Þ and Z 0 is a Brownian motion under Q $ independent ofẐ d . By using Eqs. (11) and (12), we can now obtain the risk-neutral price ratio P: where

Valuation of CAEO Using LSM Method
The value of CAEO can be determined as the expectation value of discounted cashflows under the risk-neutral probability Q: Assuming the asset D as numeraire and using Eq.(10) we obtain: where IT ¼ q D t 1 .
The market launch phase S k ðP t 1 ; 1; T À t 1 Þ can be analyzed using the LSM method. Like in any American option valuation, the optimal exercise decision at any point in time is obtained as the maximum between immediate exercise value and expected continuation value. The LSM method allows us to estimate the conditional expectation function for each exercise date and so to have a complete specification of the optimal exercise strategic along each path. The method starts by simulating n price paths of asset P t 1 defined by Eq. (13) . . .; n the simulated prices. Starting from each i th simulated-path, we begin by simulating a discretization of Eq. (13) for k ¼ 1; . . .; h. The process is repeated m times over a time horizon T. Starting with the last j th priceP i;j T , for j ¼ 1. . .m, the option value in T can be computed as S 0 ðP i;j T ; 1; 0Þ ¼ maxðP i;j T À 1; 0Þ. Working backward, at time s hÀ1 , the process is repeated for each j th path. In this case, the expected continuation value may be computed using the analytic expression for an European option S 1 ðP i;j s hÀ1 ; 1; DtÞ. Moving backwards, at time s hÀ1 , the management must decide whether to invest or not. The value of the option is maximized if the immediate exercise exceeds the continuation value, i.e.: We can find the critical ratio P Ã s hÀ1 that solve the inequality (16): But it is very heavy to compute the expected continuation value for all previous time and so to determine the critical price P Ã s k ; k ¼ 1; . . .; h À 2, as it is shown in Carr (1995). The main contribution of the LSM method is to determine the expected continuation values by regressing the subsequent discounted cash flows on a set of basis functions of current state variables. As described in Abramowitz and Stegun (1970), a common choice of basis functions are the weighted Power, Laguerre, Hermite, Legendre, Chebyshev, Gegenbauer and Jacobi polynomials. In our paper we consider as basis function a three weighted Power polynomial. Let be L w the basis of functional forms of the state variableP i;j s k that we use as regressors. We assume that w ¼ 1; 2; 3. At time s hÀ1 , the least square regression is equivalent to solve the following problem: The optimalâ ¼ ðâ 1 ;â 2 ;â 3 Þ is then used to estimate the expected continuation value along each pathP i;j s hÀ1 ; j ¼ 1; . . .; m: After that, the optimal decision for each price path is to choose the maximum between the immediate exercise and the expected continuation value. Proceeding recursively until time t 1 , we have a final vector of continuation values for each price-pathP i;j s k that allows us to build a stopping rule matrix in Matlab that maximizes the value of American option. As consequence, the ith option value approximationŜ i k ðP i t 1 ; 1; T À t 1 Þ can be determined by averaging all discounted cash flows generated by option at each date over all paths j ¼ 1; . . .; m. Finally, it is possible to implement Monte Carlo simulation to approximate the CAEO: ''Appendix A'' illustrates the complete Matlab algorithm to value CAEO. We conclude that, applying real option methodology, the R&D project will be realized at time t 0 if CðS k ; IT; t 1 Þ À R is positive, otherwise the investment will be rejected.

Feed-Forward Neural Networks to Value CAEO
In this section we describe the neural network architecture in order to value the CAEO and in particular a BP system in which the input layer is composed by n ¼ 10 nodes, one for each variable: and is composed by one hidden layer with p ¼ 6 nodes, as it shown in Fig. 2. Following Eskiizmirliler et al. (2020), we describe the BP neural network. The yellow circles are the ten input parameters above described, the blue ones are the six nodes in the hidden layer and the pink node denotes the CAEO output given by BP structure. Moreover, the red and green lines denote a negative inhibitory and a positive excitatory respectively, depending on the weights connecting the nodes. Obviously, the thickness represents the intensity of link. For the learning phase, the network is parameterized on a sufficient large number of targets given by the previous Monte Carlo LSM estimation of CAEO, summarized in Tables 4 and 5. The idea is to use the Monte Carlo values, whose time simulation is very long, as targets in the training phase in order to extend, with the BP neural network, the value for any input vector. This approach allows a drastrical reduction of time simulation. About the Monte Carlo approach, we have used as number of discretizations x ¼ 100, the number of American simulation m ¼ 50;000 and the paths of Compound option n ¼ 30;000. We recall that for each path j ¼ 1; . . .; n there are m ¼ 50;000 trajectories to simulate the American exchange option. This leads to increase time simulation of CAEO instead of a better accuracy.
We propose a logistic activation function between the input-hidden layers. In this scenario, its property as approximator is well defined (see White 1990). In our model, we have assumed one hidden layer with six nodes and one output layer, i.e. the neural value of CAEO, with a pureline activation function. Each node performs computation and transformation operations. In particular, in the hidden layer, the aggregation function used is the sum function as: where j ¼ 1; . . .; 6 are the nodes in hidden layer, x i are the input values for i ¼ 1; . . .; 10 and b j is a threshold value named bias, (for more detail see Hecht-Nielsen 1990). As seen in Fig. 2, a feed forward neural network model including a single hidden layer, which takes inputs from the input layer and produces the weighted sum of inputs added onto some bias values as outputs, is preferred to solve the problem effectively. The output produced by each node of hidden layer is obtained by the logistic activation function: and so this output become input for the output layer. In the same fashion: is the output that the network produces at the end of the first cycle of learning, in which g is the pureline activation function, w 0 j and b 0 j are the weights and the bias, respectively.
Moreover, the BP networks are a learning algorithm based on conventional method of reduction gradient, in which the couples of input-output are introduced iteratively in the network by an opportune update and modification of weights in order to reach the minimum value of squared error (MSE) function: where K is the number of input-output data with LSM simulation, y k is the real output (target) associated with vector input k and y 0 k is the neural value. For the numerical solution of the minimization problem defined above, the ''Gradient Descent Method'' is considered. Gradient descent is a first-order iterative optimization algorithm for finding the minimum of a function. To find a local minimum of a function using a gradient descent, one takes steps proportional to the negative of the gradient (or approximate gradient) of the function at the current point. In particular, the updating of the weights is obtained by retropropagating appropriately the value assumed by the function E from the output layer, through the hidden layer, up to the initial state, as: w 0 j ðq þ 1Þ ¼w 0 j ðqÞ À g where w(q) is the value of weights at q iteration and g is a learning rate that we assume g ¼ 0:60. Moreover, the choice of the learning rate is important, as descent parameter g, and it plays a vital role for converging of the algorithm to the solution in gradient descent. A lower g value causes a long running time for the algorithm, which becomes computationally expensive. In contrast, large g values imply divergence from the solution in general.

Other Methods: Radial Basis Function (RBF) and General Regression Neural Network (GRNN)
As we have analyzed, BP network is one of the most widely used neural networks. It is a multi-layer network which includes at least one hidden layer. First the input is propagated forward through the network to get the response of the output layer. Then, the sensitivities are propagated backward to reduce the error. During this process, weights in all hidden layers are modified. As the propagation continues, the weights are continuously adjusted and the precision of the output is improved. Radial Basis Function (RBF) networks typically have three layers: an input layer, a hidden layer with a non-linear RBF activation function and a linear output layer.
The RBF network is a three-layer feed-forward neural network, between the input and the output layers there is a hidden layer. When training, vectors are input to the first layer and fanned out to the hidden layer. In the latter, a cluster of RBF functions turn the input to output, adjusting the weight of the input to the hidden layer. Then, under the target vector's supervising, the weight of the output vector of the hidden layer is adjusted. When clustering texts, the Euclidean distance between the input vectors and the weight vectors, which have been adjusted by training process, is calculated. Each input sample is sorted to a class. Then the output layer collects samples belonging to same classes and organizes an output vector, the final clustering. The most common form of basis function is the Gaussian. By contrast, the hidden units of the RBF network are formed by the distance between the prototype vector and the input vector, transformed by a non-linear basis function. The basic structure of an RBF neural network includes an n dimension input layer, a fairly larger dimension hidden layer (p [ n) and the output layer. Typical radial basis function is Gaussian function, where p is the number of number of neurons in the hidden layer, jj Á jj is the Euclidean norm, c j and r j are the center and width of the hidden neuron j, respectively and a j is given by formula (19). The output is given by the following linear transformation: Generalized Regression Neural Network (GRNN), suggested by Specht (1991), belongs to RBF networks with the assumption that the number of neurons in the hidden layer is the same as the sample size of training data and the center of the i-th neuron is just the i-th sample x i . We remark that GRNN directly produces a predicted value without training process. Figure 3a shows the K ¼ 100 LSM data used in training and in the particular their values have been normalized between 0 and 1. The lowest value corresponds to CAEO value equals to 1.850 while the highest one is 32.250. In the same picture, the red line depicts the average value of LSM simulation. It is interesting to analyze Fig. 3b that illustrates the evolution of standard error in the training phase over the learning cycles. It shows how the average training error goes down in order to reach the level 0.0034. To illustrate our results, we simulate by the neural network the CAEO value starting from these initial parameter value:  Table 1. The neural network simulates the CAEO respecting the sensitivity of several variables. In particular, it increases when asset V and the volatilities r v and r d raise too. While, the CAEO decreases when the investment costs D and IT enlarge. The advantage to have a neural network to simulate a CAEO is, first of all, the time in order to receive the (a) (b) Fig. 3 Back Propagation training results simulated output with a low standard error. We remark that the LSM Monte Carlo is accuracy with an average standard error 0.0094 but the time simulation is very long. Another advantage about the neural network is to describe the influence that all variables produce on the CAEO value. As referred in Table 2, the most important parameter is the volatility of gross project r v , the gross project value V and the volatility r d . It is also possible to verify, as shown in Fig. 4, that most of the results are accurate. They are in fact almost all arranged on the fit line. In fact, the correlation coefcient (R) to the linear t (y ¼ ax) is 0.999 giving an almost perfect t, something of course expected since it was this data set used for the training of the network. The very good tting values indicate that the training was done very well.     These results veried the success of BP neural networks to recognize the implicit relationships between input and output variables. Finally, to appreciate the goodness of BP method, we will compare it with the RBF and general regression GRNN. Some significant results are summarized in Table 3. To evaluate the goodness of the network, the MSE (see Eq. 20) and the Mean Absolute Percentage Error (MAPE) defined as:

Numerical Results
y 0 k À y k y k are proposed. As we can see from the results in Table 3, the RBF and the GRNN seem to underestimate the CAEO value with respect to BP. However, the three methods analyzed have a good predictive power since, even if the MSE and MAPE are slightly higher in RBF and GRNN than BP. It is evident that the BP network provides much better predictions than the other types of neural networks. As regards the latter, however, it is difficult to establish which one has the best behavior since the accuracy of the predictions is fairly uniform. However, what we can say is that the GRNN type network is the one that behaves in the worst way.

Conclusions
In this paper, we have shown how neural network methodology, joint with the LSM, can be used to evaluate R&D projects. In particular way, an R&D opportunity is a sequential investment and therefore can be considered as a compound option. We have assumed the managerial flexibility to realize the production investment D before the maturity T in order to benefit of R&D cash flows. So an R&D project can be view as a Compound American Exchange option (CAEO), that allows us to couple both the sequential frame and the managerial flexibility of an R&D investment. We have analyzed two main contributions. The first is that the LSM method permits to determine the expected continuation value by regressing the discounted cash flows on the simple powers of variable P, and so to overcome the effort to compute the critical prices P Ã s k ; k ¼ 1. . .h À 2. But this approach requires long time in order to value a CAEO. The second contribution analyzed is the construction of a neural network based on BP architecture using the LSM simulation results as ''targets'' in the learning phase. As there is not a market evaluation of CAEO, we have seen that the advantage of this approach is the speed and the accuracy in the computations and, moreover, the possibility to extend the trained neural network to value any R&D investment project. Finally, we have compared the BP results with those obtained from the RBF and GRNN neural approach. Based on MSE and MAPE, the BP provides much better predictions.

Appendix: Matlal algorithm
In this appendix we present first of all the Matlab algorithm of LSM method: