Skip to main content

Modelling, Forecasting and Trading the Crack: A Sliding Window Approach to Training Neural Networks

  • Chapter
  • First Online:
Artificial Intelligence in Financial Markets

Abstract

The aim of this analysis is to expand on earlier work carried out by Dunis et al. (Modelling and trading the gasoline crack spread: A non-linear story. Derivatives Use, Trading & Regulation, 12(1–2), 126–145, 2005) who modelled the Crack Spread from 1 January 1995 to 1 January 2005. This chapter however provides a more sophisticated approach to the non-linear modelling of the ‘Crack’. The selected trading period covers 777 trading days starting on 9 April 2010 and ending on 28 March 2013. The proposed model is a combined PSO (particle swarm optimizer) and an RBF (radial basis function) NN (neural network), which is trained daily using sliding windows of 380 and 500 days. Performance is benchmarked against a multi-layer perceptron (MLP) NN using the same training protocol. In artificial intelligence (AI) outputs from the neural networks provide forecasts and in this case the outputs provide forecasts for one-day-ahead trading simulations. To model the spread an expansive universe of 59 inputs across different asset classes are also used. This empirical application is the first time five autoregressive moving average (ARMA) models and two GARCH (generalized autoregressive conditional heteroscedasticity) volatility models have been included in the input dataset as a mixed model approach to train the NNs. In addition, other significant contributions include a sliding window approach to training and estimation of NN models and the use of two fitness functions.

Experimental results reveal that the sliding window approach to modelling the Crack Spread is effective when using 380- and 500-day training periods. Sliding windows of less than 380 days were found to produce unsatisfactory trading performance and reduced statistical accuracy. The PSO RBF model that was trained over 380 is superior in both trading performance and statistical accuracy when compared to its peers. As each of the unfiltered models’ volatility and maximum drawdown were unattractive a threshold confirmation filter is employed.

Moreover, the threshold confirmation filter only trades when the forecasted returns are greater than an optimized threshold of forecasted returns. As a consequence, only forecasted returns of stronger conviction produce trading signals. This filter attempts to reduce maximum drawdowns and volatility by trading less frequently and only during times of greater predicted change. Ultimately, the confirmation filter improves risk return profiles for each model and transaction costs were also significantly reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This was seen to be less environmentally friendly than its alternative ethanol. As a result this new blend now comprises of 10 % ethanol.

  2. 2.

    For this application a total of 30 hidden neurons were used during the MLP training process.

  3. 3.

    This activation function is considered to be non-monotonic in that it is difficult to make weights vary sufficiently from their initial position. This can result in much larger numbers of local minima in the error surface [14].

  4. 4.

    For the purpose of forecasting, the proposed PSO RBF model utilizes a constant layer of ten neurons. Tests were conducted using the algorithm to search for the ‘optimal’ number of hidden neurons. Results from these tests produce a lot more than ten neurons and as a result the PSO RBF was found to ‘over-fit’ the data in most cases. This can be checked by observing the best weights output and comparing training using fewer fixed neurons with what the algorithm would use if it was tasked with identifying the ‘optimal’ number of neurons. With this in mind, a number of experiments were run using varying numbers of hidden neurons. All of the PSO RBF parameters are provided in Appendix A4. The best weights for each of the models are included in Appendix A5.

  5. 5.

    The number of hidden neurons is multiplied with 10−2 because the simplicity of the derived neural network is of secondary importance compared with the other two objectives (maximize the annualized return and minimizing the MSE).

  6. 6.

    Intel core i5 processors were used during both the backtesting and forecasting phases. Furthermore, in order to reduce the estimation time four out of the five cores are utilized by executing the Parallel Toolbox function in Matlab 2011.

  7. 7.

    For the RBF 380- and 500-day models the ‘x’ parameter = 0.20 %. For the MLP 380-day model the ‘x’ parameter = 1.90 % and for the MLP 500-day model ‘x’ = 1.45 %.

References

  1. Dunis, C. L., Laws, J., & Evans, B. (2005). Modelling and trading the gasoline crack spread: A non-linear story. Derivatives Use, Trading & Regulation, 12(1-2), 126–145.

    Article  Google Scholar 

  2. Butterworth, D., & Holmes, P. (2002). Inter-market spread trading: Evidence from UK Index Futures Markets. Applied Financial Economics., 12(11), 783–791.

    Article  Google Scholar 

  3. Dunis, C. L., Laws, J., & Evans, B. (2006). Modelling and trading the soybean-oil crush spread with recurrent and higher order networks: A comparative analysis. Neural Network World, 13(3/6), 193–213.

    Google Scholar 

  4. Dunis, C. L, Laws, J; Middleton, P.W. (2011). Modelling and Trading the Corn/Ethanol Crush Spread with Neural Networks. CIBEF Working Paper, Liverpool Business School, Available at www.cibef.com.

  5. Chen, L. H. C., Finney, M., & Lai, K. S. (2005). A threshold cointegration analysis of asymmetric price transmission from crude oil to gasoline prices. Economics Letters, 89, 233–239.

    Article  Google Scholar 

  6. Enders, W., & Granger, C. W. J. (1998). Unit-root tests and asymmetric adjustment with an example using the term structure of interest rates. Journal of Business and Economic Statistics, 16(3), 304–311.

    Google Scholar 

  7. Kaastra, I., & Boyd, M. (1996). Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10(3), 215.

    Article  Google Scholar 

  8. Tsai, C. F., & Wang, S. P. (2009). Stock price forecasting by hybrid machine learning techniques. Proceedings of the International Multi-Conference of Engineers and Computer Scientists, 1, 755–760.

    Google Scholar 

  9. Chang, P. C., Wang, Y. W., & Yang, W. N. (2004). An investigation of the hybrid forecasting models for stock price variation in Taiwan. Journal of the Chinese Institute of Industrial Engineering., 21(4), 358–368.

    Article  Google Scholar 

  10. Thawornwong, S., & Enke, D. (2004). The adaptive selection of financial and economic variables for use with Artificial Neural Networks. Neurocomputing., 56, 205–232.

    Article  Google Scholar 

  11. Cortez, P., Rocha, M; Neves, J. (2001) Evolving Time Series Forecasting Neural Network Models. In Proc. of the 3rd Int. Symposium on Adaptive Systems: Evolutionary Computation and Probabilistic Graphical Models (ISAS 2001), (pp. 84–91).

    Google Scholar 

  12. Mettenheim, V. H. J; Breitner, M. H. (2012). Forecasting daily highs and lows of liquid assets with neural networks. Operations Research Proceedings, Selected Papers of the Annual International Conference of the German Operations Research Society, Hannover.

    Google Scholar 

  13. Dunis, C. L., Laws, J., & Karathanassopoulos, A. (2011). Modelling and trading the Greek Stock Market with Mixed Neural Network Models. Applied Financial Economics., 21(23), 1793–1808

    Google Scholar 

  14. Sopena, J. M., Romero, E., and Alquezar, R. (1999). Neural networks with periodic and monotonic activation functions: a comparative study in classification problems. In Proceedings of the 9th International Conference on Artificial Neural Networks.

    Google Scholar 

  15. Park, J. W., Harley, R. G., & Venayagamoorthy, G. K. (2002). Comparison of MLP and RBF Neural Networks using deviation signals for on-line identification of a synchronous generator. Proceedings of Power Engineering Society Winter Meeting, IEEE, 1(27–31), 274–279.

    Article  Google Scholar 

  16. Kennedy, J., & Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of the IEEE International Conference on Neural Networks, 4, 1942–1948.

    Article  Google Scholar 

  17. Chen, Z., & Qian, P. (2009). Application of PSO-RBF Neural Network in Network Intrusion Detection. Intelligent Information Technology Application, 1, 362–364.

    Google Scholar 

  18. Konstantinos, T.; Parsopolous, E.; Vrahatis, M. N. (2010). Particle swarm optimization and intelligence: Advances and applications. IGI Global, 149–164.

    Google Scholar 

  19. Sermpinis, G., Theofilatos, K., Karathanasopoulos, A., Georgopoulos, E., & Dunis, C. (2013). Forecasting foreign exchange rates with Adaptive Neural Networks using radial-basis functions and Particle Swarm Optimization. European Journal of Operational Research, 225(3), 528–540.

    Article  Google Scholar 

  20. Li, J.; Xiao, X (2008). Multi-swarm and multi-best particle swarm optimization algorithm. In Proceedings of the 7th World Congress on Intelligent Control and Automation, China. pp. 6281–6286.

    Google Scholar 

  21. Mohaghegi, S., Valle, Y., Venayagamoorthy, G., Harley, R. (2005). A comparison of PSO and backpropagation for training RBF Neural Networks for identification of a power system with Statcom. In Proc. IEEE Swarm Intelligence Symposium, pp. 381–384.

    Google Scholar 

  22. Eberhart, R.C., Simpson, P.K., Dobbins, R.W., 1996. Computational Intelligence PC Tools. Academic Press Professional. Boston, MA.

    Google Scholar 

  23. Dunis, C. L. (1996). The economic value of neural network systems for exchange rate forecasting. Neural Network World, 1, 43–55.

    Google Scholar 

  24. Makridakis, S. (1989). Why combining works? International Journal of Forecasting, 5, 601–603.

    Article  Google Scholar 

  25. Clemen, R. T. (1989). Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 5, 559–583.

    Article  Google Scholar 

  26. Newbold, P., & Granger, C. W. J. (1974). Experience with forecasting univariate time series and the combination of forecasts (with discussion). Journal of Statistics, 137, 131–164.

    Google Scholar 

  27. Palm, F. C., & Zellner, A. (1992). To combine or not to combine? Issues of combining forecasts. Journal of Forecasting., 11, 687–701.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian L. Dunis .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Performance Measures

See Table 3.7

Table 3.7 Statistical and trading performance measures

1.2 Supplementary Information

See Table 3.8

Table 3.8 The refiner’s market capitalization

1.3 ARMA Equations and Estimations

Autoregressive moving average (ARMA) models assume that the future value of a time-series is governed by its historical values (the autoregressive component) and on previous residual values (the moving average component). A typical ARMA model takes the form of equation 3.15.

$$ {Y}_t={\phi}_0+{\phi}_1{Y}_{t-1}+{\phi}_2{Y}_{t-2}+\dots +{\phi}_p{Y}_{t-p}+{\varepsilon}_t-{w}_1{\varepsilon}_{t-1}-{w}_2{\varepsilon}_{t-2}-\dots -{w}_q{\varepsilon}_{t-q} $$
(3.15)

Where:

Y t :

is the dependent variable at time t

Y t − 1 , Y t − 2 , and Y tp :

are the lagged dependent variables

ϕ 0 , ϕ 1 , ϕ 2 , and ϕ p :

are regression coefficients

ε t :

is the residual term

ε t − 1 , ε t − 2 , and ε tp :

are previous values of the residual

w 1 , w 2 , and w q :

are weights.

Using a correlogram as a guide in the training and the test sub-periods the below restricted ARMA models were selected to trade each spread. All coefficients were found to be significant at a 95 % confidence interval. Therefore, the null hypothesis that all coefficients (except the constant) are not significantly different from zero is rejected at the 95 % confidence interval (Table 3.9).

Table 3.9 ARMA equations

1.3.1 GARCH Equations and Estimations

Each of the GARCH models (16,16) and (15,15) are deemed stable and significant at a 95 % confidence level. Following the initial estimation of significant terms a squared residuals test, Jarque-Bera test and an ARCH test are all conducted to test the reliability of the residuals. For the sake of brevity, outputs from these tests are not included. These can be obtained on request from the corresponding author. Autocorrelation is absent from both models and as a result returns derived from each model were used as inputs during the training of the proposed PSO RBF Neural Network (Table 3.10).

Table 3.10 GARCH model # 1

Observation

The AR(1), AR(2), AR(10), AR(16), MA(1), MA(2), MA(10) and MA(16) terms are all deemed significant at a 95 % confidence level. The model is also deemed stable due to the fact that the sum of GARCH(-1) and RESID(-1)^2 is less than 1. In this case it is, 0.896013 + 0.083038 = 0.979 (Table 3.11).

Table 3.11 GARCH model # 2

Observation

The AR(1), AR(4), AR(15), MA(1), MA(4), and MA(15) terms are all deemed significant at a 95 % confidence level. The model is also deemed stationary due to the fact that the sum of GARCH(-1) and RESID(-1)^2 is less than 1. In this case it is, 0.883565 + 0.091396 = 0.974961.

1.4 PSO Parameters

See Tables 3.12 and 3.13

Table 3.12 PSO RBF parameters
Table 3.13 Neural characteristics

1.5 Best Weights over the Training Windows

See Tables 3.14 and 3.15

Table 3.14 Best weights obtained from the 380 day training window
Table 3.15 Best weights obtained from the 500-day training window

Copyright information

© 2016 The Author(s)

About this chapter

Cite this chapter

Dunis, C.L., Middleton, P.W., Theofilatos, K., Karathanasopoulos, A. (2016). Modelling, Forecasting and Trading the Crack: A Sliding Window Approach to Training Neural Networks. In: Dunis, C., Middleton, P., Karathanasopolous, A., Theofilatos, K. (eds) Artificial Intelligence in Financial Markets. New Developments in Quantitative Trading and Investment. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-48880-0_3

Download citation

  • DOI: https://doi.org/10.1057/978-1-137-48880-0_3

  • Published:

  • Publisher Name: Palgrave Macmillan, London

  • Print ISBN: 978-1-137-48879-4

  • Online ISBN: 978-1-137-48880-0

  • eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics