Modelling, Forecasting and Trading the Crack: A Sliding Window Approach to Training Neural Networks

Dunis, Christian L.; Middleton, Peter W.; Theofilatos, Konstantinos; Karathanasopoulos, Andreas

doi:10.1057/978-1-137-48880-0_3

Christian L. Dunis⁸,
Peter W. Middleton⁹,
Konstantinos Theofilatos¹⁰ &
…
Andreas Karathanasopoulos¹¹

Part of the book series: New Developments in Quantitative Trading and Investment ((QTAM))

4735 Accesses

Abstract

The aim of this analysis is to expand on earlier work carried out by Dunis et al. (Modelling and trading the gasoline crack spread: A non-linear story. Derivatives Use, Trading & Regulation, 12(1–2), 126–145, 2005) who modelled the Crack Spread from 1 January 1995 to 1 January 2005. This chapter however provides a more sophisticated approach to the non-linear modelling of the ‘Crack’. The selected trading period covers 777 trading days starting on 9 April 2010 and ending on 28 March 2013. The proposed model is a combined PSO (particle swarm optimizer) and an RBF (radial basis function) NN (neural network), which is trained daily using sliding windows of 380 and 500 days. Performance is benchmarked against a multi-layer perceptron (MLP) NN using the same training protocol. In artificial intelligence (AI) outputs from the neural networks provide forecasts and in this case the outputs provide forecasts for one-day-ahead trading simulations. To model the spread an expansive universe of 59 inputs across different asset classes are also used. This empirical application is the first time five autoregressive moving average (ARMA) models and two GARCH (generalized autoregressive conditional heteroscedasticity) volatility models have been included in the input dataset as a mixed model approach to train the NNs. In addition, other significant contributions include a sliding window approach to training and estimation of NN models and the use of two fitness functions.

Experimental results reveal that the sliding window approach to modelling the Crack Spread is effective when using 380- and 500-day training periods. Sliding windows of less than 380 days were found to produce unsatisfactory trading performance and reduced statistical accuracy. The PSO RBF model that was trained over 380 is superior in both trading performance and statistical accuracy when compared to its peers. As each of the unfiltered models’ volatility and maximum drawdown were unattractive a threshold confirmation filter is employed.

Moreover, the threshold confirmation filter only trades when the forecasted returns are greater than an optimized threshold of forecasted returns. As a consequence, only forecasted returns of stronger conviction produce trading signals. This filter attempts to reduce maximum drawdowns and volatility by trading less frequently and only during times of greater predicted change. Ultimately, the confirmation filter improves risk return profiles for each model and transaction costs were also significantly reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This was seen to be less environmentally friendly than its alternative ethanol. As a result this new blend now comprises of 10 % ethanol.
2.
For this application a total of 30 hidden neurons were used during the MLP training process.
3.
This activation function is considered to be non-monotonic in that it is difficult to make weights vary sufficiently from their initial position. This can result in much larger numbers of local minima in the error surface [14].
4.
For the purpose of forecasting, the proposed PSO RBF model utilizes a constant layer of ten neurons. Tests were conducted using the algorithm to search for the ‘optimal’ number of hidden neurons. Results from these tests produce a lot more than ten neurons and as a result the PSO RBF was found to ‘over-fit’ the data in most cases. This can be checked by observing the best weights output and comparing training using fewer fixed neurons with what the algorithm would use if it was tasked with identifying the ‘optimal’ number of neurons. With this in mind, a number of experiments were run using varying numbers of hidden neurons. All of the PSO RBF parameters are provided in Appendix A4. The best weights for each of the models are included in Appendix A5.
5.
The number of hidden neurons is multiplied with 10⁻² because the simplicity of the derived neural network is of secondary importance compared with the other two objectives (maximize the annualized return and minimizing the MSE).
6.
Intel core i5 processors were used during both the backtesting and forecasting phases. Furthermore, in order to reduce the estimation time four out of the five cores are utilized by executing the Parallel Toolbox function in Matlab 2011.
7.
For the RBF 380- and 500-day models the ‘x’ parameter = 0.20 %. For the MLP 380-day model the ‘x’ parameter = 1.90 % and for the MLP 500-day model ‘x’ = 1.45 %.

References

Dunis, C. L., Laws, J., & Evans, B. (2005). Modelling and trading the gasoline crack spread: A non-linear story. Derivatives Use, Trading & Regulation, 12(1-2), 126–145.
Article Google Scholar
Butterworth, D., & Holmes, P. (2002). Inter-market spread trading: Evidence from UK Index Futures Markets. Applied Financial Economics., 12(11), 783–791.
Article Google Scholar
Dunis, C. L., Laws, J., & Evans, B. (2006). Modelling and trading the soybean-oil crush spread with recurrent and higher order networks: A comparative analysis. Neural Network World, 13(3/6), 193–213.
Google Scholar
Dunis, C. L, Laws, J; Middleton, P.W. (2011). Modelling and Trading the Corn/Ethanol Crush Spread with Neural Networks. CIBEF Working Paper, Liverpool Business School, Available at www.cibef.com.
Chen, L. H. C., Finney, M., & Lai, K. S. (2005). A threshold cointegration analysis of asymmetric price transmission from crude oil to gasoline prices. Economics Letters, 89, 233–239.
Article Google Scholar
Enders, W., & Granger, C. W. J. (1998). Unit-root tests and asymmetric adjustment with an example using the term structure of interest rates. Journal of Business and Economic Statistics, 16(3), 304–311.
Google Scholar
Kaastra, I., & Boyd, M. (1996). Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10(3), 215.
Article Google Scholar
Tsai, C. F., & Wang, S. P. (2009). Stock price forecasting by hybrid machine learning techniques. Proceedings of the International Multi-Conference of Engineers and Computer Scientists, 1, 755–760.
Google Scholar
Chang, P. C., Wang, Y. W., & Yang, W. N. (2004). An investigation of the hybrid forecasting models for stock price variation in Taiwan. Journal of the Chinese Institute of Industrial Engineering., 21(4), 358–368.
Article Google Scholar
Thawornwong, S., & Enke, D. (2004). The adaptive selection of financial and economic variables for use with Artificial Neural Networks. Neurocomputing., 56, 205–232.
Article Google Scholar
Cortez, P., Rocha, M; Neves, J. (2001) Evolving Time Series Forecasting Neural Network Models. In Proc. of the 3rd Int. Symposium on Adaptive Systems: Evolutionary Computation and Probabilistic Graphical Models (ISAS 2001), (pp. 84–91).
Google Scholar
Mettenheim, V. H. J; Breitner, M. H. (2012). Forecasting daily highs and lows of liquid assets with neural networks. Operations Research Proceedings, Selected Papers of the Annual International Conference of the German Operations Research Society, Hannover.
Google Scholar
Dunis, C. L., Laws, J., & Karathanassopoulos, A. (2011). Modelling and trading the Greek Stock Market with Mixed Neural Network Models. Applied Financial Economics., 21(23), 1793–1808
Google Scholar
Sopena, J. M., Romero, E., and Alquezar, R. (1999). Neural networks with periodic and monotonic activation functions: a comparative study in classification problems. In Proceedings of the 9th International Conference on Artificial Neural Networks.
Google Scholar
Park, J. W., Harley, R. G., & Venayagamoorthy, G. K. (2002). Comparison of MLP and RBF Neural Networks using deviation signals for on-line identification of a synchronous generator. Proceedings of Power Engineering Society Winter Meeting, IEEE, 1(27–31), 274–279.
Article Google Scholar
Kennedy, J., & Eberhart, R. (1995). Particle Swarm Optimization. Proceedings of the IEEE International Conference on Neural Networks, 4, 1942–1948.
Article Google Scholar
Chen, Z., & Qian, P. (2009). Application of PSO-RBF Neural Network in Network Intrusion Detection. Intelligent Information Technology Application, 1, 362–364.
Google Scholar
Konstantinos, T.; Parsopolous, E.; Vrahatis, M. N. (2010). Particle swarm optimization and intelligence: Advances and applications. IGI Global, 149–164.
Google Scholar
Sermpinis, G., Theofilatos, K., Karathanasopoulos, A., Georgopoulos, E., & Dunis, C. (2013). Forecasting foreign exchange rates with Adaptive Neural Networks using radial-basis functions and Particle Swarm Optimization. European Journal of Operational Research, 225(3), 528–540.
Article Google Scholar
Li, J.; Xiao, X (2008). Multi-swarm and multi-best particle swarm optimization algorithm. In Proceedings of the 7th World Congress on Intelligent Control and Automation, China. pp. 6281–6286.
Google Scholar
Mohaghegi, S., Valle, Y., Venayagamoorthy, G., Harley, R. (2005). A comparison of PSO and backpropagation for training RBF Neural Networks for identification of a power system with Statcom. In Proc. IEEE Swarm Intelligence Symposium, pp. 381–384.
Google Scholar
Eberhart, R.C., Simpson, P.K., Dobbins, R.W., 1996. Computational Intelligence PC Tools. Academic Press Professional. Boston, MA.
Google Scholar
Dunis, C. L. (1996). The economic value of neural network systems for exchange rate forecasting. Neural Network World, 1, 43–55.
Google Scholar
Makridakis, S. (1989). Why combining works? International Journal of Forecasting, 5, 601–603.
Article Google Scholar
Clemen, R. T. (1989). Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, 5, 559–583.
Article Google Scholar
Newbold, P., & Granger, C. W. J. (1974). Experience with forecasting univariate time series and the combination of forecasts (with discussion). Journal of Statistics, 137, 131–164.
Google Scholar
Palm, F. C., & Zellner, A. (1992). To combine or not to combine? Issues of combining forecasts. Journal of Forecasting., 11, 687–701.
Article Google Scholar

Download references

Author information

Authors and Affiliations

ACANTO Holding, Hannover, Germany
Christian L. Dunis
University of Liverpool, Liverpool, England
Peter W. Middleton
Pattern Recognition Laboratory, Department of Computer Engineering & Informatics, University of Patras, 26500, Patras, Greece
Konstantinos Theofilatos
American University of Beirut (AUB), Beirut, Lebanon
Andreas Karathanasopoulos

Authors

Christian L. Dunis
View author publications
You can also search for this author in PubMed Google Scholar
Peter W. Middleton
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos Theofilatos
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Karathanasopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian L. Dunis .

Editor information

Editors and Affiliations

ACANTO Holding, Hannover, Germany
Christian L. Dunis
University of Liverpool, Liverpool, United Kingdom
Peter W. Middleton
American University of Beirut (AUB), Beirut, Lebanon
Andreas Karathanasopolous
University of Patras, Patras, Greece
Konstantinos Theofilatos

Appendix

1.1 Performance Measures

See Table 3.7

Table 3.7 Statistical and trading performance measures

Full size table

1.2 Supplementary Information

See Table 3.8

Table 3.8 The refiner’s market capitalization

Full size table

1.3 ARMA Equations and Estimations

Autoregressive moving average (ARMA) models assume that the future value of a time-series is governed by its historical values (the autoregressive component) and on previous residual values (the moving average component). A typical ARMA model takes the form of equation 3.15.

$$ {Y}_t={\phi}_0+{\phi}_1{Y}_{t-1}+{\phi}_2{Y}_{t-2}+\dots +{\phi}_p{Y}_{t-p}+{\varepsilon}_t-{w}_1{\varepsilon}_{t-1}-{w}_2{\varepsilon}_{t-2}-\dots -{w}_q{\varepsilon}_{t-q} $$

(3.15)

Where:

Y _t :: is the dependent variable at time _t
Y _{t − 1} , Y _{t − 2} , and Y _{t − p} :: are the lagged dependent variables
ϕ ₀ , ϕ ₁ , ϕ ₂ , and ϕ _p :: are regression coefficients
ε _t :: is the residual term
ε _{t − 1} , ε _{t − 2} , and ε _{t − p} :: are previous values of the residual
w ₁ , w ₂ , and w _q :: are weights.

Using a correlogram as a guide in the training and the test sub-periods the below restricted ARMA models were selected to trade each spread. All coefficients were found to be significant at a 95 % confidence interval. Therefore, the null hypothesis that all coefficients (except the constant) are not significantly different from zero is rejected at the 95 % confidence interval (Table 3.9).

Table 3.9 ARMA equations

Full size table

1.3.1 GARCH Equations and Estimations

Each of the GARCH models (16,16) and (15,15) are deemed stable and significant at a 95 % confidence level. Following the initial estimation of significant terms a squared residuals test, Jarque-Bera test and an ARCH test are all conducted to test the reliability of the residuals. For the sake of brevity, outputs from these tests are not included. These can be obtained on request from the corresponding author. Autocorrelation is absent from both models and as a result returns derived from each model were used as inputs during the training of the proposed PSO RBF Neural Network (Table 3.10).

Table 3.10 GARCH model # 1

Full size table

Observation

The AR(1), AR(2), AR(10), AR(16), MA(1), MA(2), MA(10) and MA(16) terms are all deemed significant at a 95 % confidence level. The model is also deemed stable due to the fact that the sum of GARCH(-1) and RESID(-1)^2 is less than 1. In this case it is, 0.896013 + 0.083038 = 0.979 (Table 3.11).

Table 3.11 GARCH model # 2

Full size table

Observation

The AR(1), AR(4), AR(15), MA(1), MA(4), and MA(15) terms are all deemed significant at a 95 % confidence level. The model is also deemed stationary due to the fact that the sum of GARCH(-1) and RESID(-1)^2 is less than 1. In this case it is, 0.883565 + 0.091396 = 0.974961.

1.4 PSO Parameters

See Tables 3.12 and 3.13

Table 3.12 PSO RBF parameters

Full size table

Table 3.13 Neural characteristics

Full size table

1.5 Best Weights over the Training Windows

See Tables 3.14 and 3.15

Table 3.14 Best weights obtained from the 380 day training window

Full size table

Table 3.15 Best weights obtained from the 500-day training window

Full size table

Copyright information

About this chapter

Cite this chapter

Dunis, C.L., Middleton, P.W., Theofilatos, K., Karathanasopoulos, A. (2016). Modelling, Forecasting and Trading the Crack: A Sliding Window Approach to Training Neural Networks. In: Dunis, C., Middleton, P., Karathanasopolous, A., Theofilatos, K. (eds) Artificial Intelligence in Financial Markets. New Developments in Quantitative Trading and Investment. Palgrave Macmillan, London. https://doi.org/10.1057/978-1-137-48880-0_3

Download citation

DOI: https://doi.org/10.1057/978-1-137-48880-0_3
Published: 23 November 2016
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-137-48879-4
Online ISBN: 978-1-137-48880-0
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics

Modelling, Forecasting and Trading the Crack: A Sliding Window Approach to Training Neural Networks

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Performance Measures

1.2 Supplementary Information

1.3 ARMA Equations and Estimations

1.3.1 GARCH Equations and Estimations

1.4 PSO Parameters

1.5 Best Weights over the Training Windows

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation