Abstract
Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very successful in a number of signal processing applications. We discuss fundamental limitations and inherent difficulties when using neural networks for the processing of high noise, small sample size signals. We introduce a new intelligent signal processing method which addresses the difficulties. The method proposed uses conversion into a symbolic representation with a self-organizing map, and grammatical inference with recurrent neural networks. We apply the method to the prediction of daily foreign exchange rates, addressing difficulties with non-stationarity, overfitting, and unequal a priori class probabilities, and we find significant predictability in comprehensive experiments covering 5 different foreign exchange rates. The method correctly predicts the directionof change for the next day with an error rate of 47.1%. The error rate reduces to around 40% when rejecting examples where the system has low confidence in its prediction. We show that the symbolic representation aids the extraction of symbolic knowledge from the trained recurrent neural networks in the form of deterministic finite state automata. These automata explain the operation of the system and are often relatively simple. Automata rules related to well known behavior such as tr end following and mean reversal are extracted.
Article PDF
Similar content being viewed by others
References
Machine Learning. (1995). International Journal of Intelligent Systems in Accounting, Finance and Management, 4:2, Special Issue.
Proceedings of IEEE/IAFE Conference on Computational Intelligence for Financial Engineering (CIFEr), 1996- 2001.
Abu-Mostafa, Y. S. (1990). Learning from hints in neural networks. Journal of Complexity, 6, 192.
Alexander, J. A.& Mozer, M. C. (1995). Template-based algorithms for connectionist rule extraction. In G. Tesauro, D. Touretzky,& T. Leen (Eds.), Advances in Neural Information Processing Systems (Vol. 7, pp. 609–616), The MIT Press.
Anthony, M.& Biggs, Norman, L. (1995). A computational learning theory view of economic forecasting with neural nets. In A. Refenes (Ed.), Neural Networks in the Capital Markets. John Wiley and Sons.
Back, A. D.& Tsoi, A. C. (1991). FIR and IIR synapses, a new neural network architecture for time series modeling. Neural Computation, 3:3, 375–385.
Barron, A. R. (1993). Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 39:3, 930–945.
Bellman, R. E. (1961). Adaptive Control Processes. Princeton, NJ: Princeton University Press.
Bengio, Y. (1996). Neural Networks for Speech and Sequence Recognition. Thomson.
Cleeremans, A., Servan-Schreiber, D.,& McClelland, J. L. (1989). Finite state automata and simple recurrent networks. Neural Computation, 1:3, 372–381.
Darken, C.& Moody, J. E. (1991). Note on learning rate schedules for stochastic optimization. In R. P. Lippmann., J. E. Moody,& D. S. Touretzky, (Eds.), Advances in Neural Information Processing Systems (Vol. 3, pp. 832–838), San Mateo, CA: Morgan Kaufmann.
Denker, J. S., Schwartz, D., Wittner, B., Solla, S.A., Howard, R., Jackel, L.,& Hopfield, J. (1987). Large automatic learning, rule extraction, and generalization. Complex Systems, 1, 877–922.
Elman, J. L. (1991). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 7:2/3, 195–226.
Fama, E. F. (1965). The behaviour of stock market prices. Journal of Business, January:34–105.
Fama, E. F. (1970). Efficient capital markets: A review of theory and empiricalwork. Journal of Finance, May:383–417.
Faragó, A.& Lugosi, G. (1993). Strong universal consistency of neural network classifiers. IEEE Transactions on Information Theory, 39:4, 1146–1151.
Friedman, J. H. (1994). An overview of predictive learning and function approximation. In V. Cherkassky, J. H. Friedman,& H. Wechsler, (Eds.), From Statistics to Neural Networks, Theory and Pattern Recognition Applications (Vol. 136 of NATO ASI Series F, pp. 1–61) Springer.
Friedman, J. H. (1995). Introduction to computational learning and statistical prediction. Tutorial Presented at Neural Information Processing Systems, Denver, CO.
Ghahramani, Z.& Jordan, M. I. (1997). Factorial hidden Markov models. Machine Learning, 29:245.
Giles, C. L., Horne, B. G.,& Lin, T. (1995). Learning a class of large finite state machines with a recurrent neural network. Neural Networks, 8:9, 1359–1365.
Giles, C. L., Miller, C. B., Chen, D., Chen, H. H., Sun, G. Z.,& Lee, Y. C. (1992). Learning and extracting finite state automata with second-order recurrent neural networks. Neural Computation, 4:3, 393–405.
Giles, C. L., Miller, C. B., Chen, D., Sun, G. Z., Chen, H. H.,&Y. C. Lee, (1992). Extracting and learning an unknown grammar with recurrent neural networks. In J. E. Moody, S. J. Hanson,& R. P. Lippmann, (Eds.), Advances in Neural Information Processing Systems (Vol. 4, pp. 317–324), San Mateo, CA: Morgan Kaufmann Publishers.
Giles, C. L., Sun, G. Z., Chen, H. H., Lee, Y. C.,& Chen, D. (1990). Higher order recurrent networks and grammatical inference. In D. S. Touretzky (Ed.), Advances in Neural Information Processing Systems (Vol. 2, pp. 380–387), San Mateo, CA: Morgan Kaufmann Publishers.
Granger, C. W. J.& Newbold, P. (1986). Forecasting Economic Time Series. 2nd edn., San Diego: Academic press.
Hayashi, Y. (1991). A neural expert system with automated extraction of fuzzy if-then rules. In R. P. Lippmann, J. E. Moody,& D. S. Touretzky, (Eds.), Advances in Neural Information Processing Systems, (Vol. 3, pp. 578–584), Morgan Kaufmann Publishers.
Haykin, S. (1994). Neural Networks, A Comprehensive Foundation, New York, NY: Macmillan.
Hopcroft, J. E.& Ullman, J. D. (1979). Introduction to Automata Theory, Languages, and Computation. Reading, MA: Addison-Wesley Publishing Company.
Ingber, L. (1996). Statistical mechanics of nonlinear nonequilibrium financial markets: Applications to optimized trading. Mathematical Computer Modelling.
Ishikawa, M. (1996). Rule extraction by successive regularization. In International Conference on Neural Networks, (pp. 1139–1143), IEEE Press.
Jordan, M. I. (1996). Neural networks. In A. Tucker, (Ed.), CRC Handbook of Computer Science. Boca Raton, FL: CRC Press.
Kehagias, A.& Petridis, V. (1997). Time series segmentation using predictive modular neural networks. Neural Computation, 9, 1691–1710.
Kohonen, T. (1995). Self-Organizing Maps. Berlin, Germany: Springer-Verlag.
Kolmogorov, A. N. (1957). On the representation of continuous functions of several variables by superpositions of continuous functions of one variable and addition. Dokl, 114, 679–681.
Kŭrková, V. (1991). Kolmogorov's theorem is relevant. Neural Computation, 3:4, 617–622.
Kŭrková, V. (1995). Kolmogorov's theorem. In M. A. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks (pp. 501–502), Massachusetts: MIT Press, Cambridge.
Lawrence, S., Burns, I., Back, A. D., Tsoi, A. C.,& Giles, C. L. (1998). Neural network classification and unequal prior class probabilities. In G. Orr, K.-R. Müller, and R. Caruana, (Eds.), Tricks of the Trade (pp. 299–314), Springer Verlag. Lecture Notes in Computer Science State-of-the-Art Surveys.
Lawrence, S., Giles, C. L.,& Fong, S. (2000). Natural language grammatical inference with recurrent neural networks. IEEE Transactions on Knowledge and Data Engineering, 12:1, 126–140.
Lequarré, J. Y. (1993). Foreign currency dealing: A brief introduction (data set C). In A. S. Weigend& N. A. Gershenfeld, (Eds.), Time Series Prediction: Forecasting the Future and Understanding the Past. Addison-Wesley.
Lo, A. W.& MacKinlay, A. C. (1999). A Non-Random Walk Down Wall Street. Princeton University Press.
Malkiel, B. G. (1987). Efficient Market Hypothesis. London: Macmillan.
Malkiel, B. G. (1996). A Random Walk Down Wall Street. New York, NY: Norton.
McMillan, C., Mozer, M. C.,& Smolensky, P. (1992). Rule induction through integrated symbolic and subsymbolic processing. In J. E. Moody, S. J. Hanson,& R. P. Lippmann, (Eds.), Advances in Neural Information Processing Systems (Vol. 4, pp. 969–976), Morgan Kaufmann Publishers.
Moody, J. E. (1995). Economic forecasting: Challenges and neural network solutions. In Proceedings of the International Symposium on Artificial Neural Networks, Hsinchu, Taiwan.
Mukherjee, S., Osuna, E.,& Girosi, F. (1997). Nonlinear prediction of chaotic time series using support vector machines. In J. Principe, L. Giles, N. Morgan,& E. Wilson, (Eds.), IEEE Workshop on Neural Networks for Signal Processing VII (p. 511), IEEE Press.
Omlin, C. W.& Giles, C. L. (1996). Extraction of rules from discrete-time recurrent neural networks. Neural Networks, 9:1, 41–52.
Principe, J. C.& Kuo, J.-M. (1995). Dynamic modelling of chaotic time series with neural networks. In G. Tesauro, D. Touretzky,& T. Leen, (Eds.), Advances in Neural Information Processing Systems (Vol. 7, pp. 311–318), MIT Press.
Refenes, A. (Ed.), (1995). Neural Networks in the Capital Markets. New York, NY: John Wiley and Sons.
Ruck, D. W., Rogers, S. K., Kabrisky, K., Oxley, M. E.,& Suter, B. W. (1990). The multilayer perceptron as an approximation to an optimal Bayes estimator. IEEE Transactions on Neural Networks, 1:4, 296–298.
Saad, E.W., Prokhorov, D. V.,& Wunsch II, D. C. (1998). Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks. IEEE Transactions on Neural Networks, 9:6, 1456.
Setiono, R.& Liu, H. (1996). Symbolic representation of neural networks. IEEE Computer, 29:3, 71–77.
Siegelmann, H. T.& Sontag, E. D. (1995). On the computational power of neural nets. Journal of Computer and System Sciences, 50:1, 132–150.
Takens, F. (1981). Detecting strange attractors in turbulence. In D. A. Rand& L.-S. Young, (Eds.), Dynamical Systems and Turbulence, (pp. 366–381), Berlin: Springer-Verlag, vol. 898 of Lecture Notes in Mathematics.
Taylor, S. J. (Ed.), (1994). Modelling Financial Time Series. Chichester: J. Wiley&Sons.
Tickle, A. B., Andrews, R., Golea, M.,& Diederich, J. (1998). The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE Transactions on Neural Networks, 9:6, 1057.
Tomita, M. (1982). Dynamic construction of finite-state automata from examples using hill-climbing. In Proceedings of the Fourth Annual Cognitive Science Conference (pp. 105–108), Ann Arbor, MI.
Towell, G.& Shavlik, J. (1993a). The extraction of refined rules from knowledge based neural networks. Machine Learning, 131, 71–101.
Towell, G. G.& Shavlik, J.W. (1993b). Extracting refined rules from knowledge-based neural networks. Machine Learning, 13, 71.
Tsibouris, G.& Zeidenberg, M. (1995). Testing the efficient markets hypothesis with gradient descent algorithms. In A. Refenes, (Ed.), Neural Networks in the Capital Markets. John Wiley and Sons.
Watrous, R. L.& Kuhn, G. M. (1992a). Induction of finite state languages using second-order recurrent networks. In J. E. Moody, S. J. Hanson,& R. P Lippmann, (Eds.), Advances in Neural Information Processing Systems, (Vol. 4, pp. 309–316), San Mateo, CA: Morgan Kaufmann Publishers.
Watrous, R. L.& Kuhn, G. M. (1992b). Induction of finite-state languages using second-order recurrent networks. Neural Computation, 4:3, 406.
Weigend, A. S., Huberman, B. A.,& Rumelhart, D. E. (1992). Predicting sunspots and exchange rates with connectionist networks. In M. Casdagli& S. Eubank, (Eds.), Nonlinear Modeling and Forecasting, SFI Studies in the Sciences of Complexity, Proceedings Vol. XII (pp. 395–432), Addison-Wesley.
Weiss S. M.& Indurkhya, N. (1995). Rule-based machine learning methods for functional prediction. Journal of Artificial Intelligence Research.
Yule, G. U. (1927). On a method of investigating periodicities in disturbed series with special reference toWolfer's sunspot numbers. Philosophical Transactions Royal Society London Series A, 226, 267–298.
Zeng, Z., Goodman, R. M.,& Smyth, P. (1994). Discrete recurrent neural networks for grammatical inference. IEEE Transactions on Neural Networks, 5:2, 320–330.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Giles, C.L., Lawrence, S. & Tsoi, A.C. Noisy Time Series Prediction using Recurrent Neural Networks and Grammatical Inference. Machine Learning 44, 161–183 (2001). https://doi.org/10.1023/A:1010884214864
Issue Date:
DOI: https://doi.org/10.1023/A:1010884214864