Data Mining for Algorithmic Asset Management

Statistical arbitrage refers to a class of algorithmic trading systems implementing data mining strategies. In this chapter we describe a computational framework for statistical arbitrage based on support vector regression. The algorithm learns the fair price of the security under management by minimining a regularized ε-insensitive loss function in an on-line fashion, using the most recent market information acquired by means of streaming financial data. The difficult issue of adaptive learning in non-stationary environments is addressed by adopting an ensemble learning approach, where a meta-algorithm strategically combines the opinion of a pool of experts. Experimental results based on nearly seven years of historical data for the iShare S&P 500 ETF demonstrate that satisfactory risk-adjusted returns can be achieved by the data mining system even after transaction costs.

References

  1. 1.
    C.C. Aggarwal, J. Han, J. Wang, and Yu P.S. Data Streams: Models and Algorithms, chapter On Clustering Massive Data Streams: A Summarization Paradigm, pages 9–38. Springer,2007.Google Scholar
  2. 2.
    C. Alexander and A. Dimitriu. Sources of over-performance in equity markets: mean reversion, common trends and herding. Technical report, ISMA Center, University of Reading,UK, 2005Google Scholar
  3. 3.
    L. Cao and F. Tay. Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on Neural Networks, 14(6):1506–1518, 2003.CrossRefGoogle Scholar
  4. 4.
    N. Cesa-Bianchi and G. Lugosi. Prediction, learning, and games. Cambridge University Press, 2006.MATHGoogle Scholar
  5. 5.
    N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge University Press, 2000.Google Scholar
  6. 6.
    R.J. Elliott, J. van der Hoek, and W.P. Malcolm. Pairs trading. Quantitative Finance, pages 271–276, 2005.Google Scholar
  7. 7.
    N. Littlestone and M.K. Warmuth. The weighted majority algorithm. Information and Computation, 108:212–226, 1994.MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    J. Ma, J. Theiler, and S. Perkins. Accurate on-line support vector regression. Neural Computation, 15:2003, 2003.CrossRefGoogle Scholar
  9. 9.
    G. Montana, K. Triantafyllopoulos, and T. Tsagaris. Data stream mining for market-neutral algorithmic trading. In Proceedings of the ACM Symposium on Applied Computing, pages 966–970, 2008.Google Scholar
  10. 10.
    G. Montana, K. Triantafyllopoulos, and T. Tsagaris. Flexible least squares for temporal data mining and statistical arbitrage. Expert Systems with Applications,doi:10.1016/j.eswa.2008.01.062, 2008.Google Scholar
  11. 11.
    J. G. Nicholas. Market-Neutral Investing: Long/Short Hedge Fund Strategies. Bloomberg Professional Library, 2000.Google Scholar
  12. 12.
    S. Papadimitriou, J. Sun, and C. Faloutsos. Data Streams: Models and Algorithms, chapter Dimensionality reduction and forecasting on streams, pages 261–278. Springer, 2007.Google Scholar
  13. 13.
    F. Parrella and G. Montana. A note on incremental support vector regression. Technical report,Imperial College London, 2008.Google Scholar
  14. 14.
    A. Pole. Statistical Arbitrage. Algorithmic Trading Insights and Techniques. Wiley Finance,2007.Google Scholar
  15. 15.
    V. Vapnik. The Nature of Statistical Learning Theory. Springer, 1995.Google Scholar
  16. 16.
    J. Weng, Y. Zhang, and W. S. Hwang. Candid covariance-free incremental principal component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(8):1034–1040, 2003.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of MathematicsImperial College LondonPorto AlegreLondon

Personalised recommendations