Journal of Signal Processing Systems

, Volume 77, Issue 1–2, pp 117–129 | Cite as

Pipelined HAC Estimation Engines for Multivariate Time Series

  • Ce GuoEmail author
  • Wayne Luk


Heteroskedasticity and autocorrelation consistent (HAC) covariance matrix estimation, or HAC estimation in short, is one of the most important techniques in time series analysis and forecasting. It serves as a powerful analytical tool for hypothesis testing and model verification. However, HAC estimation for long and high-dimensional time series is computationally expensive. This paper describes a pipeline-friendly HAC estimation algorithm derived from a mathematical specification, by applying transformations to eliminate conditionals, to parallelise arithmetic, and to promote data reuse in computation. We discuss an initial hardware architecture for the proposed algorithm, and propose two optimised architectures to improve the worst-case performance. Experimental systems based on proposed architectures demonstrate high performance especially for long time series. One experimental system achieves up to 12 times speedup over an optimised software system on 12 CPU cores.


Time series HAC estimation Big data Acceleration engine FPGA 



The authors would like to thank the anonymous reviewers for their constructive comments. This work is supported in part by the China Scholarship Council, by the European Union Seventh Framework Programme under grant agreement number 257906, 287804 and 318521, by UK EPSRC, by Maxeler University Programme, and by Xilinx.


  1. 1.
    Jegadeesh, N., & Titman, S. (1993). Returns to buying winners and selling losers: implications for stock market efficiency. The Journal of Finance, 48(1), 65–91.CrossRefGoogle Scholar
  2. 2.
    Bollerslev, T., Tauchen, G., Zhou, H. (2009). Expected stock returns and variance risk premia. Review of Financial Studies, 22(11), 4463–4492.CrossRefGoogle Scholar
  3. 3.
    Bekaert, G., Harvey, C., Lundblad, C., Siegel, S. (2011). What segments equity markets.Review of Financial Studies, 24(12), 3841–3890.CrossRefGoogle Scholar
  4. 4.
    Guo, C., & Luk, W. (2013). Accelerating HAC estimation for multivariate time series. International conference on application-specific systems. Architectures and Processors.Google Scholar
  5. 5.
    Hamilton, J. D. (1994). Time series analysis. Cambridge University Press.Google Scholar
  6. 6.
    Newey, W. K., & West, K. D. (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica: Journal of the Econometric Society, 55, 703–708.CrossRefzbMATHMathSciNetGoogle Scholar
  7. 7.
    Newey, W. K., & West, K. D. (1994). Automatic lag selection in covariance matrix estimation. Review of Economic Studies, 61, 631–653.CrossRefzbMATHMathSciNetGoogle Scholar
  8. 8.
    Andrews, D. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica: Journal of the Econometric Society, 59, 817–858.CrossRefzbMATHMathSciNetGoogle Scholar
  9. 9.
    Sart, D., Mueen, A., Najjar, W., Keogh, E., Niennattrakul, V. (2010). Accelerating dynamic time warping subsequence search with GPUs and FPGAs. International conference on data mining (pp. 1001–1006).Google Scholar
  10. 10.
    Wang, Z., Huang, S., Wang, L., Li, H., Wang, Y., Yang, H. (2013). Accelerating subsequence similarity search based on dynamic time warping distance with FPGA. International symposium on field-programmable gate arrays.Google Scholar
  11. 11.
    Preis, T., Virnau, P., Paul, W., Schneider, J. J. (2009). Accelerated fluctuation analysis by graphic cards and complex pattern formation in financial markets. New Journal of Physics, 11(9), 093024.CrossRefGoogle Scholar
  12. 12.
    Gembris, D., Neeb, M., Gipp, M., Kugel, A., Männer, R. (2011). Correlation analysis on GPU systems using NVIDIA’s CUDA. Journal of Real-Time Image Processing, 6(4), 275–280.CrossRefGoogle Scholar
  13. 13.
    Baker, Z., & Prasanna, V. (2005). Efficient hardware data mining with the apriori algorithm on FPGAs. International symposium on field-programmable custom computing machines (pp. 3–12).Google Scholar
  14. 14.
    Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. International conference on very large databases (VLDB) (pp. 487–499).Google Scholar
  15. 15.
    MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Berkeley symposium on mathematical statistics and probability (pp. 281–297).Google Scholar
  16. 16.
    Quinlan, J. R. (Mar. 1986). Induction of decision trees. Machine Learning, 1(1), 81–106.Google Scholar
  17. 17.
    Saegusa, T., & Maruyama, T. (2006). An FPGA implementation of K-means clustering for color images based on kd-tree. International conference on field programmable logic and applications (pp. 1–6).Google Scholar
  18. 18.
    Narayanan, R., Honbo, D., Memik, G., Choudhary, A., Zambreno, J. (2007). An FPGA implementation of decision tree classification. Conference on design, automation and test in Europe (pp. 1–6).Google Scholar
  19. 19.
    Moerland, P., & Fiesler, E. (1997). Neural network adaptations to hardware implementations. Tech. Rep., Switzerland: The Idiap Research Institute. Google Scholar
  20. 20.
    Guo, C., Fu, H., Luk, W. (2012). A fully-pipelined expectation-maximization engine for Gaussian mixture models. International conference on field-programmable technology.Google Scholar
  21. 21.
    Zeileis, A. (2004). Econometric computing with HC and HAC covariance matrix estimators.Distributed with the sandwich R package.Google Scholar
  22. 22.
    Cottrell, A., & Lucchetti, R. (2012). Gretl user’s guide.Distributed with the Gretl library.Google Scholar
  23. 23.
    Lin, C. Y., So, H. K., Leong, P. H. W. (2011). A model for matrix multiplication performance on FPGAs. International conference on field programmable logic and applications (pp. 305–310).Google Scholar
  24. 24.
    Jacob, A. C., Buhler, J. D., Chamberlain, R. D. (2010). Rapid RNA folding: analysis and acceleration of the zuker recurrence. International symposium on field-programmable custom computing machines (pp. 87–94).Google Scholar
  25. 25.
    Urquhart, R., & Wood, D. (1984). Systolic matrix and vector multiplication methods for signal processing. IEE Proceedings, 131(6), 623–631.Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of ComputingImperial College LondonLondonUK

Personalised recommendations