Advertisement

Computational Statistics

, Volume 31, Issue 4, pp 1305–1325 | Cite as

Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights

  • Philippe Pébay
  • Timothy B. Terriberry
  • Hemanth Kolla
  • Janine Bennett
Original Paper

Abstract

Formulas for incremental or parallel computation of second order central moments have long been known, and recent extensions of these formulas to univariate and multivariate moments of arbitrary order have been developed. Such formulas are of key importance in scenarios where incremental results are required and in parallel and distributed systems where communication costs are high. We survey these recent results, and improve them with arbitrary-order, numerically stable one-pass formulas which we further extend with weighted and compound variants. We also develop a generalized correction factor for standard two-pass algorithms that enables the maintenance of accuracy over nearly the full representable range of the input, avoiding the need for extended-precision arithmetic. We then empirically examine algorithm correctness for pairwise update formulas up to order four as well as condition number and relative error bounds for eight different central moment formulas, each up to degree six, to address the trade-offs between numerical accuracy and speed of the various algorithms. Finally, we demonstrate the use of the most elaborate among the above mentioned formulas, with the utilization of the compound moments for a practical large-scale scientific application.

Keywords

Descriptive statistics Statistical moments Parallel computing Large data analysis 

References

  1. Amblard PO, Brossier JM (1995) Adaptive estimation of the fourth-order cumulant of a white stochastic process. Signal Process 42(1):37–43CrossRefzbMATHGoogle Scholar
  2. Bennett J, Grout RW, Pébay PP, Roe DC, Thompson DC (2009) Numerically stable, single-pass, parallel statistics algorithms. In: CLUSTER. IEEE, pp 1–8. doi: 10.1109/CLUSTR.2009.5289161
  3. Chan TF, Lewis JG (1978) Rounding error analysis of algorithms for computing means and standard deviations. Tech. Rep. Tech. Rep. 289, Dept. of Mathematical Sciences, The Johns Hopkins University, Baltimore, MDGoogle Scholar
  4. Chan TF, Golub GH, LeVeque RJ (1979) Updating formulae and a pairwise algorithm for computing sample variances. Technical Report STAN-CS-79-773, Stanford University, Department of Computer ScienceGoogle Scholar
  5. Chan TF, Golub GH, LeVeque RJ (1983) Algorithms for computing the sample variance: analysis and recommendations. Am Stat 37(3):242–247MathSciNetzbMATHGoogle Scholar
  6. Chen JH, Choudhary A, de Supinski B, DeVries M, Hawkes ER, Klasky S, Liao WK, Ma KL, Mellor-Crummey J, Podhorski N, Sankaran R, Shende S, Yoo CS (2009) Terascale direct numerical simulations of turbulent combustion using S3D. Comput Sci Discov 2(015001):1–31Google Scholar
  7. Dembélé D, Favier G (1998) Recursive estimation of fourth-order cumulants with application to identification. Signal Process 68(2):127–139CrossRefzbMATHGoogle Scholar
  8. Dodge Y, Rousson V (1999) The complications of the fourth central moment. Am Stat 53(3):267–269MathSciNetGoogle Scholar
  9. Eddelbuettel D (2010) High-performance and parallel computing with R. http://cran.r-project.org/web/views/HighPerformanceComputing.html, version 2010-09-16 downloaded 2010-11-15
  10. Farid H, Popescu AC (2001) Blind removal of image non-linearities. In: Proceedings of the 8th international conference on computer vision (ICCV’01), vol 1, Vancouver, BC, pp 76–81Google Scholar
  11. Halmos PR (1946) The theory of unbiased estimation. Ann Math Stat 17(1):34–43MathSciNetCrossRefzbMATHGoogle Scholar
  12. Harvey CR, Siddique A (2000) Conditional skewness in asset pricing tests. J Financ 55(3):1263–1295CrossRefGoogle Scholar
  13. Hung DCH, Shackleton M, Xu X (2004) CAPM, higher co-moment and factor models of UK stock returns. J Bus Financ Acc 31(1–2):87–112CrossRefGoogle Scholar
  14. Ibrahim MA, Hussein AWF, Mashali SA, Mohamed AH (1998) A blind image restoration system using higher-order statistics and Radon transform. In: Proceedings of the 5th international conference on electronics, circuits and systems (ICECS’98), vol 3, Lisboa, Portugal, pp 523–530Google Scholar
  15. IEEE (1985) IEEE standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985Google Scholar
  16. Jones WP (1993) Turbulence modelling and numerical solution methods for variable density and combusting flows. In: Libby PA, Williams FA (eds) Turbulent reacting flows. Academic Press, London, pp 309–374Google Scholar
  17. Kikuchi N, Hayase S, Sekine K, Sasaki S (2005) Performance of chromatic dispersion monitoring using statistical moments of asynchronously sampled waveform histograms. Photonics Technol Lett 17:1103–1105Google Scholar
  18. Kleihorst RP, Lagendijk RL, Biemond J (1997) An adaptive order-statistic noise filter for gamma-corrected image sequences. IEEE Trans Image Process 6(10):1442–1446CrossRefGoogle Scholar
  19. Langlois P, Louvet N (2007) How to ensure a faithful polynomial evaluation with the compensated horner algorithm. In: Proceedings of the 18th symposium on computer arithmetic (ARITH’07), Montpellier, France, pp 141–149Google Scholar
  20. Lyu S, Farid H (2002) Detecting hidden messages using higher-order statistics and support vector machines. In: Proceedings of the 5th international workshop on information hiding (IH’02), vol 2578, Lecture Notes in Computer Science. Springer, Noordwijkerhout, The Netherlands, pp 340–354Google Scholar
  21. Mendel JM (1991) Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications. Proc IEEE 79(3):278–305CrossRefGoogle Scholar
  22. Neely PM (1966) Comparison of several algorithms for computation of means, standard deviations and correlation coefficients. Commun ACM 9(7):496–499CrossRefGoogle Scholar
  23. Nemer E, Goubran R, Mahmoud S (2001) Robust voice activity detection using higher-order statistics in the LPC residual domain. IEEE Trans Speech Audio Process 9(3):217–231CrossRefGoogle Scholar
  24. Nikias CL, Mendel JM (1993) Signal processing with higher-order spectra. IEEE Signal Process Mag 10(3):10–37CrossRefGoogle Scholar
  25. Ogita T, Rump SM, Oishi S (2005) Accurate sum and dot product. SIAM J Sci Comput 26(6):1955–1988MathSciNetCrossRefzbMATHGoogle Scholar
  26. Pearson K (1894) Contributions to the mathematical theory of evolution. Philos Trans R Soc Lond Ser A 185:71–110CrossRefzbMATHGoogle Scholar
  27. Pébay P, Thompson DC, Bennett J (2010) Computing contingency statistics inparallel: design trade-offs and limiting cases. In: Proceedings of the 2010 IEEE international conference on cluster computing (12th CLUSTER’10). IEEE Computer Society, Heraklion, Crete, Greece, pp 156–165Google Scholar
  28. Pébay P, Thompson DC, Bennett J, Mascarenhas A (2011) Design and performance of a scalable, parallel statistics toolkit. In: Proceedings of the workshop on parallel & distributed scientific & engineering computing (12th PDSEC’11) workshop proceedings, 25th IEEE international symposium on parallel and distributed processing (25th IPDPS’11). IEEE Computer Society, Anchorage, Alaska, USA, pp 1475–1484Google Scholar
  29. Porat B, Friedlander B (1991) Direction finding algorithms based on high-order statistics. IEEE Trans Signal Process 39(9):2016–2024CrossRefzbMATHGoogle Scholar
  30. Prakasam P, Madheswaran M (2008) M-ary shift keying modulation scheme identification algorithm using wavelet transform and higher order statistical moment. J Appl Sci 8(1):112–119CrossRefGoogle Scholar
  31. Samuelson PA (1970) The fundamental approximation theorem of portfolio analysis in terms of means, variances and higher moments. Rev Econ Stud 37(4):537–542CrossRefGoogle Scholar
  32. Sankaran R, Hawkes ER, Chen JH, Lu T, Law CK (2007) Structure of a spatially developing turbulent lean methaneair Bunsen flame. Proc Combust Inst 31:1291–1298CrossRefGoogle Scholar
  33. Schmidberger M, Morgan M, Eddelbuettel D, Yu H, Tierney L, Mansmann U (2009) State-of-the-art in parallel computing with R. Tech. Rep. 47, Department of Statistics, University of MunichGoogle Scholar
  34. Shalvi O, Weinstein E (1990) New criteria for blind deconvolution of nonminimum phase systems (channels). IEEE Trans Inf Theory 36(2):312–321MathSciNetCrossRefzbMATHGoogle Scholar
  35. Stata (2010) Stata/MP performance report. Tech. rep., StataCorp LP. http://www.stata.com/statamp/statamp.pdf, version 2.0.0
  36. Tugnait JK (1997) Identification and deconvolution of multichannel linear non-Gaussian processes using higher order statistics and inverse filter criteria. IEEE Trans Signal Process 45(3):658–672CrossRefGoogle Scholar
  37. Wang H, Huo J, Song C (2006) A blind image restoration algorithm based on cumulants. In: Proceedings of the 8th international conference on signal processing (ICSP’06), BeijingGoogle Scholar
  38. Welford BP (1962) Note on a method for calculating corrected sums of squares and products. Technometrics 4(3):419–420MathSciNetCrossRefGoogle Scholar
  39. West DHD (1979) Updating mean and variance estimates: an improved method. Commun ACM 22(9):532–535CrossRefzbMATHGoogle Scholar
  40. Wong MH, Thompson DC, Pébay P, Mayo JR, Gentile AC, Debusschere BJ, Brandt JM (2008) \({{\sf OVIS}}\)-2: a robust distributed architecture for scalable RAS. In: Proceedings of the 22nd IEEE international parallel & distributed processing symposium, Miami, FL. http://ipdps.org/ipdps2008/IPDPS-2008-Abstract.pdf
  41. Wylie B, Baumes J, Shead T (2008) Titan informatics toolkit. In: IEEE visualization tutorial, Columbus, OHGoogle Scholar
  42. Xu Y, Crebbin G (1996) Image blur identification by using higher order statistic techniques. In: Proceedings of the 3rd international conference on image processing (ICIP’96), vol 3, Lausanne, Switzerland, pp 77–80Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg (outside the USA) 2016

Authors and Affiliations

  • Philippe Pébay
    • 1
  • Timothy B. Terriberry
    • 2
  • Hemanth Kolla
    • 3
  • Janine Bennett
    • 4
  1. 1.Sandia National LaboratoriesLivermoreUSA
  2. 2.The Xiph.Org FoundationArlingtonUSA
  3. 3.Sandia National LaboratoriesLivermoreUSA
  4. 4.Sandia National LaboratoriesLivermoreUSA

Personalised recommendations