Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights

Pébay, Philippe; Terriberry, Timothy B.; Kolla, Hemanth; Bennett, Janine

doi:10.1007/s00180-015-0637-z

Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights

Original Paper
Published: 29 March 2016

Volume 31, pages 1305–1325, (2016)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Philippe Pébay¹,
Timothy B. Terriberry²,
Hemanth Kolla³ &
…
Janine Bennett⁴

1023 Accesses
19 Citations
9 Altmetric
Explore all metrics

Abstract

Formulas for incremental or parallel computation of second order central moments have long been known, and recent extensions of these formulas to univariate and multivariate moments of arbitrary order have been developed. Such formulas are of key importance in scenarios where incremental results are required and in parallel and distributed systems where communication costs are high. We survey these recent results, and improve them with arbitrary-order, numerically stable one-pass formulas which we further extend with weighted and compound variants. We also develop a generalized correction factor for standard two-pass algorithms that enables the maintenance of accuracy over nearly the full representable range of the input, avoiding the need for extended-precision arithmetic. We then empirically examine algorithm correctness for pairwise update formulas up to order four as well as condition number and relative error bounds for eight different central moment formulas, each up to degree six, to address the trade-offs between numerical accuracy and speed of the various algorithms. Finally, we demonstrate the use of the most elaborate among the above mentioned formulas, with the utilization of the compound moments for a practical large-scale scientific application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Framework for Distributed Approximation of Moments with Higher-Order Derivatives Through Automatic Differentiation

Smolyak’s Algorithm: A Powerful Black Box for the Acceleration of Scientific Computations

The Parallel C++ Statistical Library for Bayesian Inference: QUESO

References

Amblard PO, Brossier JM (1995) Adaptive estimation of the fourth-order cumulant of a white stochastic process. Signal Process 42(1):37–43
Article MATH Google Scholar
Bennett J, Grout RW, Pébay PP, Roe DC, Thompson DC (2009) Numerically stable, single-pass, parallel statistics algorithms. In: CLUSTER. IEEE, pp 1–8. doi:10.1109/CLUSTR.2009.5289161
Chan TF, Lewis JG (1978) Rounding error analysis of algorithms for computing means and standard deviations. Tech. Rep. Tech. Rep. 289, Dept. of Mathematical Sciences, The Johns Hopkins University, Baltimore, MD
Chan TF, Golub GH, LeVeque RJ (1979) Updating formulae and a pairwise algorithm for computing sample variances. Technical Report STAN-CS-79-773, Stanford University, Department of Computer Science
Chan TF, Golub GH, LeVeque RJ (1983) Algorithms for computing the sample variance: analysis and recommendations. Am Stat 37(3):242–247
MathSciNet MATH Google Scholar
Chen JH, Choudhary A, de Supinski B, DeVries M, Hawkes ER, Klasky S, Liao WK, Ma KL, Mellor-Crummey J, Podhorski N, Sankaran R, Shende S, Yoo CS (2009) Terascale direct numerical simulations of turbulent combustion using S3D. Comput Sci Discov 2(015001):1–31
Google Scholar
Dembélé D, Favier G (1998) Recursive estimation of fourth-order cumulants with application to identification. Signal Process 68(2):127–139
Article MATH Google Scholar
Dodge Y, Rousson V (1999) The complications of the fourth central moment. Am Stat 53(3):267–269
MathSciNet Google Scholar
Eddelbuettel D (2010) High-performance and parallel computing with R. http://cran.r-project.org/web/views/HighPerformanceComputing.html, version 2010-09-16 downloaded 2010-11-15
Farid H, Popescu AC (2001) Blind removal of image non-linearities. In: Proceedings of the 8th international conference on computer vision (ICCV’01), vol 1, Vancouver, BC, pp 76–81
Halmos PR (1946) The theory of unbiased estimation. Ann Math Stat 17(1):34–43
Article MathSciNet MATH Google Scholar
Harvey CR, Siddique A (2000) Conditional skewness in asset pricing tests. J Financ 55(3):1263–1295
Article Google Scholar
Hung DCH, Shackleton M, Xu X (2004) CAPM, higher co-moment and factor models of UK stock returns. J Bus Financ Acc 31(1–2):87–112
Article Google Scholar
Ibrahim MA, Hussein AWF, Mashali SA, Mohamed AH (1998) A blind image restoration system using higher-order statistics and Radon transform. In: Proceedings of the 5th international conference on electronics, circuits and systems (ICECS’98), vol 3, Lisboa, Portugal, pp 523–530
IEEE (1985) IEEE standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985
Jones WP (1993) Turbulence modelling and numerical solution methods for variable density and combusting flows. In: Libby PA, Williams FA (eds) Turbulent reacting flows. Academic Press, London, pp 309–374
Google Scholar
Kikuchi N, Hayase S, Sekine K, Sasaki S (2005) Performance of chromatic dispersion monitoring using statistical moments of asynchronously sampled waveform histograms. Photonics Technol Lett 17:1103–1105
Kleihorst RP, Lagendijk RL, Biemond J (1997) An adaptive order-statistic noise filter for gamma-corrected image sequences. IEEE Trans Image Process 6(10):1442–1446
Article Google Scholar
Langlois P, Louvet N (2007) How to ensure a faithful polynomial evaluation with the compensated horner algorithm. In: Proceedings of the 18th symposium on computer arithmetic (ARITH’07), Montpellier, France, pp 141–149
Lyu S, Farid H (2002) Detecting hidden messages using higher-order statistics and support vector machines. In: Proceedings of the 5th international workshop on information hiding (IH’02), vol 2578, Lecture Notes in Computer Science. Springer, Noordwijkerhout, The Netherlands, pp 340–354
Mendel JM (1991) Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications. Proc IEEE 79(3):278–305
Article Google Scholar
Neely PM (1966) Comparison of several algorithms for computation of means, standard deviations and correlation coefficients. Commun ACM 9(7):496–499
Article Google Scholar
Nemer E, Goubran R, Mahmoud S (2001) Robust voice activity detection using higher-order statistics in the LPC residual domain. IEEE Trans Speech Audio Process 9(3):217–231
Article Google Scholar
Nikias CL, Mendel JM (1993) Signal processing with higher-order spectra. IEEE Signal Process Mag 10(3):10–37
Article Google Scholar
Ogita T, Rump SM, Oishi S (2005) Accurate sum and dot product. SIAM J Sci Comput 26(6):1955–1988
Article MathSciNet MATH Google Scholar
Pearson K (1894) Contributions to the mathematical theory of evolution. Philos Trans R Soc Lond Ser A 185:71–110
Article MATH Google Scholar
Pébay P, Thompson DC, Bennett J (2010) Computing contingency statistics inparallel: design trade-offs and limiting cases. In: Proceedings of the 2010 IEEE international conference on cluster computing (12th CLUSTER’10). IEEE Computer Society, Heraklion, Crete, Greece, pp 156–165
Pébay P, Thompson DC, Bennett J, Mascarenhas A (2011) Design and performance of a scalable, parallel statistics toolkit. In: Proceedings of the workshop on parallel & distributed scientific & engineering computing (12th PDSEC’11) workshop proceedings, 25th IEEE international symposium on parallel and distributed processing (25th IPDPS’11). IEEE Computer Society, Anchorage, Alaska, USA, pp 1475–1484
Porat B, Friedlander B (1991) Direction finding algorithms based on high-order statistics. IEEE Trans Signal Process 39(9):2016–2024
Article MATH Google Scholar
Prakasam P, Madheswaran M (2008) M-ary shift keying modulation scheme identification algorithm using wavelet transform and higher order statistical moment. J Appl Sci 8(1):112–119
Article Google Scholar
Samuelson PA (1970) The fundamental approximation theorem of portfolio analysis in terms of means, variances and higher moments. Rev Econ Stud 37(4):537–542
Article Google Scholar
Sankaran R, Hawkes ER, Chen JH, Lu T, Law CK (2007) Structure of a spatially developing turbulent lean methaneair Bunsen flame. Proc Combust Inst 31:1291–1298
Article Google Scholar
Schmidberger M, Morgan M, Eddelbuettel D, Yu H, Tierney L, Mansmann U (2009) State-of-the-art in parallel computing with R. Tech. Rep. 47, Department of Statistics, University of Munich
Shalvi O, Weinstein E (1990) New criteria for blind deconvolution of nonminimum phase systems (channels). IEEE Trans Inf Theory 36(2):312–321
Article MathSciNet MATH Google Scholar
Stata (2010) Stata/MP performance report. Tech. rep., StataCorp LP. http://www.stata.com/statamp/statamp.pdf, version 2.0.0
Tugnait JK (1997) Identification and deconvolution of multichannel linear non-Gaussian processes using higher order statistics and inverse filter criteria. IEEE Trans Signal Process 45(3):658–672
Article Google Scholar
Wang H, Huo J, Song C (2006) A blind image restoration algorithm based on cumulants. In: Proceedings of the 8th international conference on signal processing (ICSP’06), Beijing
Welford BP (1962) Note on a method for calculating corrected sums of squares and products. Technometrics 4(3):419–420
Article MathSciNet Google Scholar
West DHD (1979) Updating mean and variance estimates: an improved method. Commun ACM 22(9):532–535
Article MATH Google Scholar
Wong MH, Thompson DC, Pébay P, Mayo JR, Gentile AC, Debusschere BJ, Brandt JM (2008) \({{\sf OVIS}}\)-2: a robust distributed architecture for scalable RAS. In: Proceedings of the 22nd IEEE international parallel & distributed processing symposium, Miami, FL. http://ipdps.org/ipdps2008/IPDPS-2008-Abstract.pdf
Wylie B, Baumes J, Shead T (2008) Titan informatics toolkit. In: IEEE visualization tutorial, Columbus, OH
Xu Y, Crebbin G (1996) Image blur identification by using higher order statistic techniques. In: Proceedings of the 3rd international conference on image processing (ICIP’96), vol 3, Lausanne, Switzerland, pp 77–80

Download references

Author information

Authors and Affiliations

Sandia National Laboratories, MS 9159, P.O. Box 969, Livermore, CA, 94551, USA
Philippe Pébay
The Xiph.Org Foundation, 2521 S. Oxford St., Arlington, VA, 22206, USA
Timothy B. Terriberry
Sandia National Laboratories, MS 9158, P.O. Box 969, Livermore, CA, 94551, USA
Hemanth Kolla
Sandia National Laboratories, MS 9152, P.O. Box 969, Livermore, CA, 94551, USA
Janine Bennett

Authors

Philippe Pébay
View author publications
You can also search for this author in PubMed Google Scholar
Timothy B. Terriberry
View author publications
You can also search for this author in PubMed Google Scholar
Hemanth Kolla
View author publications
You can also search for this author in PubMed Google Scholar
Janine Bennett
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe Pébay.

Additional information

Dedicated to the Memory of Dr Timothy J. Baker (1948–2006).

Philippe Pébay, Hemanth Kolla and Janine Bennett: These authors were supported by the United States Department of Energy, Office of Science, Office of Defense, and Sandia LDRD Program. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed-Martin Company, for the United States Department of Energy under Contract DE-AC04-94-AL85000.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pébay, P., Terriberry, T.B., Kolla, H. et al. Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights. Comput Stat 31, 1305–1325 (2016). https://doi.org/10.1007/s00180-015-0637-z

Download citation

Received: 26 January 2015
Accepted: 09 December 2015
Published: 29 March 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s00180-015-0637-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights

Abstract

Access this article

Similar content being viewed by others

A Framework for Distributed Approximation of Moments with Higher-Order Derivatives Through Automatic Differentiation

Smolyak’s Algorithm: A Powerful Black Box for the Acceleration of Scientific Computations

The Parallel C++ Statistical Library for Bayesian Inference: QUESO

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights

Abstract

Access this article

Similar content being viewed by others

A Framework for Distributed Approximation of Moments with Higher-Order Derivatives Through Automatic Differentiation

Smolyak’s Algorithm: A Powerful Black Box for the Acceleration of Scientific Computations

The Parallel C++ Statistical Library for Bayesian Inference: QUESO

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation