Abstract
Formulas for incremental or parallel computation of second order central moments have long been known, and recent extensions of these formulas to univariate and multivariate moments of arbitrary order have been developed. Such formulas are of key importance in scenarios where incremental results are required and in parallel and distributed systems where communication costs are high. We survey these recent results, and improve them with arbitrary-order, numerically stable one-pass formulas which we further extend with weighted and compound variants. We also develop a generalized correction factor for standard two-pass algorithms that enables the maintenance of accuracy over nearly the full representable range of the input, avoiding the need for extended-precision arithmetic. We then empirically examine algorithm correctness for pairwise update formulas up to order four as well as condition number and relative error bounds for eight different central moment formulas, each up to degree six, to address the trade-offs between numerical accuracy and speed of the various algorithms. Finally, we demonstrate the use of the most elaborate among the above mentioned formulas, with the utilization of the compound moments for a practical large-scale scientific application.
Similar content being viewed by others
References
Amblard PO, Brossier JM (1995) Adaptive estimation of the fourth-order cumulant of a white stochastic process. Signal Process 42(1):37–43
Bennett J, Grout RW, Pébay PP, Roe DC, Thompson DC (2009) Numerically stable, single-pass, parallel statistics algorithms. In: CLUSTER. IEEE, pp 1–8. doi:10.1109/CLUSTR.2009.5289161
Chan TF, Lewis JG (1978) Rounding error analysis of algorithms for computing means and standard deviations. Tech. Rep. Tech. Rep. 289, Dept. of Mathematical Sciences, The Johns Hopkins University, Baltimore, MD
Chan TF, Golub GH, LeVeque RJ (1979) Updating formulae and a pairwise algorithm for computing sample variances. Technical Report STAN-CS-79-773, Stanford University, Department of Computer Science
Chan TF, Golub GH, LeVeque RJ (1983) Algorithms for computing the sample variance: analysis and recommendations. Am Stat 37(3):242–247
Chen JH, Choudhary A, de Supinski B, DeVries M, Hawkes ER, Klasky S, Liao WK, Ma KL, Mellor-Crummey J, Podhorski N, Sankaran R, Shende S, Yoo CS (2009) Terascale direct numerical simulations of turbulent combustion using S3D. Comput Sci Discov 2(015001):1–31
Dembélé D, Favier G (1998) Recursive estimation of fourth-order cumulants with application to identification. Signal Process 68(2):127–139
Dodge Y, Rousson V (1999) The complications of the fourth central moment. Am Stat 53(3):267–269
Eddelbuettel D (2010) High-performance and parallel computing with R. http://cran.r-project.org/web/views/HighPerformanceComputing.html, version 2010-09-16 downloaded 2010-11-15
Farid H, Popescu AC (2001) Blind removal of image non-linearities. In: Proceedings of the 8th international conference on computer vision (ICCV’01), vol 1, Vancouver, BC, pp 76–81
Halmos PR (1946) The theory of unbiased estimation. Ann Math Stat 17(1):34–43
Harvey CR, Siddique A (2000) Conditional skewness in asset pricing tests. J Financ 55(3):1263–1295
Hung DCH, Shackleton M, Xu X (2004) CAPM, higher co-moment and factor models of UK stock returns. J Bus Financ Acc 31(1–2):87–112
Ibrahim MA, Hussein AWF, Mashali SA, Mohamed AH (1998) A blind image restoration system using higher-order statistics and Radon transform. In: Proceedings of the 5th international conference on electronics, circuits and systems (ICECS’98), vol 3, Lisboa, Portugal, pp 523–530
IEEE (1985) IEEE standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985
Jones WP (1993) Turbulence modelling and numerical solution methods for variable density and combusting flows. In: Libby PA, Williams FA (eds) Turbulent reacting flows. Academic Press, London, pp 309–374
Kikuchi N, Hayase S, Sekine K, Sasaki S (2005) Performance of chromatic dispersion monitoring using statistical moments of asynchronously sampled waveform histograms. Photonics Technol Lett 17:1103–1105
Kleihorst RP, Lagendijk RL, Biemond J (1997) An adaptive order-statistic noise filter for gamma-corrected image sequences. IEEE Trans Image Process 6(10):1442–1446
Langlois P, Louvet N (2007) How to ensure a faithful polynomial evaluation with the compensated horner algorithm. In: Proceedings of the 18th symposium on computer arithmetic (ARITH’07), Montpellier, France, pp 141–149
Lyu S, Farid H (2002) Detecting hidden messages using higher-order statistics and support vector machines. In: Proceedings of the 5th international workshop on information hiding (IH’02), vol 2578, Lecture Notes in Computer Science. Springer, Noordwijkerhout, The Netherlands, pp 340–354
Mendel JM (1991) Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications. Proc IEEE 79(3):278–305
Neely PM (1966) Comparison of several algorithms for computation of means, standard deviations and correlation coefficients. Commun ACM 9(7):496–499
Nemer E, Goubran R, Mahmoud S (2001) Robust voice activity detection using higher-order statistics in the LPC residual domain. IEEE Trans Speech Audio Process 9(3):217–231
Nikias CL, Mendel JM (1993) Signal processing with higher-order spectra. IEEE Signal Process Mag 10(3):10–37
Ogita T, Rump SM, Oishi S (2005) Accurate sum and dot product. SIAM J Sci Comput 26(6):1955–1988
Pearson K (1894) Contributions to the mathematical theory of evolution. Philos Trans R Soc Lond Ser A 185:71–110
Pébay P, Thompson DC, Bennett J (2010) Computing contingency statistics inparallel: design trade-offs and limiting cases. In: Proceedings of the 2010 IEEE international conference on cluster computing (12th CLUSTER’10). IEEE Computer Society, Heraklion, Crete, Greece, pp 156–165
Pébay P, Thompson DC, Bennett J, Mascarenhas A (2011) Design and performance of a scalable, parallel statistics toolkit. In: Proceedings of the workshop on parallel & distributed scientific & engineering computing (12th PDSEC’11) workshop proceedings, 25th IEEE international symposium on parallel and distributed processing (25th IPDPS’11). IEEE Computer Society, Anchorage, Alaska, USA, pp 1475–1484
Porat B, Friedlander B (1991) Direction finding algorithms based on high-order statistics. IEEE Trans Signal Process 39(9):2016–2024
Prakasam P, Madheswaran M (2008) M-ary shift keying modulation scheme identification algorithm using wavelet transform and higher order statistical moment. J Appl Sci 8(1):112–119
Samuelson PA (1970) The fundamental approximation theorem of portfolio analysis in terms of means, variances and higher moments. Rev Econ Stud 37(4):537–542
Sankaran R, Hawkes ER, Chen JH, Lu T, Law CK (2007) Structure of a spatially developing turbulent lean methaneair Bunsen flame. Proc Combust Inst 31:1291–1298
Schmidberger M, Morgan M, Eddelbuettel D, Yu H, Tierney L, Mansmann U (2009) State-of-the-art in parallel computing with R. Tech. Rep. 47, Department of Statistics, University of Munich
Shalvi O, Weinstein E (1990) New criteria for blind deconvolution of nonminimum phase systems (channels). IEEE Trans Inf Theory 36(2):312–321
Stata (2010) Stata/MP performance report. Tech. rep., StataCorp LP. http://www.stata.com/statamp/statamp.pdf, version 2.0.0
Tugnait JK (1997) Identification and deconvolution of multichannel linear non-Gaussian processes using higher order statistics and inverse filter criteria. IEEE Trans Signal Process 45(3):658–672
Wang H, Huo J, Song C (2006) A blind image restoration algorithm based on cumulants. In: Proceedings of the 8th international conference on signal processing (ICSP’06), Beijing
Welford BP (1962) Note on a method for calculating corrected sums of squares and products. Technometrics 4(3):419–420
West DHD (1979) Updating mean and variance estimates: an improved method. Commun ACM 22(9):532–535
Wong MH, Thompson DC, Pébay P, Mayo JR, Gentile AC, Debusschere BJ, Brandt JM (2008) \({{\sf OVIS}}\)-2: a robust distributed architecture for scalable RAS. In: Proceedings of the 22nd IEEE international parallel & distributed processing symposium, Miami, FL. http://ipdps.org/ipdps2008/IPDPS-2008-Abstract.pdf
Wylie B, Baumes J, Shead T (2008) Titan informatics toolkit. In: IEEE visualization tutorial, Columbus, OH
Xu Y, Crebbin G (1996) Image blur identification by using higher order statistic techniques. In: Proceedings of the 3rd international conference on image processing (ICIP’96), vol 3, Lausanne, Switzerland, pp 77–80
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated to the Memory of Dr Timothy J. Baker (1948–2006).
Philippe Pébay, Hemanth Kolla and Janine Bennett: These authors were supported by the United States Department of Energy, Office of Science, Office of Defense, and Sandia LDRD Program. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed-Martin Company, for the United States Department of Energy under Contract DE-AC04-94-AL85000.
Rights and permissions
About this article
Cite this article
Pébay, P., Terriberry, T.B., Kolla, H. et al. Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights. Comput Stat 31, 1305–1325 (2016). https://doi.org/10.1007/s00180-015-0637-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-015-0637-z