As Good as GOLD: Gram–Schmidt Orthogonalization by Another Name

Hunter, Michael D.

doi:10.1007/s11336-016-9511-3

As Good as GOLD: Gram–Schmidt Orthogonalization by Another Name

Published: 20 September 2016

Volume 81, pages 969–991, (2016)
Cite this article

Psychometrika Aims and scope Submit manuscript

Michael D. Hunter¹

381 Accesses
Explore all metrics

Abstract

Generalized orthogonal linear derivative (GOLD) estimates were proposed to correct a problem of correlated estimation errors in generalized local linear approximation (GLLA). This paper shows that GOLD estimates are related to GLLA estimates by the Gram–Schmidt orthogonalization process. Analytical work suggests that GLLA estimates are derivatives of an approximating polynomial and GOLD estimates are linear combinations of these derivatives. A series of simulation studies then further investigates and tests the analytical properties derived. The first study shows that when approximating or smoothing noisy data, GLLA outperforms GOLD, but when interpolating noisy data GOLD outperforms GLLA. The second study shows that when data are not noisy, GLLA always outperforms GOLD in terms of derivative estimation. Thus, when data can be smoothed or are not noisy, GLLA is preferred whereas when they cannot then GOLD is preferred. The last studies show situations where GOLD can produce biased estimates. In spite of these possible shortcomings of GOLD to produce accurate and unbiased estimates, GOLD may still provide adequate or improved model estimation because of its orthogonal error structure. However, GOLD should not be used purely for derivative estimation because the error covariance structure is irrelevant in this case. Future research should attempt to find orthogonal polynomial derivative estimators that produce accurate and unbiased derivatives with an orthogonal error structure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Linear estimation under the Gauss–Helmert model: geometrical interpretation and general solution

Article 11 May 2023

Yu Hu & Xing Fang

Robust and sparse regression in generalized linear model by stochastic optimization

Article 11 June 2019

Takayuki Kawashima & Hironori Fujisawa

Robust and Efficient derivative estimation under correlated errors

Article 18 November 2023

Deru Kong, Wei Shen, … WenWu Wang

Notes

Some confusion may have arisen from some of the terminology used for derivative estimation. Time-delays and embeddings sound exotic and potentially mysterious. The fact that the terms originate from the study of nonlinear and chaotic dynamics in physics (Abarbanel, 1996; Abarbanel, Brown, Sidorowich, & Tsimring, 1993) and the embedding theorems from Whitney (1936) in topological mathematics might aid in this misunderstanding. When conjoined with the practice of estimating derivatives which has roots in chemical spectroscopy Savitzky and Golay (1964), latent growth curves (McArdle & Epstein, 1987), and latent differential equations (Boker et al., 2004), it appears that some number of missteps are inevitable.
Recall that because $\varvec{Q}$ is orthogonal $\varvec{Q}^\mathsf{T} = \varvec{Q}^{-1} $.
Although the error is generated with standard deviation (SD) equal to $0.25*(SD_\mathrm{true})$, the SD of the error relative to the fitted observations is not 0.25. It is closer to 0.5. This is because the observations are polynomials. Polynomials explode at the tails and this drives the SD up. But the fitting does not occur in the tails, only at the middle of an 11-point window. So the SD of the fitted observations is much smaller than the SD of the total observations. This means the signal-to-noise ratio (of variances) for the fitted observations is about $mean(10*\mathrm{log}10(SNRv))\,=\,4.61\,dB$, not $10*\mathrm{log}10(16)\,=\,12.04 dB$. In other words, for the data fitted here, the error does not account for 1/17 (approximately 6%) of the total variance. Rather, it accounts for about $\mathrm{mean}(NR2)\,=\,28.8$ % of the total variance. To say it a third time, the raw signal-to-noise ratio is not 16, but rather is closer to 4.
Note that in Table 1 the 3rd and 4th order derivative results are identical. This is necessarily true when using a fourth-order polynomial. In general, the highest two derivative orders possible for a polynomial will always be identical across GLLA and GOLD.

References

Abarbanel, H. D. I. (1996). Analysis of observed chaotic data. New York: Springer.
Book Google Scholar
Abarbanel, H. D. I., Brown, R., Sidorowich, J. J., & Tsimring, L. S. (1993). The analysis of observed chaotic data in physical systems. Reviews of Modern Physics, 65(4), 1331–1392. doi:10.1103/RevModPhys.65.1331.
Article Google Scholar
Bisconti, T. L., Bergeman, C. S., & Boker, S. M. (2004). Emotional well-being in recently bereaved widows: A dynamical systems approach. Journal of Gerontology, 59B, 158–167.
Article Google Scholar
Bisconti, T. L., Bergeman, C. S., & Boker, S. M. (2006). Social support as a predictor of variability: An examination of the adjustment trajectories of recent widows. Psychology and Aging, 21, 590–599.
Article PubMed Google Scholar
Björck, Å. (1967). Solving linear least squares problems by Gram–Schmidt othogonalization. BIT, 7, 1–21.
Article Google Scholar
Boker, S. M., Deboeck, P. R., Edler, C., & Keel, P. K. (2009). Generalized local linear approximation of derivatives from time series. In S.-M. Chow, E. Ferrer, & F. Hsieh (Eds.), Statistical methods for modeling human dynamics: An interdisciplinary dialogue. Boca Raton, FL: Taylor and Francis.
Google Scholar
Boker, S. M., & Graham, J. (1998). A dynamical systems analysis of adolescent substance abuse. Multivariate Behavioral Research, 33, 479–507. doi:10.1207/s15327906mbr3304_3.
Article PubMed Google Scholar
Boker, S. M., Leibenluft, E., Deboeck, P. R., Virk, G., & Postolache, T. T. (2008). Mood oscillations and coupling between mood and weather in patients with rapid cycling bipolar disorder. International Journal of Child Health and Human Development, 1(2), 181–203.
PubMed PubMed Central Google Scholar
Boker, S. M., Montpetit, M. A., Hunter, M. D., & Bergeman, C. S. (2010). Modeling resilience with diffrential equations. In P. C. M. Molenaar & K. Newell (Eds.), Individual pathways of change: Statistical models for analyzing learning and development (pp. 183–206). Washington, DC: American Psychological Association. doi:10.1037/12140-011.
Chapter Google Scholar
Boker, S. M., Neale, M. C., & Rausch, J. (2004). Latent differential equation modeling with multivariate multi-occassion indicators. In K. van Montfort, H. Oud, & A. Satorra (Eds.), Recent developments on structural equation models: Theory and applications (pp. 151–174). Dordrecht: Kluwer Academic Publishers. doi:10.1007/978-1-4020-1958-6_9.
Chapter Google Scholar
Boker, S. M., & Nesselroade, J. R. (2002). A method for modeling the intrinsic dynamics of intraindividual variability: Recovering the parameters of simulated oscillators in multi-wave panel data. Multivariate Behavioral Research, 37, 127–160.
Article PubMed Google Scholar
Casdagli, M., Eubank, S., Farmer, J. D., & Gibson, J. (1991). State space reconstruction in the presence of noise. Physica D, 51, 52–98. doi:10.1016/0167-2789(91)90222-U.
Article Google Scholar
Chow, S., Ram, N., Boker, S. M., Fujita, F., & Clore, G. (2005). Emotion as a thermostat: Representing emotion regulation using a damped oscillator model. Emotion, 5, 208–225.
Article PubMed Google Scholar
Deboeck, P. R. (2010). Estimating dynamical systems: Derivative estimation hints from Sir Ronald A Fisher. Multivariate Behavioral Research, 45, 725–745. doi:10.1080/00273171.2010.498294.
Article PubMed Google Scholar
Estabrook, R. (2015). Evaluating measurement of dynamic constructs: Defining a measurement model of derivatives. Psychological Methods, 20(1), 117–141. doi:10.1037/a0034523.
Article PubMed Google Scholar
Fisher, R. A. (1925). The influence of rainfall on the yield of wheat at Rothamsted. Philosophical Transactions of the Royal Society of London, Series B, Containing Papers of a Biological Character, 213, 89–142.
Article Google Scholar
Giona, M., Lentini, F., & Cimagalli, V. (1991). Functional reconstruction and local prediction of chaotic time series. Physical Review A, 44(6), 3496. doi:10.1103/PhysRevA.44.3496.
Article Google Scholar
Hamaker, E. L., Nesselroade, J. R., & Molenaar, P. C. M. (2007). The integrated trait-state model. Journal of Research in Personality, 41, 295–315. doi:10.1016/j.jrp.2006.04.003.
Article Google Scholar
Landau, R. H., Páez, M. J., & Bordeianu, C. C. (2007). Computational physics: Problem solving with computers (2nd ed.). Weinheim: Wiley-VCH.
Book Google Scholar
Lay, D. C. (2003). Linear algebra and its applications (3rd ed.). Boston, MA: Addison Wesley.
Google Scholar
Leon, S. J. (2006). Linear algebra with applications (7th ed.). Upper Saddle River, NJ: Prentice Hall.
Google Scholar
McArdle, J. J., & Epstein, D. (1987). Latent growth curves within developmental structural equation models. Child Development, 58, 110–133.
Article PubMed Google Scholar
Molenaar, P. C. M., & Newell, K. M. (2003). Direct fit of theoretical model of phase transition in oscillatory finger motions. British Journal of Mathematical and Statistical Psychology, 56, 199–214. doi:10.1348/000711003770480002.
Article PubMed Google Scholar
Narula, S. C. (1979). Orthogonal polynomial regression. International Statistical Review, 47(1), 31–36.
Article Google Scholar
Oud, J. H. L., & Folmer, H. (2011). Reply to Steele & Ferrer: Modeling oscillation, approximately or exactly? Multivariate Behavioral Research, 46(6), 985–993. doi:10.1080/00273171.2011.625306.
Article PubMed Google Scholar
Savitzky, A., & Golay, M. J. E. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 36(9), 1627–1639.
Article Google Scholar
Song, H., & Ferrer, E. (2009). State-space modeling of dynamic psychological processes via the Kalman smoother algorithm: Rationale, finite sample properties, and applications. Structural Equation Modeling, 16, 338–363. doi:10.1080/10705510902751432.
Article Google Scholar
Steele, J. S., & Ferrer, E. (2011). Latent differential equation modeling of self-regulatory and coregulatory affective processes. Multivariate Behavioral Research, 46(6), 956–984. doi:10.1080/00273171.2011.625305.
Article PubMed Google Scholar
Trail, J. B., Collins, L. M., Rivera, D. E., Li, R., Piper, M. E., & Baker, T. B. (2014). unctional data analysis for dynamical system identification of behavioral processes. Psychological Methods, 19, 175–187. doi:10.1037/a0034035.
Article PubMed Google Scholar
Whitney, H. (1936). Differentiable manifolds. The Annals of Mathematics, 37(3), 645–680. Retrieved from http://www.jstor.org/stable/1968482.
Whittaker, E. T., & Robinson, G. (1924). The calculus of observations: A treatise on numerical mathematics (vol. 36) (No. 9). London.
Yang, M., & Chow, S. (2010). Using state-space model with regime switching to represent the dynamics of facial electromyography (EMG) data. Psychometrika, 75, 744–771. doi:10.1007/s11336-010-9176-2.
Article Google Scholar
Zentall, S. R., Boker, S. M., & Braungart-Rieker, J. M. (2006, June). Mother-infant synchrony: A dynamical systems approach. In Proceedings of the Fifth International Conference on Development and Learning.
Zheng, Y., Wiebe, R. P., Cleveland, H. H., Molenaar, P. C., & Harris, K. S. (2013). An idiographic examination of day-to-day patterns of substance use craving, negative affect, and tobacco use among young adults in recovery. Multivariate Behavioral Research, 48(2), 241–266. doi:10.1080/00273171.2013.763012.
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgments

The author is grateful to Joseph L. Rodgers for helpful comments on earlier drafts of this article, and to the reviewers and associate editor for their invaluable feedback.

Author information

Authors and Affiliations

Department of Pediatrics, University of Oklahoma Health Sciences Center, Oklahoma City, OK, 73104 , USA
Michael D. Hunter

Authors

Michael D. Hunter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael D. Hunter.

Appendix: Mathematical Background and a Lemma

1.1 Background

Three items of sometimes unfamiliar mathematics will be vitally important: the dot product, vector projection, and the Gram–Schmidt orthogonalization process. For vectors $\varvec{x} = \left( x_1 ,~ x_2 ,~ x_3 ,\ldots , x_n \right) ^\mathsf{T}$ and $\varvec{y} = \left( y_1 ,~ y_2 ,~ y_3 , \ldots , y_n \right) ^\mathsf{T}$, the dot product, or scalar product, of the two vectors is written

$$\begin{aligned} \underbrace{ \varvec{x} \bullet \varvec{y} }_\text {Dot} = \underbrace{ \varvec{x}^\mathsf{T} \varvec{y} }_\text {Matrix} = \underbrace{ x_1 y_1 + x_2 y_2 + x_3 y_3 + \cdots + x_n y_n }_\text {Concrete Summation} = \underbrace{ \sum _{i=1}^n x_i y_i }_\text {Full Summation} \end{aligned}$$

(55)

The projection of $\varvec{x}$ in the direction of $\varvec{y}$ is defined as

$$\begin{aligned} \text {proj}_{\varvec{y}} (\varvec{x}) = \dfrac{\varvec{x} \bullet \varvec{y}}{\varvec{y} \bullet \varvec{y}} \varvec{y} \end{aligned}$$

(56)

It can be derived without much difficulty, but the idea of vector projection is most easily shown graphically as in Figure 3.

In many situations, it can be useful to have an orthogonal set of vectors spanning a space of interest. The Gram–Schmidt orthogonalization process begins with an arbitrary set of basis vectors and produces an orthogonal set of basis vectors spanning the same space. Gram–Schmidt orthogonalization can be used to solve least squares problems (Björck 1967) and is often covered in introductory linear algebra course books (e.g., Lay, 2003; Leon, 2006).

If $\varvec{u}_1, \varvec{u}_2, \varvec{u}_3, \ldots , \varvec{u}_m$ are the original basis vectors, then $\varvec{u'}_1, \varvec{u'}_2, \varvec{u'}_3, \ldots , \varvec{u'}_m$ are the new orthogonal basis vectors defined by

$$\begin{aligned} \varvec{u'}_1&= \varvec{u}_1 \end{aligned}$$

(57)

$$\begin{aligned} \varvec{u'}_2&= \varvec{u}_2 - \text {proj}_{\varvec{u'}_1} (\varvec{u}_2) \end{aligned}$$

(58)

$$\begin{aligned} \varvec{u'}_3&= \varvec{u}_3 - \text {proj}_{\varvec{u'}_1} (\varvec{u}_3) - \text {proj}_{\varvec{u'}_2} (\varvec{u}_3) \end{aligned}$$

(59)

$$\begin{aligned} \vdots \nonumber \\ \varvec{u'}_m&= \varvec{u}_m - \sum _{i=1}^{m-1} \text {proj}_{\varvec{u'}_i} (\varvec{u}_m). \end{aligned}$$

(60)

So the $k^{th}$ new basis vector,

$$\begin{aligned} \varvec{u'}_k = \varvec{u}_k - \sum _{i=1}^{k-1} \text {proj}_{\varvec{u'}_i} (\varvec{u}_k) = \varvec{u}_k - \sum _{i=1}^{k-1} \dfrac{ \varvec{u}_k \bullet \varvec{u'}_i }{ \varvec{u'}_i \bullet \varvec{u'}_i} \varvec{u'}_i. \end{aligned}$$

(61)

Figure 4 illustrates a two-dimensional example of the Gram–Schmidt orthogonalization process. When beginning with a nonorthogonal set of basis vectors $\{\varvec{u_1}, \varvec{u_2}\}$ that spans the two-dimensional space, then the Gram–Schmidt procedure produces a new set of orthogonal basis vectors $\{\varvec{u'_1}, \varvec{u'_2}\}$ that spans the same space.

1.2 Lemma Regarding Orthogonal Vectors and Projections

Let $\varvec{u}_1, \ldots , \varvec{u}_m$ be any full rank set of m vectors, and let $\varvec{u'}_1, \ldots , \varvec{u'}_m$ be the set of orthogonalized vectors produced by applying the Gram–Schmidt orthogonalization process to $\varvec{u}_1, \ldots , \varvec{u}_m$. We want to show that for any $k \in \{ 1, 2, 3, \ldots , m \}, ~~ \varvec{u'}_k \bullet \varvec{u'}_k = \varvec{u'}_k \bullet \varvec{u}_k$. By substitution based on the Gram–Schmidt definition and that of vector projection

$$\begin{aligned} \varvec{u'}_k \bullet \varvec{u'}_k = \varvec{u'}_k \bullet \left( \varvec{u}_k - \sum _{i=1}^{k-1} \text {proj}_{\varvec{u'}_i} (\varvec{u}_k) \right) = \varvec{u'}_k \bullet \left( \varvec{u}_k - \sum _{i=1}^{k-1} \dfrac{ \varvec{u}_k \bullet \varvec{u'}_i }{ \varvec{u'}_i \bullet \varvec{u'}_i} \varvec{u'}_i \right) . \end{aligned}$$

(62)

And because the dot product follows a distributive law

$$\begin{aligned} = \varvec{u'}_k \bullet \varvec{u}_k - \varvec{u'}_k \bullet \left( \sum _{i=1}^{k-1} \dfrac{ \varvec{u}_k \bullet \varvec{u'}_i }{ \varvec{u'}_i \bullet \varvec{u'}_i} \varvec{u'}_i \right) . \end{aligned}$$

(63)

Again, applying the distributive law and expanding the summation,

$$\begin{aligned} = \varvec{u'}_k \bullet \varvec{u}_k - \underbrace{\varvec{u'}_k \bullet \left( \dfrac{ \varvec{u}_k \bullet \varvec{u'}_1 }{ \varvec{u'}_1 \bullet \varvec{u'}_1} \varvec{u'}_1 \right) - \varvec{u'}_k \bullet \left( \dfrac{ \varvec{u}_k \bullet \varvec{u'}_2 }{ \varvec{u'}_2 \bullet \varvec{u'}_2} \varvec{u'}_2 \right) - \ldots - \varvec{u'}_k \bullet \left( \dfrac{ \varvec{u}_k \bullet \varvec{u'}_{k-1} }{ \varvec{u'}_{k-1} \bullet \varvec{u'}_{k-1}} \varvec{u'}_{k-1} \right) }_{\text {Want to show this is } 0}. \end{aligned}$$

(64)

Now rearranging terms

$$\begin{aligned}&= \varvec{u'}_k \bullet \varvec{u}_k - \left( \dfrac{ \varvec{u}_k \bullet \varvec{u'}_1 }{ \varvec{u'}_1 \bullet \varvec{u'}_1} \right) \underbrace{ \left( \varvec{u'}_k \bullet \varvec{u'}_1 \right) }_{=0} - \left( \dfrac{ \varvec{u}_k \bullet \varvec{u'}_2 }{ \varvec{u'}_2 \bullet \varvec{u'}_2} \right) \underbrace{ \left( \varvec{u'}_k \bullet \varvec{u'}_2 \right) }_{=0}\nonumber \\&\quad - \ldots - \left( \dfrac{ \varvec{u}_k \bullet \varvec{u'}_{k-1} }{ \varvec{u'}_{k-1} \bullet \varvec{u'}_{k-1}} \right) \underbrace{ \left( \varvec{u'}_k \bullet \varvec{u'}_{k-1} \right) }_{=0}. \end{aligned}$$

(65)

The underbraced terms are all zero because we know that the new Gram–Schmidt basis vectors, $\varvec{u'}_i$, are orthogonal. So finally,

$$\begin{aligned} = \varvec{u'}_k \bullet \varvec{u}_k. \end{aligned}$$

(66)

We have thus shown that $\varvec{u'}_k \bullet \varvec{u'}_k = \varvec{u'}_k \bullet \varvec{u}_k$, and the proof is complete.

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hunter, M.D. As Good as GOLD: Gram–Schmidt Orthogonalization by Another Name. Psychometrika 81, 969–991 (2016). https://doi.org/10.1007/s11336-016-9511-3

Download citation

Received: 05 December 2013
Published: 20 September 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s11336-016-9511-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

As Good as GOLD: Gram–Schmidt Orthogonalization by Another Name

Abstract

Access this article

Similar content being viewed by others

Linear estimation under the Gauss–Helmert model: geometrical interpretation and general solution

Robust and sparse regression in generalized linear model by stochastic optimization

Robust and Efficient derivative estimation under correlated errors

Notes

References

Acknowledgments