Abstract
BFGS procedure is classically used for the estimation of the parameters of a recursive Path Analysis model. In practice, BFGS does not present any problem of convergence. However, to date, no proof of its convergence is available. The present paper introduces an alternative procedure and establishes its convergence properties. Numerical experiments will be presented, concluding that the proposed alternative seems to converge faster than BFGS procedure.
Similar content being viewed by others
References
Anderson T (1969) Statistical inference for covariance matrices with linear structure. In: Krishnaiah PR (ed) Multivariate analysis-II. Academic, New York, pp 55–66
Bollen K (1989) Structural equations with latent variables. Wiley, New York
Breckler S (1990) Applications of correlation structure modeling in psychology: cause for concern psychol. Bull 107:260–273
Dai YH (2003) Convergence properties of the bfgs algorithm. SIAM J Optim 13(2003):693–701
Duncan O (1966) Path analysis: sociological examples. Am J Sociol 72:1–16
Eisenhauer N, Bowker M, Grace J, Powell K (2015) From patterns to causal understanding: structural equation modeling (sem) in soil ecology. Pedobiologia 58(2):65–72
El Hadri Z, Hanafi M (2015) The finite iterative method for calculating the correlation matrix implied by a recursive path model. Electron J Appl Stat Anal 08(01):84–99
El Hadri Z, Iaousse M, Hanafi M, Dolce P, El Kettani Y (2020) Properties of the correlation matrix implied by a recursive path model obtained using the finite iterative method. Electron J Appl Stat Anal 13(02):413–435. https://doi.org/10.1285/i20705948v13n2p413
Hauser R (1975) Education, occupation, and earnings. Academic Press, New York
Iaousse M, Hmimou A, El Hadri Z, El Kettani Y (2020) On the computation of the correlation matrix implied by a recursive path model. In the 6th edition of the international conference on optimization and applications (ICOA 2020)
Jöreskog K (1970) A general method for the analysis of covariance structures. Biometrica 57:239–251
Lee SY (2007) Structural equation modelling: a bayesian approach. Wiley, New York
Nocedal J, Wright S (2006) Numerical optimization, 2nd edn. Springer, New York
Pugesek R (2003) Structural equation modeling applications in ecological and evolutionary biology. Cambridge University Press, New York
Schumacker R, Lomax R (2004) A beginner’s guide to structural equation modeling. Lawrence Erlbaum Associates, Mahwah
Wolfe P (1969) Convergence conditions for ascent methods. J SIAM Rev 11:226–235
Wright S (1921) Correlation and causation. J Agric Res 20(17):557–585
Wright S (1923) The theory of path coefficients: a reply to niles’ criticism. Genetics 8:239–255
Wright S (1934) The method of path coefficients. Ann Math Stat 5:161–215
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest. The authors declare that no funds, grants, or other supports were received during the preparation of this manuscript. The authors have no relevant financial or non-financial interests to disclose.
Additional information
Communicated by Yutaka Kano.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
Proof of Lemma 1
To prove Lemma 1, it is sufficient to prove that \(\widehat{\mathbf{R }}({\varvec{\theta }})\) is affine with respect to each parameter. Let \(1\le t\le T\).
-
1.
If \(t\le n_{{\varvec{\Phi }}}\), then \(\theta _t\) is an element of \({\varvec{\Phi }}\).
-
Initialization: FIM algorithm gives \(\widehat{\mathbf{R }}({\varvec{\theta }})_{1:p,1:p}={\varvec{\Phi }}\). However, \({\varvec{\Phi }}\) is obviously affine with respect to each of its elements; in particular, \({\varvec{\Phi }}\) is affine with respect to \(\theta _t\). Consequently, \(\widehat{\mathbf{R }}({\varvec{\theta }})_{1:p,1:p}\) is affine with respect to \(\theta _t\).
-
Step 1: Let i be an integer, such that \(1\le i\le q\) and suppose that the block \(\widehat{\mathbf{R }}({\varvec{\theta }})_{1:p+i-1,1:p+i-1}\) is affine with respect to \(\theta _t\). Since \(\theta _t\) is an element of \({\varvec{\Phi }}\), then \(\mathbf{A }\) is constant with respect to \(\theta _t\). As a consequence, \(\widehat{\mathbf{R }}({\varvec{\theta }})_{p+i,1:p+i-1}=\mathbf{A }_{i,1:p+i-1}\widehat{\mathbf{R }}({\varvec{\theta }})_{1:p+i-1,1:p+i-1}\) is affine with respect to \(\theta _t\).
-
Step 2: Since \(\widehat{\mathbf{R }}({\varvec{\theta }})_{1:p+i-1,p+i}=\widehat{\mathbf{R }'}({\varvec{\theta }})_{p+i,1:p+i-1}\), then \(\widehat{\mathbf{R }}({\varvec{\theta }})_{1:p+i-1,p+i}\) is affine with respect to \(\theta _t\).
-
Step 3: Since \(\widehat{\mathbf{R }}({\varvec{\theta }})_{p+i,p+i}=1\), then \(\widehat{\mathbf{R }}({\varvec{\theta }})_{p+i,p+i}\) is affine with respect to \(\theta _t\). As a result, the block \(\widehat{\mathbf{R }}({\varvec{\theta }})_{1:p+i,1:p+i}\) is affine with respect to \(\theta _t\).
-
-
2.
If \(t> n_{{\varvec{\Phi }}}\), then \(\theta _t\) is an element of \(\mathbf{A }\). And the proof of this part is given in (El Hadri et al. 2020).
As a consequence, \(\widehat{\mathbf{R }}({\varvec{\theta }})\) can be decomposed as given in (1). \(\square\)
Appendix 2
Proof of Lemma 2
Let \(t\in \{1,\ldots ,T\}\),
-
1.
If \(t\le n_{{\varvec{\Phi }}}\), then \(\theta _t\) is an element of \({\varvec{\Phi }}\) and \(\exists (k,j) \in \{1,\ldots ,p\}\), such that \(j\ne k\) and \(\theta _t={\varvec{\Phi }}_{kj}\). Using the initialization step of FIM, it comes
$$\begin{aligned} \widehat{\mathbf{R }}_{k,j}({\varvec{\theta }})={\varvec{\Phi }}_{kj}=\theta _t. \end{aligned}$$(26)However, from Theorem 1, we get
$$\begin{aligned} \widehat{\mathbf{R }}_{k,j}({\varvec{\theta }})=\left[ \mathbf{M }({\varvec{\theta }}_{(-t)})\right] _{k,j}+\theta _t\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{k,j}. \end{aligned}$$(27)Identifying (26) and (27) gives, \(\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{k,j}=1\). And y symmetry \(\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{j,k}=1\). As a consequence, \(\parallel \mathbf{N }({\varvec{\theta }}_{(-t)})\parallel _F^2\ge \left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{k,j}^2+\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{j,k}^2=2.\)
-
2.
If \(t> n_{{\varvec{\Phi }}}\), then \(\theta _t\) is an element of \(\mathbf{A }\) and \(\exists k\in \{1,\ldots ,q\}\) and \(\exists j\in \{1,\ldots ,p+k-1\}\). such that \(\theta _t=\mathbf{A }_{kj}\). Step 1 of FIM gives \(\widehat{\mathbf{R }}_{p+k,j}({\varvec{\theta }})=\mathbf{A }_{k,1:p+k-1}\widehat{\mathbf{R }}_{1:p+k-1,j}({\varvec{\theta }})\). Thus, \(\widehat{\mathbf{R }}_{p+k,j}({\varvec{\theta }})=\sum _{l=1}^{p+k-1}\mathbf{A }_{k,l}\widehat{\mathbf{R }}_{l,j}({\varvec{\theta }})\). As a consequence
$$\begin{aligned} \widehat{\mathbf{R }}_{p+k,j}({\varvec{\theta }})=\theta _t+\sum _{l=1,l\ne j}^{p+k-1}\mathbf{A }_{k,l}\widehat{\mathbf{R }}_{l,j} ({\varvec{\theta }}). \end{aligned}$$(28)However, from Theorem 1, we get
$$\begin{aligned} \widehat{\mathbf{R }}_{p+k,j}({\varvec{\theta }})=\left[ \mathbf{M }({\varvec{\theta }}_{(-t)})\right] _{p+k,j}+\theta _t\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{p+k,j}. \end{aligned}$$(29)Thus, by identification of (28) and (29), we get \(\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{p+k,j}=1\). And by symmetry, \(\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] _{j,p+k}=1\). As a consequence, \(\parallel \mathbf{N }({\varvec{\theta }}_{(-t)}) \parallel _F^2\ge \left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] ^2_{p+k,j}+\left[ \mathbf{N }({\varvec{\theta }}_{(-t)})\right] ^2_{j,p+k}=2.\)
\(\square\)
About this article
Cite this article
El Hadri, Z., Sahli, A. & Hanafi, M. Simple and fast convergent procedure to estimate recursive path analysis model. Behaviormetrika 50, 317–333 (2023). https://doi.org/10.1007/s41237-022-00181-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41237-022-00181-z