Skip to main content
Log in

Truncated Estimators for a Precision Matrix

  • Published:
Mathematical Methods of Statistics Aims and scope Submit manuscript


In this paper, we estimate the precision matrix \({\Sigma}^{-1}\) of a Gaussian multivariate linear regression model through its canonical form \(({Z}^{T},{U}^{T})^{T}\) where \(Z\) and \(U\) are respectively an \(m\times p\) and an \(n\times p\) matrices. This problem is addressed under the data-based loss function \(\textrm{tr}\ [({\hat{\Sigma}}^{-1}-{\Sigma}^{-1})S]^{2}\), where \({\hat{\Sigma}}^{-1}\) estimates \({\Sigma}^{-1}\), for any ordering of \(m,n\) and \(p\), in a unified approach. We derive estimators which, besides the information contained in the sample covariance matrix \(S={U}^{T}U\), use the information contained in the sample mean \(Z\). We provide conditions for which these estimators improve over the usual estimators \(a{S}^{+}\) where \(a\) is a positive constant and \({S}^{+}\) is the Moore-Penrose inverse of \(S\). Thanks to the role of \(Z\), such estimators are also improved by their truncated version.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1


  1. This means that \(Q^{-}\) is a generalized inverse of \(Q\), that is, \(QQ^{-}Q=Q\) and that \(Q\) is a generalized inverse of \(Q^{-}\), that is, \(Q^{-}QQ^{-}=Q^{-}\). This can be checked thanks to the semi-orthogonality on \(O\) and \(H\): \(QQ^{-}Q=HL^{1/2}O\) \({O}^{T}L^{-1/2}{H}^{T}\) \(HL^{1/2}O=Q\) and \(Q^{-}QQ^{-}={O}^{T}L^{-1/2}{H}^{T}\) \(HL^{1/2}O\) \({O}^{T}L^{-1/2}{H}^{T}=Q^{-}\).


  1. D. Boukehil, D. Fourdrinier, F. Mezoued, and W. E. Strawderman, ‘‘Estimation of the inverse scatter matrix for a scale mixture of Wishart matrices under Efron-Morris type losses,’’ J. Statist. Plann. Inference 215, 368–387 (2021).

    Article  MathSciNet  Google Scholar 

  2. S. Canu and D. Fourdrinier, ‘‘Unbiased risk estimates for matrix estimation in the elliptical case,’’ J. Multivar. Anal. 158, 60–72 (2017).

    Article  MathSciNet  Google Scholar 

  3. D. Fourdrinier and W. E. Strawderman, ‘‘Robust minimax Stein estimation under invariant data-based loss for spherically and elliptically symmetric distributions,’’ Metrika 78 (4), 461–484 (2015).

    Article  MathSciNet  Google Scholar 

  4. D. Fourdrinier, F. Mezoued, and M. T. Wells, ‘‘Estimation of the inverse scatter matrix of an elliptically symmetric distribution,’’ J. Multivar. Anal. 143, 32–55 (2016).

    Article  MathSciNet  Google Scholar 

  5. D. Fourdrinier, A. M. Haddouche, and F. Mezoued, ‘‘Covariance matrix estimation under data-based loss,’’ Statistics and Probability Letters 177, 109160 (2021).

  6. G. Golub and C. van Loan, Matrix Computations (JHU Press, 3rd ed., 1996).

    Google Scholar 

  7. A. M. Haddouche, D. Fourdrinier, and F. Mezoued, ‘‘Scale matrix estimation of an elliptically symmetric distribution in high and low dimensions,’’ J. Multivar. Anal. 181, 104680 (2021).

  8. L. R. Haff, ‘‘Estimation of the inverse covariance matrix: Random mixtures of the inverse Wishart matrix and the identity,’’ Ann. Statist. 7 (6), 1264–1276, 11 (1979).

  9. T. Kubokawa and M. Srivastava, ‘‘Estimating the covariance matrix: A new approach,’’ J. Multivar. Anal. 86 (1), 28–47 (2003).

    Article  MathSciNet  Google Scholar 

  10. T. Kubokawa and M. Srivastava, ‘‘Estimation of the precision matrix of a singular Wishart distribution and its application in high-dimensional data,’’ J. Multivar. Anal. 99 (9), 1906–1928 (2008).

    Article  MathSciNet  Google Scholar 

  11. T. Kubokawa and M. Tsai, ‘‘Estimation of covariance matrices in fixed and mixed effects linear models,’’ J. Multivar. Anal. 97 (10), 2242–2261 (2006).

    Article  MathSciNet  Google Scholar 

  12. B. K. Sinha and M. Ghosh, ‘‘Inadmissibility of the best equivariant estimators of the variance-covariance matrix, the precision matrix, and the generalized variance under entropy loss,’’ Statistics and Decisions 5 (3–4), 201–228 (1987).

    Article  MathSciNet  Google Scholar 

  13. M. S. Srivastava, ‘‘Singular Wishart and multivariate Beta distributions,’’ Ann. Statis. 31 (5), 1537–1560 (2003).

    Article  MathSciNet  Google Scholar 

  14. C. Stein, ‘‘Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean,’’ Ann. Inst. Stat. Math. 16 (1), 155–160 (1964).

    Article  MathSciNet  Google Scholar 

  15. A. Takemura, ‘‘An orthogonally invariant minimax estimator of the covariance matrix of a multivariate normal population,’’ Tsukuba J. Math. 8, 367–376 (1984).

    Article  MathSciNet  Google Scholar 

  16. H. Tsukuma and T. Kubokawa, ‘‘A unified approach to estimating a normal mean matrix in high and low dimensions,’’ J. Multivar. Anal. 139, 312–328 (2015).

    Article  MathSciNet  Google Scholar 

  17. H. Tsukuma and T. Kubokawa, ‘‘Unified improvements in estimation of a normal covariance matrix in high and low dimensions,’’ J. Multivar. Anal. 143, 233–248 (2016).

    Article  MathSciNet  Google Scholar 

  18. H. Tsukuma and T. Kubokawa, ‘‘Multivariate linear model and group invariance,’’ in: Shrinkage Estimation for Mean and Covariance Matrices (Springer, 2020), pp. 27–33.

    Book  Google Scholar 

Download references


We thank Pr. Stéphane Canu (INSA de Rouen Normandie) for helpful discussions about the proof of Lemma A.3. We are grateful to two anonymous referees for their careful reading which allowed us to write an improved version of this paper.


This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Anis M. Haddouche or Dominique Fourdrinier.

Ethics declarations

The authors of this work declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



We provide here materials for the proofs of Theorems 1 and 2 where we place ourselves in the context of Lemma A.3 at the end of this appendix.

1.1 A.1. A Stein–Haff Identity

A key tool for the proof of these theorems is the Stein–Haff identity given in the following lemma.

Lemma A.1 (Equation (4.4) of Tsukuma and Kubokawa [17]). Let \(Z\) and \(U\) two random matrices with respective dimension \(m\times p\) and an \(n\times p\) such that \(({Z}^{T},{U}^{T})^{T}\) has joint density


Let \(Q\) and \(\Lambda\) as in (A.7) and let also \(\Phi(\Lambda)=\textrm{diag}\left(\phi_{1}(\Lambda),\dots,\phi_{i}(\Lambda),\dots,\phi_{k}(\Lambda)\right)\) be a \(k\times k\) diagonal matrix such that, for any \(i=1,\dots,k\), the function \(\lambda_{i}\mapsto\phi_{i}\left(\lambda_{1},\dots,\lambda_{i},\dots,\lambda_{k}\right)\) is absolutely continuous.

Assuming that \(E_{\theta,\Sigma}{}\big{[}|\textrm{tr}(\Sigma^{-1}Q\Phi Q^{\top})|\big{]}<\infty\), where \(E_{\theta,\Sigma}{}\) denotes the expectation with respect to (A.1), we have

$$E_{\theta,\Sigma}{}\left[\textrm{tr}(\Sigma^{-1}Q\Phi Q^{\top})\right]=E_{\theta,\Sigma}{}\left[\sum_{i=1}^{k}\Bigg{\{}\alpha_{i}\phi_{i}-2\lambda_{i}\frac{\partial\phi}{\partial\lambda_{i}}-2\sum_{j>i}^{k}\frac{\phi_{i}-\phi_{j}}{\lambda_{i}-\lambda_{j}}\lambda_{j}\Bigg{\}}\right],$$


$$\forall i=1,\ldots,k\quad\alpha_{i}=|n-p|+2i-1.$$

Remark A.3. Throughout this paper, we use the modified version of identity (A.2) given in (A.4) below. This identity is established as follows. As for \(j>i=1,\dots,k\) we have






Then substituting (A.3) for (A.2) gives

$${}=E_{\theta,\Sigma}{}\left[\sum_{i=1}^{k}\Big{\{}(n\vee p-n\wedge p+2k-1)\phi_{i}-2\lambda_{i}\frac{\partial\phi_{i}}{\partial\lambda_{i}}-2\sum_{j>i}\frac{\lambda_{i}\phi_{i}-\lambda_{j}\phi_{j}}{\lambda_{i}-\lambda_{j}}\Big{\}}\right]$$


$$\alpha_{i}+2(k-i)=n\vee p-n\wedge p+2k-1.$$

1.2 A.2. The Case where \(k=n\wedge p\)

Note that, as mentioned at the beginning of Subsection 2.2, when \(k=n\wedge p\), the estimators in (8) can be rewritten as \({\hat{\Sigma}}^{-1}_{\Phi}={(Q^{-})}^{T}\Phi Q^{-}\)) where \(\Phi=a_{o}\left(I_{k}+\Psi\right)\). Although not used in the rest of the article, for completeness, we give the risk of such estimators in the following proposition.

Proposition A.1. Let \(Q\) and \(\Lambda\) as in (A.7) below and let \(\Phi=\textrm{diag}\big{(}\phi_{1}(\Lambda),\dots,\) \(\phi_{i}(\Lambda),\dots,\) \(\phi_{k}(\Lambda)\big{)}\) be a \(k\times k\) diagonal matrix such that, for any \(i=1,\dots,k\), the function \(\lambda_{i}\mapsto\phi_{i}\left(\lambda_{1},\dots,\lambda_{i},\dots,\lambda_{k}\right)\) is absolutely continuous and non-negative.

The risk function in \((3)\) of the estimator \({\hat{\Sigma}}^{-1}_{\Phi}={(Q^{-})}^{T}\Phi Q^{-}\) is given by

$$R\Big{(}{\Sigma}^{-1},{\hat{\Sigma}}^{-1}_{\Phi}\Big{)}=E_{\theta,\Sigma}{}\left[\sum_{i=1}^{k}\left\{\phi_{i}^{2}-2(n\vee p-n\wedge p+2k-1)\phi_{i}+4\lambda_{i}\frac{\partial\phi_{i}}{\partial\lambda_{i}}+4\sum_{j>i}\frac{\lambda_{i}\phi_{i}-\lambda_{j}\phi_{j}}{\lambda_{i}-\lambda_{j}}\right\}\right]$$

Remark A.4. When \(m>p>n\), for the class of orthogonally invariant estimators, the risk expression in Proposition A.1 parallels the one associated to the data-based loss \(\textrm{tr}[({\hat{\Sigma}}^{-1}-{\Sigma}^{-1})^{2}S^{2}]\) provided by Kubokawa and Srivastava [10] in their Proposition 2.12 where the role of the \(\lambda_{i}\)’s is played by \(l^{-1}_{i}\)’s. They showed that these estimators, that depend only on \(S\), improve over the optimal estimator in (4).

Proof. According to the loss function (2), the risk function of any estimators of the form \({\hat{\Sigma}}^{-1}_{\Phi}={(Q^{-})}^{T}\Phi Q^{-}\) is expressed as

$${}=E_{\theta,\Sigma}{}\left[\textrm{tr}\left({(Q^{-})}^{T}\Phi Q^{-}S\right)^{2}\right]+E_{\theta,\Sigma}{}\left[\textrm{tr}\left(\Sigma^{-1}S\right)^{2}\right]-2E_{\theta,\Sigma}{}\left[\textrm{tr}\left({\Sigma}^{-1}S{(Q^{-})}^{T}\Phi Q^{-}S\right)\right].$$

Using the fact that \(S{(Q^{-})}^{T}=Q\) and that \(Q^{-}{Q}=I_{k}\), we have


Applying the Stein–Haff type identity in (A.4) to the second term in the right-hand side of the last equality gives the desired result. \(\Box\)

Remark A.5. The loss function in (2) allows to get rid of the matrix \(Q\) in the expression of the risk of \({\hat{\Sigma}}^{-1}_{\Phi}\) through the fact that \(Q^{-}\) is the left inverse of \(Q\) and thanks to the Stein–Haff type identity in Proposition A.4. Such a simplification does not occur with the usual quadratic loss \(\textrm{tr}[({\hat{\Sigma}}^{-1}-{\Sigma}^{-1})^{2}]\) and with the data-based losses \(\textrm{tr}[({\hat{\Sigma}}^{-1}-{\Sigma}^{-1})^{2}S^{r}]\) used by Kubokawa and Srivastava [10] for \(r=1,2\). The choice of our loss function works in favor of our group of truncated estimators in (15) in the sense that, thanks to the presence of the statistics \(S\), it allows to highlight that these estimators clearly improve over the usual estimators \(aS\).

1.3 A.3. Determination of the Optimal Constant ‘‘\(a_{o}\)’’

Here, we prove the statement in (4). The risk of \(a{S}^{+}\) equals

$${}=a^{2}(n\wedge p)-2aE_{\theta,\Sigma}{}\left[\textrm{tr}\big{(}\Sigma^{-1}S\big{)}\right]+E_{\theta,\Sigma}{}\left[\textrm{tr}\Big{(}\Sigma^{-1}S\Big{)}^{2}\right],$$

where we used the fact that \({S}^{+}S{S}^{+}S={S}^{+}S\), \(\textrm{tr}\big{(}{S}^{+}S\big{)}=n\wedge p\) and \(S{S}^{+}S=S\). Clearly, this polynomial in \(a\) is minimized for

$$a=a_{o}=\frac{E_{\theta,\Sigma}{}\left[\textrm{tr}\big{(}\Sigma^{-1}S\big{)}\right]}{n\wedge p}.$$

Note that this expression does not depend on \(\theta\) or \(\Sigma\). Indeed, according to Lemma 2.1 of Haddouche et al. [7], if \(G(z,s)\) be a \(p\times p\) matrix function such that, for any fixed \(z\in\mathbb{R}^{m\times p}\), \(G(z,s)\) is weakly differentiable with respect to \(s\in\mathbb{R}^{p\times p}\) and such that \(E_{\theta,\Sigma}{}\big{[}|\textrm{tr}(\Sigma^{-1}G(Z,S))|\big{]}<\infty\), we have

$${}=E_{\theta,\Sigma}{}\Big{[}\textrm{tr}\Big{(}2S{S}^{+}\mathcal{D}_{s}\{S{S}^{+}G(Z,S)\}^{T}+(n-n\wedge p-1){S}^{+}G(Z,S)\Big{)}\Big{]},$$

where \(\mathcal{D}_{s}\{\cdot\}\) is the Haff operator whose generic element is \(\frac{1}{2}(1+\delta_{ij})\frac{\partial}{\partial S_{ij}},\) with \(\delta_{ij}=1\) if \(i=j\) and \(\delta_{ij}=0\) if \(i\neq j\). Using (A.6) with \(G(Z,S)=S\) it follows that

$${}=E_{\theta,\Sigma}{}\left[\textrm{tr}\left(2S{S}^{+}\mathcal{D}_{s}\{S\}+(n-n\wedge p-1){S}^{+}S\right)\right].$$

Now, according to Lemma A.6 of Haddouche et al. [7], we have


so that, as \(\textrm{tr}\big{(}S{S}^{+}\big{)}=n\wedge p\), it follows that

$$E_{\theta,\Sigma}{}\left[\textrm{tr}\big{(}\Sigma^{-1}S\big{)}\right]=(p+1)(n\wedge p)+(n-n\wedge p-1)(n\wedge p)=(n\vee p)(n\wedge p).$$

Hence, according to (A.5), \(a_{0}=n\vee p\).

Remark A.6. The optimal constant \(a_{o}=n\vee p\) with respect to our data-based loss and the one in Kubokawa and Srivastava [10] are identical when \(p>n\). This is not surprising since theses data-based losses coincide for the class of orthogonally invariant estimators as \({\hat{\Sigma}}_{a}^{-1}\) as mentioned in Remark 2.

1.4 A.4. A Fundamental Inequality

The following lemma is a key tool for the proof of Theorem 2.

Lemma A.2 (Theorem 3.1 of Tsukuma and Kubokawa [17]). Let \(Q\) and \(\Lambda\) as in (A.7) below. Let also \(\Phi=\textrm{diag}\big{(}\phi_{1}(\Lambda),\dots,\phi_{i}(\Lambda),\) \(\dots,\) \(\phi_{k}(\Lambda)\big{)}\) be a \(k\times k\) diagonal matrix such that, for any \(i=1,\dots,k\), the function \(\lambda_{i}\mapsto\phi_{i}\left(\lambda_{1},\dots,\lambda_{i},\dots,\lambda_{k}\right)\) is absolutely continuous and non negative. Then

$$E_{\theta,\Sigma}{}\left[\textrm{tr}({\Sigma}^{-1}Q\Phi{Q}^{T})\right]\geq(m+n\vee p)E_{\theta,\Sigma}{}\left[\textrm{tr}(\Phi(I_{k}+\Lambda)^{-1})\right].$$

1.5 A.5. A Simultaneous Diagonalization Result

Here, we give a simultaneous diagonalization lemma of two matrices in the context of (5) and (7).

Lemma A.3 (Simultaneous diagonalization). For \(Z\) and \(U\) two matrices with dimension \(m\times p\) and \(n\times p\), respectively, let \(S={U}^{T}U\) with \(\textrm{rank}(S)=n\wedge p\) and \(W={Z}^{T}Z\) with \(\textrm{rank}(W)=m\wedge p\). Let \(S=HL{H}^{T}\) the eigenvalue decomposition of \(S\) where \(L=\textrm{diag}(l_{1}>\dots>l_{i}>\dots>l_{n\wedge p}>0)\) is the \((n\wedge p)\times(n\wedge p)\) diagonal matrix of the positive eigenvalues of \(S\) and \(H\) the \(p\times(n\wedge p)\) semi-orthogonal matrix (\({H}^{T}H=I_{n\wedge p}\)) of the corresponding eigenvectors. Let also \(k=n\wedge m\wedge p\).

Then there exists a \(k\times p\) matrix \(Q^{-}\) such that the following simultaneous diagonalization of \(W\) and \(S\) holds

$$Q^{-}W{(Q^{-})}^{T}=\Lambda\quad and\quad Q^{-}S{(Q^{-})}^{T}=I_{k},$$

where the \(k\times k\) diagonal matrix \(\Lambda\) intervenes in the thin singular value decomposition of \(ZHL^{-1/2}\), that is,




\(O\) an \((n\wedge p)\times k\) semi-orthogonal matrix (\({O}^{T}O=I_{k}\)) and \(R\) an \(m\times k\) semi-orthogonal matrix (\({R}^{T}R=I_{k}\)). More precisely, we have \(Q^{-}={O}^{T}L^{-1/2}{H}^{T}\).

Besides, \(Q^{-}\) is the reflexive generalized inverse Footnote 1of \(Q=HL^{1/2}O\), which are respectively \(k\times p\) and \(p\times k\) matrices. Also, \(Q^{-}\) is a left inverse of \(Q\), that is,


In addition, we have

$$S{(Q^{-})}^{T}=Q\quad and\quad Q^{-}S={Q}^{T},$$



where \({S}^{+}=HL^{-1}{H}^{T}\) is the Moore–Penrose inverse of \(S\).

Proof. Through the expressions of \(Q^{-}\) and \(W\) we have

$${}={O}^{T}O\Lambda^{1/2}{R}^{T}R\Lambda^{1/2}{O}^{T}O\quad\text{according to (A.8)}$$
$${}=\Lambda\quad\text{thanks to the semi-orthogonality of}\ R\ \text{and}\ O.$$

This is the first diagonalization in (A.7). As for the second diagonalization in (A.7), we can write, according to the expression of \(Q^{-}\) and to the eigenvalue decomposition of \(S\),


thanks to the semi-orthogonality of \(H\) and \(O\).

Now, with the same semi-orthogonality arguments, (A.9) follows immediately from


and (A.10) from




Finally, Equalities (A.11) and (A.12) are obtained as follows





About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Haddouche, A.M., Fourdrinier, D. Truncated Estimators for a Precision Matrix. Math. Meth. Stat. 33, 12–25 (2024).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: