Skip to main content
Log in

A new algorithm for fitting semi-parametric variance regression models

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Variance regression allows for heterogeneous variance, or heteroscedasticity, by incorporating a regression model into the variance. This paper uses a variant of the expectation–maximisation algorithm to develop a new method for fitting additive variance regression models that allow for regression in both the mean and the variance. The algorithm is easily extended to allow for B-spline bases, thus allowing for the incorporation of a semi-parametric model in both the mean and variance. Although there are existing methods to fit these types of models, this new algorithm provides a reliable alternative approach that is not susceptible to numerical instability that can arise in this constrained estimation context. We utilise the developed algorithm with a series of simulation studies and analyse illustrative data. Various simulation studies show that the algorithm can recover the true model for a variety of scenarios. We also study automatic selection of model complexity based on information-based criteria, and show that the Akaike information criterion is useful for choosing the optimal number of knots in a B-spline model. An R package is available for implementing these methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristy P. Robledo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

1.1 Fisher information matrix

If \({\varvec{\theta }}= (\beta _0,\beta _1,..., \beta _P, \alpha _0,\alpha _1, ..., \alpha _Q)\), with a total of W parameters (where \(W=P+Q+2\)) then the information matrix is the negative second matrix derivative of the log-likelihood function, which is a \(W\times W\) matrix.

The log-likelihood for our general model discussed in the previous section is

$$\begin{aligned} \ell ({{\varvec{\theta }}})&=-\frac{n}{2} \log (2\pi )-\frac{1}{2} \sum \nolimits _{i=1}^n \log \left( \alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq} \right) \nonumber \\&\quad -\frac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( X_i-\beta _0- \sum \nolimits _{p=1}^P \beta _p z_{ip}\right) ^2}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} . \end{aligned}$$
(10)

If we partially differentiate (10) with respect to \(\beta _0\), we get the following likelihood equation

$$\begin{aligned} \dfrac{\partial }{\partial \beta _0}\ell ({{\varvec{\theta }}})&= \sum \nolimits _{i=1}^n \dfrac{X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}}. \end{aligned}$$

This then follows on through each of the \(\beta _p\) parameters to give

$$\begin{aligned} \dfrac{\partial }{\partial \beta _p}\ell ({{\varvec{\theta }}})&= \sum \nolimits _{i=1}^n \dfrac{z_{ip}\left( X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}\right) }{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} . \end{aligned}$$

For the likelihood equation for \(\alpha _0\), we get

$$\begin{aligned} \dfrac{\partial }{\partial \alpha _0}\ell ({{\varvec{\theta }}})&=-\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{1}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} +\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq} \right) ^2}, \end{aligned}$$

and then for each of the \(\alpha _q\) parameters we have

$$\begin{aligned} \dfrac{\partial }{\partial \alpha _q}\ell ({{\varvec{\theta }}})&=-\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iq}}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} +\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iq} \left( X_i-\beta _0-\sum \nolimits _{p=1}^P \beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}\right) ^2}. \end{aligned}$$

Now, taking the second derivatives we obtain the following \((P+1) \times (P+1)\) matrix for the \({\varvec{\beta }}\) parameters. We refer to this matrix as \({\varvec{B}}=\left[ B_{ij}\right] \):

$$\begin{aligned} B_{00}= & {} -\dfrac{\partial ^2}{\partial \beta _0^2}\ell ({{\varvec{\theta }}}) = \sum \nolimits _{i=1}^n \dfrac{1}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} \\ B_{01}=B_{10}= & {} -\dfrac{\partial ^2}{\partial \beta _0\beta _1}\ell ({{\varvec{\theta }}}) =\sum \nolimits _{i=1}^n \dfrac{z_{i1}}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} \\ B_{11}= & {} -\dfrac{\partial ^2}{\partial \beta _1^2}\ell ({\varvec{\theta }}) = \sum \nolimits _{i=1}^n \dfrac{z_{i1}^2}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}} \\&\vdots \\ B_{PP}= & {} -\frac{\partial ^2}{\partial \beta _P^2}\ell ({\varvec{\theta }}) = \sum \nolimits _{i=1}^n \frac{z_{iP}^2}{\alpha _0+\sum \nolimits _{q=1}^Q \alpha _q x_{iq}}. \end{aligned}$$

Now, the partial derivatives for the \(\alpha _q\) parameters form a \((Q+1) \times (Q+1)\) matrix \({\varvec{A}}\), where \({\varvec{A}}=[A_{ij}]\):

$$\begin{aligned} A_{00}&=-\frac{\partial ^2}{\partial \alpha _0^2}\ell ({\varvec{\theta }})\\&=-\frac{1}{2}\sum \nolimits _{i=1}^n \frac{1}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{\left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq} \right) ^3} \\ A_{01}&=-\frac{\partial ^2}{\partial \alpha _0\alpha _1}\ell ({{\varvec{\theta }}}) =- \frac{1}{2}\sum \nolimits _{i=1}^n \frac{x_{i1}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{ x_{i1}\left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^3} \\ A_{11}&=-\frac{\partial ^2}{\partial \alpha _1^2}\ell ({{\varvec{\theta }}})\\&=-\frac{1}{2}\sum \nolimits _{i=1}^n \frac{x_{i1}^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{x_{i1}^2 \left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^3} \\&\vdots \\ A_{QQ}&=-\frac{\partial ^2}{\partial \alpha _Q^2}\ell ({{\varvec{\theta }}})\\&=- \frac{1}{2}\sum \nolimits _{i=1}^n \frac{x_{iQ}^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}+\sum \nolimits _{i=1}^n \frac{x_{iQ}^2\left( X_i-\beta _0-\sum \nolimits _{p=1}^P\beta _p z_{ip}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^3} . \end{aligned}$$

The partial derivatives of the combination of \(\beta _p\) and \(\alpha _q\) parameters reduce to zero when we take the expectation, and thus the expected information matrix is block diagonal:

$$\begin{aligned} \left[ \begin{array}{cc} {\varvec{B}} &{} {\varvec{0}}^T\\ {\varvec{0}} &{} {\varvec{A}}\\ \end{array} \right] , \end{aligned}$$

where \({\varvec{0}}\) is a \((Q+1) \times (P+1)\) matrix of zeroes. Now, if we focus on the mean component of the expected information matrix, \({\varvec{B}}\),

$$\begin{aligned} \left[ \begin{array}{cccc} \sum \nolimits _{i=1}^n \dfrac{1}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{}\sum \nolimits _{i=1}^n \dfrac{z_{i1}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{} \cdots &{} \sum \nolimits _{i=1}^n \dfrac{z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} \\ \\ \sum \nolimits _{i=1}^n \dfrac{z_{i1}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}&{} \sum \nolimits _{i=1}^n \dfrac{\left( z_{i1}\right) ^2}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{}\cdots &{}\sum \nolimits _{i=1}^n \dfrac{z_{i1}z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}\\ \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \\ \sum \nolimits _{i=1}^n \dfrac{z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}&{} \sum \nolimits _{i=1}^n \dfrac{z_{i1}z_{iP}}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}} &{}\cdots &{}\sum \nolimits _{i=1}^n \dfrac{\left( z_{iP}\right) ^2}{\alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}}\\ \end{array} \right] , \end{aligned}$$

and for the variance component of the expected information matrix, \({\varvec{A}}\),

$$\begin{aligned} \left[ \begin{array}{cccc} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{1}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{}\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{} \cdots &{} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}\\ \\ \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{}\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( x_{i1}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{} \cdots &{} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}\\ \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ \\ \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{}\dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{x_{i1}x_{iQ}}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2} &{} \cdots &{} \dfrac{1}{2} \sum \nolimits _{i=1}^n \dfrac{\left( x_{iQ}\right) ^2}{\left( \alpha _0+\sum \nolimits _{q=1}^Q\alpha _q x_{iq}\right) ^2}\\ \end{array} \right] . \end{aligned}$$

Lastly, these matrices need to be inverted in order to obtain the standard errors of the respective parameters.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Robledo, K.P., Marschner, I.C. A new algorithm for fitting semi-parametric variance regression models. Comput Stat 36, 2313–2335 (2021). https://doi.org/10.1007/s00180-021-01067-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-021-01067-6

Keywords

Navigation