Abstract
Models that capture symmetries present in the data have been widely used in different applications, with early examples from psychometric and medical research. The aim of this article is to study a random effects model focusing on the covariance structure that is block circular symmetric. Useful results are obtained for the spectra of these structured matrices.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Real populations which are of interests in various research areas such as medicine, biology, social studies, often exhibit hierarchical structures. For instance, in educational research, students are grouped within classes and classes are grouped within schools; in medical studies, patients are nested within doctors and doctors are nested within hospitals; in breeding studies, offsprings are grouped by sire and sires are grouped within some spatial factors (region); in political studies, voters are grouped within districts and districts are grouped within cities; in demographic studies, children are grouped within families and families are grouped within a macro-context such as neighborhoods or ethnic communities. It has been recognized that such grouping induces certain dependence between population units within different clusters and, hence statistical models based upon independence assumption become invalid.
Mixed linear models are routinely used for data analysis when the data exhibit dependence and/or various sources of variation can be identified, e.g., repeated measures, longitudinal and hierarchical data. In general, the mixed linear model has the following form
where \(\varvec{Y}\): \(n\times 1\) is a response vector, \(\varvec{X}\): \(n\times p\) is a known design matrix, \(\varvec{\beta }:p\times 1\) is a vector of fixed effects, \(\varvec{\gamma }: k\times 1\) is a vector of random-effects with a known incidence matrix \(\varvec{Z}:n\times k\), \(\varvec{\epsilon }:n\times 1\) is a vector of random errors. It is assumed that \(\varvec{\gamma }\sim N(0,\varvec{G})\), \(\varvec{\epsilon }\sim N(0,\varvec{R})\), and \(Cov(\varvec{\gamma },\epsilon )=0\). Hence, \(\varvec{Y}\) is normally distributed with expectation \(\varvec{X}\varvec{\beta }\) and covariance matrix \(\varvec{\Sigma }=\varvec{Z}\varvec{G}\varvec{Z}'+\varvec{R}\).
In this article, we consider a two factor nested model. Let \(\varvec{\gamma }: n_2\times 1\) and \(\varvec{\xi }: n_2n_1\times 1\) be two vectors of random effects, where \(\varvec{\xi }\) is nested within factor \(\varvec{\gamma }\), and \(\varvec{\epsilon }: n_2n_1\times 1\) be the vector of random errors. Further, it is assumed that \(\varvec{\gamma }\sim N(\mathbf {0}, \varvec{\Sigma }_1)\), \(\varvec{\xi }\sim N(\mathbf {0}, \varvec{\Sigma }_2)\) and \(\varvec{\epsilon }\sim N(\mathbf {0}, \sigma ^2\varvec{I}_{n_2n_1})\). Thus, the model in (1) becomes:
where \(\varvec{\beta }\) and \(\varvec{X}\) are defined as before, \(\mathbf {1}_{n_i}\) and \(\varvec{I}_{n_i}\) denote a column vector of size \(n_i\) with all elements equal to 1 and the identity matrix of order \(n_i\), respectively, and the symbol \(\otimes \) stands for the Kronecker product. Hence,
Two-level hierarchical (nested) model (2) can, for example, be used to model data comprising the petal length and petal width measurements of Kalanchoe flowers collected on n plants from the same greenhouse (Liang et al. 2015). From each plant, there have been randomly chosen \(n_2\) Kalanchoe flowers, all of them have four petals (\(n_1=4\)). The correlation between the observations on any two petals within a single flower is supposed to be a function of the number of petals between them, since the arrangement of petals is circular within each Kalanchoe flower. Therefore, to describe the intra-flower correlation a covariance matrix with circular structure is applied. The inter-flower correlation is described using the compound symmetric covariance matrix. Hence, the mixed linear model (2) used to fit the data becomes:
where \(\varvec{y}_i:12\times 1\) is a vector of observations on plant i, \(i=1,\ldots ,n\), \(\mu \) is a general mean, \(\varvec{\gamma }:3\times 1\) is the vector of inter-plant random effects, \(\varvec{\xi }:12\times 1\) is the vector of intra-plant random effects and \(\varvec{\epsilon }\) is the vector of random errors. Furthermore, \(\varvec{\gamma }\), \(\varvec{\xi }\) and \(\varvec{\epsilon }\) are assumed to be independently distributed as \(N(\mathbf {0},\varvec{\Sigma }_1)\), \(N(\mathbf {0},\varvec{\Sigma }_2)\) and \(N(\mathbf {0},\sigma ^2\varvec{I}_{12})\), correspondingly, with \(\varvec{\Sigma }_1\) having a compound symmetric structure and \(\varvec{\Sigma }_2\) exhibiting a circular pattern.
As mentioned above, the presence of a hierarchical structure generally implies dependence within groups of observations. The dependence structure which is described via the covariance matrices can exhibit special patterns, for example an intraclass correlation pattern. Nowadays, the interest of studying various patterned covariance structures is increasing, see e.g. Srivastava et al. (2008), Klein and Zezula (2009), Leiva and Roy (2010), Liang et al. (2015), Roy et al. (2018), Kopčová and Žežula (2020). The reason is that unstructured covariance matrices may not be suitable to model the error structure in general. The number of unknown parameters in a \(p \times p\) unstructured covariance matrix is \(p(p+1)/2\). A parsimonious version of a covariance matrix may be both useful and meaningful when modelling data, especially for small sample sizes. E.g. in a \(p \times p\) symmetric circular Toeplitz matrix, there are only \([p/2]+1\) unknown parameters, the \([\bullet ]\) stands for the integer part. Furthermore, in longitudinal studies, the number of covariance parameters to be estimated grows rapidly with the number of measured occasions and may approach or even become larger than the number of subjects enrolled in the study (Fitzmaurice et al. 2004). In such situations it is common to impose some structures on the covariance matrix, e.g., autoregressive or banded structures in order to reduce the number of unknown parameters. If we have a tenable prior knowledge about the true covariance structures of the random variables in the model, incorporation of this knowledge may increase the reliability of the estimation procedure. For example, Ohlson and von Rosen (2010) studied linearly structured covariance matrices in a classical growth curve model. Since the variance of the estimator of the mean parameter \(\mu \) is usually a function of the covariance matrix, it is crucial to have a correct assumption about the covariance. Furthermore, an appropriate covariance structure also plays an important role in statistical diagnostics, such as outlier detection and influential observation identification, see, e.g., Pan and Fang (2002).
In this work we will study model (2) with a covariance structure that is block circular symmetric. Circular symmetric model was considered by Olkin and Press (1969). They provided MLEs for the parameters in such models, constructed different likelihood ratio tests (LRT) for testing different types of symmetry in the covariance matrix and tests concerning the mean structure. Olkin (1973) extended the circular symmetric model to the case where circular symmetry appeared in blocks, and blocks were unstructured. For this model, the covariance structure was studied and various LRTs were obtained.
The presence of symmetry in the data at one or several levels yields a patterned dependence structure within or between the corresponding levels in the model (Dawid 1988). Symmetry here means, for example, that the units within given group are exchangeable (Draper et al. 1993), i.e., dependence between neighboring units remains the same (invariant) after re-arrangement of units. Perlman (1987) discussed and summarized results related to group symmetry models. These are linear models for which the covariance structure of \(\varvec{Y}\) is assumed to satisfy certain symmetry restrictions, namely \(D(\varvec{Y})=D(\varvec{Q}\varvec{Y})=\varvec{Q}D(\varvec{Y})\varvec{Q}'\) for some orthogonal matrices, where \(D(\bullet )\) stands for the covariance matrix. Properties of some patterned covariance matrices arising under different symmetry restrictions in balanced mixed models have been studied in Nahtman (2006), Nahtman and von Rosen (2008) and von Rosen (2011).
The aim of this article is to extend models that are circular symmetric in blocks (Olkin 1973), so-called dihedral block symmetry models. We prove that in case when both circular symmetry and exchangeability are present, these models have specific patterned blocks. We will follow up and combine in a certain sense the results obtained in Nahtman (2006), and Nahtman and von Rosen (2008) concerning the covariance structures in model (2). We shall obtain expressions for the spectra of block circular symmetric covariance matrices which take into account the block structure.
The organization of the article is as follows. At the end of this section we give some examples concerning circular symmetry models. Section 2 states the preliminaries and presents some definitions and spectral properties of symmetric circular Toeplitz matrices. In Sect. 3 symmetry restrictions that yield the block circular symmetric covariance structure are studied. Section 4 provides the results concerning the spectra of block circular symmetric matrices. Section 5 comprises concluding remarks.
1.1 Some examples of circular symmetry models
Circular (block) symmetry models have been utilized in situations when there is a spatial circular layout on one factor and another factor satisfies the property of exchangeability.
Example 1
In a signal processing problem, Olkin and Press (1969) and Olkin (1973) studied a signal received from a point source (satellite) located at the geocenter of a regular polygon of n sides. Identical signal receivers were placed at the n vertices, and the signal received at the ith vertex was characterized by k components, e.g. \(\varvec{x}_i=(x_{i1},\ldots ,x_{ik})'\). Assuming that the signal strength was the same in all directions along the vertices, the covariance matrices between the signals had a circular symmetric structure, i.e. \(Cov(\varvec{x}_i,\varvec{x}_{i+j})=\varvec{\Sigma }_j=\varvec{\Sigma }_{n-j}\), \(j=0,1,\ldots ,n\). Additionally, it might be possible to have a more general data structure, which contains another symmetric (with exchangeable categories) space factor (e.g., region), so that the data has the circulant property in the receiver (vertices) dimension and a symmetric pattern in the spatial dimension.
Example 2
In some public health studies (see Hartley and Naik 2001), the disease incidence rates of (relatively homogeneous) city sectors placed around the city center may be circularly correlated. Additionally, if there are \(n_2\) sectors within \(n_1\) cities in the data, and \(Y_{ij}\) denotes disease incidence rate in the ith sector of the jth city, then the covariance matrix of \(Y_{ij}\) exhibits circular block symmetry when cities are exchangeable, \(i=1,\ldots ,n_2\), \(j=1,\ldots ,n_1\). Similarly, during an outbreak of a disease, the disease incidence rate in any sector around the initial ethological agent may be correlated with those adjacent sectors. With the existence of exchangeability of cities, this pattern of covariance structure is appropriate.
Example 3
Gotway and Cressie (1990) described a data set concerning soil-water-infiltration and it can be incorporated in our context by some modifications. As the location varies across the field, the ability of water to infiltrate soil will vary spatially so that locations nearby are more alike with regard to infiltration, than those far apart. Soil-water-infiltration measurements \(Y_{ij}\) (uniresponse) or \(Y_{ijk}\) (multiresponse) were made at \(n_2\) locations contained by \(n_1\) towns, which may be assumed to be exchangeable by our prior knowledge.
For Examples 1–3, the circular property occur at the lowest level while the exchangeability is at the highest level.
2 Preliminaries
In this section, we will give some important definitions and provide useful results concerning certain patterned matrices which will be used in the subsequent. The concept of invariance is important throughout this work.
Definition 1
The covariance matrix \(D(\varvec{\xi })\) of a random variable \(\varvec{\xi }\) is called invariant with respect to the transformation \(\varvec{Q}\) if \(D(\varvec{\xi })=D(\varvec{Q}\varvec{\xi })\), i.e. \(D(\varvec{\xi })=\varvec{Q}D(\varvec{\xi })\varvec{Q}'\), and \(\varvec{Q}\) is an orthogonal matrix.
Next we will introduce specific matrices which are essential here and discuss their properties.
Definition 2
A permutation matrix (P-matrix) is an orthogonal matrix whose columns can be obtained by permuting the columns of the identity matrix.
Definition 3
An orthogonal matrix \(\varvec{SP}=(p_{ij}): n \times n\) is a shift permutation matrix (SP-matrix) if
where \(\mathbf {1}_{(.)}\) is the indicator function.
Definition 4
A matrix \(\varvec{T}=(t_{ij})\) with \([n/2]+1\) distinct elements, where [.] stands for the integer part, and for \(i,j=1,\ldots ,n\),
is called a symmetric circular Toeplitz matrix (SC-Toeplitz matrix). The matrix \(\varvec{T}: n \times n\) has the form
An alternative way to define SC-Toeplitz matrix \(\varvec{T}\), see Olkin (1973), is given by
Definition 5
A symmetric circular matrix SC(n, k) is defined in the following way:
or equivalently
where \(k = 1,\ldots ,[n/2]\).
For notational convenience denote \(SC(n,0)=\varvec{I}_n\).
Example 2.1
For \(n=4\), the matrices given in Definitions (2)-(4) are the following
It is easy to see that
This way of representing SC-Toeplitz matrix can be useful when deriving MLEs for the model (2), see Olkin and Press (1969) and Olkin (1973).
The spectral properties of SC-Toeplitz matrices can, for example, be found in Basilevsky (1983). Nahtman and von Rosen (2008) gave some additional results concerning multiplicities of the eigenvalues of such matrices.
Lemma 1
Let \(\varvec{T}: n \times n\) be a SC-Toeplitz matrix and let \(\lambda _h, h=1,\ldots ,n\), be an eigenvalue of \(\varvec{T}\).
(i) If n is odd, then
It follows that, \(\lambda _h=\lambda _{n-h},\;h=1,\ldots ,n-1\), and there is only one eigenvalue, \(\lambda _n\), which has multiplicity 1, all other eigenvalues are of multiplicity 2.
If n is even, then
It follows that, for \(h\ne n,n/2:\lambda _h=\lambda _{n-h}\), there are only two eigenvalues, \(\lambda _n\) and \(\lambda _{n/2}\), which have multiplicity 1, and all other eigenvalues are of multiplicity 2.
(ii) The number of distinct eigenvalues for SC-Toeplitz matrix is \(\left[ \frac{n}{2}\right] +1\).
(iii) A set of eigenvectors \((\varvec{v}_1,\ldots ,\varvec{v}_n)\) corresponding to the eigenvalues \(\lambda _1,\ldots ,\lambda _n\), is defined by
Furthermore, Lemma 1 provides also eigenvalues and eigenvectors for the matrix SC(n, k) given in Definition 5. An important observation is that the eigenvectors of a SC-Toeplitz matrix \(\varvec{T}\) in (7) do not depend on the elements of \(\varvec{T}\). A consequence of this result is the following.
Theorem 1
Any pair of two SC-Toeplitz matrices of the same size commute.
Another important result which will be used in Sect. 4, is presented in the next lemma, see Nahtman (2006) together with its proof.
Lemma 2
Let \(\varvec{J}_n=\mathbf {1}_n\mathbf {1}_n'\). The matrix \(\varvec{\Sigma }=(a-b)\varvec{I}_n+b\varvec{J}_n\) has two distinct eigenvalues, \(\lambda _0=a-b\) and \(\lambda _1=a+(n-1)b\) of multiplicities \(n-1\) and 1, respectively.
3 Block circular symmetric covariance matrices
As mentioned above, the presence of symmetry in the data at one or several levels yields a patterned dependence structure within or between the corresponding levels (Dawid 1988). In this section we shall obtain symmetry restrictions that yield the block circular symmetric covariance structures.
Let us consider model (2). We are specifically interested in the covariance matrices of the observation vector \(\varvec{Y}=(Y_{ij})\) and random factors in this model under circular symmetry. A crucial assumption will be that if we permute or rotate the levels of one factor (i.e. permute or rotate the ith- or the jth-index in \(Y_{ij}\)), the others will not be affected. This leads to the concept of marginal invariance (see Nahtman 2006), i.e. levels within a factor can be permuted or shifted without any changes in the covariance structure of the model.
Thus, a symmetry model belongs to a family of models where the covariance matrix \(\varvec{\Sigma }\) remains invariant under a finite group \(\mathcal {G}\) of orthogonal transformations (see Perlman 1987). In the subsequent, we say that \(\varvec{\Sigma }\) is \(\mathcal {G}\)-invariant. Definition 6 provides more formal definition.
Definition 6
Symmetry models determined by the group \(\mathcal {G}\) comprises a family of models with positive definite covariance matrices that are \(\mathcal {G}\)-invariant, i.e.
The intraclass correlation model and the circular symmetry model are examples of symmetry models.
Let us define the following (finite) groups of orthogonal transformations:
The following symmetry models can be considered.
(i) Symmetry model with complete block symmetry covariance matrices
implies that the covariance matrix \(\varvec{\Sigma }\) remains invariant under all permutations of the corresponding factor levels. Here, all the covariance matrices are of the form
where both \(\varvec{A}\) and \(\varvec{B}\) are compound symmetry matrices. (Nahtman 2006, Theorem 2.2.) proved that \(\mathcal {G}_1\)-invariance implies a specific structure given in (14).
(ii) Symmetry model with circular (dihedral) block symmetry covariance matrices
Here, the covariance structure remains invariant under all rotations (and reflections) of the corresponding factor levels. For example, when there are four blocks in the covariance matrix, it has the following form (Perlman 1987):
where \(\varvec{A}\), \(\varvec{B}\), and \(\varvec{C}\) are SC-Toeplitz matrices given by (3) (Nahtman and von Rosen 2008, Theorem 3.3.). These models have been studied and applied intensively during the last decades (see for example, Olkin and Press 1969; Olkin 1973; Marin and Dhorne 2002, 2003; Liang et al. 2015; Marques and Coelho 2018).
The novelty of our work is the study of symmetry models with \(\mathcal {G}_2\)- and \(\mathcal {G}_3\)-invariant covariance matrices. We shall show that a symmetry model determined by group \(\mathcal {G}_2\) or \(\mathcal {G}_3\) is a special case of (i) or (ii), respectively, with an additional feature that blocks in the covariance matrix \(\varvec{\Sigma }\) have another pattern. So \(\varvec{\Sigma }\) reflects both compound symmetry and circular symmetry appear simultaneously. We also show how the symmetry models determined by groups \(\mathcal {G}_2\) and \(\mathcal {G}_3\) are related to each other.
The following should be specially noted: it is important to distinguish between full invariance and partial invariance. Full invariance concerns the covariance matrix \(D(\varvec{Y})\) of observation vector \(\varvec{Y}\) implying invariance for all factors in a model. Partial invariance concerns the covariance matrices of some (not all) factors in the model.
Nahtman (2006) and Nahtman and von Rosen (2008) gave the two following results, regarding the invariant covariance matrix of the main effect \(\varvec{\gamma }\) in model (2). Let \(\varvec{SP}\) be a SP-matrix and \(\varvec{P}\) be a P-matrix.
Theorem 2
(Nahtman 2006) The covariance matrix \(\varvec{\Sigma }_1: n_2\times n_2\) of the factor \(\varvec{\gamma }\) is invariant with respect to all permutations \(\varvec{P}\) if and only if it has the following structure:
where \(c_0\) and \(c_1\) are constants.
Theorem 3
(Nahtman and von Rosen 2008) The covariance matrix \(\varvec{\Sigma }_1: n_2\times n_2\) of the factor \(\varvec{\gamma }\) is shift invariant with respect to all shift permutations \(\varvec{SP}\), if and only if it has the following structure:
where the matrices \(SC(n_2,k),\;k=0,\ldots ,[n_2/2]\), are given by Definition 5, and \(\tau _k,\;k=0,\ldots ,[n_2/2]\), are constants.
The next theorems reveal the structure of the covariance matrix of the factor representing the 2nd-order interaction effects \(\varvec{\xi }\) in model (2) which is invariant with respect to \(\mathcal {G}_2\) or \(\mathcal {G}_3\).
Theorem 4
The matrix \(D(\varvec{\xi })=\varvec{\Sigma }_{2}: n_2n_1 \times n_2n_1\) in model (2) is invariant with respect to all orthogonal transformations defined by \(\varvec{P}_{21}= \varvec{P}\otimes \varvec{SP}\), if and only if it has the following structure:
where \(\tau _{k}\) and \(\tau _{k+[n_1/2]+1}\) are constants, and matrices \(SC(n_1,k)\) are defined in Definition 5, \(k=0,\ldots ,[n_1/2]\).
Remark 3.1
To emphasize the block-symmetric structure of a \(\mathcal {G}_2\)-invariant matrix \(\varvec{\Sigma }_{2}\) given in (19), \(\varvec{\Sigma }_{2}\) can be presented as
where \(\varvec{\Sigma }^{(1)}=\sum _{k=0}^{[n_1/2]}\tau _{k}SC(n_1,k),\) \(\varvec{\Sigma }^{(2)}=\sum _{k=0}^{[n_1/2]}\tau _{k+[n_1/2]+1}SC(n_1,k).\)
The number of distinct elements of \(\varvec{\Sigma }_{2}\) given in (20) is \(2(\left[ n_1/2\right] +1)\).
The next example illustrates a \(\mathcal {G}_2\)-invariant covariance matrix.
Example 3.1
For \(n_2=4, n_1=4\), we have the following covariance matrix of the 2nd-order interaction effect of \(\varvec{\xi }\) in model (2):
Since \(n_2=4\) and \(n_1=4\) there are 3 distinct elements in both \(\varvec{A}\) and \(\varvec{B}\), respectively. Next, we obtain the structure of the covariance matrix which is \(\mathcal {G}_3\)-invariant.
Theorem 5
The matrix \(D(\varvec{\xi })=\varvec{\Sigma }_{2}: n_2n_1 \times n_2n_1\) is invariant with respect to all orthogonal transformations defined by \(\varvec{P}_{12}=\varvec{SP}\otimes \varvec{P}\) if and only if it has the following structure:
where \(\varvec{\Sigma }^{(k)}=\tau _{k}\varvec{I}_{n_1}+\tau _{k+[n_2/2]+1}(\varvec{J}_{n_1}-\varvec{I}_{n_1}), \tau _{k}\quad \text {and}\;\tau _{k+[n_2/2]+1}\) are constants. \(SC(n_2,k)\) is a SC-matrix, given in Definition 5.
Remark 3.2
A \(\mathcal {G}_3\)-invariant covariance matrix \(\varvec{\Sigma }_{2}\) has the following block structure:
where \(\varvec{\Sigma }^{(k)}=\tau _{k}\varvec{I}_{n_1}+\tau _{k+[n_2/2]+1}(\varvec{J}_{n_1}-\varvec{I}_{n_1})\). The number of distinct elements of \(\varvec{\Sigma }_{2}\) is \(2(\left[ n_2/2\right] +1).\)
In the next example \(\mathcal {G}_3\)-invariant \(\varvec{\Sigma }_{2}\) will be illustrated when \(n_2=4\) and \(n_1=4\).
Example 3.2
Let \(n_2=4, n_1=4\), then according to (23)
The number of distinct elements in matrices \(\varvec{A}\), \(\varvec{B}\) and \(\varvec{C}\) is 2, respectively. Correspondingly there are 6 distinct elements in \(\varvec{\varvec{\Sigma }}_{2}\).
It is interesting to observe that the \(\mathcal {G}_2\)-invariant matrix \(\varvec{\Sigma }_{2}: 16\times 16\) in (21) has a different structure from the \(\mathcal {G}_3\)-invariant matrix \(\varvec{\Sigma }_{2}: 16\times 16\) in (24). One is block compound symmetry with SC-Toeplitz blocks (denoted by \(\varvec{\Sigma }_{BCS-T}\)), another is block SC-Toeplitz with compound symmetric blocks (denoted by \(\varvec{\Sigma }_{BCT-CS}\)). Transformation \(\varvec{P}_{12}\) and \(\varvec{P}_{21}\) only affect indices of a response vector \(\varvec{Y}=(y_{ij})\), and the question is whether the labeling of \(y_{ij}\) (observations) affects the covariance structure of the model. The answer is negative. The relationship between the two covariance structures, obtained in Theorem 4 and 5, respectively, is presented in the theorem below.
In the following theorem the commutation matrix is used. This matrix has among others the property of switching the order of matrices in the Kronecker product. For the definition and properties of the commutation matrix we refer to Magnus and Neudecker (1986).
Theorem 6
With rearrangement of the observations in the response vector \(\varvec{Y}\) in model (2), the covariance matrix \(\varvec{\Sigma }_{BCS-T}\) given in (19), can be transformed into the covariance matrix \(\varvec{\Sigma }_{BCT-CS}\) given in (22), i.e. \(\varvec{\Sigma }_{BCT-CS}=\varvec{K}_{n_1,n_2}\varvec{\Sigma }_{BCS-T}\varvec{K}_{n_1,n_2}'\), where \(\varvec{K}_{n_1,n_2}: n_2n_1\times n_2n_1\) is the commutation matrix \(\varvec{K}_{n_1,n_2}=\sum _{i=1}^{n_1}\sum _{j=1}^{n_2}(\varvec{e}_i\varvec{d}_j')\otimes (\varvec{d}_j \varvec{e}_i')\), where \(\varvec{e}_i\) is the ith column vector of \(\varvec{I}_{n_1}\) and \(\varvec{d}_j\) is the jth column vector of \(\varvec{I}_{n_2}\).
Proof
From Theorem 4 we have
Using the following property of the Kronecker product \((c\varvec{A})\otimes \varvec{B}=\varvec{A}\otimes (c\varvec{B}),\) where c is an arbitrary scalar, we have
where \(\varvec{\Sigma }^{(k)}=\tau _{k}\varvec{I}_{n_2}+\tau _{k+[n_1/2]+1}(\varvec{J}_{n_2}-\varvec{I}_{n_2}), k=0,\ldots ,[n_1/2].\) Moreover, let \(\varvec{Y}=\left( y_{11},y_{12},\ldots ,y_{1n_1},\ldots ,y_{n_21},y_{n_22},\ldots ,y_{n_2n_1}\right) '\). Applying \(\varvec{K}_{n_1,n_2}\) to \(\varvec{Y}\) yields,
the labeling of the Y components is changed.
With the help of the commutation matrix, we can interchange the elements of the Kronecker product, namely,
and the structure of \(\varvec{\Sigma }_{BCT-CS}\) in Theorem 5 is obtained.
If the covariance matrix has the structure \(\varvec{\Sigma }_{BCT-CS}\), using the commutation matrix \(\varvec{K}_{n_2,n_1}\), we obtain the same structure as in Theorem 4, i.e.,
\(\square \)
We use a simple example to demonstrate the result of Theorem 6.
Example 3.3
Let \(\varvec{\Sigma }_{BCS-T}\) has a structure given in Theorem 4. For \(n_2=3,\;n_1=4\), let
and the structure of the covariance matrix \(\varvec{\Sigma }_{BCS-T}\) is illustrated in Example 3.1. According to Theorem 6, there exists a commutation matrix \(\varvec{K}_{4,3}\) such that
and \(\varvec{\Sigma }_{BCT-CS}=\varvec{K}_{4,3}\varvec{\Sigma }_{BCS-T}\varvec{K}'_{4,3}\). The example shows that, (21) and (24) reflect the dependence structure of the same data, which however, arise from different labeling of factor levels.
4 Spectra of \(\mathcal {G}_2\) and \(\mathcal {G}_3\)-invariant matrices
In this section we study the spectra of the covariance matrices \(\varvec{\Sigma }_{2}\), given in Theorem 4 and Theorem 5, respectively. The novelty of our results is that we use the eigenvalues of the blocks which constitute corresponding matrices as presented in (20) and (23), instead of direct calculation of the eigenvalues using the elements of \(\varvec{\Sigma }_{2}\). Here, the concept of commutativity is important since if two normal matrices commute then they have a joint eigenspace and can be diagonalized simultaneously (see e.g. Kollo and von Rosen 2005, Chapter 1). The multiplicities of the eigenvalues and the number of distinct eigenvalues will also be given.
Theorem 7
Let the covariance matrix \(\varvec{\Sigma }_{2}: n_2n_1\times n_2n_1\) be \(\mathcal {G}_2\)-invariant and have a structure given in (20). Let \(\lambda _h^{(i)}\) be the eigenvalue of \(\varvec{\Sigma }^{(i)}: n_1\times n_1\) with multiplicity \(m_h,\;i,=1,2, h=1,\ldots ,[n_1/2]+1\). The spectrum of \(\varvec{\Sigma }_{2}\) consists of the eigenvalues \(\lambda _h^{(1)}-\lambda _h^{(2)}\), each of multiplicity \((n_2-1)m_h,\) and \(\lambda _h^{(1)}+(n_2-1)\lambda _h^{(2)}\), each of multiplicity \(m_h\). The number of distinct eigenvalues is \(2([n_1/2]+1).\)
Proof
The SC-matrices \(SC(n_i,k_i),\) \(k_i=0,\ldots ,[n_i/2]\) commute. So \(\varvec{\Sigma }^{(1)}\) and \(\varvec{\Sigma }^{(2)}\) commute as well, and they have a joint eigenspace. Hence, there exists an orthogonal matrix \(\varvec{V}_2\), such that \(\varvec{V}_2'\varvec{\Sigma }^{(1)}\varvec{V}_2=\varvec{\Lambda }^{(1)}\) and \(\varvec{V}_2'\varvec{\Sigma }^{(2)}\varvec{V}_2=\varvec{\Lambda }^{(2)}\), where \(\varvec{\Lambda }^{(i)}=diag(\lambda _1^{(i)},\ldots ,\lambda _{n_1}^{(i)}),\;i=1,2\). Furthermore, \(\varvec{I}_{n_2} \otimes \varvec{\Sigma }^{(1)}\) and \((\varvec{J}_{n_2}-\varvec{I}_{n_2})\otimes \varvec{\Sigma }^{(2)}\) also commute. Define the orthogonal matrix \(\varvec{V}_1=(n_2^{-1/2}\mathbf {1}_{n_2}\vdots \varvec{H})\), where \(\varvec{H}\): \(n_2\times (n_2-1),\) satisfies \(\varvec{H}'\mathbf {1}_{n_2}=\mathbf {0}\) and \(\varvec{H}'\varvec{H}=\varvec{I}_{n_2-1}.\) Then \(\varvec{V}_1'\varvec{J}_{n_2}\varvec{V}_1\) is the following diagonal matrix: \(\varvec{V}_1'\varvec{J}_{n_2}\varvec{V}_1=diag\left\{ n_2,\mathbf {0}_{n_2-1}\right\} .\) Let \(\varvec{V}=\varvec{V}_1\otimes \varvec{V}_2\), then using the property of the Kronecker product \((\varvec{A}\otimes \varvec{B})(\varvec{C}\otimes \varvec{D})=\varvec{A}\varvec{C}\otimes \varvec{B}\varvec{D}\), we have
where \( diag\left\{ n_2-1, -\varvec{I}_{n_2-1} \right\} \) denotes a diagonal matrix. The obtained matrix in (25) is a diagonal matrix and the elements in \(\varvec{\Lambda }^{(1)}\) and \(\varvec{\Lambda }^{(2)}\) are obtained from Lemma 1, as well as their multiplicities. We know that there are \(\left[ \frac{n_1}{2}\right] +1\) distinct eigenvalues in \(\varvec{\Lambda }^{(i)},\;i=1,2.\) From the diagonal matrix (25), the number of distinct eigenvalues in \(\varvec{\Sigma }_{2}\) is obtained. \(\square \)
Now we illustrate the results obtained in Theorem 7 on two examples.
Example 4.1
Let \(\varvec{\Sigma }_{2}=\varvec{I}_{3}\otimes \varvec{\Sigma }^{(1)}+(\varvec{J}_{3}-\varvec{I}_{3})\otimes \varvec{\Sigma }^{(2)},\) where \(\varvec{\Sigma }^{(1)}=\sum _{k_1=0}^2\tau _{k_1}SC(4,k_1)\) and \(\varvec{\Sigma }^{(2)}=\sum _{k_1=0}^2\tau _{k_1+3}SC(4,k_1).\)
The block \(\varvec{\Sigma }^{(1)}: 4\times 4\), is a SC-Toeplitz matrix with three distinct eigenvalues:
with multiplicities \(m_1=2\), \(m_2=1\) and \(m_3=1\), respectively.
Similarly, the block \(\varvec{\Sigma }^{(2)}: 4\times 4\), is a SC-Toeplitz matrix with three distinct eigenvalues:
with the same multiplicities \(m_h,\;h=1,\ldots ,3,\) as in \(\varvec{\Sigma }^{(1)}\).
Let \(m_{\lambda _i}\) denote the multiplicity of the eigenvalue \(\lambda _i\). The distinct eigenvalues of \(\varvec{\Sigma }_{2}: 12\times 12\) are:
Example 4.2
Let \(\varvec{\Sigma }_{2}=\varvec{I}_{3}\otimes \varvec{\Sigma }^{(1)}+(\varvec{J}_{3}-\varvec{I}_{3})\otimes \varvec{\Sigma }^{(2)},\) where \(\varvec{\Sigma }^{(1)}=\sum _{k_1=0}^1\tau _{k_1}SC(3,k_1)\) and \(\varvec{\Sigma }^{(2)}=\sum _{k_1=0}^1\tau _{k_1+2}SC(3,k_1).\)
Both blocks \(\varvec{\Sigma }^{(1)}: 3\times 3\) and \(\varvec{\Sigma }^{(2)}: 3\times 3\) are SC-Toeplitz matrices. The distinct eigenvalues are:
The distinct eigenvalues of \(\varvec{\Sigma }_{2}: 9\times 9\) are:
Note The spectrum of \(\varvec{\Sigma }_{2}\) (\(\varvec{\Sigma }_{BCT-CS}\)), given in (23) is the same as of \(\varvec{\Sigma }_{2}\) (\(\varvec{\Sigma }_{BCS-T}\)) in (20), and it also can be found from Theorem 6 that \(\varvec{\Sigma }_{BCT-CS}\) and \(\varvec{\Sigma }_{BCS-T}\) are similar matrices, i.e, \(\varvec{\Sigma }_{BCT-CS}=\varvec{K}_{n_1,n_2}\varvec{\Sigma }_{BCS-T}\varvec{K}'_{n_1,n_2}\), where \(\varvec{K}_{n_1,n_2}\) is an orthogonal matrix. The characteristic equation is given by the following determinant,
5 Concluding remarks
In practice, a symmetry model is applied to a data set in which specific symmetry relations can be identified (Viana 2008). We have derived the covariance structures under invariance related to two groups of orthogonal transformations (permutations and rotations). In mixed linear models, particular patterns of the covariance matrices reflect how the data share common characteristics in different hierarchies. This is important when performing estimation and testing. When estimating the fixed effects, the imposed structure can usually improve the precision of the fixed effects estimator. Furthermore, there might exist a risk of misspecification of the covariance structure that could result in misleading inference of the fixed effects. Thus, it is also necessary to discuss different hypotheses of the covariance structures to verify the model (Jensen 1988). In addition, the existence of explicit MLEs for such symmetry models should be studied, for example, Szatrowski (1980) and Ohlson and von Rosen (2010) provided the explicit MLEs of some patterned covariance structures. Our study of the spectral properties can be used to obtain explicit MLEs of a covariance matrix which has block circular symmetric structure and discuss concerning the existence of explicit MLEs.
In this article, we only considered model with two random factors which is common in empirical studies and it could be of interest to study the cases with more factors. In such cases, the higher order interactions will be involved. For example, when we investigate random effects models with s random factors, the potential structured data might be possibly identified by considering different groups of symmetry transformations, i.e., when different symmetry patterns are observed in different hierarchies.
References
Basilevsky A (1983) Applied matrix algebra in the statistical sciences. North-Holland, New York
Dawid AP (1988) Symmetry models and hypotheses for structured data layouts. J Roy Stat Soc B 50:1–34
Draper D, Hodges J, Mallows C, Pregibon D (1993) Exchangeability and data analysis. J R Stat Soc A 156:9–37
Fitzmaurice GM, Laird NM, Ware JH (2004) Applied longitudinal analysis. Wiley, Hoboken
Gotway CA, Cressie NA (1990) A spatial analysis of variance applied to soil-water infiltration. Water Resour Res 26:2695–2703
Hartley AM, Naik DN (2001) Estimation of familial correlations under autoregressive circular covariance. Commun Stat Theory Methods 30:1811–1828
Jensen ST (1988) Covariance hypotheses which are linear in both the covariance and the inverse covariance. Ann Stat 16:302–322
Klein D, Zezula I (2009) The maximum likelihood estimators in the growth curve model with serial covariance structure. J Stat Plan Inference 139:3270–3276
Kollo T, von Rosen D (2005) Advanced multivariate statistics with matrices. Springer, Dordrecht
Kopčová V, Žežula I (2020) On intraclass structure estimation in the growth curve model. Stat Pap 61:1085–1106
Leiva R, Roy A (2010) Linear discrimination for multi-level multivariate data with separable means and jointly equicorrelated covariance structure. J Stat Plan Inference 141:1910–1924
Liang Y, von Rosen D, von Rosen T (2015) On estimation in hierarchical models with block circular covariance structures. Ann Inst Stat Math 67:773–791
Magnus JR, Neudecker H (1986) Symmetry, 0–1 matrices and Jacobians: a review. Econ Theory 2:157–190
Marin JM, Dhorne T (2002) Linear Toeplitz covariance structure models with optimal estimators of variance components. Linear Algebra Appl 354:195–212
Marin JM, Dhorne T (2003) Optimal quadratic unbiased estimation for models with linear Toeplitz covariance structure. Statistics 37:85–99
Marques FJ, Coelho CA (2018) The simultaneous test of equality and circularity of several covariance matrices. J Stat Theory Pract 12:861–885
Nahtman T (2006) Marginal permutation invariant covariance matrices with applications to linear models. Linear Algebra Appl 417:183–210
Nahtman T, von Rosen D (2008) Shift permutation invariance in linear random factor models. Math Methods Stat 17:173–185
Ohlson M, von Rosen D (2010) Explicit estimators of parameters in the growth curve model with linearly structured covariance matrices. J Multivar Anal 101:1284–1295
Olkin I (1973) Testing and estimation for structures which are circularly symmetric in blocks. In: Kabe DG, Gupta RP (eds) Multivariate statistical inference. North-Holland, Amsterdam, pp 183–195
Olkin I, Press S (1969) Testing and estimation for a circular stationary model. Ann Math Stat 40:1358–1373
Pan JX, Fang KT (2002) Growth curve models and statistical diagnostics. Springer, New York
Perlman MD (1987) Group symmetry covariance models. Stat Sci 2:421–425
Roy A, Filipiak K, Klein D (2018) Testing a block exchangeable covariance matrix. Statistics 52:393–408
Srivastava MS, von Rosen T, von Rosen D (2008) Models with a Kronecker product covariance structure: estimation and testing. Math Methods Stat 17:357–370
Szatrowski TH (1980) Necessary and sufficient conditions for explicit solutions in the multivariate normal estimation problem for patterned means and covariances. Ann Stat 8:802–810
Viana M (2008) Symmetry studies: an introduction to the analysis of structured data in applications. Cambridge University Press, New York
von Rosen T (2011) On the inverse of certain block structured matrices generated by linear combinations of Kronecter products. Linear Multilinear Algebra 59:595–606
Acknowledgements
The authors would like to thank the two anonymous reviewers for providing useful suggestions and valuable recommendations that led to the much improved version of the article. The authors gratefully acknowledge The Swedish Research Council, grant 2017-03003. Yuli Liang’s research is partly supported by the strategic funding of Örebro University.
Funding
Open access funding provided by Örebro University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proof of Theorem 4
Proof
Let \(N=n_2n_1\) and \(\varvec{P}_{21}\in \mathcal {G}_2\), given by (11). The matrix \(\varvec{\Sigma }_{2}\) can be written as
where \(\varvec{e}_{k}\), \(\varvec{e}_{l}\) are the kth and the lth columns of the identity matrix \(\varvec{I}_N\), respectively. We can define the element \(\sigma _{kl}\) of \(\varvec{\Sigma }_2\) in a more informative way. Observe that one can write \(\varvec{e}_k=\varvec{e}_{2,i_2}\otimes \varvec{e}_{1,i_1}\) and \(\varvec{e}_l'=\varvec{e}_{2,j_2}'\otimes \varvec{e}_{1,j_1}'\), where \(\varvec{e}_{h,i_h}\) is the \(i_h\)th column of the identity matrix \(\varvec{I}_{n_h}, h=1,2,\) and \(\sigma _{kl}=\sigma _{(i_2,i_1)(j_2,j_1)}=Cov(\xi _{i_2i_1},\xi _{j_2j_1})\), where \(k=(i_2-1)n_1+i_1\) and \(l=(j_2-1)n_1+j_1.\)
Hence, using the following property of the Kronecker product,
we can express \(\varvec{\Sigma }_{2}\) in the following way:
The \(\mathcal {G}_2\)-invariance implies \(\varvec{P}_{21}\varvec{\Sigma }_{2}\varvec{P}_{21}'=\varvec{\Sigma }_{2}\), for all \(\varvec{P}_{21}\in \mathcal {G}_2\). Therefore,
Since \(\varvec{P}\) is a P-matrix, it acts on the components of \(\varvec{\xi }=(\xi _{ij})\) via index i, which is associated with the corresponding factor levels of \(\varvec{\gamma }\), \(i=1,\ldots ,n_2,\;j=1,\ldots ,n_1\). For the term \(\varvec{P}\varvec{e}_{2,i_2}\varvec{e}_{2,j_2}'\varvec{P}',\) the invariance of \(\varvec{\Sigma }_{2}\) implies that in (26) we may define constants
where \(i_1,j_1=1,\ldots ,n_1,\;i_2,j_2=1,\ldots ,n_2.\) Thus, (26) becomes
The SP-matrix \(\varvec{SP}\) acts on the components of \(\varvec{\xi }=(\xi _{ij})\) via index j, which are nested within \(\varvec{\gamma }\) by assumption. We can express (27) in the following way:
By the invariance of \(\varvec{\Sigma }_{2}\) with respect to the term \(\varvec{SP}\varvec{e}_{1,i_1}\varvec{e}_{1,j_1}'\varvec{SP}'\), we may define constants
Hence, we have the following structure for \(\varvec{\Sigma }_{2}:\)
The structure in (19) is obtained, which implies that the “only if” part of the theorem is true. The “if” part is shown due to the structure of \(\varvec{\Sigma }_{2},\) since
followed by Theorem 3,
and
The proof is completed. \(\square \)
Appendix B: Proof of Theorem 5
Proof
We use the same technique as in the proof of Theorem 4. Under the condition \(\varvec{P}_{12}\varvec{\Sigma }_{2}\varvec{P}_{12}'=\varvec{\Sigma }_{2}\), for all \(\varvec{P}_{12}\in \mathcal {G}_3\), after the same presentation of \(\varvec{\Sigma }_{2}\) as used in Theorem 4 for \(\varvec{\Sigma }_{2}\), we have
Denoting \(\sigma _{1(i_2)(j_2)}=\sigma _{(i_2,i_1),(j_2,i_1)}\) for \(\forall i_1=j_1;\;\forall i_2,j_2\) and \(\sigma _{2(i_2)(j_2)}=\sigma _{(i_2,i_1),(j_2,j_1)}\) for \(\forall i_1\ne j_1;\;\forall i_2,j_2,\) we have
Let us now define \(\tau _{k}=\sigma _{1(i_2)(j_2)}, \forall |i_2-j_2|=k,n_2-k;\forall i_1=j_1,\) and \(\tau _{k+[n_2/2]+1}=\sigma _{2(i_2)(j_2)}, \forall |i_2-j_2|=k,n_2-k;\forall i_1\ne j_1.\) Thus, (28) becomes
and (22) is obtained. Due to the structure of \(\varvec{\Sigma }_{2}\), it is straightforward to show that \(\varvec{P}_{12}\varvec{\Sigma }_{2}\varvec{P}'_{12}=\varvec{\Sigma }_{2}\). \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Liang, Y., von Rosen, D. & von Rosen, T. On properties of Toeplitz-type covariance matrices in models with nested random effects. Stat Papers 62, 2509–2528 (2021). https://doi.org/10.1007/s00362-020-01202-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-020-01202-3