On properties of Toeplitz-type covariance matrices in models with nested random effects

Models that capture symmetries present in the data have been widely used in different applications, with early examples from psychometric and medical research. The aim of this article is to study a random effects model focusing on the covariance structure that is block circular symmetric. Useful results are obtained for the spectra of these structured matrices.


Introduction
Real populations which are of interests in various research areas such as medicine, biology, social studies, often exhibit hierarchical structures. For instance, in educational research, students are grouped within classes and classes are grouped within schools; in medical studies, patients are nested within doctors and doctors are nested within hospitals; in breeding studies, offsprings are grouped by sire and sires are grouped within some spatial factors (region); in political studies, voters are grouped within districts and districts are grouped within cities; in demographic studies, chil-dren are grouped within families and families are grouped within a macro-context such as neighborhoods or ethnic communities. It has been recognized that such grouping induces certain dependence between population units within different clusters and, hence statistical models based upon independence assumption become invalid.
Mixed linear models are routinely used for data analysis when the data exhibit dependence and/or various sources of variation can be identified, e.g., repeated measures, longitudinal and hierarchical data. In general, the mixed linear model has the following form where Y : n × 1 is a response vector, X: n × p is a known design matrix, β : p × 1 is a vector of fixed effects, γ : k × 1 is a vector of random-effects with a known incidence matrix Z : n × k, : n × 1 is a vector of random errors. It is assumed that γ ∼ N (0, G), ∼ N (0, R), and Cov(γ , ) = 0. Hence, Y is normally distributed with expectation Xβ and covariance matrix = ZG Z + R.
In this article, we consider a two factor nested model. Let γ : n 2 × 1 and ξ : n 2 n 1 × 1 be two vectors of random effects, where ξ is nested within factor γ , and : n 2 n 1 ×1 be the vector of random errors. Further, it is assumed that γ ∼ N (0, 1 ), ξ ∼ N (0, 2 ) and ∼ N (0, σ 2 I n 2 n 1 ). Thus, the model in (1) becomes: Y = Xβ + (I n 2 ⊗ 1 n 1 )γ + (I n 2 ⊗ I n 1 )ξ + (I n 2 ⊗ I n 1 ) , where β and X are defined as before, 1 n i and I n i denote a column vector of size n i with all elements equal to 1 and the identity matrix of order n i , respectively, and the symbol ⊗ stands for the Kronecker product. Hence, Y ∼ N (Xβ, ), where = Z 1 1 Z 1 + 2 + σ 2 I n 2 n 1 , Z 1 = I n 2 ⊗ 1 n 1 .
Two-level hierarchical (nested) model (2) can, for example, be used to model data comprising the petal length and petal width measurements of Kalanchoe flowers collected on n plants from the same greenhouse (Liang et al. 2015). From each plant, there have been randomly chosen n 2 Kalanchoe flowers, all of them have four petals (n 1 = 4). The correlation between the observations on any two petals within a single flower is supposed to be a function of the number of petals between them, since the arrangement of petals is circular within each Kalanchoe flower. Therefore, to describe the intra-flower correlation a covariance matrix with circular structure is applied. The inter-flower correlation is described using the compound symmetric covariance matrix. Hence, the mixed linear model (2) used to fit the data becomes: where y i : 12 × 1 is a vector of observations on plant i, i = 1, . . . , n, μ is a general mean, γ : 3 × 1 is the vector of inter-plant random effects, ξ : 12 × 1 is the vector of intra-plant random effects and is the vector of random errors. Furthermore, γ , ξ and are assumed to be independently distributed as N (0, 1 ), N (0, 2 ) and N (0, σ 2 I 12 ), correspondingly, with 1 having a compound symmetric structure and 2 exhibiting a circular pattern. As mentioned above, the presence of a hierarchical structure generally implies dependence within groups of observations. The dependence structure which is described via the covariance matrices can exhibit special patterns, for example an intraclass correlation pattern. Nowadays, the interest of studying various patterned covariance structures is increasing, see e.g. Srivastava et al. (2008), Klein and Zezula (2009), Leiva and Roy (2010), Liang et al. (2015), Roy et al. (2018), Kopčová and Žežula (2020). The reason is that unstructured covariance matrices may not be suitable to model the error structure in general. The number of unknown parameters in a p × p unstructured covariance matrix is p( p + 1)/2. A parsimonious version of a covariance matrix may be both useful and meaningful when modelling data, especially for small sample sizes. E.g. in a p × p symmetric circular Toeplitz matrix, there are only [ p/2] + 1 unknown parameters, the [•] stands for the integer part. Furthermore, in longitudinal studies, the number of covariance parameters to be estimated grows rapidly with the number of measured occasions and may approach or even become larger than the number of subjects enrolled in the study (Fitzmaurice et al. 2004). In such situations it is common to impose some structures on the covariance matrix, e.g., autoregressive or banded structures in order to reduce the number of unknown parameters. If we have a tenable prior knowledge about the true covariance structures of the random variables in the model, incorporation of this knowledge may increase the reliability of the estimation procedure. For example, Ohlson and von Rosen (2010) studied linearly structured covariance matrices in a classical growth curve model. Since the variance of the estimator of the mean parameter μ is usually a function of the covariance matrix, it is crucial to have a correct assumption about the covariance. Furthermore, an appropriate covariance structure also plays an important role in statistical diagnostics, such as outlier detection and influential observation identification, see, e.g., Pan and Fang (2002). In this work we will study model (2) with a covariance structure that is block circular symmetric. Circular symmetric model was considered by Olkin and Press (1969). They provided MLEs for the parameters in such models, constructed different likelihood ratio tests (LRT) for testing different types of symmetry in the covariance matrix and tests concerning the mean structure. Olkin (1973) extended the circular symmetric model to the case where circular symmetry appeared in blocks, and blocks were unstructured. For this model, the covariance structure was studied and various LRTs were obtained. The presence of symmetry in the data at one or several levels yields a patterned dependence structure within or between the corresponding levels in the model (Dawid 1988). Symmetry here means, for example, that the units within given group are exchangeable (Draper et al. 1993), i.e., dependence between neighboring units remains the same (invariant) after re-arrangement of units. Perlman (1987) discussed and summarized results related to group symmetry models. These are linear models for which the covariance structure of Y is assumed to satisfy certain symmetry restrictions, namely D(Y ) = D( QY ) = Q D(Y ) Q for some orthogonal matrices, where D(•) stands for the covariance matrix. Properties of some patterned covariance matrices arising under different symmetry restrictions in balanced mixed models have been studied in Nahtman (2006), Nahtman and von Rosen (2008) and von Rosen (2011). The aim of this article is to extend models that are circular symmetric in blocks (Olkin 1973), so-called dihedral block symmetry models. We prove that in case when both circular symmetry and exchangeability are present, these models have specific patterned blocks. We will follow up and combine in a certain sense the results obtained in Nahtman (2006), and Nahtman and von Rosen (2008) concerning the covariance structures in model (2). We shall obtain expressions for the spectra of block circular symmetric covariance matrices which take into account the block structure. The organization of the article is as follows. At the end of this section we give some examples concerning circular symmetry models. Section 2 states the preliminaries and presents some definitions and spectral properties of symmetric circular Toeplitz matrices. In Sect. 3 symmetry restrictions that yield the block circular symmetric covariance structure are studied. Section 4 provides the results concerning the spectra of block circular symmetric matrices. Section 5 comprises concluding remarks.

Some examples of circular symmetry models
Circular (block) symmetry models have been utilized in situations when there is a spatial circular layout on one factor and another factor satisfies the property of exchangeability.
Example 1 In a signal processing problem, Olkin and Press (1969) and Olkin (1973) studied a signal received from a point source (satellite) located at the geocenter of a regular polygon of n sides. Identical signal receivers were placed at the n vertices, and the signal received at the ith vertex was characterized by k components, e.g. x i = (x i1 , . . . , x ik ) . Assuming that the signal strength was the same in all directions along the vertices, the covariance matrices between the signals had a circular symmetric structure, i.e. Cov(x i , x i+ j ) = j = n− j , j = 0, 1, . . . , n. Additionally, it might be possible to have a more general data structure, which contains another symmetric (with exchangeable categories) space factor (e.g., region), so that the data has the circulant property in the receiver (vertices) dimension and a symmetric pattern in the spatial dimension.

Example 2
In some public health studies (see Hartley and Naik 2001), the disease incidence rates of (relatively homogeneous) city sectors placed around the city center may be circularly correlated. Additionally, if there are n 2 sectors within n 1 cities in the data, and Y i j denotes disease incidence rate in the ith sector of the jth city, then the covariance matrix of Y i j exhibits circular block symmetry when cities are exchangeable, i = 1, . . . , n 2 , j = 1, . . . , n 1 . Similarly, during an outbreak of a disease, the disease incidence rate in any sector around the initial ethological agent may be correlated with those adjacent sectors. With the existence of exchangeability of cities, this pattern of covariance structure is appropriate.
Example 3 Gotway and Cressie (1990) described a data set concerning soil-waterinfiltration and it can be incorporated in our context by some modifications. As the location varies across the field, the ability of water to infiltrate soil will vary spatially so that locations nearby are more alike with regard to infiltration, than those far apart. Soil-water-infiltration measurements Y i j (uniresponse) or Y i jk (multiresponse) were made at n 2 locations contained by n 1 towns, which may be assumed to be exchangeable by our prior knowledge.
For Examples 1-3, the circular property occur at the lowest level while the exchangeability is at the highest level.

Preliminaries
In this section, we will give some important definitions and provide useful results concerning certain patterned matrices which will be used in the subsequent. The concept of invariance is important throughout this work.
and Q is an orthogonal matrix.
Next we will introduce specific matrices which are essential here and discuss their properties.
Definition 2 A permutation matrix (P-matrix) is an orthogonal matrix whose columns can be obtained by permuting the columns of the identity matrix.

Definition 3 An orthogonal matrix
where 1 (.) is the indicator function.
stands for the integer part, and for i, j = 1, . . . , n, is called a symmetric circular Toeplitz matrix (SC-Toeplitz matrix). The matrix T : n × n has the form An alternative way to define SC-Toeplitz matrix T , see Olkin (1973), is given by Definition 5 A symmetric circular matrix SC(n, k) is defined in the following way: For notational convenience denote SC(n, 0) = I n .
Example 2.1 For n = 4, the matrices given in Definitions (2) It is easy to see that This way of representing SC-Toeplitz matrix can be useful when deriving MLEs for the model (2), see Olkin and Press (1969) and Olkin (1973). The spectral properties of SC-Toeplitz matrices can, for example, be found in Basilevsky (1983). Nahtman and von Rosen (2008) gave some additional results concerning multiplicities of the eigenvalues of such matrices.
Lemma 1 Let T : n × n be a SC-Toeplitz matrix and let λ h , h = 1, . . . , n, be an eigenvalue of T .
(i) If n is odd, then It follows that, λ h = λ n−h , h = 1, . . . , n − 1, and there is only one eigenvalue, λ n , which has multiplicity 1, all other eigenvalues are of multiplicity 2. If n is even, then It follows that, for h = n, n/2 : λ h = λ n−h , there are only two eigenvalues, λ n and λ n/2 , which have multiplicity 1, and all other eigenvalues are of multiplicity 2.
(ii) The number of distinct eigenvalues for SC-Toeplitz matrix is n 2 + 1.
Furthermore, Lemma 1 provides also eigenvalues and eigenvectors for the matrix SC(n, k) given in Definition 5. An important observation is that the eigenvectors of a SC-Toeplitz matrix T in (7) do not depend on the elements of T . A consequence of this result is the following.

Theorem 1 Any pair of two SC-Toeplitz matrices of the same size commute.
Another important result which will be used in Sect. 4, is presented in the next lemma, see Nahtman (2006) together with its proof.
Lemma 2 Let J n = 1 n 1 n . The matrix = (a − b)I n + b J n has two distinct eigenvalues, λ 0 = a − b and λ 1 = a + (n − 1)b of multiplicities n − 1 and 1, respectively.

Block circular symmetric covariance matrices
As mentioned above, the presence of symmetry in the data at one or several levels yields a patterned dependence structure within or between the corresponding levels (Dawid 1988). In this section we shall obtain symmetry restrictions that yield the block circular symmetric covariance structures. Let us consider model (2). We are specifically interested in the covariance matrices of the observation vector Y = (Y i j ) and random factors in this model under circular symmetry. A crucial assumption will be that if we permute or rotate the levels of one factor (i.e. permute or rotate the ith-or the jth-index in Y i j ), the others will not be affected. This leads to the concept of marginal invariance (see Nahtman 2006), i.e. levels within a factor can be permuted or shifted without any changes in the covariance structure of the model. Thus, a symmetry model belongs to a family of models where the covariance matrix remains invariant under a finite group G of orthogonal transformations (see Perlman 1987). In the subsequent, we say that is G-invariant. Definition 6 provides more formal definition.
Definition 6 Symmetry models determined by the group G comprises a family of models with positive definite covariance matrices that are G-invariant, i.e.
The intraclass correlation model and the circular symmetry model are examples of symmetry models. Let us define the following (finite) groups of orthogonal transformations: The following symmetry models can be considered.
(i) Symmetry model with complete block symmetry covariance matrices implies that the covariance matrix remains invariant under all permutations of the corresponding factor levels. Here, all the covariance matrices are of the form where both A and B are compound symmetry matrices. (Nahtman 2006, Theorem 2.2.) proved that G 1 -invariance implies a specific structure given in (14).
(ii) Symmetry model with circular (dihedral) block symmetry covariance matrices Here, the covariance structure remains invariant under all rotations (and reflections) of the corresponding factor levels. For example, when there are four blocks in the covariance matrix, it has the following form (Perlman 1987): where A, B, and C are SC-Toeplitz matrices given by (3) (Nahtman and von Rosen 2008, Theorem 3.3.). These models have been studied and applied intensively during the last decades (see for example, Olkin and Press 1969;Olkin 1973;Dhorne 2002, 2003;Liang et al. 2015;Marques and Coelho 2018). The novelty of our work is the study of symmetry models with G 2 -and G 3 -invariant covariance matrices. We shall show that a symmetry model determined by group G 2 or G 3 is a special case of (i) or (ii), respectively, with an additional feature that blocks in the covariance matrix have another pattern. So reflects both compound symmetry and circular symmetry appear simultaneously. We also show how the symmetry models determined by groups G 2 and G 3 are related to each other. The following should be specially noted: it is important to distinguish between full invariance and partial invariance. Full invariance concerns the covariance matrix D(Y ) of observation vector Y implying invariance for all factors in a model. Partial invariance concerns the covariance matrices of some (not all) factors in the model. Nahtman (2006) and Nahtman and von Rosen (2008) gave the two following results, regarding the invariant covariance matrix of the main effect γ in model (2). Let S P be a SP-matrix and P be a P-matrix.
Theorem 2 (Nahtman 2006) The covariance matrix 1 : n 2 × n 2 of the factor γ is invariant with respect to all permutations P if and only if it has the following structure: where c 0 and c 1 are constants.
The next theorems reveal the structure of the covariance matrix of the factor representing the 2nd-order interaction effects ξ in model (2) which is invariant with respect to G 2 or G 3 .
Example 3.1 For n 2 = 4, n 1 = 4, we have the following covariance matrix of the 2nd-order interaction effect of ξ in model (2):

A B B B B A B B B B A B B B B A
Since n 2 = 4 and n 1 = 4 there are 3 distinct elements in both A and B, respectively. Next, we obtain the structure of the covariance matrix which is G 3 -invariant.
It is interesting to observe that the G 2 -invariant matrix 2 : 16 × 16 in (21) has a different structure from the G 3 -invariant matrix 2 : 16 × 16 in (24). One is block compound symmetry with SC-Toeplitz blocks (denoted by BC S−T ), another is block SC-Toeplitz with compound symmetric blocks (denoted by BC T −C S ). Transformation P 12 and P 21 only affect indices of a response vector Y = (y i j ), and the question is whether the labeling of y i j (observations) affects the covariance structure of the model. The answer is negative. The relationship between the two covariance structures, obtained in Theorem 4 and 5, respectively, is presented in the theorem below. In the following theorem the commutation matrix is used. This matrix has among others the property of switching the order of matrices in the Kronecker product. For the definition and properties of the commutation matrix we refer to Magnus and Neudecker (1986). (22), i.e. BC T −C S = K n 1 ,n 2 BC S−T K n 1 ,n 2 , where K n 1 ,n 2 : n 2 n 1 × n 2 n 1 is the commutation matrix K n 1 ,n 2 =

Theorem 6 With rearrangement of the observations in the response vector Y in model (2), the covariance matrix BC S−T given in (19), can be transformed into the covariance matrix BC T −C S given in
Using the following property of the Kronecker product (c A) ⊗ B = A ⊗ (c B), where c is an arbitrary scalar, we have ⊗SC(n 1 , 0) + . . .
We use a simple example to demonstrate the result of Theorem 6.

Spectra of G 2 and G 3 -invariant matrices
In this section we study the spectra of the covariance matrices 2 , given in Theorem 4 and Theorem 5, respectively. The novelty of our results is that we use the eigenvalues of the blocks which constitute corresponding matrices as presented in (20) and (23), instead of direct calculation of the eigenvalues using the elements of 2 . Here, the concept of commutativity is important since if two normal matrices commute then they have a joint eigenspace and can be diagonalized simultaneously (see e.g. Kollo and von Rosen 2005, Chapter 1). The multiplicities of the eigenvalues and the number of distinct eigenvalues will also be given.

Proof
The SC-matrices SC(n i , k i ), k i = 0, . . . , [n i /2] commute. So (1) and (2) commute as well, and they have a joint eigenspace. Hence, there exists an orthogonal n 1 ), i = 1, 2. Furthermore, I n 2 ⊗ (1) and ( J n 2 − I n 2 ) ⊗ (2) also commute. Define the orthogonal matrix V 1 = (n −1/2 2 1 n 2 . . .H), where H: n 2 ×(n 2 −1), satisfies H 1 n 2 = 0 and H H = I n 2 −1 . Then V 1 J n 2 V 1 is the following diagonal matrix: V 1 J n 2 V 1 = diag n 2 , 0 n 2 −1 . Let V = V 1 ⊗ V 2 , then using the property of the Kronecker product ( A ⊗ B)(C ⊗ D) = AC ⊗ B D, we have where diag n 2 − 1, −I n 2 −1 denotes a diagonal matrix. The obtained matrix in (25) is a diagonal matrix and the elements in (1) and (2) are obtained from Lemma 1, as well as their multiplicities. We know that there are n 1 2 + 1 distinct eigenvalues in (i) , i = 1, 2. From the diagonal matrix (25), the number of distinct eigenvalues in 2 is obtained. Now we illustrate the results obtained in Theorem 7 on two examples. (1) : 4 × 4, is a SC-Toeplitz matrix with three distinct eigenvalues: with multiplicities m 1 = 2, m 2 = 1 and m 3 = 1, respectively. Similarly, the block (2) : 4 × 4, is a SC-Toeplitz matrix with three distinct eigenvalues: with the same multiplicities m h , h = 1, . . . , 3, as in (1) . Let m λ i denote the multiplicity of the eigenvalue λ i . The distinct eigenvalues of 2 : 12 × 12 are: k 1 ). Both blocks (1) : 3 × 3 and (2) : 3 × 3 are SC-Toeplitz matrices. The distinct eigenvalues are: The distinct eigenvalues of 2 : 9 × 9 are: Note The spectrum of 2 ( BC T −C S ), given in (23) is the same as of 2 ( BC S−T ) in (20), and it also can be found from Theorem 6 that BC T −C S and BC S−T are similar matrices, i.e, BC T −C S = K n 1 ,n 2 BC S−T K n 1 ,n 2 , where K n 1 ,n 2 is an orthogonal matrix. The characteristic equation is given by the following determinant,

Concluding remarks
In practice, a symmetry model is applied to a data set in which specific symmetry relations can be identified (Viana 2008). We have derived the covariance structures under invariance related to two groups of orthogonal transformations (permutations and rotations). In mixed linear models, particular patterns of the covariance matrices reflect how the data share common characteristics in different hierarchies. This is important when performing estimation and testing. When estimating the fixed effects, the imposed structure can usually improve the precision of the fixed effects estimator. Furthermore, there might exist a risk of misspecification of the covariance structure that could result in misleading inference of the fixed effects. Thus, it is also necessary to discuss different hypotheses of the covariance structures to verify the model (Jensen 1988). In addition, the existence of explicit MLEs for such symmetry models should be studied, for example, Szatrowski (1980) and Ohlson and von Rosen (2010) provided the explicit MLEs of some patterned covariance structures. Our study of the spectral properties can be used to obtain explicit MLEs of a covariance matrix which has block circular symmetric structure and discuss concerning the existence of explicit MLEs. In this article, we only considered model with two random factors which is common in empirical studies and it could be of interest to study the cases with more factors. In such cases, the higher order interactions will be involved. For example, when we investigate random effects models with s random factors, the potential structured data might be possibly identified by considering different groups of symmetry transformations, i.e., when different symmetry patterns are observed in different hierarchies.
Funding Open access funding provided by Örebro University.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendix A: Proof of Theorem 4
Proof Let N = n 2 n 1 and P 21 ∈ G 2 , given by (11). The matrix 2 can be written as where e k , e l are the kth and the lth columns of the identity matrix I N , respectively. We can define the element σ kl of 2 in a more informative way. Observe that one can write e k = e 2,i 2 ⊗ e 1,i 1 and e l = e 2, j 2 ⊗ e 1, j 1 , where e h,i h is the i h th column of the identity matrix I n h , h = 1, 2, and σ kl = σ (i 2 ,i 1 )( j 2 , j 1 ) = Cov(ξ i 2 i 1 , ξ j 2 j 1 ), where k = (i 2 − 1)n 1 + i 1 and l = ( j 2 − 1)n 1 + j 1 .
Hence, we have the following structure for 2 : 2 = I n 2 ⊗ τ 0 I n 1 + I n 2 ⊗ [n 1 /2] k=1 τ k SC(n 1 , k) + ( J n 2 − I n 2 )⊗τ [n 1 /2]+1 I n 1 + ( J n 2 − I n 2 )⊗ The structure in (19) is obtained, which implies that the "only if" part of the theorem is true. The "if" part is shown due to the structure of 2 , since The proof is completed.