Matrix compression along isogenic blocks

A matrix-compression algorithm is derived from a novel isogenic block decomposition for square matrices. The resulting compression and inflation operations possess strong functorial and spectral-permanence properties. The basic observation that Hadamard entrywise functional calculus preserves isogenic blocks has already proved to be of paramount importance for thresholding large correlation matrices. The proposed isogenic stratification of the set of complex matrices bears similarities to the Schubert cell stratification of a homogeneous algebraic manifold. An array of potential applications to current investigations in computational matrix analysis is briefly mentioned, touching concepts such as symmetric statistical models, hierarchical matrices and coherent matrix organization induced by partition trees.

1. Introduction 1.1.Prelude.In a previous paper [4] we were concerned with the study of entrywise positivity preservers acting on the cone of N ×N positive semidefinite matrices.By a classical observation of Loewner, as developed by Horn in [16,Theorem 1.2], if the real-valued function f defined on the positive half line preserves positive semidefiniteness for all N × N matrices via entrywise Hadamard calculus (for brevity, we say that f preserves positivity), then f is necessarily of class C (N −3) and has non-negative derivatives up to this order.Moreover, the derivatives of order up to N − 1 are non-negative if they exist.In the case where f is known to be analytic, preserving positivity only for rank-one N × N matrices already implies that the first N non-zero Taylor coefficients of f are strictly positive.In particular, this holds for f a polynomial.Henceforth A •k denotes the entrywise (fractional) power of a matrix A.
The central result in our previous work [4] (subsequently refined by one of us with Tao [19]) is a closed form for the greatest lower bound for the only possible negative coefficient c M in the linear pencil with real powers such that p(A) is positive semidefinite for any positive semidefinite N × N matrix A with entries in a specific domain in C. The threshold value for the coefficient c M was obtained via a combinatorial formula involving Schur polynomials.
A more natural matrix-theoretic approach to obtain the greatest lower bound for c M in (1.1) is to study the spectrum of the Rayleigh quotient R of two quadratic forms, Finding the maximum of R with respect to both u and A indeed yields the critical threshold for preserving positivity as mentioned above; we refer the reader to [4, Section 4] and [19,Proposition 11.2].One first obtains a closed-form expression for R A := sup u∈R N R(A, u) and then maximizes with respect to A. However, a significant difficulty arises when using this approach: the function A → R A is not continuous on the set of N × N positive semidefinite matrices.
As shown in [4], discontinuities occur precisely where constant-block structures emerge in A. The discovery of this phenomenon led the authors to investigate, in [4, Section 5], the simultaneous kernel ∩ k≥0 A •k of the Hadamard powers of an arbitrary positive semidefinite matrix A, and a related Schubert cell-type stratification of the set of all positive semidefinite matrices of a given size.
A series of remarkable properties of this stratification is unveiled in the present work.In order to facilitate access to the concepts and notation in subsequent sections, we start by explaining informally the main ideas with the help of a simple example.Consider the 6 × 6 real symmetric matrix .
Given any matrix A ′ ∈ C N ×N , there is a unique coarsest partition π ′ of {1, . . ., N } such that A ′ is constant on the blocks determined by π ′ .We denote by S π ′ the set of all matrices having this constant-block structure.This naturally yields a stratification of the space of complex matrices, where the disjoint union is taken over all partitions of {1, . . ., N }.We refer to each set S π ′ as a stratum, and call the above decomposition the isogenic block stratification of C N ×N .We explore several applications of the stratification below.A natural operation is the compression Σ ↓ π of each constant block to a single entry: with A and π as above, This straightforward transformation has some remarkable and very useful properties.For example, the spectrum of D π has the same nonzero eigenvalues, counting multiplicities, as the original matrix A, where the matrix D π = diag(3, 2, 1) simply reflects the size of the blocks of π.The spectrum of A has 3 additional zero eigenvalues, corresponding to its kernel.
An even more striking use of compression is for the computation of the Moore-Penrose generalized inverse: with A as above, this is In general, there is a fast method to compute the pseudo-inverse of a block matrix: where the inflation map Σ ↑ π expands a matrix into one with blocks which are constant for the index-set partition π.
While the isogenic block structure considered in the present article arose from the entrywise matrix operations which preserve positivity, and is highly relevant to them, its simplicity and versatility invites the exploration of its potential utility beyond pure mathematics, to other areas of much current interest.Compression, fast computation and analysis of structured matrices are of great importance at present.For example, large positive semidefinite matrices make an important appearance in the analysis of big data, where they occur as covariance or correlation matrices of random vectors.In such settings, the rows and columns of the matrix are ordered according to the ordering of the underlying variables.Entrywise operations such as thresholding are natural in that setting (see [23], for example), as are grouping and averaging variables, and the related block operations on the corresponding matrices.We conclude the paper by indicating several such applications.
On the other hand, two notable connections with established areas of mathematics should be mentioned.The partition (1.3) of C N ×N into subsets S π -and more generally, as in Proposition 2.6 -is akin to the stratification of the flag variety in Schubert calculus.It too emerges from a group action -of G N ×N acting entrywise on C N ×N .
Second, the matrix compression and inflation operations we focus on are a very specific instance of a conditional expectation map.Originating in probability theory, conditional expectations were generalized to the noncommutative setting of operator algebras with hard to underestimate benefits.From the ample bibliography on the subject we only mention an inspired, very recent survey [6] which presents Blecher and Read's extension of positivity beyond the well-known C * -algebraic setting.
In the present work, aimed at a large audience of practitioners of matrix analysis, we do not pursue in detail these two paths.
1.2.Summary of contents.In Section 2, we study the stratification of complex matrices with blocks that lie in single orbits under the action of a multiplicative subgroup G of the group of units C × , which generalises the case G = {1} discussed above.In Section 3, we show that the same stratification naturally emerges from studying simultaneous entrywise powers or functions of matrices that are positive semidefinite; in fact, we do not even require the positivity of determinants of size 4 or more.
In Section 4, we investigate the compression Σ ↓ π (A), which collapses every block of the matrix A to a single entry, equal to the average over that block.Compression and its right inverse, inflation, provide an effective tool for operating inside each stratum or its closure.Section 5 contains results that describe how compression and inflation relate to spectra and the functional calculus.These exploit the observation that a weighted version of the compression map provides a * -isomorphism of unital C * algebras between the stratum given by a partition with m elements and C m×m .
The final section, Section 6, is devoted to providing examples and links to other areas of matrix analysis.Although our study was prompted by the structure of matrices arising in the statistics of big data, the isogenic stratification we introduce here is relevant to sparse-matrix compression procedures and related fast computational tools.Indeed, the averaging on isogenic blocks we propose is a simple and efficient method of eliminating the redundancy of operations involving this class of structured matrices.The rapidly developing theory of hierarchical matrices [12,7,14] is a natural framework where our study finds deep resonances.A second setting where isogenic stratifications are very natural is that of coherent matrix organization [10,11,9,20].An immediate consequence of our technique is the quick computation of spectra and singular values for structured matrices; other possible applications are the efficient solution of large linear systems and certifying the stability of evolution semigroups.1.3.Acknowledgements.The authors extend their thanks to the International Centre for Mathematical Sciences, Edinburgh, where part of this work was carried out, and to an anonymous referee, for their helpful comments.D.G. was partially supported by a University of Delaware Research Foundation grant, by a Simons Foundation collaboration grant for mathematicians, and by a University of Delaware Research Foundation Strategic Initiative grant.A.K. was partially supported by Ramanujan Fellowship SB/S2/RJN-121/2017, MATRICS grant MTR/2017/000295, and SwarnaJayanti Fellowship grants SB/SJF/2019-20/14 and DST/SJF/MS/2019/3 from SERB and DST (Govt. of India), by grant F.510/25/CAS-II/2018(SAP-I) from UGC (Govt. of India), by a Tata Trusts gravel grant, and by a Young Investigator Award from the Infosys Foundation.M.P. was partially supported by a Simons Foundation collaboration grant.
1.4.List of symbols.We collect below some notation introduced and used throughout the text.
• D(0, ρ) is the closed disc in C with radius ρ centered at the origin, S 1  is the unit circle in C and C × is the set of non-zero complex numbers; more generally, the group of units for a unital commutative ring R is denoted by R × .• 1 M ×N is the M × N matrix with each entry equal to 1, whereas Id N is the N × N identity matrix.• f [A] is the matrix obtained by applying the function f to each entry of the matrix A. • A •α is the matrix obtained from A by taking the αth power of each entry, whenever this is well defined.We set 0 0 := 1.
is the m×m matrix obtained by averaging the N ×N matrix A over each block determined by the m-element partition π.
is the N × N matrix given by inflating each entry of the m × m matrix B to a constant block subordinate to the m-element partition π.
) of the m × m matrix B.
• P N (Ω) is the set of N × N positive semidefinite matrices with entries in the set Ω.

Isogenic stratification of complex matrices
In order to define the isogenic stratification, we begin with a summary of results appearing in our earlier work.(1) Suppose {I 1 , . . ., I m } is a partition of {1, . . ., N } satisfying the following two conditions.(a) Each diagonal block A I j of A is a submatrix having rank at most one, and The entries of each diagonal block A I j lie in a single G-orbit.Then there exists a unique matrix C = (c ij ) m i,j=1 such that c ij = 0 unless u i = 0 and u j = 0, and A is a block matrix with Moreover, the entries of each off-diagonal block of A also lie in a single G-orbit.Furthermore, the matrix C ∈ P m (D(0, 1)), and the matrices A and C have equal rank.
(c) The diagonal blocks of A have maximal size, that is, each diagonal block is not contained in a larger diagonal block that has rank one.There exists a partition {I 1 , . . ., I m } such that (a), (b) and (c) hold, and such a partition is unique up to relabelling of the indices.Theorem 2.1 naturally leads to Schubert-cell type stratifications of the cone P N (C), and some properties of these stratifications were studied in [4].Here, we explore a coarser form of partitioning on the whole set of N × N complex matrices, based solely on G-equivariance.Definition 2.2.We denote by (Π N , ≺) the poset of all partitions of the set {1, . . ., N }, ordered so that π ′ ≺ π if and only if π is a refinement of π ′ .
The poset Π N is a lattice.Given π, π ′ ∈ Π N , let π ∧ π ′ and π ∨ π ′ denote the meet and join of π and π ′ , respectively: see [27,Example 3.10.4],but note that our ordering is opposite to the one employed there.For example, the meet and join of the partitions Proof.The non-trivial part is to establish uniqueness.To do so, we claim that if (π 1 , ̟ 1 ) and (π 2 , ̟ 2 ) ∈ Π M ×Π N satisfy the property in the assertion, then so does (π 1 ∧ π 2 , ̟ 1 ∧ ̟ 2 ).The meet of π 1 ∧ π 2 can be constructed as follows: connect i and i ′ ∈ {1, . . ., M } by an edge if they lie in the same block of π 1 or π 2 ; this defines a graph with vertex set {1, . . ., M } whose connected components yield the blocks of the partition π 1 ∧ π 2 .Denote this equivalence relation by i ∼ i ′ in π 1 ∧ π 2 .Now suppose i ∼ i ′ in π 1 ∧ π 2 and, similarly, j ∼ j ′ in ̟ 1 ∧ ̟ 2 , so there are paths joining them, each of whose vertices lies in a block of π 1 or π 2 , and ̟ 1 or ̟ 2 , respectively.Denote these paths by We claim that a ij ∈ Ga i ′ j ′ .Indeed, using the above paths, and this proves the claim.Now suppose M = N and A is symmetric.Then the partition (π min , ̟ min ) works for A, and (̟ min , π min ) for A T = A, whence the above analysis shows π min ∧ ̟ min works for both rows and columns of A. By minimality, it follows that π min = π min ∧ ̟ min = ̟ min .
Finally, suppose R = C, G ⊂ S 1 , and A is Hermitian.We claim that (π min , ̟ min ) works for A as well as for A T = A; by the previous paragraph, this gives the result.To show the claim, note that if (π, ̟) works for A and Theorem 2.3 has a useful "symmetric" version for square matrices, in the sense that the same partition is used for both the rows and the columns.The proof is similar to that of Theorem 2.3 and is hence omitted.Proposition 2.4.Fix an integer N ≥ 1, and a multiplicative subgroup G of units in a unital commutative ring R. Given A ∈ R N ×N , there exists a unique minimal partition π = {I 1 , . . ., I m } ∈ Π N , such that the entries of the block submatrix A I i ×I j lie in a single G-orbit, for all i, j ∈ {1, . . ., N }.
The following definitions follow naturally from the previous proposition.Our focus henceforth is on complex matrices with blocks which are orbits of a fixed multiplicative subgroup G of S 1 , and primarily the case G = {1}, where the entries in any given block are identical.Definition 2.5.Given a matrix A ∈ C N ×N and a multiplicative group G ⊂ S 1 , let π G (A) ∈ Π N be the partition provided by Proposition 2.4 for the matrix A.
Conversely, for a given partition π ∈ Π N , define the stratum S G π to be The proof of the following proposition is immediate.
Proposition 2.6.Given N ≥ 1 and a multiplicative subgroup G of S 1 , there is a natural stratification of the set of N × N complex matrices and the stratum S G π has closure when C N ×N is equipped with its usual topology.
Equation (2.1) may be seen as akin to the Schubert-cell decomposition of the flag variety corresponding to a semisimple Lie group.
Definition 2.7.The family of stratifications given by Proposition 2.6 will be referred to as G-block stratifications of the space C N ×N .Remark 2.8.There is an important distinction between the stratification of C N ×N considered here and that for the cone P N (C) considered previously.In Theorem 2.1, the partition was defined to have the property that the diagonal blocks of A ∈ P N (C) have rank at most one.However, for a general matrix A ∈ C N ×N , this extra property need not hold for π G (A), unless either G = {1}, or A ∈ P N (C) and G ⊂ S 1 .In fact, as shown in [5,Proposition 4.6], in the latter case the requirement in Theorem 2.1 that A is positive semidefinite may be relaxed by requiring A to be 3-PMP : every principal minor of size no more than 3 × 3 is non-negative.

PMP matrices and simultaneous kernels of entrywise powers
Henceforth, we will focus on constant-block stratifications, which we call isogenic, and we work mainly with square complex matrices.
We begin by noting how, for a large class of Hermitian matrices, the partition π {1} (A) introduced in Definition 2.5 emerges naturally from the study of simultaneous kernels of entrywise powers of A.
(2) The simultaneous kernel of A •n for all n ≥ 0.
(3) The simultaneous kernel of the block-diagonal matrices This equality of kernels need not hold for matrices that are not 3-PMP.
Our immediate goal is to show that a similar result holds for arbitrary real powers.(1) The simultaneous kernel of A •n j for j = 1, . . ., N .
(2) The simultaneous kernel of A •α for all α ∈ R.
(3) The simultaneous kernel of the block-diagonal matrices The kernel of the matrix J π := m j=1 1 I j ×I j .The same holds when A has non-negative entries with at least one zero entry, as long as n 1 = 0 and α is taken to be non-negative in ( 2) and ( 3).
These equalities of kernels need not hold for matrices that are not 3-PMP.
Remark 3.4.One subtlety in adapting the proof in [5] of Theorem 3.2 to these variants is that a key matrix identity relating where D M,j is a diagonal matrix composed of certain Schur polynomials evaluated at the rows of A. If one tries to generalize this identity from exponents {0, 1, . . ., N −1} to arbitrary real powers, as in [19], the entries of the diagonal matrices become ratios of generalized Vandermonde determinants, and such ratios are not defined for all A. As a result, the above identity does not admit a uniform generalization, so we cannot naively adapt the previous proof to show that the subspace in (2) contains that in (1).There is a similar issue when α is not greater than n N , which cannot be resolved by modifying the arguments in [5].
In light of the preceding remark, it is heartening that all three variants above follow from an even stronger result, that holds over arbitrary subsets of C and for more general functions than powers.To state this result, we introduce the following notion, over an arbitrary commutative ring.Definition 3.5.Let X be a non-empty set and R a unital commutative ring.A set F of functions from X to R has full determinantal rank over X if, for any k distinct points x 1 , . . ., x k ∈ X, where k ≤ |F|, there exist functions f 1 , . . ., f k ∈ F such that the determinant det(f i (x j )) is not a zero divisor.
If R is a field, X is a finite set and F has at least |X| elements, then the family F has full determinantal rank over X if and only if the generalized matrix (f (x)) x∈X,f ∈F has full rank.If, however, the family F has fewer than |X| elements, then this is not the case; for example, if F contains a single function f which is the indicator function for a point in X, and X contains at least two points, then the matrix (f (x)) x∈X has full rank but zero entries, and thus F does not have full determinantal rank over X.
Example 3.6.We now give several examples of such families of functions, the first three of which correspond precisely to the results above.
(1) If X is a finite set of complex numbers with k elements then the family {z j−1 : j = 1, . . ., k} has full determinantal rank over X.Thus {z n : n = 0, 1, 2, . ..} has full determinantal rank over any subset of C. (2) If X is a finite set of positive real numbers with k elements then {x n j : j = 1, . . ., k} has full determinantal rank over X for any choice of distinct real exponents n 1 , . . ., n k .Thus {x α : α ∈ R} has full determinantal rank over any subset of (0, ∞).
where M = (exp(2α i x j )) is invertible by the preceding example.
Classes of functions with the separation property of Definition 3.5 are well studied in approximation theory, under the name of Chebyshev systems [18].
We now provide a theorem which contains all three results described above.
Theorem 3.7.Fix a Hermitian matrix A ∈ C N ×N that is 3-PMP, where N ≥ 1. Suppose π := π {1} (A) = {I 1 , . . ., I m } and π ′ = {I ′ 1 , . . ., I ′ m ′ } is any partition refined by π.Let X denote the set of entries of A and suppose F is a family of complex-valued functions on X that has full determinantal rank over the entries of each row of A. The following spaces are equal.
(1) The simultaneous kernel of f [A] for all f ∈ F.
(2) The simultaneous kernel of f [A] for all functions f : X → C.
(3) The simultaneous kernel of the block-diagonal matrices for all functions f : X → C. (4) The kernel of J π := m j=1 1 I j ×I j .This equality of kernels need not hold for matrices that are not 3-PMP.
The proof of Theorem 3.7 relies on the following strengthening of a result obtained in the proof of [5,Theorem 5.1]; in that setting, the ring R is taken to be a field and F = {1, x, . . ., x m−1 }.Proposition 3.8.Let R be a unital commutative ring, and suppose the matrix B ∈ R m×m is such that m ≥ 1 and If the family F has full determinantal rank over the entries in each row of B, Proof.We show the result by induction on m, with the case m = 1 being immediate.For the inductive step, we claim that if u ∈ f ∈F ker f [B] then u 1 = 0.This reduces the problem to showing that the trailing principal (N −1)×(N −1) submatrix of B has the same simultaneous kernel, whence we are done by the induction hypothesis; note that if F has full determinantal rank over a set X, then it has so over any subset of X.
To show the claim, let r T = (r 1 , . . ., r m ) be the first row of B, and apply Theorem 2.3 with M = 1, N = m and G = {1} to obtain a minimal partition ̟ min = {J 1 , . . ., J k } such that r T is constant on each block.Since r 1 = r j for j = 2, . . ., m, we may take J 1 = {1} without loss of generality.Let s ∈ R k be the compression of r obtained by deleting repeated entries, so that s j = r l for any l ∈ J j , with s 1 = r 1 = b 11 .As s has distinct entries by construction, there exist f 1 , . . ., f k ∈ F such that det C is not a zero divisor, where be defined by setting v j := l∈J j u l , and note that v 1 = u 1 .It follows that . ., k, so that Cv = 0.By Cramer's rule, it follows that det(C)v = 0 in R k , whence v = 0 by the hypotheses.In particular, we have that u 1 = v 1 = 0, as desired.
We can now prove the general theorem described above.
Proof of Theorem 3.7.Let V 1 , . . ., V 4 be the subspaces described in parts (1) to (4) of the statement of the theorem.We will show a chain of inclusions.
Note first that v ∈ V 4 if and only if l∈I j v l = 0 for j = 1, . . ., m, from which it follows that We now claim that the inclusion V 1 ⊂ V 4 gives the result.Firstly, we then have that and secondly, this also gives the inclusion V 3 ⊂ V 4 , in which case For the last claim, note that V 3 is the direct sum of , where the intersection is taken over the set of all functions from j with the partition π ∩ I ′ j gives the inclusion as required.Thus, it remains to show that V 1 ⊂ V 4 .We proceed as in [ We can now conclude our proof that , by Proposition 3.8, so u ∈ ker J π = V 4 as required.This concludes the proof of the chain of inclusions.

Inflation and compression
In the present section we continue to explore isogenic stratification.The proof of Theorem 3.7 used the compression operator Σ ↓ π , which will be one of the main characters in the new act.
Suppose A, B ∈ S {1} π , so that these matrices are constant on the blocks defined by the partition π.Then we may write where 1 I i ×I j is the N × N matrix with 1 in each entry of the I 1 × I j block and 0 elsewhere.Hence so the closure of every stratum is a subalgebra of C N ×N for both the usual and entrywise multiplication.The isogenic stratification of the space C N ×N is not merely into linear spaces of matrices, but into subalgebras.
Next we focus on the compression operation of a fixed stratum to a lowerdimensional space.To simplify notation, we write henceforth π whenever there is no danger of confusion.Definition 4.1.For i, j ∈ {1, . . ., m}, let E ij denote the elementary matrix with (i, j) entry equal to 1 and all other entries 0, and recall that 1 I i ×I j is the N × N matrix with 1 in each entry of the I i × I j block and 0 elsewhere.
(1) Define the linear inflation map as the linear extension of and note that the range of Σ ↑ π is S π .(2) Define the linear compression map so that the image B = Σ ↓ π (A) is such that b ij is the arithmetic mean of the entries in A I i ×I j , for i, j = 1, . . ., N .
Our next result shows that we may, in the entrywise setting, compress the matrices in a given stratum and work with the resulting smaller matrices with no loss of information.Theorem 4.2.Let S π and C m×m be equipped with the entrywise product, so that the units for this product are 1 N ×N and 1 m×m , respectively.The maps Σ ↓ π : S π → C m×m and Σ ↑ π : C m×m → S π are mutually inverse, rank-preserving isomorphisms of unital * -algebras.Moreover, a matrix A ∈ S π is positive semidefinite if and only if Σ ↓ π (A) is.
Towards the proof of this result, we first study the inflation and compression operators.To this aim, some new terminology will be useful.Definition 4.3.
(1) Define the weight matrix W π ∈ C N ×m to have (i, j) entry 1 if i ∈ I j and 0 otherwise.Let When the rows of W π are ordered so that the indices in I 1 are first, then the indices in I 2 , and so on, then (2) For any coarsening π ′ ≺ π, define the partition π ′ ↓ ∈ Π m so that the blocks of π ′ ↓ are made up of those indices of blocks in π to be combined to form the blocks of π ′ .Thus if Several important properties of the operators Σ ↓ π and Σ ↑ π are summarized in Proposition 4.4.Proposition 4.4.
(1) The map π ′ → π ′ ↓ is a bijection between the set of all coarsenings of π in Π N and the set Π m .
(2) For all A ∈ C m×m , Moreover, Σ ↑ π is a bijection from C m×m onto S π , sending the stratum Moreover, Σ ↓ π restricted to S π is a bijection onto C m×m , being the inverse map of Σ ↑ π .(4) The linear maps Σ ↑ π and Σ ↓ π are compatible with matrix multiplication in the following sense: and Proof.Part (1) readily follows from the definitions.To see that (4.2) holds, it suffices by linearity to show that W π E ij W * π = 1 I i ×I j for each elementary matrix E ij , but this is immediate.It is also clear that Σ ↓ π and Σ ↑ π are mutually inverse bijections between S π and C m×m .The other assertions of (2) are straightforward.
To show that (4.3) holds, it once again suffices to take A to be an arbitrary elementary matrix; the calculation is then straightforward.That Σ ↓ π and Σ ↑ π are inverse between S π and C m×m has already been discussed.Finally, equation (4.4) is verified by using (4.2) and the definition of D π .To see (4.5), note that, by (4.4), and this concludes the proof.
In other words, the map Σ ↑ π • Σ ↓ π is a conditional expectation on C N ×N corresponding to the σ-algebra generated by the blocks which define the stratum S π .
These properties of inflation and compression maps help demonstrate the * -isomorphism claimed above.
Proof of Theorem 4.2.By Proposition 4.4, it suffices to show the results only for Σ ↑ π .This map is linear and multiplicative for the entrywise product, by definition.Equation (4.2) shows that Σ ↑ π commutes with taking the adjoint * , and also that Σ ↑ π preserves rank, since W π has full rank.Finally, that positivity is preserved follows immediately from (4.2).

Spectral permanence
Theorem 4.2 shows that the map Σ ↓ π : S π → C m×m is a positivitypreserving * -algebra isomorphism for the entrywise product, whence the entrywise calculus is transported to a lower-dimensional space of matrices.However, it is not immediately apparent if the usual holomorphic functional calculus in C m×m can be transported up to the closure of each stratum.We now show how this can be accomplished with the help of different weighted inflation and compression maps.
As in the previous Section 4, we fix a partition π = {I 1 , . . ., I m } ∈ Π N , where N ≥ 1. Definition 5.1.Define linear operators Theorem 5.2.The maps Θ ↓ π and Θ ↑ π are mutually inverse, rank-preserving isomorphisms between the unital * -algebras S π and C m×m equipped with the usual matrix multiplication.Furthermore, a matrix A ∈ S π is positive semidefinite if and only if Θ ↓ π (A) is.
Proof.That Θ ↓ π and Θ ↑ π are linear, bijective, * -equivariant, and preserve rank follows from the corresponding properties of Σ ↓ π and Σ ↑ π from Theorem 4.2, since D π is positive definite.Moreover, it is easily shown that Θ ↓ π and Θ ↑ π are mutually inverse between S π and C m×m , since Σ ↓ π and Σ ↑ π are.Furthermore, if A, B ∈ S π , then, by (4.5), . Thus the maps Θ ↓ π and Θ ↑ π are algebra homomorphisms for the usual matrix multiplication, and that they take the units to one another is readily verified; note that S π has unit The final assertion about preserving positivity is immediate.
Remark 5.3.We note a couple of simple consequences of Theorem 5.2. ( (2) If A ∈ S π , then the Moore-Penrose pseudo-inverse is: since A = Θ ↑ π (Θ ↓ π (A)) and Θ ↑ π is a * -algebra homomorphism.We now extend the compression and inflation operators to act on vectors as well as matrices.
Remark 5.6.We collect here some further properties of the maps Σ ↓ π , Σ ↑ π , Θ ↓ π , and Θ ↑ π .(1) If A ∈ S π and u ∈ C N , then Au is constant on the blocks of the partition π, that is, Au ∈ im Σ ↑ π = im Θ ↑ π , the range of these two operators acting from C m to C N .Consequently, Our next result shows that the maps Θ ↑ π and Θ ↓ π preserve eigenvalues, up to the possible addition or removal of 0.
Proposition 5.7.The following spectral permanence holds for all A ∈ S π and B ∈ C m×m : In particular, Θ ↓ π : S π → C m×m and Θ ↑ π : C m×m → S π preserve the spectral radius.Furthermore, applying Θ ↑ π to B adds a zero eigenvalue of geometric multiplicity N − m to σ(B), whereas applying Θ ↓ π to A reduces the geometric multiplicity of the zero eigenvalue of A by N − m.
The proof of (5.2) is similar, with the help of Proposition 5.5 and that the fact that if u ∈ im Θ ↑ π and Θ ↓ π (u) = 0 then u = 0.The last claim follows immediately by the rank-nullity theorem.
Remark 5.8.Let A be a complex associative algebra with multiplicative identity 1 A .Then the spectrum of a ∈ A is defined as In the notation of Proposition 5.7, σ(A) = σ(A; C N ×N ) for any A ∈ C N ×N , where C N ×N equipped with the usual matrix product.
The following theorem shows that the holomorphic functional calculus naturally transfers between C m×m and S π .Its proof follows immediately from the fact that Θ ↓ π is an algebra isomorphism.
Theorem 5.9.Given A ∈ S π and B ∈ C m×m , let the resolvents ).Thus, the holomorphic functional calculus transfers between C m×m and S π : if A ∈ S π and f is holomorphic on an open set containing σ(A; S π ), then Remark 5.10.It follows from Theorem 5.2 that analogues of the general linear group GL m , unitary group U m , and permutation group S m exist inside the stratum S π .Furthermore, the notions of nilpotent, Hermitian, and positive semidefinite matrices are preserved in S π via Θ ↑ π .Hence analogues of the Bruhat, Cholesky, and polar decompositions can also be defined on S π , and all of the respective factors live inside the same stratum.
We conclude this section with some remarks on the situation for S G π with more general G ⊂ C × .A key feature in the definition of the compression operator Σ ↓ π , and so of Θ ↓ π , was the unique decomposition of a single rankone block matrix: , the block matrices do not possess such a decomposition.Moreover, each stratum and its closure are no longer closed under multiplication.For example, if G = {±1}, N = 2, and π = {{1, 2}} is the minimum partition, then

Ramifications
We collect in this last section several observations revealing natural links between the compression and inflation operations derived from the isogenic stratification of the matrix space and recent advances or classical examples of current interest in numerical matrix analysis.6.1.Symmetric statistical models.We briefly discuss a related setting of statistical models and covariance matrices which exhibit symmetry with respect to a group of permutations; see [26] and the references therein.The authors fix there a subgroup G of the symmetric group S N , and define In the framework of the present article we consider permutation groups associated to a partition π = {I 1 , . . ., I m } of {1, . . ., N }, that is, By contrast, in [26] the authors work with more general subgroups of S N .Furthemore, the matrices in a given stratum S {1} π have diagonal blocks with equal entries, as the permutations act separately on the rows and columns of C N ×N , whereas the diagonal blocks of a square matrix in W G may have different diagonal and off-diagonal entries.
Having acknowledged these differences, we now discuss the setting of [26] from the viewpoint adopted above.The first step is to establish the existence of a suitable partition associated with a given matrix.Proposition 6.1.Fix a unital commutative ring R and a multiplicative subgroup G ⊂ R × .Given a matrix A ∈ R N ×N , where N ≥ 1, there is a unique minimal partition ̟ min = {I 1 , . . ., I m } ∈ Π N satisfying the following two properties.
(1) If i, j ∈ {1, . . ., N } are distinct, then all entries of the block A I i ×I j lie in a single G-orbit.(2) If i ∈ {1, . . ., N }, then all diagonal entries of A I i ×I i all lie in a single G-orbit, as do all off-diagonal entries.
Proof.As in the proof of Theorem 2.3, the key step is to show that if ̟ 1 and ̟ 2 satisfy both of the properties given above, then ̟ 1 ∨ ̟ 2 does too.This proceeds as in that proof, and the only part which is not immediate the case of off-diagonal elements of a diagonal block, with one lying above the diagonal and the other below.However, property (2) gives that a pq ∈ Ga qp if p and q are distinct and lie in the same block of ̟ 1 or ̟ 2 , so one may "cross the diagonal".
Denote the partition in Proposition 6.1 by ̟ G (A) and, as above, define for each partition ̟ the set There is a natural stratification of C N ×N resulting from Proposition 6.1, namely and Furthermore, we have that Thus C N ×N indeed admits a stratification, analogous to the situation above.However, the reason we do not proceed further along these lines is the lack of a rank-preserving equivalence of the stratum C {1} ̟ with any lowerdimensional space.Indeed, if ̟ = {I 1 , . . ., I m } and A ̟ := m j=1 1 I j ×I j , then the sum of A ̟ with any positive multiple of the N × N identity matrix is an element of C {1} ̟ with full rank.6.2.Block correlation matrices and group kernels.A popular approach for constructing probabilistic models involving categorical data consists of grouping the input levels in such a way that correlation is constant within each group and across groups [2,8,17,24].In particular, the rapidly developing area of group kernels exploits this idea to obtain useful Gaussian processes for categorical variables [22,25].This approach yields parsimonious covariance models that can be naturally analyzed in the framework of the current paper.To elaborate, consider a categorical problem involving a potentially large number of levels N , partitioned into a typically small number of groups m of sizes n 1 , . . ., n m , so that n 1 + • • • + n m = N .Assuming the within-groups correlations and between-groups correlation are constant, the associated covariance matrix A can be written in block form where the diagonal blocks A ii ∈ R n i ×n i are compound symmetry matrices of the form A ii = Id n i + c ii (1 n i ×n i − Id n i ) and the off-diagonal blocks A ij ∈ R n i ×n j are of the form As shown in [24,25], positive definiteness of A is equivalent to the positivity of the much smaller compressed matrix Σ ↓ π (A), where π is the partition of {1, . . ., N } associated with the block structure of A. Earlier versions of this result may be found in [8] and [17].
i,j=1 be as above, with 0 < c ij < 1 for i, j = 1, . . ., m.Then A is positive definite if and only if its compression Σ ↓ π (A) is positive definite.In fact, Theorem 6.3 is a consequence of the following more general result.Recall the Loewner order: if A and B are Hermitian square complex matrix of the same size, then A ≥ B if and only if A − B is positive semidefinite.(1) The matrix B is positive semidefinite.
Note that Theorem 6.3 follows from Theorem 6.4 because More generally, the same result holds if the diagonal blocks have the form where For completeness, we provide a self-contained proof of Theorem 6.4 in the language developed above.is positive semidefinite by assumption, and so B = Σ ↑ π (C) + D is positive semidefinite.This shows that (2) =⇒ (1).
For the positive-definite version, that (1) =⇒ ( 2) is again clear, with the help of (4.3) and the fact that W π there has full rank.For the converse, suppose C := Σ ↓ π (B) is positive definite and note that B = Σ ↑ π (C) + D is the sum of positive semidefinite matrices.Thus, if v is such that v T Bv = 0 then v lies in the kernels of D and Σ ↑ π (C).Since each B ii is invertible, and rank is subadditive, the kernel of each block of D is at most one dimensional.Hence, the kernel of D is spanned by vectors of the form Σ ↑ π (e i ) for i ∈ I, where e 1 , . . ., e m is the canonical basis of R m and I ⊂ {1, . . ., m}.However, by Proposition 5.5 (5), we have that since C and D π are invertible.Hence D has trivial kernel and B is positive definite.
An explicit expression for the eigendecomposition of the block matrices featuring in Theorem 6.3 was obtained in [8] and [17], as well as in [2] for the case where the diagonal blocks are of the form We provide a proof of the last result using our stratification language and some results from Section 5. Theorem 6.5 ([2, Theorem 1]).Let A = (A ij ) m i,j=1 be a real matrix with Then A = QDQ T , where Q is an orthogonal matrix, Proof.As above, let A ii denote the arithmetic mean of the entries of the block A ii , so that A ii = n −1 i (d i + (n i − 1)c ii ) for i = 1, . . ., m, and let π = {I 1 , . . ., I m } be the partition associated with the block structure of A.
We claim that where d j − c jj has multiplicity n j − 1 for j = 1, . . ., m.If this holds then, since A and D are symmetric and have the same spectrum, there exists an orthogonal matrix Q such that A = QDQ T .It remains to show the claim (6.4), and we do so by explicitly producing an orthonormal eigenbasis.First, let V j be the orthogonal complement of 1 I j in R I j , padded by zeros to form a subspace of R N , where A direct calculation shows that every non-zero vector in V j is an eigenvector of A with eigenvalue d j − c jj .As the subspaces V 1 , . . ., V m are pairwise orthogonal, this eigenvalue has multiplicity n j − 1 as required.
Next, let w 1 , . . ., w m ∈ R m be an orthonormal eigenbasis for A ′ , with eigenvalues λ 1 , . . ., λ m .Then {Θ ↑ π (w 1 ), . . ., Θ ↑ π (w m )} is an orthonormal set, by Proposition 5.5 (3), which lies in the span of {1 I 1 , . . ., 1 In }, by definition, so is orthogonal to V i for i = 1, . . ., m.Furthermore, if v ∈ R m then Proposition 5.5 (5) Hence, by (6.3) and Proposition 5.5(5), we conclude that In the same spirit as here, several important matrix calculations involving the matrix A, including powers, exponential, logarithm, and Gaussian log likelihood, can be performed by working with the compressed matrix A ′ .See [2] for more details.Consider first a positive semidefinite Hankel matrix H = (c j+k ) N j,k=0 with real entries.It is well known (see, for example, [1]) that there exists at least one positive measure µ on the real line having c j as power moments: Then, if H belongs to a stratum other than the topmost, S π∨ , there are distinct indices j and k such that so the measure µ is supported by at most three points, −1, 0 and 1.It follows for N ≥ 2 that H has rank at most 3 and extends uniquely to an infinite positive semidefinite Hankel matrix, whereas if so the measure µ is a point mass at 1.A similar situation arises from the analysis of a positive semidefinite Toeplitz matrix T = (s k−j ) N j,k=0 .Here, we assume the entries to be complex and the positivity of T implies that s 0 ≥ 0 and s −j = s j (j = 0, . . ., N ).
The solution to the truncated trigonometric moment problem guarantees a positive measure ν on [−π, π) such that If, as before, T has some non-trivial isogenic block structure, then there exists an index j satisfying |1 − e jθ | 2 dν(θ) = 0, so the measure ν is the sum of at most j point masses, situated on the vertices of the regular polygon determined by that equation z j = 1.Thus, the matrix T is degenerate and extends uniquely, by periodicity to an infinite positive semidefinite Toeplitz matrix.
6.4.Stability of semigroups.The simple observation that the spectral radius of a square matrix belonging to the closure of a certain stratum is preserved by the push-down operation Θ ↓ π immediately resonates with stability criteria of evolution semigroups.To be more specific, consider the time-invariant linear homogeneous system of differential equations, du dt (t) = Au(t), where A is an element of C N ×N and u : [0, ∞) → C N .As is well known, the solution u(t) = exp(tA)u(0) (6.5) decays exponentially as t → ∞, for arbitrary initial data u(0), if and only if the spectrum of A lies in the open left half-plane.In this case, the system is called asymptotically stable.
A great deal of work has been done to establish asymptotic-stability criteria for linear systems, in both the time-invariant and then time-dependent cases.A relatively early work in this area is [15], where the distance from a stable matrix to the unstable region is described definitively.
The isogenic block stratification introduced in the present article can be used in some cases to simplify the verification of stability.If the Cayley transform X = (A − I)(A + I) −1 of A belongs to a closed stratum S π and the spectrum of X belongs to the open unit disk, then so does that of the compression Θ ↓ π (X).Furthermore, the converse is true, which gives the following asymptotic-stability criterion.Proposition 6.6.Suppose the Cayley transform X of the generator A of the linear system (6.5)belongs to the stratum S π .The system is asymptotically stable if and only if the spectral radius of the compression Θ ↓ π (X) is less than 1.
Much more involved, but well studied due to its important applications to control theory, is the case of a time-varying linear system du dt (t) = A(t)u(t).
As before, it may be beneficial to examine the compression of the Cayley transform of the family of matrices A(t), but we do not enter into the details here.
One of the challenges of modern stability theory of switched dynamical systems is the computation of the joint numerical radius of a tuple of noncommuting matrices.More precisely, if A 1 , A 2 , . . ., A n are complex N × N matrices, one wants to certify that While the general case of the corresponding weak inequality is known to be undecidable, sufficient conditions for the strong inequality are known and widely used; see [21] and references therein.Once again, being fortunate enough to have the matrices A 1 , A 2 , . . ., A n in a high-codimension isogenic stratum may considerably simplify, via compression, the verification of such criteria.
6.5.Hierarchical matrices.With its origins in the theory of numerical approximation of integral equations, a novel chapter of applied linear algebra emerged in the last two decades having the concept of hierarchical matrix at its center [3,13].The philosophy behind this powerful new tool is to exploit, via effective numerical schemes, the sparsity and hidden redundancy in a large matrix of relative low rank.Starting with the factorization of a large matrix into a product of lower-rank rectangular matrices of smaller size, a systematic study of minimizing computational-cost algorithms for matrix multiplication, polar decomposition, Cholesky decomposition and much more were devised by Hackbusch, his disciples and an increasing number of followers.The foundational work [12] is complemented by the informative survey [14].
The matrices belonging to the closure S π of a high-codimension stratum defined by the partition π = {I 1 , I 2 , . . ., I m } of {1, . . ., N } are hierarchical in an obvious way.Moreover, the compression Θ ↓ π : S π → C m×m naturally aligns with the main concepts of the hierarchical matrix theory.We indicate here a few common trends.
The isogenic structure of A ∈ S π is reflected by the factorization A = W π BW * π , where B is an m × m matrix and W π ∈ C N ×m ; see Proposition 4.4.Thus every element of S π can be decomposed with a universal left factor W π and a right factor depending on m 2 complex parameters.Moreover, the weighted compression Θ ↓ π built on the same data (the partition π and the parameters determining each isogenic block) is a * -algebra homomorphism.Specifically, a matrix A ∈ S π has the multiplicative decomposition and the correspondence A → Θ ↓ π (A) preserves all matrix operations.The lifting of an LU decomposition is similarly not quite immediate.As before, let B be the compression of the matrix A ∈ S π and suppose B = LU , where L is a lower-triangular matrix and U is upper triangular.Then the liftings Θ ↑ π (L) and Θ ↑ π (U ) will be only block triangular, being elements of S π .To obtain a genuine LU decomposition, one has to depart from the isogenic structure and split the products of rank-one matrices in a consistent pattern.For example, one can factor the product of k ×n and n×m isogenic matrices Finally, on the lighter side, we state the following theorem to give a demonstration of the general paradigm for results one can derive using the compression-inflation procedure described above.6.6.Coherent matrix organization.Complex multivariate data, such as those arising from the measurement of neuronal structures, cannot be handled without regularization and compression of the corresponding large matrices.Among the many techniques to analyze and transform such matrices, the method of coherent matrix organization proposed by Gavish and Coifman [10] stands out for its elegance and universality.These authors propose organizing a matrix using a natural metric which clusters the entries by proximity; the emerging tree of finer and finer partitions offers a canonical compression by averaging along strata, with tight control of the resulting approximation error.
The isogenic stratification of a matrix and the compression maps discussed above align perfectly with Gavish and Coifman's idea.Our purely theoretical setting is an extremal one, in the sense that we employ the discrete metric for clustering.However, it is notable that the isogenic structure and the associated inflation map offer a rapid matrix-completion algorithm within a prescribed stratum, in the spirit of recent advances in coherent matrix organization; see, for instance, [9,11,20].We do not expand here some clear consequences of this non-accidental similarity.

( 3 )
Suppose (a)-(c) hold and G = C × .Then the off-diagonal entries of C lie in the open disc D(0, 1).(4) If G ⊂ S 1 , then blocks in a single G-orbit automatically have rank at most one.

Theorem 6 . 4 (
[24, Theorem 1]).Let B = (B ij ) m i,j=1 be an arbitrary Hermitian block matrix with B ij ∈ R n i ×n j for i, j = 1, . . ., m. Denote by π the partition of {1, . . ., n 1 + • • • + n m } associated with the block structure of B and let B ii := Σ ↓ π (B ii ) ∈ R be the arithmetic mean of the entries of the block B ii .If B ii ≥ B ii 1 n i ×n i (i = 1, . . ., m) then the following are equivalent.

6. 3 .
Transversal matrix structures.The classical structures of Hankel and Toeplitz matrices interact with isogenic block stratification in a very rigid manner, as we explain below.

Proposition 6 . 7 .
If the matrix A ∈ S π , then compression Θ ↓ π (A) has the same spectrum and singular values as A, with the possible exception of the value zero.Proof.Proposition 5.7 gives that σ(A) \ {0} = σ(Θ ↓ π (A)) \ {0}.Moreover, the singular values of A are the eigenvalues of its modulus |A| = √ A * A, but A * A and Θ ↓ π (A) * Θ ↓ π (A) have the same non-zero eigenvalues.Lifting the polar decomposition is slightly more subtle.Let B = Θ ↓ π (A), where A ∈ S π .The polar decomposition B = U B |B| lifts immediately toA = V A |A|, but V A := Θ ↑ π (U B) is only a partial isometry.There is, however, a unitary matrixU A ∈ C N ×N such that that U A |A| = V A |A|.The initial and final spaces of V A are equal and contain the range of A, and are spanned by {1 I k : k = 1, . . ., m}.If W is the orthogonal projection onto the orthonogal complement of this space then U A = V A + W is as desired.

Theorem 6 . 8 .
Let A, B and C be real matrices belonging to S π , where π is a partition of {1, 2, . . ., N }.The classical equations of Lyapunov, Sylvester and Riccati, X + XA T = C AX + XB = C and AX + XA T − XBX = C, have a solution X ∈ C N ×N , with X = X T for the Riccati case, if and only if the compressed equations have a solution in C m×m .
Fix integers M , N ≥ 1, a unital commutative ring R, and a multiplicative subgroup G of units in R × .Given any matrix A ∈ R M ×N , there exist unique minimal (that is, coarsest) partitions π min = {I 1 , ..., I m } ∈ Π M and ̟ min = {J 1 , ..., J n } ∈ Π N such that the entries of the block submatrix A I i ×J j lie in a single G-orbit, for all i ∈ {1, ..., m} and j ∈ {1, ..., n}.Now suppose M = N .If either A is symmetric, or R = C, G ⊂ S 1, and A is Hermitian, then π min = ̟ min .