Mathematics of Control, Signals, and Systems

, Volume 26, Issue 4, pp 481–518 | Cite as

Universal regular control for generic semilinear systems

Original Article


We consider discrete-time projective semilinear control systems \(\xi _{t+1} = A(u_t) \cdot \xi _t\), where the states \(\xi _t\) are in projective space \(\mathbb {R}\hbox {P}^{d-1}\), inputs \(u_t\) are in a manifold \(\mathcal {U}\) of arbitrary finite dimension, and \(A :\mathcal {U}\rightarrow \hbox {GL}(d,\mathbb {R})\) is a differentiable mapping. An input sequence \((u_0,\ldots ,u_{N-1})\) is called universally regular if for any initial state \(\xi _0 \in \mathbb {R}\hbox {P}^{d-1}\), the derivative of the time-\(N\) state with respect to the inputs is onto. In this paper, we deal with the universal regularity of constant input sequences \((u_0, \ldots , u_0)\). Our main result states that generically in the space of such systems, for sufficiently large \(N\), all constant inputs of length \(N\) are universally regular, with the exception of a discrete set. More precisely, the conclusion holds for a \(C^2\)-open and \(C^\infty \)-dense set of maps \(A\), and \(N\) only depends on \(d\) and on the dimension of \(\mathcal {U}\). We also show that the inputs on that discrete set are nearly universally regular; indeed, there is a unique non-regular initial state, and its corank is 1. In order to establish the result, we study the spaces of bilinear control systems. We show that the codimension of the set of systems for which the zero input is not universally regular coincides with the dimension of the control space. The proof is based on careful matrix analysis and some elementary algebraic geometry. Then the main result follows by applying standard transversality theorems.


Discrete-time systems Semilinear systems Bilinear systems Universal regular control 

1 Introduction

1.1 Basic definitions and some questions

Consider discrete-time control systems of the form:
$$\begin{aligned} x_{t+1} = F(x_t,u_t), \qquad (t = 0,1,2, \ldots ) \end{aligned}$$
where \(F :\mathcal {X}\times \mathcal {U}\rightarrow \mathcal {X}\) is a map. We will always assume that the space \(\mathcal {X}\) of states and the space \(\mathcal {U}\) of controls are manifolds and that the map \(F\) is continuously differentiable.
A sequence \((x_0, \ldots , x_N ; u_0, \ldots , u_{N-1})\) satisfying (1.1) is called a trajectory of length \(N\); it is uniquely determined by the initial state \(x_0\) and the input \((u_0,\ldots ,u_{N-1})\). Let \(\phi _N\) denote the time-\(N\) transition map, which gives the final state as a function of the initial state and the input:
$$\begin{aligned} x_N = \phi _N(x_0; u_0, \ldots , u_{N-1}). \end{aligned}$$
We say that the system (1.1) is accessible from \(x_0\) in time \(N\) if the set \(\phi _N(\{x_0\} \,\times \, \mathcal {U}^N)\) of final states that can be reached from the initial state \(x_0\) has nonempty interior.

The implicit function theorem gives a sufficient condition for accessibility. If the derivative of the map \(\phi _N(x_0; \cdot )\) at input \((u_0,\ldots ,u_{N-1})\) is an onto linear map, then we say that the trajectory determined by \((x_0; u_0, \ldots , u_{N-1})\) is regular. So the existence of such a regular trajectory implies that the system is accessible from \(x_0\) in time \(N\).

Let us call an input \((u_0, \ldots , u_{N-1})\)universally regular if for every \(x_0 \in \mathcal {X}\), the trajectory determined by \((x_0; u_0, \ldots , u_{N-1})\) is regular; otherwise, the input is called singular.

The concept of universal regularity is central in this paper; it was introduced by Sontag in [13] in the context of continuous-time control systems. The discrete-time analog was considered by Sontag and Wirth in [14]. They showed that if the system (1.1) is accessible from every initial condition \(x_0\) in uniform time \(N\), then universally regular inputs do exist, provided one assumes the map \(F\) to be analytic. In fact, under those hypotheses, they showed that universally regular inputs are abundant: in the space of inputs of sufficiently large length, singular ones form a set of positive codimension.

In this paper, we are interested in control systems (1.1) where the next state \(x_{t+1}\) depends linearly on the previous state \(x_t\) (but nonlinearly on \(u_t\), in general). This means that the state space is \(\mathbb {K}^{d}\), where \(\mathbb {K}\) is either \(\mathbb {R}\) or \(\mathbb {C}\) and that (1.1) now takes the form:
$$\begin{aligned} x_{t+1} = A(u_t) \cdot x_t, \qquad \hbox {where } A :\mathcal {U}\rightarrow \hbox {Mat}_{d \times d}(\mathbb {K}). \end{aligned}$$
Following [4], we call this a semilinear control system.
In the case that the map \(A\) above takes values in the set \(\hbox {GL}(d,\mathbb {K})\) of invertible matrices of size \(d \geqslant 2\), we consider the corresponding projectivized control system:
$$\begin{aligned} \xi _{t+1} = A(u_t) \cdot \xi _t, \end{aligned}$$
where the states \(\xi _t\) take value in the projective space \(\mathbb {K}\hbox {P}^{d-1} = \mathbb {K}^d_* / \mathbb {K}_*\). We call this a projective semilinear control system. The projectivized system is also a useful tool for the study of the original system (1.3): see e.g., [5, 15].

Universally regular inputs for projective semilinear control systems were first considered by Wirth in [15]. Under his working hypotheses, the existence and abundance of such inputs is guaranteed by the aforementioned result of [14]; then he uses universally regular inputs to obtain global controllability properties.

The purpose of this paper is to establish results on the existence and abundance of universally regular inputs for projective semilinear control systems. Different from [14, 15], we will not necessarily assume our systems to be analytic. Let us consider systems (1.4) with \(\mathbb {K}=\mathbb {R}\) and \(A :\mathcal {U}\rightarrow \hbox {GL}(d,\mathbb {R})\), a map of class \(C^r\), for some fixed \(r\geqslant 1\). To compensate for less rigidity, we do not try to obtain results that work for all \(C^r\) maps \(A\), but only for generic ones, i.e., those maps in a residual (dense \(G_\delta \)) subset, or, even better, in an open dense subset.

To make things more precise, assume \(\mathcal {U}\) is a \(C^\infty \) (real) manifold without boundary. All manifolds are assumed to be Hausdorff paracompact with a countable base of open sets, and of finite dimension. We will always consider the space \(C^r(\mathcal {U}, \hbox {GL}(d,\mathbb {R}))\) endowed with the strong \(C^r\) topology (which coincides with the usual uniform \(C^r\) topology in the case that \(\mathcal {U}\) is compact).

Hence, the first question we pose is this:

Taking \(N\) sufficiently large, is it true that for \(C^r\)-generic maps \(A\), the set of universally regular inputs in \(\mathcal {U}^N\) is itself generic?

It turns out that this question has a positive answer. Actually, in a work in preparation, we show that for generic maps \(A\), all inputs in \(\mathcal {U}^N\) are universally regular, except for those in a stratified closed set of positive codimension. So another natural question is this:

Fixed parameters \(d\), \(\dim \mathcal {U}\), \(N\), and \(r\), what is the minimum codimension of the set of singular inputs in \(\mathcal {U}^N\) that can occur for \(C^r\)-generic maps \(A :\mathcal {U}\rightarrow \hbox {GL}(d,\mathbb {R})\)?

In full generality, this question seems to be very difficult. A simpler setting would be to restrict to non-resonant inputs, namely those inputs \((u_0,\ldots ,u_{N-1})\) such that \(u_i \ne u_j\) whenever \(i \ne j\). In this paper, we consider the most resonant case. Define a constant input of length \(N\) as an element of \(\mathcal {U}^N\) of the form \((u_0, u_0, \ldots , u_0)\). We propose ourselves to study universal regularity of inputs of this form.

1.2 The main result

We prove that generically the singular constant inputs form a very small set:

Theorem 1.1

Given \(d\geqslant 2\) and \(m \geqslant 1\), there exists an integer \(N\) with \(1 \leqslant N \leqslant d^2\) such that the following properties hold. Let \(\mathcal {U}\) be a smooth \(m\)-dimensional manifold without boundary. Then there exists a \(C^2\)-open \(C^\infty \)-dense subset \(\mathcal {O}\) of \(C^2(\mathcal {U}, \hbox {GL}(d,\mathbb {R}))\) such that for every system (1.4) with \(A \in \mathcal {O}\), all constant inputs of length \(N\) are universally regular, except for those in a zero-dimensional (i.e., discrete) set.

By saying that a subset \(\mathcal {O}\) of \(C^2(\mathcal {U}, \hbox {GL}(d,\mathbb {R}))\) is \(C^\infty \)-dense, we mean that for all \(r \geqslant 2\), the intersection of \(\mathcal {O}\) with \(C^r(\mathcal {U}, \hbox {GL}(d,\mathbb {R}))\) is dense in \(C^r(\mathcal {U}, \hbox {GL}(d,\mathbb {R}))\).

It is remarkable that the generic dimension of the set of singular constant inputs (namely, 0) does not depend on the dimension \(m\) of the control space \(\mathcal {U}\), neither on the dimension \(d-1\) of the state space. A partial explanation for this phenomenon is the following: First, the obstruction to universal regularity of the input \((u,u,\ldots ,u)\) is the combined degeneracy of the matrix \(A(u)\) and of the derivatives of \(A\) at \(u\). If \(m\) is small, then the image of the generic map \(A\) will avoid too degenerate matrices, which increases the chances of obtaining universal regularity. If \(m\) is large, then more degenerate matrices \(A(u)\) will inevitably appear; however, the large number of control parameters compensates, so universal control is still likely.

The singular inputs that appear in Theorem 1.1 are not only rare; we also show that they are “almost” universally regular:

Theorem 1.2

(Addendum to Theorem 1.1) The set \(\mathcal {O}\subset C^2(\mathcal {U},\hbox {GL}(d,\mathbb {R}))\) in Theorem 1.1 can be taken with the following additional properties: If \(A \in \mathcal {O}\) and a constant input \((u,\ldots ,u)\) of length \(N\) is singular then:
  1. 1.

    There is a single direction \(\xi _0 \in \mathbb {R}\hbox {P}^{d-1}\) for which the corresponding trajectory of system (1.4) is not regular.

  2. 2.

    The derivative of the map \(\phi _N(\xi _0; \cdot )\) at input \((u,\ldots ,u)\) has corank 1.


To sum up, for generic systems (1.4), the universal regularity of constant inputs can fail only in the weakest possible way: there is at most one non-regular state, which can be moved in all directions but one.

We actually describe precisely in Appendix E (ESM) the singular inputs that appear in Theorem 1.2. We show that these singular inputs can be unremovable by perturbations, and therefore, Theorem 1.1 is optimal in the sense that there are \(C^2\)-open (actually even \(C^1\)-open) sets of maps \(A\) for which the set of singular constant inputs is nonempty. Also, by \(C^1\)-perturbing any \(A\) in those \(C^2\)-open sets, one can obtain an infinite number of singular constant inputs. In particular, the set \(\mathcal {O}\) in the statement of the Theorem 1.1 is not \(C^1\)-open in general.

1.3 Reduction to the study of the set of poor data

The bulk of the proof of Theorem 1.1 consists on the computation of the dimension of certain canonical sets, as we now explain.

We fix \(A :\mathcal {U}\rightarrow \hbox {GL}(d,\mathbb {K})\) and consider the projective semilinear system (1.4). By the chain rule, the universal regularity of an input \((u_0, u_1, \ldots , u_{N-1})\) depends only on the 1-jets of \(A\) at points \(u_0\), ..., \(u_{N-1}\), i.e., on the first order Taylor approximations of \(A\) around those points.

Let us discuss the case of constant inputs \((u_0, \ldots , u_0)\). If we take local coordinates such that \(u_0 = 0\) and replace the matrix map \(A :\mathcal {U}\rightarrow \hbox {GL}(d,\mathbb {K})\) by its linear approximation, system (1.4) becomes:
$$\begin{aligned} \xi _{t+1} = \left( A + \sum \limits _{j=1}^m u_{t,i} C_j \right) \xi _t, \quad (t = 0, 1, 2, \ldots ) , \end{aligned}$$
where \(A = A(u_0)\) and \(C_1\), ..., \(C_m\) are the partial derivatives at \(u_0=0\). This is the projectivization of a bilinear control system (see [6]). For these systems, the zero input is a distinguished one and the focus of more attention.
To study system (1.5), it is actually more convenient to consider normalized derivatives\(B_j = C_j A^{-1}\), which intrinsically take values in the Lie algebra \(\mathfrak {gl}(d,\mathbb {K})\). Consider the matrix datum \(\mathbf {A}= (A, B_{1}, \ldots , B_{m})\). We will explain how the universal regularity of the zero input is expressed in linear algebraic terms. Recall that the adjoint operator of \(A\) acts on \(\mathfrak {gl}(d,\mathbb {K})\) by the formula \(\hbox {Ad}_A(B) = A B A^{-1}\). Consider the linear subspace \(\Lambda _N(\mathbf {A})\) of \(\mathfrak {gl}(d,\mathbb {K})\) spanned by the matrices
$$\begin{aligned} \hbox {Id}\quad \hbox {and} \quad (\hbox {Ad}_A)^i(B_j), \quad (i=0,\ldots ,n-1, j=1,\ldots , m). \end{aligned}$$
(The identity matrix appears because of the projectivization.) This is nothing but the reachable set from 0 for the linear control system \((\hbox {Ad}_A, \hbox {Id}, B_1, \ldots , B_m)\). Then:

Proposition 1.3

The constant input \((0,\ldots ,0)\) of length \(N\) is universally regular for system (1.5) if and only if the space \(\Lambda _N(\mathbf {A})\) is transitive.

Here we say that a subspace of \(d \times d\) matrices with entries in the field \(\mathbb {K}\) is transitive if it acts transitively in the set \(\mathbb {K}^d_*\) of nonzero vectors.

Clearly, the spaces \(\Lambda _N(\mathbf {A})\) form a nested sequence that stabilizes to a space \(\Lambda (\mathbf {A})\) at some time \(N \leqslant d^2\). If \(\Lambda (\mathbf {A})\) is transitive, then the datum \(\mathbf {A}\) is called rich; otherwise it is called poor. Let \(\mathcal {P}_m^{(\mathbb {K})} = \mathcal {P}_{m,d}^{(\mathbb {K})}\) denote the set of poor data. A major part of our work is to study these sets. We prove:

Theorem 1.4

The set \(\mathcal {P}_{m}^{(\mathbb {R})}\) is closed and semialgebraic, and its codimension in \(\hbox {GL}(d,\mathbb {R}) \times (\mathfrak {gl}(d,\mathbb {R}))^m\) is \(m\).

Theorem 1.5

The set \(\mathcal {P}_{m}^{(\mathbb {C})}\) is algebraic, and its (complex) codimension in \(\hbox {GL}(d,\mathbb {C}) \times (\mathfrak {gl}(d,\mathbb {C}))^m\) is \(m\).

So Theorems 1.4 and 1.5 say how frequent universal regularity of the zero input is in the space of projective bilinear control systems (1.5).

1.4 Overview of the proofs

Theorem 1.1 follows rather directly from Theorem 1.4 by applying standard results from transversality theory. More precisely, the fact that the set \(\mathcal {P}_m^{(\mathbb {R})}\) is semialgebraic implies that it has a canonical stratification. This permits us to apply Thom’s jet transversality theorem and obtain Theorem 1.1.

On the other hand, Theorem 1.4 follows from its complex version Theorem 1.5 by simple abstract arguments.

Thus, everything is based on Theorem 1.5. One part of the result is easily obtained: we give examples of small disks of codimension \(m\) formed by poor data, so concluding that the codimension of \(\mathcal {P}_{m}^{(\mathbb {C})}\) is at most \(m\).

To prove the other inequality, one could try to exhibit an explicit codimension \(m\) set containing all poor data. For \(m=1\), this task is feasible (and we actually perform it, because with these conditions we can actually check universal regularity in concrete examples). However, for \(m=2\) already the task would be very laborious, and to expect to find a general solution seems unrealistic.

Our actual approach to prove the lower bound on the codimension of \(\mathcal {P}_{m}^{(\mathbb {C})}\) is indirect. Crudely speaking, after careful matrix computations, we find some sets in the complement of \(\mathcal {P}_{m}^{(\mathbb {C})}\) that are reasonably “large” (basically in terms of dimension). Then, by using some abstract results of algebraic geometry, we are able to show that \(\mathcal {P}_{m}^{(\mathbb {C})}\) is “small,” thus proving the other half of Theorem 1.5.

Let us give more detail about this strategy. We decompose the set \(\mathcal {P}_m = \mathcal {P}_{m}^{(\mathbb {C})}\) into fibers:
$$\begin{aligned} \mathcal {P}_m =\bigcup _{A \in \hbox {GL}(d,\mathbb {C})} \{A\} \times \mathcal {P}_m(A) , \qquad \mathcal {P}_m(A) \subset [\mathfrak {gl}(d,\mathbb {C})]^m. \end{aligned}$$
It is not very difficult to show that for generic \(A\) in \(\hbox {GL}(d,\mathbb {C})\), the fiber \(\mathcal {P}_m(A)\) has precisely the wanted codimension \(m\). However, for degenerate matrices \(A\), the fiber \(\mathcal {P}_m(A)\) may be much bigger. (For example, one can show that if \(A\) is an homothecy and \(m \leqslant 2d-3\), then \(\mathcal {P}_m(A)\) is the whole \([\mathfrak {gl}(d,\mathbb {C})]^m\).) In order to show that \({{\mathrm{codim }}}\mathcal {P}_m \geqslant m\), we need to make sure that those degenerate matrices do not form a large set. More precisely, we show that:
$$\begin{aligned} \forall k\in \{0,\ldots ,m\},\,{{\mathrm{codim }}}\big \{ A \in \hbox {GL}(d,\mathbb {C}) ; \; {{\mathrm{codim }}}\mathcal {P}_m(A) \leqslant m-k \big \} \geqslant k. \end{aligned}$$
Let us explain how we prove (1.6). In order to estimate the dimension of \(\mathcal {P}_m(A)\) for any matrix \(A \in \hbox {GL}(d,\mathbb {C})\), we consider a quantity \(r = r(A)\) which is the least number such that a rich datum of the form \((A,C_1,\ldots ,C_r)\) exists. In particular, if \(r = r(A) \leqslant m\), then the following affine space
$$\begin{aligned} \big \{ (C_1, C_2, \ldots , C_r , B_{r+1}, \ldots , B_m ) ; \; B_j \in \mathfrak {gl}(d,\mathbb {C}) \big \} \end{aligned}$$
is contained in the complement of \(\mathcal {P}_m(A)\).
In certain situations, if two algebraic subsets have large enough dimensions, then they necessarily intersect; for example, two algebraic curves in the complex projective plane \(\mathbb {C}\hbox {P}^2\) always intersect. This kind of phenomenon happens here: the dimension of the affine space (1.7) forces a lower bound for the codimension of \(\mathcal {P}_m(A)\), namely:
$$\begin{aligned} {{\mathrm{codim }}}\mathcal {P}_m(A) \geqslant m+1-r(A). \end{aligned}$$
So we need to show that matrices \(A\) with large \(r(A)\) are rare. A careful matrix analysis provides an upper bound to \(r(A)\) based on the numbers and sizes of the Jordan blocks of \(A\), and on the occasional algebraic relations between the eigenvalues. This bound together with (1.8) implies (1.6) and therefore concludes the proof of Theorem 1.5.

In fact, the results of this analysis are even better, and we conclude that the codimension inequality (1.6) is strict when \(k \geqslant 1\). This implies that poor data \((A, B_1, \ldots , B_m)\) for which the matrix \(A\) is degenerate form a subset of \(\mathcal {P}_{m}^{(\mathbb {C})}\) with strictly bigger codimension. Thus, we can show that the poor data that appear generically are well behaved, which leads to Theorem 1.2.

1.5 Holomorphic setting

In the case of complex matrices (i.e., \(\mathbb {K}= \mathbb {C}\)), we have a corresponding version of Theorem 1.1 where the maps \(A\) are holomorphic. Given an open subset \(\mathcal {U}\subset \mathbb {C}^m\), we denote by \(\mathcal {H}(\mathcal {U}, \hbox {GL}(d,\mathbb {C}))\) the set of holomorphic mappings \(A :\mathcal {U}\rightarrow \hbox {GL}(d,\mathbb {C})\) endowed with the usual topology of uniform convergence on compact sets.

Theorem 1.6

Given integers \(d\geqslant 2\) and \(m \geqslant 1\), there exists an integer \(N\geqslant 1\) with the following properties. Let \(\mathcal {U}\subset \mathbb {C}^m\) be open, and let \(K\subset \mathcal {U}\) be compact. Then there exists an open and dense subset \(\mathcal {O}\) of \(\mathcal {H}(\mathcal {U}, \hbox {GL}(d,\mathbb {C}))\) such that for any \(A \in \mathcal {O}\) the constant inputs in \(K^N\) are all universally regular for the system (1.4), except for a finite subset.

We have the straightforward corollary:

Corollary 1.7

Given integers \(d\geqslant 2\) and \(m \geqslant 1\), there exists an integer \(N\geqslant 1\) with the following properties. Let \(\mathcal {U}\subset \mathbb {C}^m\) be an open subset. There exists a residual subset \(\mathcal {R}\) of \(\mathcal {H}(\mathcal {U}, \hbox {GL}(d,\mathbb {C}))\) such that for any \(A \in \mathcal {R}\) the constant inputs in \(\mathcal {U}^N\) are all universally regular for the system (1.4), except for a discrete subset.

1.6 Directions for future research

One can also study uniform regularity of periodic inputs of higher period. Using our results for constant inputs, it is not difficult to derive some (non-sharp) codimension bounds for singular periodic inputs for generic systems. However, for highly resonant non-periodic inputs, we have no idea on how to obtain reasonable dimension estimates.

To obtain good estimates for the codimension of non-resonant, singular inputs for generic systems is relatively simpler from the point of view of matrix computations, but needs more sophisticated transversality theorems (e.g., multijet transversality). Since highly resonant inputs have large codimension themselves, it seems possible to obtain reasonably good codimension estimates for general inputs for generic systems.

Another interesting direction of research is to consider other Lie groups of matrices.

1.7 Organization of the paper

Section 2 contains some basic results about transitivity of spaces of matrices and its relation to universal regularity. We also obtain the easy parts of Theorems 1.4 and 1.5, namely (semi)algebraicity and the upper codimension inequalities.

In Sect. 3, we introduce the concept of rigidity, which is related to the quantity \(r(A)\) mentioned above. We state the central rigidity estimates (Theorem 3.6), which consist into two parts. The first and easier part is proved in the same Sect. 3, while the whole Sect. 4 is devoted to the proof of the second part.

Section 5 starts with some preliminaries in elementary algebraic geometry. Then we use the rigidity estimates to prove Theorem 1.5, following the strategy outlined above (§ 1.4). Theorem 1.4 follows easily. We also obtain a lemma that is needed for the proof of Theorem 1.2.

In Sect. 6, we deduce Theorem 1.1 from previous results and standard theorems stratifications and transversality.

The paper also has some appendices (Electronic Supplementary Material):

Appendix A (ESM) basically reobtains the major results in the special case \(m=1\), where we actually gain additional information of practical value: as mentioned in Sect. 1.4, it is possible to describe explicitly what 1-jets the map \(A\) should avoid in order to satisfy the conclusions of Theorems 1.1 and 1.2. The arguments necessary for the \(m=1\) case are much simpler and more elementary than those in Sects. 3, 4 and 5. Therefore, that appendix is also useful to give the reader some intuition about the general problem, and as a source of examples. Appendix A (ESM) is written in a slightly informal way, and it can be read after Sect. 2 (though the final part requires Lemma 3.1).

Appendix B (ESM) contains the proofs of necessary algebraic–geometric results, especially the one that allows us to obtain estimate (1.8).

Appendix C (ESM) reviews the necessary concepts and results on stratifications and proves a prerequisite transversality proposition.

In Appendix D (ESM), we apply Theorem 1.5 to prove a version of Theorem 1.1 for holomorphic mappings.

In Appendix E (ESM), we study the singular constant inputs of generic type, proving Theorem 1.2 and the other assertions made at the end of Sect.1.2 concerning the sharpness of Theorem 1.1. We also discuss the generic validity of some control-theoretic properties related to accessibility and regularity.

2 Preliminary facts on the poor data

In this section, we review some basic properties related to poorness and prove the easy inequalities in Theorems 1.4 and 1.5.

2.1 Transitive spaces

Let \(E\) and \(F\) be finite-dimensional vector spaces over the field \(\mathbb {K}\). Let \(\mathcal {L}(E,F)\) be the space of linear maps from \(E\) to \(F\). A vector subspace \(\Lambda \) of \(\mathcal {L}(E,F)\) is called transitive if for every \(v \in E {\backslash } \{0\}\), we have \(\Lambda \cdot v = F\), where \(\Lambda \cdot v = \{ L(v) ; \; L \in \Lambda \}\).

Under the identification \(\mathcal {L}(\mathbb {K}^n, \mathbb {K}^m) = \hbox {Mat}_{m \times n}(\mathbb {K})\), we may also speak of transitive spaces of matrices.

The following examples illustrate the concept; they will also be needed in later considerations.

Example 2.1

Recall that a Toeplitz matrix, resp. a Hankel matrix, is a matrix of the formThe set of Toeplitz matrices and the set of complex Hankel matrices constitute examples transitive subspaces of \(\mathfrak {gl}(d,\mathbb {K})\). Transitivity of the Toeplitz space is a particular case of Example 2.2, and transitivity of Hankel space follows from Remark 2.3. For \(\mathbb {K}= \mathbb {C}\), these spaces are optimal, in the sense that they have the least possible dimension; see [1].

Example 2.2

A generalized Toeplitz space is a subspace \(\Lambda \) of \(\hbox {Mat}_{d\times d}(\mathbb {K})\) (where \(d \geqslant 2\)) with the following property: For any two matrix entries \((i_1,j_1)\) and \((i_2,j_2)\) which are not in the same diagonal (i.e., \(i_1-j_1 \ne i_2-j_2\)), the linear map \((b_{i,j})_{i,j} \in \Lambda \mapsto (b_{i_1,j_1},b_{i_2,j_2}) \in \mathbb {C}^2\) is onto. Equivalently, a space is generalized Toeplitz if it can be defined by a number of linear relations between the matrix coefficients so that each relation involves only the entries on a same diagonal, and so that the relations do not force any matrix entry to be zero. We will prove later (see Sect. 3.3) that every generalized Toeplitz space is transitive.

Remark 2.3

If \(\Lambda \) is a transitive subspace of \(\mathcal {L}(E,F)\) and \(P \in \mathcal {L}(E,E)\), \(Q \in \mathcal {L}(F,F)\) are invertible operators, then \(P \cdot \Lambda \cdot Q := \{PLQ ; \; L \in \Lambda \}\) is a transitive subspace of \(\mathcal {L}(E,F)\).

Let us see that transitivity is a semialgebraic or algebraic property, according to the field. Recall that:
  • A subset of \(\mathbb {K}^n\) is called algebraic if it is expressed by polynomial equations with coefficients in \(\mathbb {K}\).

  • A subset of \(\mathbb {R}^n\) is called semialgebraic if it is the union of finitely many sets, each of them defined by finitely many real polynomial equations and inequalities (see [2, 3]).

Proposition 2.4

Let \(\mathcal {N}_{m,n,k}^{(\mathbb {K})}\) be the set of \((B_1, \ldots , B_k) \in [\hbox {Mat}_{m \times n}(\mathbb {K})]^k=\mathbb {K}^{mnk}\) such that \({{\mathrm{span }}}\{B_1, \ldots , B_k\}\) is not transitive. Then:
  1. 1.

    The set \(\mathcal {N}_{m,n,k}^{(\mathbb {R})}\) is semialgebraic.

  2. 2.

    The set \(\mathcal {N}_{m,n,k}^{(\mathbb {C})}\) is algebraic.



Consider the set of \((B_1, \ldots , B_k, v) \in [\hbox {Mat}_{m \times n}(\mathbb {K})]^k \times \mathbb {K}^n_*\) such that
$$\begin{aligned} {{\mathrm{span }}}\{B_1, \ldots , B_k\}\cdot v \ne \mathbb {K}^m . \end{aligned}$$
For \(\mathbb {K}= \mathbb {R}\), this is a semialgebraic set, because it is expressed by the vanishing of certain determinants plus the condition \(v \ne 0\). Projecting this set along the \(\mathbb {R}^n_*\) fiber we obtain \(\mathcal {N}_{m,n,k}^{(\mathbb {R})}\); so, by the Tarski–Seidenberg theorem (see [2, p. 60] or [3, p. 26]), this set is semialgebraic, proving part 1.

To see part 2, we take \(\mathbb {K}=\mathbb {C}\) and projectivize the \(\mathbb {C}^n_*\) fiber, obtaining an algebraic subset \([\hbox {Mat}_{m \times n}(\mathbb {C})]^k \times \mathbb {C}\hbox {P}^{n-1}\) whose projection along the \(\mathbb {C}\hbox {P}^{n-1}\) fiber is \(\mathcal {N}_{m,n,k}^{(\mathbb {C})}\). So part 2 follows from the fact that projections along projective fibers take algebraic sets to algebraic sets (see [12, p. 58]).

Complex transitivity of real matrices is a stronger property than real transitivity:

Proposition 2.5

The real part of \(\mathcal {N}_{m,n,k}^{(\mathbb {C})}\) (that is, its intersection with \([\hbox {Mat}_{m \times n}(\mathbb {R})]^k\)) contains \(\mathcal {N}_{m,n,k}^{(\mathbb {R})}\).

The proof is an easy exercise.

2.2 Universal regularity for constant inputs and richness

In this subsection, we prove Proposition 1.3; in fact we prove a more precise result, and also fix some notation.

Given a linear operator \(H :E \rightarrow E\), where \(E\) is a finite-dimensional vector space over the field \(\mathbb {K}\), and vectors \(v_1\), ..., \(v_m \in E\), we denote by \(\mathfrak {R}^N_H(v_1,\ldots ,v_m)\) the space spanned by the family of vectors \(H^t(v_i)\), where \(1\leqslant i \leqslant m\) and \(0 \leqslant t < N\). In other words, \(\mathfrak {R}^N_H(v_1,\ldots ,v_m)\) is the reachable set from 0 of the linear control system
$$\begin{aligned} \xi _{t+1} = H\xi _t + \sum \limits _i u_{t,i} v_i . \end{aligned}$$
The sequence of spaces \(\mathfrak {R}^N_H(v_1,\ldots ,v_m)\) is nested nondecreasing and thus stabilize to a space \(\mathfrak {R}_H(v_1,\ldots ,v_m)\) after \(N \leqslant \dim H\) steps.

If \(A :\mathcal {U}\rightarrow \hbox {GL}(d,\mathbb {C})\) is a differentiable map, then the normalized derivative of \(A\) at a point \(u\) is the linear map \(T_u \mathcal {U}\rightarrow \mathfrak {gl}(d,\mathbb {R})\) given by \(h \mapsto (DA(u)\cdot h)\circ A^{-1}(u)\).

Let \(\phi _N(\xi _0,\hat{u})\) be the state \(\xi _N\in \mathbb {K}\hbox {P}^d\) of the system (1.4) determined by the initial state \(\xi _0\) and the input sequence \(\hat{u}\in \mathcal {U}^N\). Let \(\partial _2 \phi _N(\xi _0,\hat{u})\) be the derivative of the map \(\phi _N(\xi _0, \cdot )\) at \(\hat{u}\).

Fix a constant input \(\hat{u}=(u,\ldots , u)\in \mathcal {U}^N\), and local coordinates on \(\mathcal {U}\) around \(u\). Let \(B_j\) be the normalized partial derivatives of the map \(A\) at \(u\) with respect to the \(i\)th coordinate. Consider the datum \(\mathbf {A}=(A,B_1,\ldots , B_m)\), where \(A = A(u)\). Define the following subspace of \(\mathfrak {gl}(d,\mathbb {K})\):
$$\begin{aligned} \Lambda _N(\mathbf {A}) = \mathfrak {R}_{\hbox {Ad}_A}^N (\hbox {Id}, B_1, \ldots , B_m), \end{aligned}$$
where \(\hbox {Ad}_A(B) = ABA^{-1}\).

Proposition 2.6

For all \(\xi _0\in \mathbb {K}\hbox {P}^{d-1}\) and any \(x_0\in \mathbb {K}^d{\backslash } \{0\}\) representing \(\xi _0\),
$$\begin{aligned} {{\mathrm{rank }}}\partial _2 \phi _N (\xi _0,\hat{u}) = \dim \left[ \Lambda _N(\mathbf {A})\cdot (A^N x_0)\right] -1. \end{aligned}$$

In particular (since \(A =A(u)\) is invertible), the input \(\hat{u}\) is universally regular if and only if \(\Lambda _N(\mathbf {A})\) is a transitive space, which is the statement of Proposition 1.3.

Proof of Proposition 2.6

Let \(\xi _0 = [x_0]\), where \(x_0\in \mathbb {K}^d_*\). Let \(\psi _N(x_0,\hat{u})\) be the final state of the non-projectivized system (1.3) determined by the initial state \(x_0\) and by the sequence of controls \(\hat{u}\in \mathcal {U}^N\). Using local coordinates with \(u\) in the origin, we have the following first order approximation for \(\hat{u}\simeq 0\):
$$\begin{aligned} \psi _N(x_0,\hat{u})&\simeq A^N x_0 + \mathop {\sum \limits _{1\leqslant j\leqslant m}}\limits _{0 \leqslant t<N} u_{t,j}A^{N-t-1}B_{j} A^{t+1} x_0 \\&= \left( \hbox {Id}+ \mathop {\sum \limits _{1\leqslant j\leqslant m}}\limits _{0 \leqslant n < N} u_{N-1-n,j}\hbox {Ad}_A^n(B_{j})\right) x_N , \end{aligned}$$
where \(x_N = \psi _N(x_0,0) = A^N x_0\). Therefore, the image of \(\partial _2 \psi _N(x_0,\hat{u})\) is the following subspace of \(T_{A^N x_0} \mathbb {K}^d\):
$$\begin{aligned} V = \left( \mathop {\mathop {{{\mathrm{span }}}}\limits _{1\leqslant j\leqslant m}}\limits _{0\leqslant n<N}\hbox {Ad}_A^n B_{j} \right) \cdot x_N , \end{aligned}$$
The image of \(\partial _2 \phi _N(\xi _0,\hat{u})\) equals \(D\pi (x_N)(V)\), where \(\pi : \mathbb {K}^d_* \rightarrow \mathbb {K}\hbox {P}^{d-1}\) is the canonical projection. Notice that \({{\mathrm{Ker }}}D\pi (x) = \mathbb {K}x\) for any \(x \in \mathbb {K}^d_*\). It follows that
$$\begin{aligned} {{\mathrm{rank }}}\partial _2 \phi _N(\xi _0,\hat{u})&= \dim \left[ D\pi (x_N) (V) \right] \\&= \dim \left[ D\pi (x_N) \big (\mathbb {K}x_N + V\big ) \right] = \dim [\mathbb {K}x_N + V] - 1 \end{aligned}$$
Since \(\mathbb {K}x_N + V = \Lambda _N(\mathbf {A})\cdot x_N\), the proposition is proved.\(\square \)

2.3 The sets of poor data

For emphasis, we repeat the definition already given in the introduction: The datum \(\mathbf {A}= (A, B_1, \ldots , B_m) \in \hbox {GL}(d,\mathbb {K}) \times [\mathfrak {gl}(d,\mathbb {K})]^m\) is rich if the space \(\Lambda (\mathbf {A}) = \Lambda _{d^2}(\mathbf {A})\) is transitive, and poor otherwise. The concept in fact depends on the field under consideration. The set of such poor data is denoted by \(\mathcal {P}_{m,d}^{(\mathbb {K})}\).

It follows immediately from Proposition 2.4 that \(\mathcal {P}_{m,d}^{(\mathbb {R})}\) is a closed and semialgebraic subset of \(\hbox {GL}(d,\mathbb {R}) \times [\mathfrak {gl}(d,\mathbb {R})]^m\) and \(\mathcal {P}_{m,d}^{(\mathbb {C})}\) is an algebraic subset of \(\hbox {GL}(d,\mathbb {C}) \times [\mathfrak {gl}(d,\mathbb {C})]^m\). This proves part of Theorems 1.4 and 1.5.

Also, by Proposition 2.5 the real poor data are contained in the real part of the complex poor data, i.e.,
$$\begin{aligned} \mathcal {P}_{m,d}^{(\mathbb {R})} \subset \mathcal {P}_{m,d}^{(\mathbb {C})} \cap \big [ \hbox {GL}(d,\mathbb {R}) \times [\mathfrak {gl}(d,\mathbb {R})]^m \big ]. \end{aligned}$$
For later use, we note that the sets of poor data are saturated in the sense of the following definition: A set \(\mathcal {Z}\subset [\hbox {Mat}_{d\times d}(\mathbb {K})]^{1+m}\) will be called saturated if \((A, B_1, \ldots , B_m) \in \mathcal {Z}\) implies that: \((A, B_1, \ldots , B_m) \in \mathcal {Z}\) implies that:
  • for all \(P \in \hbox {GL}(d,\mathbb {K})\) we have \((P^{-1}AP, P^{-1}B_1 P, \ldots , P^{-1}B_m P) \in \mathcal {Z}\);

  • for all \(Q \!=\! (q_{ij}) \!\in \! \hbox {GL}(m,\mathbb {K})\), letting \(B'_i \!=\! \sum \nolimits _j q_{ij} B_j\), we have \((A, B_1', \ldots , B_m')\!\in \!\mathcal {Z}\).

2.4 The easy codimension inequalities of Theorems 1.4 and 1.5

Here we will discuss the simplest examples of poor data.

To begin, notice that if \(A \in \hbox {GL}(d,\mathbb {C})\) is diagonalizable, then so is \(\hbox {Ad}_A\). Indeed, assume without loss of generality that \(A = \hbox {Diag}(\lambda _1, \ldots , \lambda _d)\). Consider the basis \(\{E_{i,j} ; \; i,j \in \{1,\ldots ,d\}\} \) of \(\mathfrak {gl}(d,\mathbb {C})\), where
$$\begin{aligned} E_{i,j} \hbox { is the matrix whose only nonzero entry is a 1 in the } (i,j) \hbox { position}. \end{aligned}$$
Then \(\hbox {Ad}_A (E_{i,j}) = \lambda _i \lambda _j^{-1} E_{i,j}\). So if \(f\) is a polynomial and \(B = (b_{ij})\), then
$$\begin{aligned} \hbox {the }(i,j)\hbox {-entry of the matrix } (f(\hbox {Ad}_A))(B) \hbox { is } f(\lambda _i\lambda _j^{-1})b_{ij}. \end{aligned}$$
The datum \(\mathbf {A}= (A, B_1, \ldots , B_m) \in \hbox {GL}(d,\mathbb {K}) \times \mathfrak {gl}(d,\mathbb {K})^m\) is called conspicuously poor if there exists a change of bases \(P \in \hbox {GL}(d,\mathbb {K})\) such that:
  • the matrix \(P^{-1} A P\) is diagonal;

  • the matrices \(P^{-1} B_k P\) have a zero entry in a common off-diagonal position; more precisely, there are indices \(i_0\), \(j_0 \in \{1,\ldots ,d\}\) with \(i_0 \ne j_0\) such that for each \(k \in \{1,\ldots ,m\}\), the \((i_0,j_0)\) entry of the matrix \(P^{-1} B_k P\) vanishes.

(As in the definition of poorness, the concept depends on the field \(\mathbb {K}\).)

Lemma 2.7

Conspicuously poor data are poor.


Let \(\mathbf {A}= (A, B_1, \ldots , B_m)\) be conspicuously poor. With a change of basis, we can assume that \(A\) is diagonal. Let \((e_1,\ldots ,e_d)\) be the canonical basis of \(\mathbb {K}^d\). Let \((i,j)\) be the entry position where all \(B_i\)’s have a zero entry. By (2.4), all matrices in the space \(\Lambda (\mathbf {A}) = \mathfrak {R}_{\hbox {Ad}_A}(\hbox {Id}, B_1, \ldots , B_m)\) have a zero entry in the \((i_0,j_0)\) position. In particular, there is no \(L \in \Lambda (\mathbf {A})\) such that \(L \cdot e_{j_0} = e_{i_0}\), showing that this space is not transitive.

The converse of this lemma is certainly false. (Many examples appear in Appendix A (ESM); see also Example 3.5.) However, we will see in § A.1 that the converse holds for generic \(A\).

We will use Lemma 2.7 to prove the easy codimension inequalities for Theorems 1.4 and 1.5; first we need to recall the following:

Proposition 2.8

Suppose \(A \in \hbox {Mat}_{d\times d}(\mathbb {K})\) is diagonalizable over \(\mathbb {K}\) and with simple eigenvalues only. Then there is a neighborhood of \(A\) where the eigenvalues vary smoothly, and where the eigenvectors can be chosen to vary smoothly.

Proposition 2.9

(Easy half of Theorems 1.4 and 1.5) For both \(\mathbb {K}=\mathbb {R}\) or \(\mathbb {C}\), we have \({{\mathrm{codim }}}_{\mathbb {K}} \mathcal {P}^{(\mathbb {K})}_m \leqslant m\).


Using Proposition 2.8, we can exhibit smoothly embedded disks of codimension \(m\) inside \(\hbox {GL}(d,\mathbb {K}) \times \mathfrak {gl}(d,\mathbb {K})^m\) formed by conspicuously poor data.\(\square \)

3 Rigidity

The aim of this section is to state Theorem 3.6 and prove its first part. Along the way, we will establish several lemmas which will be reused in the proof of the second part of the theorem in Sect. 4.

3.1 Acyclicity

Consider a linear operator \(H :E \rightarrow E\), where \(E\) is a finite-dimensional complex vector space. The acyclicity of \(H\) is defined as the least number \(n\) of vectors \(v_1\), ..., \(v_n \in E\) such that \(\mathfrak {R}_H(v_1, \ldots , v_n) = E\). We denote \(n = {{\mathrm{acyc }}}H\). If \(n = 1\), then \(H\) is called a cyclic operator, and \(v_1\) is called a cyclic vector.

Lemma 3.1

Let \(E\) be a finite-dimensional complex vector space and let \(H :E \rightarrow E\) be a linear operator. Assume that \(E_1\), ..., \(E_k \subset E\) are \(H\)-invariant subspaces and that the spectra of \(A|E_i\) (\(1 \leqslant i \leqslant k\)) are pairwise disjoint. If \(v_1 \in E_1\), ..., \(v_k \in E_k\), then
$$\begin{aligned} \mathfrak {R}_H(v_1, \ldots , v_k) = \mathfrak {R}_H(v_1 + \cdots + v_k) . \end{aligned}$$


View \(E\) as a module over the ring of polynomials \(\mathbb {C}[x]\) by defining \(xv=H(v)\) for \(v \in E\). Then the lemma follows from [11, Theorem 6.4].

The geometric multiplicity of an eigenvalue \(\lambda \) of \(H\) is the dimension of the kernel of \(H - \lambda \hbox {Id}\) (or, equivalently, the number of corresponding Jordan blocks).

Proposition 3.2

The acyclicity of an operator equals the maximum of the geometric multiplicities of its eigenvalues.


This follows from the Primary Cyclic Decomposition Theorem together with Lemma 3.1. \(\square \)

Remark 3.3

The operators which interest us most are \(H = \hbox {Ad}_A\), where \(A \in \hbox {GL}(d,\mathbb {C})\). It is useful to observe that the geometric multiplicity of 1 as an eigenvalue of\(\hbox {Ad}_A\)equals the codimension of the conjugacy class of\(A\)inside\(\hbox {GL}(d,\mathbb {C})\). To prove this, consider the map \(\Psi _A :\hbox {GL}(d,\mathbb {C}) \rightarrow \hbox {GL}(d,\mathbb {C})\) given by \(\Psi _A(X) = \hbox {Ad}_X(A)\). The derivative at \(X=\hbox {Id}\) is \(H \mapsto HA-AH\); so \({{\mathrm{Ker }}}D\Psi _A (\hbox {Id}) = {{\mathrm{Ker }}}(\hbox {Ad}_A - \hbox {id})\). Therefore, when \(X=\hbox {Id}\), the rank of \(D\Psi _A (X)\) equals the geometric multiplicity of 1 as an eigenvalue of \(\hbox {Ad}_A\). To see that this is true for any \(X\), notice that \(\Psi _A = \Psi _{\hbox {Ad}_X(A)} \circ R_{X^{-1}}\) (where \(R\) denotes a right-multiplication diffeomorphism of \(\hbox {GL}(d,\mathbb {C})\)).

We will see later (Lemma 4.11) that 1 is the eigenvalue of \(\hbox {Ad}_A\) with the biggest geometric multiplicity. By Proposition 3.2, we conclude that \({{\mathrm{acyc }}}\hbox {Ad}_A\) equals the codimension of the conjugacy class of \(A\).

3.2 Definition of rigidity, and the main rigidity estimate

Let \(E\) and \(F\) be finite-dimensional complex vector spaces. Let \(H\) be a linear operator action on the space \(\mathcal {L}(E,F)\). We define the rigidity of \(H\), denoted \({{\mathrm{rig }}}H\), as the least \(n\) such that there exist \(L_1\), ..., \(L_n \in \mathcal {L}(E,F)\) so that \(\mathfrak {R}_H(L_1 , \ldots , L_n)\) is transitive. Therefore,
$$\begin{aligned} 1 \leqslant {{\mathrm{rig }}}H \leqslant {{\mathrm{acyc }}}H . \end{aligned}$$
For technical reasons, we also define a modified rigidity of \(H\), denoted \({{\mathrm{rig }}}_+ H\). The definition is the same, with the difference that if \(E = F\), then \(L_1\) is required to be the identity map in \(\mathcal {L}(E,E)\). Of course,
$$\begin{aligned} {{\mathrm{rig }}}H \leqslant {{\mathrm{rig }}}_+ H \leqslant {{\mathrm{rig }}}H + 1. \end{aligned}$$
We want to give a reasonably good estimate of the modified rigidity of \(\hbox {Ad}_A\) for any fixed \(A \in \hbox {GL}(d,\mathbb {C})\). (This will be achieved in Lemma 4.14.) We assume that \(d \geqslant 2\); so \({{\mathrm{rig }}}_+ \hbox {Ad}_A \geqslant 2\). The next example shows that “most” matrices \(A\) have the lowest possible \({{\mathrm{rig }}}_+ \hbox {Ad}_A\).

Example 3.4

If \(A\in \hbox {GL}(d,\mathbb {C})\) is unconstrained (see § A.1), then \({{\mathrm{rig }}}_+ \hbox {Ad}_A = 2\). Indeed, if we take a matrix \(B \in \mathfrak {gl}(d,\mathbb {C})\) whose expression in the base that diagonalizes \(A\) has no zeros off the diagonal, then, by Lemma A.1, \(\Lambda (A,B) = \mathfrak {R}_{\hbox {Ad}_A}(\hbox {Id},B)\) is rich.

More generally, if \(A\in \hbox {GL}(d,\mathbb {C})\) is little constrained (see Appendix A in ESM), then it follows from Proposition A.3 that \({{\mathrm{rig }}}_+ \hbox {Ad}_A = 2\).

Example 3.5

Consider \(A = \hbox {Diag}(1, \alpha , \alpha ^2)\) where \(\alpha = e^{2\pi i /3}\). (In the terminology of § A.1, \(A\) has constraints of type 1.) Since \(\hbox {Ad}_A^3\) is the identity, we have \(\dim \mathfrak {R}_{\hbox {Ad}_A}(\hbox {Id},B) \leqslant 4\) for any \(B \in \mathfrak {gl}(3,\mathbb {C})\). By the result of Azoff [1] already mentioned at Example 2.1, the minimum dimension of a transitive subspace of \(\mathfrak {gl}(3,\mathbb {C})\) is 5. This shows that \({{\mathrm{rig }}}_+ \hbox {Ad}_A \geqslant 3\). (Actually, equality holds, as we will see in Example 3.9 below.)

Let \(T\) be the set of roots of unity. Define an equivalence relation \(\asymp \) on the set \(\mathbb {C}^*\) of nonzero complex numbers by:
$$\begin{aligned} \lambda \asymp \lambda ' \Leftrightarrow \lambda / \lambda ' \in T. \end{aligned}$$
We also say that \(\lambda \), \(\lambda '\) are equivalent mod \(T\).
For \(A \in \hbox {GL}(d,\mathbb {C})\), we denote
$$\begin{aligned} c(A) := \hbox {number of different classes mod T of the eigenvalues of }A. \end{aligned}$$
We now state a technical result which has a central role in our proofs, as explained informally in Sect. 1.4:

Theorem 3.6

Let \(d \geqslant 2\) and \(A \in \hbox {GL}(d,\mathbb {C})\). Then:
  1. 1.

    If \(c(A) = d\), then \({{\mathrm{rig }}}_+ \hbox {Ad}_A = 2\).

  2. 2.

    If \(c(A) < d\), then \({{\mathrm{rig }}}_+ \hbox {Ad}_A \leqslant {{\mathrm{acyc }}}\hbox {Ad}_A - c(A) + 1\).


Remark 3.7

When \(c(A) = d\), we have \({{\mathrm{acyc }}}\hbox {Ad}_A = d\) (this will follow from Lemma 4.11); so the conclusion of part 2 does not hold in this case.

Remark 3.8

The conditions of \(A\) being unconstrained and \(A\) having \(c(A)=d\) both mean that \(A\) is “non-degenerate.” Both of them imply small rigidity, according to Example 3.4 and part 1 of Theorem 3.6. It is important, however, not to confuse the two properties; in fact, none implies the other.

Example 3.9

Consider again \(A\) as in Example 3.5. The eigenvalues of \(\hbox {Ad}_A\) are 1, \(\alpha \), and \(\alpha ^2\), each with multiplicity 3; so Proposition 3.2 gives \({{\mathrm{acyc }}}\hbox {Ad}_A = 3\). So Theorem 3.6 tell us that \({{\mathrm{rig }}}_+ \hbox {Ad}_A \leqslant 3\), which is actually sharp.

The proof of part 1 of Theorem 3.6 will be given in § 3.5 after a few preliminaries (Sect. 3.3 and 3.4). These preliminaries are also used in the proof of the harder part 2, which will be given in Sect. 4.

3.3 A criterion for transitivity

We will show the transitivity of certain spaces of matrices that remotely resemble Toeplitz matrices.

Let \(t\) and \(s\) be positive integers. Let \(\mathcal {R}_1\) be a partition of the interval \([1,t] = \{1, \ldots , t\}\) into intervals, and let \(\mathcal {R}_2\) be a partition of \([1,s]\) into intervals. Let \(\mathcal {R}\) be the product partition. We will be interested in matrices of the following special form:where \(\mathtt {R}\) is an element of the product partition \(\mathcal {R}\), and \(M_{\mathtt {R}}\) is the submatrix \((m_{i,j})_{(i,j)\in \mathtt {R}}\).
Let \(\Lambda \) be a vector space of \(t \times s\) matrices. For each \(\mathtt {R}\in \mathcal {R}\), say of size \(k \times \ell \), we define the following space of matrices:
$$\begin{aligned} \Lambda ^{[\mathtt {R}]} = \big \{N \in \hbox {Mat}_{k \times \ell }(\mathbb {C});\exists M \in \Lambda \hbox { of the form (3.3) with }M_\mathtt {R}= N \big \}. \end{aligned}$$
We regard \(\Lambda \) as a subspace of \(\mathcal {L}(\mathbb {C}^s,\mathbb {C}^t)\). If the rectangle \(\mathtt {R}\) is \([p, p+k-1] \times [q, q+\ell -1]\), we regard the space \(\Lambda ^{[\mathtt {R}]}\) as a subspace of
$$\begin{aligned} \mathcal {L} \big ( \{0\}^{q-1} \times \mathbb {C}^\ell \times \{0\}^{t-q-\ell +1}, \{0\}^{p-1} \times \mathbb {C}^k \times \{0\}^{t-p-k+1} \big ). \end{aligned}$$

Lemma 3.10

Assume that \(\Lambda ^{[\mathtt {R}]}\) is transitive for each \(\mathtt {R}\in \mathcal {R}\). Then \(\Lambda \) is transitive.

An interesting feature of the lemma which will be useful later is that it can be applied recursively. Before giving the proof of the lemma, we illustrate its usefulness by showing the transitivity of generalized Toeplitz spaces:

Proof of Example 2.2

Consider the partition of \([1,d]^2\) into \(1 \times 1\) “rectangles.” If \(\Lambda \) is a generalized Toeplitz space then \(\Lambda ^{[\mathtt {R}]} = \hbox {Mat}_{1\times 1}(\mathbb {C}) = \mathbb {C}\) for each rectangle \(\mathtt {R}\). These are transitive spaces, so Lemma 3.10 implies that \(\Lambda \) is transitive. \(\square \)

Before proving Lemma 3.10, notice the following dual characterization of transitivity, whose proof is immediate:

Lemma 3.11

A subspace \(\Lambda \subset \mathcal {L}(\mathbb {C}^s,\mathbb {C}^t)\) is transitive iff for any nonzero vector \(u \in \mathbb {C}^s\) and any nonzero linear functional \(\phi \in (\mathbb {C}^t)^*\) there exists \(M \in \Lambda \) such that \(\phi (M \cdot u) \ne 0\).

Proof of Lemma 3.10

Take any nonzero vector \(u = (u_1,\ldots ,u_s)\) in \(\mathbb {C}^s\) and a non-zero functional \(\phi (v_1,\ldots ,v_t) = \sum _{i=1}^t \phi _i v_i\) in \((\mathbb {C}^t)^*\). By Lemma 3.11, we need to show that there exists \(M = (x_{ij}) \in \Lambda \) such that
$$\begin{aligned} \phi (M \cdot u) = \sum \limits _{i=1}^t \sum \limits _{j=1}^s \phi _i x_{ij} u_j \end{aligned}$$
is nonzero.

Let \(j_0\) be the least index such that \(u_{j_0} \ne 0\), and let \(i_0\) be the greatest index such that \(\phi _{i_0} \ne 0\). Let \(\mathtt {R}\) be the element of \(\mathcal {R}\) that contains \((i_0,j_0)\). Notice that if \(M\) is of the form (3.3), then the \((i,j)\)-entries of \(M\) that are above left (resp. below right) of \(\mathtt {R}\) do not contribute to the sum (3.5), because \(u_j\) (resp. \(\phi _i\)) vanishes. That is, \(\phi (M \cdot u)\) depends only on \(M_{\mathtt {R}}\) and is given by \(\sum _{(i,j)\in \mathtt {R}} \phi _i x_{ij} u_j\); Since \(\Lambda ^{[\mathtt {R}]}\) is transitive, by Lemma 3.11 there is a choice of a matrix \(M \in \Lambda \) of the form (3.3) so that \(\phi (M \cdot u) \ne 0\). So we are done.

3.4 Preorder in the complex plane

We consider the set \(\mathbb {C}_* / T\) of equivalence classes of the relation (3.1). Since \(T\) is the torsion subgroup of \(\mathbb {C}_*\), the quotient \(\mathbb {C}_* / T\) is an abelian torsion-free group.

Proposition 3.12

There exists a multiplication-invariant total order \(\preccurlyeq \) on \(\mathbb {C}_* / T\).

The proposition follows from a result of Levi [10], but nevertheless let us give a direct proof:


There is an isomorphism between \(\mathbb {R}\oplus (\mathbb {R}/ \mathbb {Q})\) and \(\mathbb {C}_*/T\), namely \((x,y) \mapsto \exp (x + 2\pi i y)\). So it suffices to find a multiplication-invariant order in \(\mathbb {R}/\mathbb {Q}\) (and then take the lexicographic order). Take a Hamel basis \(B\) of the \(\mathbb {Q}\)-vector space \(\mathbb {R}\) so that \(1 \in B\). Then \(\mathbb {R}/\mathbb {Q}\) is a direct sum of abelian groups \(\bigoplus _{x\in B, x \ne 1} x\mathbb {Q}\). Order each \(x\mathbb {Q}\) in the usual way and take any total order on \(B\). Then the induced lexicographic order on \(\mathbb {R}/\mathbb {Q}\) is multiplication-invariant, and the proof is concluded.\(\square \)

Let \([z]\in \mathbb {C}_*/T\) denote the equivalence class of \(z\in \mathbb {C}_*\). Let us extend the notation, writing \(z \preccurlyeq z'\) if \([z] \preccurlyeq [z']\). Then \(\preccurlyeq \) becomes a multiplication-invariant total preorder on \(\mathbb {C}_*\) that induces the equivalence relation \(\asymp \). In other words, for all \(z\), \(z'\), \(z'' \in \mathbb {C}_*\) we have:
  • \(z \preccurlyeq z'\) or \(z' \preccurlyeq z\);

  • \(z \preccurlyeq z'\) and \(z' \preccurlyeq z\)\(\Longleftrightarrow \)\(z \asymp z'\);

  • \(z \preccurlyeq z'\) and \(z' \preccurlyeq z''\)\(\Longrightarrow \)\(z \preccurlyeq z''\);

  • \(z \preccurlyeq z'\)\(\Longrightarrow \)\(z z'' \preccurlyeq z' z''\).

It follows that:
  • \(z \preccurlyeq z'\)\(\Longrightarrow \)\((z')^{-1} \preccurlyeq z^{-1}\).

We write \(z \prec z'\) when \(z \preccurlyeq z'\) and \(z \not \asymp z'\).

3.5 Proof of the easy part of Theorem 3.6

Proof of part 1 of Theorem 3.6

If \(c(A)=d\), then in particular all eigenvalues are different and so the matrix \(A\) is diagonalizable. So with a change of basis, we can assume that \(A = \hbox {Diag}(\lambda _1, \ldots , \lambda _d)\). We can also assume that the eigenvalues are increasing with respect to the preorder introduced in Sect. 3.4:
$$\begin{aligned} \lambda _1 \prec \lambda _2 \prec \cdots \prec \lambda _d . \end{aligned}$$
Fix any matrix \(B\) with only nonzero entries and consider the space \(\Lambda = \mathfrak {R}_{\hbox {Ad}_A}(B)\), which is described by (2.4). We will use Lemma 3.10 to show that \(\Lambda \) is transitive. Let \(\mathcal {R}\) be the partition of \([1,d]^2\) into \(1 \times 1\) rectangles. Given a cell \(\mathtt {R}= \{(i_0,j_0)\} \in \mathcal {R}\) and a coefficient \(t \in \mathbb {C}\), there exists a polynomial \(f\) such that \(f(\lambda _i \lambda _j^{-1})\) equals \(t\) if \(\lambda _i \lambda _j^{-1} = \lambda _{i_0} \lambda _{j_0}^{-1}\) and equals 0 otherwise. Because the eigenvalues are ordered, \(M = f(\hbox {Ad}_A)\cdot B\) is a matrix in \(\Lambda \) of the form (3.3). Also, \(M_{\mathtt {R}} = (t)\). So \(\Lambda ^{[\mathtt {R}]} = \mathbb {C}\), which is transitive. This shows that \({{\mathrm{rig }}}\hbox {Ad}_A = 1\), and \({{\mathrm{rig }}}_+ \hbox {Ad}_A \leqslant 2\). Thus, as \(d\geqslant 2\), we have \({{\mathrm{rig }}}_+ \hbox {Ad}_A = 2\). \(\square \)

4 Proof of the hard part of the rigidity estimate

This section is wholly devoted to proving part 2 of Theorem 3.6. In the course of the proof, we need to introduce some terminology and to establish several intermediate results. None of these are used in the rest of the paper, apart form a simple consequence, which is Remark 4.12.

4.1 The normal form

Let \(A \in \hbox {GL}(d,\mathbb {C})\). In order to describe the estimate on \({{\mathrm{rig }}}_+ \hbox {Ad}_A\), we need to put \(A\) in a certain normal form, which we now explain. Fix a preorder \(\preccurlyeq \) on \(\mathbb {C}_*\) as in § 3.4.

List the eigenvalues of \(A\) without repetitions as
$$\begin{aligned} \lambda _1 \preccurlyeq \cdots \preccurlyeq \lambda _r \end{aligned}$$
Write each eigenvalue in polar coordinates:
$$\begin{aligned} \lambda _k = \rho _k \exp ( \theta _k \sqrt{-1}), \quad \hbox {where } \rho _k > 0 \text{ and } 0 \leqslant \theta _k < 2 \pi . \end{aligned}$$
Up to reordering, we may assume
$$\begin{aligned} \left. \begin{array}{l} \lambda _k \asymp \lambda _\ell \\ k < \ell \end{array} \right\} \Rightarrow \theta _k < \theta _\ell . \end{aligned}$$
With a change of basis, we can assume that \(A\) has Jordan form:
$$\begin{aligned} A = \begin{pmatrix} A_1 &{} &{} \\ &{} \ddots &{} \\ &{} &{} A_r \end{pmatrix}, \quad A_k = \begin{pmatrix} J_{t_{k,1}}(\lambda _k) &{} &{} \\ &{} \ddots &{} \\ &{} &{} J_{t_{k,\tau _k}}(\lambda _k) \end{pmatrix}, \end{aligned}$$
where \(t_{k,1} + \cdots + t_{k,\tau _k} = s_k\) is the multiplicity of the eigenvalue \(\lambda _k\), and \(J_t(\lambda )\) is the following \(t \times t\) Jordan block:The matrix \(A\) will be fixed from now on.

4.2 Rectangular partitions

This subsection contains several definitions that will be fundamental in all arguments until the end of the section. We will define certain subregions of the set \(\{1,\dots ,d\}^2\) of matrix entry positions that depend on the normal form of the matrix \(A\). Later we will see they are related to \(\hbox {Ad}_A\)-invariant subspaces. Those regions will be c-rectangles, e-rectangles, and j-rectangles (where c stands for classes of eigenvalues, e for eigenvalues and j for Jordan blocks). Regions will have some numerical attributes (banners and weights) coming from their geometry and from the eigenvalues of \(A\) they will be associated with. Those attributes will be related to numerical invariants of \(\hbox {Ad}_A\) (eigenvalues and geometric multiplicities), but we use different names so that we remember their geometric meaning and so that they are not mistaken for the corresponding invariants of \(A\). We also introduce positional attributes of the regions (arguments and latitudes) which will be useful fundamental later in the proofs of our rigidity estimates.

Recall \(A\) is a matrix in normal form as explained in Sect. 4.1. Define three partitions \(\mathcal {P}_\mathrm{c}\), \(\mathcal {P}_\mathrm{e}\), \(\mathcal {P}_\mathrm{j}\) of the set \([1,d] = \{1,\ldots ,d\}\) into intervals:
  • The partition \(\mathcal {P}_\mathrm{c}\) corresponds to equivalence classes of eigenvalues under the relation \(\asymp \), that is, the right endpoints of its atoms are the numbers \(s_1 + \cdots + s_k\) where \(k=r\) or \(k\) is such that \(\lambda _k \prec \lambda _{k+1}\).

  • The partition \(\mathcal {P}_\mathrm{e}\) corresponds to eigenvalues: the right endpoints of its atoms are the numbers \(s_1 + \cdots + s_k\), where \(1 \leqslant k \leqslant r\). So \(\mathcal {P}_\mathrm{e}\) refines \(\mathcal {P}_\mathrm{c}\).

  • The partition \(\mathcal {P}_\mathrm{j}\) corresponds to Jordan blocks: the right endpoints of its atoms are the numbers \(s_1 + \cdots + s_{k-1} + t_{k,1} + \cdots + t_{k,\ell }\), where \(1 \leqslant k \leqslant r\) and \(1 \leqslant \ell \leqslant \tau _k\). So \(\mathcal {P}_\mathrm{j}\) refines \(\mathcal {P}_\mathrm{e}\).

For \(*= \mathrm c\), e, j, let \(\mathcal {P}_*^2\) be the partition of the square \([1,d]^2\) into rectangles that are products of atoms of \(\mathcal {P}_*\). The elements of \(\mathcal {P}_\mathrm{c}^2\) are called c-rectangles, the elements of \(\mathcal {P}_\mathrm{e}^2\) are called e-rectangles, and elements of \(\mathcal {P}_\mathrm{j}^2\) are called j-rectangles. Thus, the square \([1,d]^2\) is a disjoint union c-rectangles, each of them is a disjoint union of e-rectangles, each of them is a disjoint union of j-rectangles.

Example 4.1

Suppose \(d=17\), \(A\) has \(r=5\) eigenvalues
$$\begin{aligned} \lambda _1 \!=\! \exp { \tfrac{1}{2}\pi i} , \quad \lambda _2 \!=\! \exp { \tfrac{7}{6}\pi i} , \quad \lambda _3 \!=\! \exp {\tfrac{11}{6}\pi i} , \quad \lambda _4 \!=\! 2\exp { \tfrac{1}{6}\pi i} , \quad \lambda _5 \!=\! 2\exp { \tfrac{5}{6}\pi i} \end{aligned}$$
with \(\lambda _1 \asymp \lambda _2 \asymp \lambda _3 \prec \lambda _4 \asymp \lambda _5\) and respective Jordan blocks of sizes 4, 2, 1; 3, 2; 2; 2; 1. Then there are 4 c-rectangles, 25 e-rectangles, and 64 j-rectangles. See Fig. 1.
Fig. 1

The partitions of the square \([1,d]^2\) corresponding to Example 4.1. Thick (resp., thin, dashed) lines represent c-rectangles (resp., e-, j-) borders. Weight and latitude of each j-rectangle inside a selected e-rectangle are indicated. The weight of each e-rectangle is recorded in its upper left corner, along with a symbolic representation of its banner. There are three banner classes (\(\bullet = [1]\), \(\mathord {\Downarrow }=[2]\) and \(\mathord {\Uparrow }=[1/2]\)), each of them with 3 different banners. The e-rectangles with negative arguments are marked with \(\ominus \)

For each e-rectangle, we define its row eigenvalue and its column eigenvalue in the obvious way: If an e-rectangle \(\mathtt {E}\) equals \(I_k \times I_\ell \) where \(I_k\) and \(I_\ell \) are intervals with right endpoints \(s_1 + \cdots + s_k\) and \(s_1 + \cdots + s_\ell \), respectively, then the row eigenvalue of \(\mathtt {E}\) is \(\lambda _k\) and the column eigenvalue of \(\mathtt {E}\) is \(\lambda _\ell \). The row and column eigenvalues of a j-rectangle \(\mathtt {J}\) are defined, respectively, as the row and column eigenvalues of the e-rectangle that contains it.

Let \(\mathtt {E}\) be an e-rectangle with row eigenvalue \(\lambda _k\) and column eigenvalue \(\lambda _\ell \). The banner of \(\mathtt {E}\) is defined by \(\lambda _k^{-1}\lambda _\ell \). The argument of the e-rectangle is the quantity \(\theta _\ell - \theta _k \in (-2\pi ,2\pi )\). It coincides modulo \(2\pi \) with the argument of the banner, but it contains more information than the argument of the banner.

Each j-rectangle \(\mathtt {J}\) has an address of the type “\(i\)th row, \(j\)th column, e-rectangle \(\mathtt {E}\)”; then the latitude of the j-rectangle \(\mathtt {J}\) within the e-rectangle \(\mathtt {E}\) is defined as \(j-i\). See an example in Fig. 1.

If two e-rectangles lie in the same c-rectangle, then their banners are equivalent mod \(T\). Thus, every c-rectangle has a well-defined banner class in \(\mathbb {C}^*/T\).

If a j-rectangle, e-rectangle, or c-rectangle intersects the diagonal \(\{(1,1), \ldots ,\)\( (d,d)\}\), then we call it equatorial. Equatorial regions are always square. Thus, every equatorial e-rectangle has banner 1.

The weight of a j-rectangle is defined as the minimum of its sides. The weight of a union \(R\) of j-rectangles in \([1,d]^2\) is defined as the sum of the weights of those j-rectangles. We denote it by \({{\mathrm{wgt }}}R\). We can in particular consider the weights of e and c-rectangles, and of the complete square \([1,d]^2\).

Let us notice some facts on the location of the banners (which will be useful to apply Lemma 3.10):

Lemma 4.2

Let \(\mathtt {E}\) be an e-rectangle in a c-rectangle \(\mathtt {C}\). Consider the divisions of the square \([1,d]^2\) and the c-rectangle \(\mathtt {C}\) as in Fig. 2.
Fig. 2

The divisions of \([1,d]^2\) and \(\mathtt {C}\) in Lemma 4.2

Let \(\beta \) be the banner of the e-rectangle \(\mathtt {E}\), and let \([\beta ]\) be the banner class of the c-rectangle \(\mathtt {C}\). Then:
  1. 1.

    All the c-rectangles with banner class \([\beta ]\) are inside the rectangles marked with \(\times \).

  2. 2.

    If the e-rectangle \(\mathtt {E}\) has nonnegative (resp. negative) argument, then the all the e-rectangles with nonnegative (resp. negative) argument and with same banner \(\beta \) are inside the rectangles marked with \(*\).



In view of the ordering of the eigenvalues (4.1), the banner class increases strictly (with respect to the order \(\prec \), of course) when we move rightwards or upwards to another c-rectangle. So Claim (1) follows.

The argument of an e-rectangle takes values in the interval \((-2\pi ,2\pi )\). It increases strictly by moving rightwards or upwards inside \(\mathtt {C}\). If two e-rectangles in the same c-rectangle have both nonnegative or negative argument, then they have the same banner if and only if they have the same argument. So Claim (2) follows.

4.3 The action of the adjoint of \(A\)

Given any \(d\times d\) matrix \(X = (x_{i,j})\) and a j-rectangle, e-rectangle or c-rectangle \(\mathtt {R}= [p,p+t-1] \times [q, q+s-1]\) we define the submatrix of \(X\) corresponding to \(\mathtt {R}\) as \((x_{i,j})_{(i,j) \in \mathtt {R}}\). We regard the space of \(\mathtt {R}\)-submatrices as \(\mathcal {L} \big ( \{0\}^{q-1} \times \mathbb {C}^s \times \{0\}^{d-q-s+1} , \{0\}^{p-1} \times \mathbb {C}^t \times \{0\}^{d-p-t+1} \big )\), or as the set of \(d \times d\) matrices whose entries outside \(\mathtt {R}\) are all zero. Such spaces are denoted by \(\mathtt {R}^\square \), and are invariant under \(\hbox {Ad}_A\). Indeed, if \(\mathtt {R}= \mathtt {J}\) is a j-rectangle, then identifying \(\mathtt {J}^\square \) with \(\hbox {Mat}_{t\times s}(\mathbb {C})\), the action of \(\hbox {Ad}_A | \mathtt {J}^\square \) is given by
$$\begin{aligned} X \mapsto J_{t}(\lambda _k) \cdot X \cdot J_{s}(\lambda _\ell )^{-1}, \end{aligned}$$
where \(\lambda _k\) and \(\lambda _\ell \) are, respectively, the row and the column eigenvalues of \(\mathtt {J}\) and \(J\) denotes Jordan blocks as defined by (4.3).

Lemma 4.3

For each j-rectangle \(\mathtt {J}\), the only eigenvalue of \(\hbox {Ad}_A | \mathtt {J}^\square \) is the banner of the e-rectangle that contains \(\mathtt {J}\). Moreover, the geometric multiplicity of the eigenvalue is the weight of the j-rectangle.


The matrix of the linear operator \(\hbox {Ad}_A | \mathtt {J}^\square \) can be described using the Kronecker product: see [9, Lemma 4.3.1]. The Jordan form of this operator is then described by [9, Theorem 4.3.17(a)]. The assertions of the lemma follow.

Some immediate consequences are the following:
  • The eigenvalues of \(\hbox {Ad}_A\) are the banners of e-rectangles.

  • The geometric multiplicity of the eigenvalue \(\beta \) for \(\hbox {Ad}_A\) is the total weight of e-rectangles of banner \(\beta \).

If \(\mathtt {R}\) is an equatorial j-rectangle, e-rectangle, or c-rectangle, we will refer to the \(d\times d\)-matrix in \(\mathtt {R}^\square \) whose \(\mathtt {R}\)-submatrix is the identity as the identity on\(\mathtt {R}^\square \). The following observation will be useful:

Lemma 4.4

If \(\mathtt {J}\) is an equatorial j-rectangle, then the identity on \(\mathtt {J}^\square \) is an eigenvector of the operator \(\hbox {Ad}_A | \mathtt {J}^\square \) corresponding to a Jordan block of size \(1 \times 1\).


Suppose \(\mathtt {J}\) has size \(t \times t\) and row (or column) eigenvalue \(\lambda \). Assume that the claim is false. This means that there exists a matrix \(X \in \hbox {Mat}_{t \times t}(\mathbb {C})\) such that \(J_t(\lambda ) X J_t(\lambda )^{-1} = X + \hbox {Id}\), which is impossible because \(X\) and \(X + \hbox {Id}\) have different spectra.

4.4 Rigidity estimates for j-rectangles and e-rectangles

Lemma 4.5

For any j-rectangle \(\mathtt {J}\), we have \({{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {J}^\square ) \leqslant {{\mathrm{wgt }}}\mathtt {J}\).


By Lemma 4.3 (and Proposition 3.2), \(\hbox {Ad}_A|\mathtt {J}^\square \) has acyclicity \(n = {{\mathrm{wgt }}}\mathtt {J}\), that is, there are matrices \(X_1\), ..., \(X_n \in \mathtt {J}^\square \) such that \(\mathfrak {R}_{\hbox {Ad}_A}(X_1, \ldots , X_n)\) is the whole \(\mathtt {J}^\square \) (and, in particular, is transitive in \(\mathtt {J}^\square \)). So \({{\mathrm{rig }}}(\hbox {Ad}_A|\mathtt {J}^\square ) \leqslant n\), which proves the lemma for non-equatorial j-rectangles.

If \(\mathtt {J}\) is an equatorial j-rectangle then, by Lemma 4.4, \(\mathtt {J}^\square \) splits invariantly into two subspaces, one of them spanned by the identity matrix on \(\mathtt {J}^\square \). So we can choose the matrices \(X_i\) above so that \(X_1\) is the identity. This shows that \({{\mathrm{rig }}}_+ (\hbox {Ad}_A|\mathtt {J}^\square ) \leqslant n\).

In all that follows, we adopt the convention \(\max \varnothing = 0\).

Lemma 4.6

For any e-rectangle \(\mathtt {E}\),
$$\begin{aligned} {{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {E}^\square ) \leqslant \sum \limits _{\ell \,\mathrm{latitude}} \mathop {\mathop {\max }\limits _{\mathtt {J}\subset \mathtt {E}\, \mathrm{is}\,\mathrm{a}\,\mathrm{j}\hbox {-}\mathrm{rectangle}}}\limits _{\mathrm{with}\,\mathrm{latitude} \ell } {{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {J}^\square ) . \end{aligned}$$


For each j-rectangle \(\mathtt {J}\) contained in \(\mathtt {E}\), let \(r(\mathtt {J}) = {{\mathrm{rig }}}_+ (\hbox {Ad}_A|\mathtt {J}^\square )\). Take matrices \(X_{\mathtt {J}, 1}\), ..., \(X_{\mathtt {J}, r(\mathtt {J})}\) such that \(\Lambda _\mathtt {J}:= \mathfrak {R}_{\hbox {Ad}_A} \big ( X_{\mathtt {J}, 1}, \ldots , X_{\mathtt {J}, r(\mathtt {J})} \big )\) is a transitive subspace of \(\mathtt {J}^\square \), and \(X_{\mathtt {J},1}\) is the identity matrix in \(\mathtt {J}^\square \) if \(\mathtt {J}\) is an equatorial \(\mathrm{j}\)-rectangle. Define \(X_{\mathtt {J}, i} = 0\) for \(i>r(\mathtt {J})\). For each latitude \(\ell \), let \(n_\ell \) be the maximum of \(r(\mathtt {J})\) over the j-rectangles \(\mathtt {J}\) of \(\mathtt {E}\) with latitude \(\ell \), and let
$$\begin{aligned} Y_{\ell , i} = \mathop {\mathop \sum \limits _{\mathtt {J}\subset \mathtt {E}\,\mathrm{is}\,\mathrm{a}\,\mathrm{j}\hbox {-}\mathrm{rectangle}}}\limits _{\mathrm{with}\,\mathrm{latitude}\,\ell } X_{\mathtt {J}, i} , \quad \hbox {for }1 \leqslant i \leqslant n_\ell . \end{aligned}$$
Notice that if \(\mathtt {E}\) is an equatorial e-rectangle then \(Y_{0,1}\) is the identity matrix in \(\mathtt {E}^\square \). Consider the space
$$\begin{aligned} \Delta = \mathfrak {R}_{\hbox {Ad}_A} \big \{Y_{\ell ,i} ; \; \ell \hbox { is a latitude, } 1 \leqslant i \leqslant n_\ell \big \}. \end{aligned}$$
We claim that for every j-rectangle \(\mathtt {J}\) in \(\mathtt {E}\) and for every \(M \in \Lambda _\mathtt {J}\), we can find some \(N \in \Delta \) with the following properties:
  • the submatrix \(N_\mathtt {J}\) equals \(M\);

  • for every j-rectangle \(\mathtt {J}'\) in \(\mathtt {E}\) that has a different latitude than \(\mathtt {J}\), the submatrix \(N_{\mathtt {J}'}\) vanishes.

Indeed, if \(M = \sum _{i=1}^{r(\mathtt {J})} f_i(\hbox {Ad}_A) X_{\mathtt {J},i}\) for certain polynomials \(f_i\), we simply take \(N = \sum _{i=1}^{r(\mathtt {J})} f_i(\hbox {Ad}_A) Y_{\ell ,i}\), where \(\ell \) is the latitude of \(\mathtt {J}\).

In notation (3.4), the claim we have just proved means that \(\Delta ^{[\mathtt {J}]} \supset \Lambda _\mathtt {J}\). So we can apply Lemma 3.10 and conclude that \(\Delta \) is a transitive subspace of \(\mathtt {E}^\square \). Therefore, \({{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {E}^\square ) \leqslant \sum n_\ell \), as we wanted to show.

Example 4.7

Using Lemmas 4.5 and 4.6, we see that the e-rectangle \(\mathtt {E}\) whose j-rectangle weights are indicated in Fig. 1 has \({{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {E}^\square ) \leqslant 5\).

In fact, we will not use Lemmas 4.5 and 4.6 directly, but only the following immediate consequence:

Lemma 4.8

For every e-rectangle \(\mathtt {E}\) we have \({{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {E}^\square ) \leqslant {{\mathrm{wgt }}}\mathtt {E}\). The inequality is strict if \(\mathtt {E}\) has more than one row of j-rectangles and more that one column of j-rectangles.

4.5 Comparison of weights

If \(\mathtt {R}\) is a j-rectangle, e-rectangle or c-rectangle, we define its row projection\(\pi _{\mathrm{r}}(\mathtt {R})\) as the unique equatorial j-rectangle, e-rectangle or c-rectangle (respectively) that is in the same row as \(\mathtt {R}\). Analogously, we define the column projection\(\pi _{\mathrm{c}}(\mathtt {R})\).

Lemma 4.9

For any e-rectangle \(\mathtt {E}\), we have
$$\begin{aligned} {{\mathrm{wgt }}}\mathtt {E}\leqslant \frac{{{\mathrm{wgt }}}\pi _{\mathrm{r}}(\mathtt {E}) + {{\mathrm{wgt }}}\pi _{\mathrm{c}}(\mathtt {E})}{2} . \end{aligned}$$
Moreover, if equality holds then the number of rows of j-rectangles for \(\mathtt {E}\) equals the number of columns of j-rectangles.

This is a clear consequence of the abstract lemma below, taking \(x_\alpha \), \(\alpha \in F_0\) (resp. \(\alpha \in F_1\)) as the sequence of heights (resp. widths) of j-rectangles in \(\mathtt {E}\), counting repetitions.

Lemma 4.10

Let \(F\) be a nonempty finite set, and let \(x_\alpha \) be positive numbers indexed by \(\alpha \in F\). Take any partition \(F = F_0 \sqcup F_1\), where \(\sqcup \) stands for disjoint union. For \(\epsilon \), \(\delta \in \{0,1\}\), let
$$\begin{aligned} \Sigma _{\epsilon \delta } = \sum \limits _{(\alpha ,\beta ) \in F_\epsilon \times F_\delta } \min (x_\alpha , x_\beta ) . \end{aligned}$$
$$\begin{aligned} \Sigma _{01} = \Sigma _{10} \leqslant \frac{\Sigma _{00} + \Sigma _{11}}{2} . \end{aligned}$$
Moreover, equality implies that \(F_0\) and \(F_1\) have the same cardinality.


We will in fact prove the stronger fact:
$$\begin{aligned} \Sigma _{00} - 2\Sigma _{01} + \Sigma _{11} \geqslant \big (|F_0| -|F_1|\big )^2 \min \limits _{\alpha \in F} x_\alpha , \end{aligned}$$
where \(|\mathord {\cdot }|\) denotes set cardinality. The proof is by induction on \(|F|\). It clearly holds for \(|F|=1\). Fix some \(n\) and assume that (4.4) always holds when \(|F| = n\). Take a set \(F\) with \(|F|=n+1\), and take positive numbers \(x_\alpha \), \(\alpha \in F\). We can assume that \(F=\{1, \ldots , n+1\}\) and that \(x_1 \geqslant \cdots \geqslant x_{n+1}\). Take any partition \(F = F_0 \sqcup F_1\). Without loss of generality, assume that \(n+1 \in F_0\). Apply the induction hypothesis to \(F'=\{1, \ldots , n\}\), obtaining
$$\begin{aligned} \Sigma '_{00} - 2\Sigma '_{01} + \Sigma '_{11} \geqslant \big (|F_0| - 1 - |F_1|)^2 x_n. \end{aligned}$$
We have
$$\begin{aligned} \Sigma _{00} = \Sigma _{00}' + \big ( 2|F_0| -1 \big ) x_{n+1} , \quad \Sigma _{01} = \Sigma '_{01} + |F_1| x_{n+1} , \quad \hbox {and} \quad \Sigma _{11} = \Sigma '_{11} , \end{aligned}$$
so (4.4) follows.

If \(\mathtt {R}\) is a c-rectangle or the entire square \([1,d]^2\), let \({{\mathrm{wgt }}}_1 \mathtt {R}\) denote the sum of the weights of the e-rectangles in \(\mathtt {R}\) with banner 1.

Let us give the following useful consequence of Lemma 4.9:

Lemma 4.11

\({{\mathrm{acyc }}}\hbox {Ad}_A = {{\mathrm{wgt }}}_1 [1,d]^2\).


By Proposition 3.2, \({{\mathrm{acyc }}}\hbox {Ad}_A\) is the maximum of the geometric multiplicities of the eigenvalues of \(\hbox {Ad}_A\). Those eigenvalues are the banners \(\beta \), and the geometric multiplicity of each \(\beta \) is the total weight with banner \(\beta \). Thus, to prove the lemma we have to show that banner 1 has biggest total weight. \(\square \)

Let \(\beta \) be a banner. Then, using Lemma 4.9,
$$\begin{aligned} \mathop {\mathop {\sum }\limits _{\mathtt {E}\,\mathrm{is}\,\mathrm{an}\,\mathrm{e}\hbox {-}\mathrm{rectangle}}}\limits _{\mathrm{with}\,\mathrm{banner}\,\beta } {{\mathrm{wgt }}}\mathtt {E}\leqslant \frac{1}{2} \mathop {\mathop {\sum }\limits _{\mathtt {E}\,\mathrm{is}\,\mathrm{an}\,\mathrm{e}\hbox {-}\mathrm{rectangle}}}\limits _{\mathrm{with}\,\mathrm{banner}\,\beta } {{\mathrm{wgt }}}\pi _{\mathrm{r}}(\mathtt {E}) + \frac{1}{2} \mathop {\mathop {\sum }\limits _{\mathtt {E}\,{\mathrm{is}\,\mathrm{an}\,\mathrm{e}\hbox {-}\mathrm{rectangle}}}}\limits _{\mathrm{with}\,\mathrm{banner}\beta } {{\mathrm{wgt }}}\pi _{\mathrm{c}}(\mathtt {E}) . \end{aligned}$$
Since no two e-rectangles in the same row (resp. column) can have the same banner, the restriction of \(\pi _{\mathrm{r}}\) (resp. \(\pi _{\mathrm{c}}\)) to the set of e-rectangles with banner \(\beta \) is a one-to-one map. This allows us to conclude.

Remark 4.12

The Jordan type of a matrix \(A \in \hbox {Mat}_{d \times d}(\mathbb {C})\) consists on the following data:
  1. 1.

    The number of different eigenvalues.

  2. 2.

    For each eigenvalue, the number of Jordan blocks and their sizes.

It follows from Lemma 4.11 that these data is sufficient to determine \({{\mathrm{acyc }}}\hbox {Ad}_A\).

4.6 Rigidity estimate for c-rectangles

Lemma 4.13

For any c-rectangle \(\mathtt {C}\),
$$\begin{aligned} {{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {C}^\square ) \leqslant \frac{{{\mathrm{wgt }}}_1 \pi _{\mathrm{r}}(\mathtt {C}) + {{\mathrm{wgt }}}_1 \pi _{\mathrm{c}}(\mathtt {C})}{2} . \end{aligned}$$

In order to prove this lemma, it is convenient to consider separately the cases of non-equatorial and equatorial c-rectangles.

Proof of Lemma 4.13

when\(\mathtt {C}\)is non-equatorial. For each banner \(\beta \) in \(\mathtt {C}\), let \(n_\beta \) (resp. \(s_\beta \)) be the maximum of \({{\mathrm{rig }}}_+(\hbox {Ad}_A | \mathtt {E}^\square )\) over nonnegative (resp. negative) argument e-rectangles \(\mathtt {E}\) in \(\mathtt {C}\) with banner \(\beta \). For each e-rectangle \(\mathtt {E}\) with banner \(\beta \), choose matrices \(X_{\mathtt {E},1}\), ..., \(X_{\mathtt {E}, n_\beta + s_\beta } \in \mathtt {E}^\square \) such that:
  • \(\Lambda _\mathtt {E}:= \mathfrak {R}_{\hbox {Ad}_A} ( X_{\mathtt {E},1}, \ldots , X_{\mathtt {E}, m})\) is a transitive subspace of \(\mathtt {E}^\square \);

  • if \(\mathtt {E}\) has negative argument then \(X_{1} = X_{2} = \cdots = X_{n_\beta } = 0\);

  • if \(\mathtt {E}\) has nonnegative argument then \(X_{n_\beta +1} = \cdots = X_{n_\beta +s_\beta } = 0\).

Also, let \(X_{\mathtt {E}, j} = 0\) for \(j>n_\beta +s_\beta \).
Next, define
$$\begin{aligned} Y_{\beta , j} = \mathop {\mathop {\sum }\limits _{\mathtt {E}\,{\mathrm{is}\,\mathrm{an}\,\mathrm{e}\hbox {-}\mathrm{rectangle}}}}\limits _{\mathrm{of}\,\mathtt {C}\,\mathrm{with}\,\mathrm{banner}\,\beta } X_{\mathtt {E}, j} \end{aligned}$$
$$\begin{aligned} Z_j = \sum \limits _{\beta \hbox { banner on }\mathtt {C}} Y_{\beta , j} \end{aligned}$$
Consider the space
$$\begin{aligned} \Delta = \mathfrak {R}_{\hbox {Ad}_A} (Z_1, \ldots , Z_m), \quad \hbox {where} \quad m = \max _{\beta \hbox { banner on }\mathtt {C}} (n_\beta + s_\beta ) \end{aligned}$$
It follows from Lemma 3.1 that
$$\begin{aligned} \Delta = \mathfrak {R}_{\hbox {Ad}_A} \big \{Y_{\beta ,j} ; \; \beta \hbox { is a banner, } 1 \leqslant j \leqslant n_\beta + s_\beta \big \}. \end{aligned}$$
Recall notation (3.4). We claim that
$$\begin{aligned} \Lambda _\mathtt {E}\subset \Delta ^{[\mathtt {E}]}. \end{aligned}$$
Indeed, given \(M \in \Lambda _{\mathtt {E}}\), write \(M = \sum _j f_j(\hbox {Ad}_A) X_{\mathtt {E}, j}\), where the \(f_j\)’s are polynomials and \(f_j \equiv 0\) whenever \(X_{\mathtt {E},j}=0\). Consider \(N = \sum _j f_j(\hbox {Ad}_A) Y_{\beta , j}\), where \(\beta \) is the banner of \(\mathtt {E}\). Then it follows from Lemma 4.2 (part 2) that \(N \in \Delta ^{[\mathtt {E}]}\). This shows (4.7). So, by Lemma 3.10, \(\Delta \) is a transitive subspace of \(\mathtt {C}^\square \), showing that \({{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {C}^\square ) \leqslant m\).
To complete the proof of the lemma in the non-equatorial case, we show that
$$\begin{aligned} m \leqslant \frac{{{\mathrm{wgt }}}_1 \pi _{\mathrm{r}}(\mathtt {C}) + {{\mathrm{wgt }}}_1 \pi _{\mathrm{c}}(\mathtt {C})}{2} . \end{aligned}$$
Let \(\beta \) be the banner for which \(n_\beta + s_\beta \) attains the maximum \(m\). If \(n_\beta >0\), let \(\mathtt {E}_+\) be a nonnegative argument e-rectangle in \(\mathtt {C}\) with banner \(\beta \) and \({{\mathrm{rig }}}_+(\hbox {Ad}_A | \mathtt {E}_+^\square ) = n_\beta \). If \(s_\beta >0\), let \(\mathtt {E}_-\) be a negative argument e-rectangle in \(\mathtt {C}\) with banner \(\beta \) and \({{\mathrm{rig }}}_+(\hbox {Ad}_A | \mathtt {E}_-^\square ) = s_\beta \). Assume for the moment that both e-rectangles exist. Let \(\mathtt {E}_1\), \(\mathtt {E}_2\), \(\mathtt {E}_3\), \(\mathtt {E}_4\) be projected equatorial e-rectangles as in Fig. 3.
Fig. 3

The case of non-equatorial c-rectangles: \(\mathtt {E}_1 = \pi _{\mathrm{r}}(\mathtt {E}_+)\), \(\mathtt {E}_2 = \pi _{\mathrm{r}}(\mathtt {E}_-)\), \(\mathtt {E}_3 = \pi _{\mathrm{c}}(\mathtt {E}_-)\), \(\mathtt {E}_4 = \pi _{\mathrm{c}}(\mathtt {E}_+)\)

$$\begin{aligned} m&= {{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {E}_+) + {{\mathrm{rig }}}_+ (\hbox {Ad}_A | \mathtt {E}_-)\mathop {\leqslant }\limits ^{\hbox {(i)}} {{\mathrm{wgt }}}\mathtt {E}_+ + {{\mathrm{wgt }}}\mathtt {E}_- \\&\mathop {\leqslant }\limits ^{\hbox {(ii)}} \tfrac{1}{2}\big ( {{\mathrm{wgt }}}\mathtt {E}_1 + \cdots + {{\mathrm{wgt }}}\mathtt {E}_4 \big )\leqslant \tfrac{1}{2}\big ({{\mathrm{wgt }}}_1 \mathtt {C}_1 + {{\mathrm{wgt }}}_1 \mathtt {C}_2 \big ), \end{aligned}$$
where (i) and (ii) follow, respectively, from Lemmas 4.8 and 4.9. This proves (4.8) in this case. If there is no nonnegative argument e-rectangle or no negative argument e-rectangle within \(\mathtt {C}\) with banner 1 then the proof of (4.8) is easier.

So the lemma is proved for non-equatorial \(\mathtt {C}\).

We now consider equatorial c-rectangles. There is a special kind of c-rectangle for which the proof of the rigidity estimate has to follow a different strategy. A c-rectangle is called exceptional if it has only the banners 1 and \(-1\) (so it is equatorial and has 4 e-rectangles), each e-rectangle has a single j-rectangle, and all j-rectangles have the same weight.

Proof of Lemma 4.13

when\(\mathtt {C}\)is equatorial non-exceptional. As in the previous case, let \(n_\beta \) (resp. \(s_\beta \)) be the maximum of \({{\mathrm{rig }}}_+(\hbox {Ad}_A|\mathtt {E}^\square )\) over the nonnegative (resp. negative) argument e-rectangles \(\mathtt {E}\) in \(\mathtt {C}\) with banner \(\beta \).

We claim that
$$\begin{aligned} n_\beta + s_\beta < {{\mathrm{wgt }}}_1 \mathtt {C}\quad \hbox {for all banners }\beta \ne 1\hbox { in }\mathtt {C}. \end{aligned}$$
Let us postpone the proof of this inequality and see how to conclude.
Let \(M={{\mathrm{wgt }}}_1 \mathtt {C}\). In view of Lemma 4.8 and relation (4.9), for each c-rectangle \(\mathtt {E}\) we can take matrices \(X_{\mathtt {E}, 1}\), ..., \(X_{\mathtt {E}, M} \in \mathtt {E}^\square \) such that:
  • \(\Lambda _\mathtt {E}:= \mathfrak {R}_{\hbox {Ad}_A} (X_{\mathtt {E}, 1}, \ldots , X_{\mathtt {E}, M})\) is a transitive subspace of \(\mathtt {E}^\square \);

  • \(X_{\mathtt {E}, M} = 0\) if \(\mathtt {E}\) is non-equatorial;

  • \(X_{\mathtt {E}, M}\) is the identity in \(\mathtt {E}^\square \) if \(\mathtt {E}\) is equatorial.

Then define matrices \(Z_j\) as before: by (4.5) and (4.6). Here we have that \(Z_M\) is the identity matrix in \(\mathtt {C}^\square \). As before, \(\mathfrak {R}_{\hbox {Ad}_A}(Z_1, \ldots , Z_M)\) is a transitive subspace of \(\mathtt {C}^\square \). Hence, \({{\mathrm{rig }}}_+(\hbox {Ad}_A|\mathtt {C}^\square ) \leqslant M = {{\mathrm{wgt }}}_1 \mathtt {C}\), as desired.
Now let us prove (4.9). Consider a banner \(\beta \ne 1\) in \(\mathtt {C}\). Let \(\mathtt {E}_+\) (resp. \(\mathtt {E}_-\)) be a nonnegative (resp. negative) argument e-rectangle within \(\mathtt {C}\) with banner \(\beta \) and of maximal weight; assume for the moment that both e-rectangles exist. Let \(\mathtt {E}_1\), \(\mathtt {E}_2\), \(\mathtt {E}_3\), \(\mathtt {E}_4\) be projected equatorial e-rectangles as in Fig. 4. Then
$$\begin{aligned} n_\beta + s_\beta&= {{\mathrm{rig }}}_+(\hbox {Ad}_A | \mathtt {E}_+) + {{\mathrm{rig }}}_+(\hbox {Ad}_A | \mathtt {E}_-) \nonumber \\&\leqslant {{\mathrm{wgt }}}\mathtt {E}_+ + {{\mathrm{wgt }}}\mathtt {E}_- \end{aligned}$$
$$\begin{aligned}&\leqslant \tfrac{1}{2}\big ( {{\mathrm{wgt }}}\mathtt {E}_1 + \cdots + {{\mathrm{wgt }}}\mathtt {E}_4 \big ) \end{aligned}$$
$$\begin{aligned}&\leqslant {{\mathrm{wgt }}}_1 \mathtt {C}. \end{aligned}$$
Inequality (4.10) follows from Lemma 4.8, inequality (4.11) follows from Lemma 4.9, and inequality (4.12) holds because the e-rectangles \(\mathtt {E}_1\), ..., \(\mathtt {E}_4\) are equatorial, and any e-rectangle can appear at most twice in this list. So
$$\begin{aligned} n_\beta + s_\beta \leqslant {{\mathrm{wgt }}}_1 \mathtt {C}. \end{aligned}$$
In the case that there is no nonnegative argument e-rectangle or no negative argument e-rectangle with banner \(\beta \) (i.e., \(n_\beta \) or \(s_\beta \) vanishes), a simpler argument shows that strict inequality holds in (4.13).

Now assume by contradiction that (4.9) does not hold. Then we must have equality in (4.13). By what we have just seen, both e-rectangles \(\mathtt {E}_+\) and \(\mathtt {E}_-\) above exist. Then the inequalities in (4.10)–(4.12) become equalities. Since (4.12) is an equality, there must be exactly two equatorial e-rectangles in \(\mathtt {C}\). So the non-equatorial banner \(\beta \) satisfies \(\beta ^{-1} = \beta \), that is, \(\beta =-1\). Since (4.11) is an equality, it follows from Lemma 4.9 that both non-equatorial e-rectangles have the same number of j-rectangles in each column and each row. So there is some \(\ell \) such that all four e-rectangles in \(\mathtt {C}\) have \(\ell \) rows of j-rectangles and \(\ell \) columns of j-rectangles. Since (4.10) is an equality, Lemma 4.8 implies that \(\ell =1\). That is, \(\mathtt {C}\) is a exceptional c-rectangle, a situation which we excluded a priori. This contradiction proves (4.9) and Lemma 4.13 in the present case. \(\square \)

Fig. 4

The case of equatorial non-exceptional c-rectangles: \(\mathtt {E}_1 = \pi _{\mathrm{c}}(\mathtt {E}_-)\), \(\mathtt {E}_2 = \pi _{\mathrm{r}}(\mathtt {E}_+)\), \(\mathtt {E}_3 = \pi _{\mathrm{c}}(\mathtt {E}_+)\), \(\mathtt {E}_4 = \pi _{\mathrm{r}}(\mathtt {E}_-)\). It is possible that \(\mathtt {E}_1 = \mathtt {E}_2\) or \(\mathtt {E}_3 = \mathtt {E}_4\)

Let us now deal with exceptional c-rectangles. In all the previous cases, the transitive subspace we found had some vaguely Toeplitz form. For exceptional c-rectangles, however, this strategy is not efficient. What we are going to do is to find a transitive space of vaguely Hankel form, namely the following:
$$\begin{aligned} \Lambda _k = \left\{ \begin{pmatrix} P &{} M \\ M &{} N \end{pmatrix}; \; M, N, P\hbox { are }k\times k\hbox { matrices} \right\} . \end{aligned}$$
Notice that \(\Lambda _k = S_k \cdot \Gamma _k\), where
$$\begin{aligned} S_k = \begin{pmatrix} 0 &{} \hbox {Id}\\ \hbox {Id}&{} 0 \end{pmatrix} \quad \hbox {and} \quad \Gamma _k = \left\{ \begin{pmatrix} M &{} N \\ P &{} M \end{pmatrix}; \; M, N, P\hbox { are }k\times k\hbox { matrices} \right\} . \end{aligned}$$
Since \(\Gamma _k\) is a generalized Toeplitz space, it follows from Remark 2.3 that \(\Lambda _k\) is transitive.

Proof of Lemma 4.13

when\(\mathtt {C}\)is exceptional. If \(\mathtt {C}\) is exceptional then it has size \(2k \times 2k\) for some \(k\), and the operator \(\hbox {Ad}_A | \mathtt {C}^\square \) is given by \(X \mapsto \hbox {Ad}_L(X)\), where
$$\begin{aligned} L = \begin{pmatrix} J &{} 0 \\ 0 &{} -J \end{pmatrix}, \quad \hbox {and }J = J_k(1)\hbox { is the Jordan block (4.13).} \end{aligned}$$
Let \(V\) be unique \(\hbox {Ad}_J\)-invariant subspace of \(\hbox {Mat}_{k \times k}(\mathbb {C})\) that has codimension 1 and does not contain the identity matrix (which exists by Lemma 4.4). Take matrices \(X_1\), ..., \(X_k \in \hbox {Mat}_{k \times k}(\mathbb {C})\) such that \(X_1 = \hbox {Id}\) and \(V = \mathfrak {R}_{\hbox {Ad}_J} (X_2, \ldots , X_k)\). Define \(Y_1\), ..., \(Y_k \in \hbox {Mat}_{2k \times 2k} (\mathbb {C})\) by
$$\begin{aligned} Y_1 = \begin{pmatrix} \hbox {Id}&{} 0 \\ 0 &{} \hbox {Id}\end{pmatrix}, \quad Y_j = \begin{pmatrix} X_j &{} 0 \\ 0 &{} 0 \end{pmatrix}\quad \hbox { for }2 \leqslant j \leqslant k, \end{aligned}$$
$$\begin{aligned} \mathfrak {R}_{\hbox {Ad}_L}(Y_1,\ldots ,Y_k) = \left\{ \begin{pmatrix} x\hbox {Id}+ Z &{} 0 \\ 0 &{} x \hbox {Id}\end{pmatrix} ; \; x \in \mathbb {C}, \ Z \in V \right\} . \end{aligned}$$
For \(j=k+1\), ..., \(2k\), define
$$\begin{aligned} Y_j = \begin{pmatrix} 0 &{} X_{j-k} \\ X_{j-k} &{} X_{j-k} \end{pmatrix}. \end{aligned}$$
Then, by Lemma 3.1,
$$\begin{aligned} \mathfrak {R}_{\hbox {Ad}_L}(Y_{k+1},\ldots ,Y_{2k}) = \left\{ \begin{pmatrix} 0 &{} M \\ M &{} N \end{pmatrix} ; \; M, \ N \in \hbox {Mat}_{k \times k}(\mathbb {C}) \right\} . \end{aligned}$$
Therefore, \(\mathfrak {R}_{\hbox {Ad}_L}(Y_1,\ldots ,Y_{2k})\) is the transitive space given by (4.14). Since \(Y_1\) is the identity on \(\mathtt {C}\), this shows that \({{\mathrm{rig }}}_+(\hbox {Ad}_A|\mathtt {C}^\square ) \leqslant 2k = {{\mathrm{wgt }}}_1 \mathtt {C}\), concluding the proof of Lemma 4.13.

4.7 The final rigidity estimate

Let \(c = c(A)\) be the number of equivalence classes mod \(T\) of eigenvalues of \(A\).

Lemma 4.14

If \(c<d\) then
$$\begin{aligned} {{\mathrm{rig }}}_+ \hbox {Ad}_A \leqslant {{\mathrm{wgt }}}_1 [1,d]^2 - c + 1 . \end{aligned}$$


Let \(m = {{\mathrm{wgt }}}_1 [1,d]^2 - c + 1\). For each c-rectangle \(\mathtt {C}\), let
$$\begin{aligned} r(\mathtt {C}) = \big \lfloor \tfrac{1}{2} ({{\mathrm{wgt }}}_1 \pi _{\mathrm{r}}(\mathtt {C}) + {{\mathrm{wgt }}}_1 \pi _{\mathrm{c}}(\mathtt {C})) \big \rfloor . \end{aligned}$$
We claim that
$$\begin{aligned} r(\mathtt {C}) \leqslant {\left\{ \begin{array}{ll} m &{}\hbox {if }\mathtt {C}\hbox { is an equatorial c-rectangle,} \\ m-1 &{}\hbox {if }\mathtt {C}\hbox { is a non-equatorial c-rectangle.} \end{array}\right. } \end{aligned}$$
Let us postpone the proof of this inequality and see how it implies the lemma.
In view of Lemma 4.13 and relation (4.15), for each c-rectangle \(\mathtt {C}\) we can take matrices \(X_{\mathtt {C}, 1}\), ..., \(X_{\mathtt {C}, m} \in \mathtt {C}^\square \) such that:
  • \(\Lambda _\mathtt {C}:= \mathfrak {R}_{\hbox {Ad}_A} (X_{\mathtt {C}, 1}, \ldots , X_{\mathtt {C}, m})\) is a transitive subspace of \(\mathtt {C}^\square \);

  • \(X_{\mathtt {C}, m} = 0\) if \(\mathtt {C}\) is non-equatorial;

  • \(X_{\mathtt {C}, m}\) is the identity in \(\mathtt {C}^\square \) if \(\mathtt {C}\) is equatorial.

Define matrices:
$$\begin{aligned} Y_{\alpha , j}&= \mathop {\mathop {\sum }\limits _{\mathtt {C}\,\mathrm{is}\,\mathrm{a}\,\mathrm{c}\hbox {-}\mathrm{rectangle}}}\limits _{\mathrm{with}\,\mathrm{banner}\,\mathrm{class}\,\alpha } X_{\mathtt {C}, j} \qquad (\alpha \,\hbox {is a banner class}, \ 1 \leqslant j \leqslant m), \\ Z_j&= \sum _{\alpha \,\mathrm{is}\,\mathrm{a}\,\mathrm{banner}\,\mathrm{class}} Y_{\alpha , j} \qquad (1 \leqslant j \leqslant m). \end{aligned}$$
So \(Z_m\) is the \(d \times d\) identity matrix. Consider the space
$$\begin{aligned} \Delta = \mathfrak {R}_{\hbox {Ad}_A} (Z_1, \ldots , Z_m). \end{aligned}$$
It follows from Lemma 3.1 that
$$\begin{aligned} \Delta = \mathfrak {R}_{\hbox {Ad}_A} \big \{ Y_{\alpha ,j} ; \; \alpha \hbox { is a banner class, } 1 \leqslant j \leqslant m \big \}. \end{aligned}$$
We claim that every c-rectangle \(\mathtt {C}\),
$$\begin{aligned} \Lambda _\mathtt {C}\subset \Delta ^{[\mathtt {C}]}. \end{aligned}$$
Indeed, if \(M \in \mathtt {C}\) then we can write \(M = \sum _j f_j(\hbox {Ad}_A) X_{\mathtt {C},j}\), where the \(f_j\)’s are polynomials. Consider \(N = \sum _j f_j(\hbox {Ad}_A) Y_{\alpha ,j}\), where \(\alpha \) is the banner class of \(\mathtt {C}\). It follows Lemma 4.2 (part 1) that \(N \in \Delta ^{[\mathtt {C}]}\). This proves (4.16). So, by Lemma 3.10, \(\Delta \) is a transitive subspace of \(\hbox {Mat}_{d \times d}(\mathbb {C})\), showing that \({{\mathrm{rig }}}_+ \hbox {Ad}_A \leqslant m\).

To conclude the proof, we have to show estimate (4.15). First consider a equatorial c-rectangle \(\mathtt {C}\). Since there are \(c\) equatorial c-rectangles, and each of them has a nonzero \({{\mathrm{wgt }}}_1\) value, we conclude that \(r(\mathtt {C}) \leqslant m\), as claimed.

Now take a non-equatorial \(\mathtt {C}\). Applying what we have just proved for the equatorial c-rectangles \(\pi _{\mathrm{r}}(\mathtt {C})\) and \(\pi _{\mathrm{c}}(\mathtt {C})\), we conclude that \(r(\mathtt {C}) \leqslant m\). Now assume that (4.15) does not hold for \(\mathtt {C}\), that is, \(r(\mathtt {C}) = m\). Then
$$\begin{aligned} {{\mathrm{wgt }}}_1 \pi _{\mathrm{r}}(\mathtt {C}) = {{\mathrm{wgt }}}_1 \pi _{\mathrm{c}}(\mathtt {C}) = m = {{\mathrm{wgt }}}_1 [1,d]^2 - c + 1. \end{aligned}$$
Since \({{\mathrm{wgt }}}_1 [1,d]^2 \geqslant {{\mathrm{wgt }}}_1 \pi _{\mathrm{r}}(\mathtt {C}) + {{\mathrm{wgt }}}_1 \pi _{\mathrm{c}}(\mathtt {C}) + c - 2\), we have \(m=1\) and \({{\mathrm{wgt }}}_1 [1,d]^2 = c\). This means that \({{\mathrm{wgt }}}_1 \tilde{\mathtt {C}} = 1\) for all equatorial c-rectangles \(\tilde{\mathtt {C}}\), which is only possible if \(c=d\). However, this case was excluded by hypothesis.

This proves (4.15) and hence Lemma 4.14.

Example 4.15

If \(A\) is the matrix of Example 4.1, then Lemma 4.14 gives the estimate \({{\mathrm{rig }}}_+ \hbox {Ad}_A \leqslant 28\). A more careful analysis (going through the proofs of the lemmas) would give \({{\mathrm{rig }}}_+ \hbox {Ad}_A \leqslant 7\) (see Example 4.7). \(\square \)

Proof of part 2 of Theorem 3.6

Apply Lemmas 4.14 and 4.11.

5 Proof of the hard part of the codimension \(m\) theorem

We showed in Proposition 2.9 that \({{\mathrm{codim }}}\mathcal {P}_m^{(\mathbb {K})} \leqslant m\). In this section, we will prove the reverse inequalities. More precisely, we will first prove Theorem 1.5 and then deduce Theorem 1.4 from it.

5.1 Preliminaries on elementary algebraic geometry

5.1.1 Quasiprojective varieties

An algebraic subset of \(\mathbb {C}^n\) is also called an affine variety. A projective variety is a subset of \(\mathbb {C}\hbox {P}^n\) that can be expressed as the zero set of a family of homogeneous polynomials in \(n+1\) variables. The Zariski topology on an (affine or projective) variety \(X\) is the topology whose closed sets are the (affine or projective) subvarieties of \(X\).

An open subset \(U\) of a projective variety \(X\) is called a quasiprojective variety. We consider in \(U\) the induced Zariski topology. The affine space \(\mathbb {C}^n\) can be identified with a quasiprojective variety, namely its image under the embedding \((z_1, \ldots , z_n) \mapsto (1: z_1 : \cdots : z_n)\).

If \(X\) and \(Y\) are quasi-projective varieties, then the product \(X \times Y\) can be identified with a quasiprojective variety, namely its image under the Segre embedding; see [12, § 5.1].

Recall the following property from [12, p. 58]:

Proposition 5.1

If \(X\) is a projective variety and \(Y\) is a quasiprojective variety, then the projection \(p :X \times Y \rightarrow Y\) takes Zariski closed sets to Zariski closed sets.

A quasiprojective variety is called irreducible if it cannot be written as a nontrivial union of two quasiprojective varieties (that is, none contains the other).

5.1.2 Dimension

The dimension \(\dim X\) of an irreducible quasiprojective variety \(X\) may be defined in various equivalent ways (see for instance [8, p. 133ff]). It will be sufficient for us to know that there exists an (intrinsically defined) subvariety \(Y\) of the singular points of\(X\) such that in a neighborhood of each point of \(X {\backslash } Y\), the set \(X\) is a complex submanifold of dimension (in the classical sense of differential geometry) \(\dim X\); moreover, each irreducible component of \(Y\) has dimension strictly less than \(\dim X\).

The dimension of a general quasiprojective variety is by definition the maximum of the dimensions of the irreducible components.

The following lemma is useful to estimate the codimension of an algebraic set \(X\) from information about the fibers of a certain projection \(\pi :X \rightarrow Y\).

Lemma 5.2

Let \(Y\) be a quasiprojective variety. Let \(X \subset Y \times \mathbb {C}\hbox {P}^n\) be a nonempty algebraically closed set. Let \(\pi :X \rightarrow Y\) be the projection along \(\mathbb {C}\hbox {P}^n\). Then:
  1. 1.
    For each \(j \geqslant 0\), the set
    $$\begin{aligned} C_j = \{ y \in \pi (X) ; \; {{\mathrm{codim }}}\pi ^{-1}(y) \leqslant j \} \end{aligned}$$
    is algebraically closed in \(Y\).
  2. 2.
    The dimension of \(X\) is given in terms of the dimensions of the \(C_j\)’s by:
    $$\begin{aligned} {{\mathrm{codim }}}X = \min \limits _{j ; \; C_j \ne \varnothing } \big ( j + {{\mathrm{codim }}}C_j \big ) . \end{aligned}$$

In the above, the codimensions of \(\pi ^{-1}(Y)\), \(X\) and \(C_j\) are taken with respect to \(\mathbb {C}\hbox {P}^n\), \(Y \times \mathbb {C}\hbox {P}^n\) and \(Y\), respectively. The proof of the lemma is given in Appendix B (ESM).

Remark 5.3

Lemma 5.2 works with the same statement if \(\mathbb {C}\hbox {P}^{n}\) is replaced by \(\mathbb {C}^{n+1}\), provided one assumes that \(X \subset Y \times \mathbb {C}^{n+1}\) is homogeneous in the second factor (i.e., \((y,z) \in X\) implies \((y,tz)\in X\) for every \(t\in \mathbb {C}\)). Indeed, this follows from the fact that the projection \(\mathbb {C}^{n+1}{\backslash }\{0\} \rightarrow \mathbb {C}\hbox {P}^{n}\) preserves codimension of homogeneous sets.

5.1.3 Dimension estimates for sets of vector subspaces

If \(M \in \hbox {Mat}_{n \times m}(\mathbb {K})\), let \({{\mathrm{col }}}M \subset \mathbb {K}^n\) denote the column space of \(M\). A set \(X \subset \hbox {Mat}_{n \times m}(\mathbb {K})\) is called column-invariant if
$$\begin{aligned} \left. \begin{array}{c} M \in X \\ N \in \hbox {Mat}_{n \times m}(\mathbb {K})\\ {{\mathrm{col }}}M = {{\mathrm{col }}}N \end{array} \right\} \ \Rightarrow \ N \in X. \end{aligned}$$
So a column-invariant set \(X\) is characterized by its set of column spaces. We enlarge the latter set by including also subspaces, thus defining:
$$\begin{aligned}{}[\![X ]\!]:= \big \{ E \hbox { subspace of } \mathbb {K}^n ; \; E \subset {{\mathrm{col }}}M \hbox { for some } M \in X \big \}. \end{aligned}$$
In Appendix B (ESM) we prove:

Theorem 5.4

Let \(X \subset \hbox {Mat}_{n \times m}(\mathbb {C})\) be an algebraically closed, column-invariant set. Suppose \(E\) is a vector subspace of \(\mathbb {C}^n\) that does not belong to \([\![X ]\!]\). Then
$$\begin{aligned} {{\mathrm{codim }}}X \geqslant m + 1 - \dim E . \end{aligned}$$

5.1.4 The real part of an algebraic set

Let \(X\) be an algebraically closed subset of \(\mathbb {C}^n\). The real part of \(X\) is defined as \(X \cap \mathbb {R}^n\). This is an algebraically closed subset of \(\mathbb {R}^n\). Indeed, generators of the corresponding ideal \(f_1,\ldots , f_k\) in \(\mathbb {C}[T_1,\ldots , T_n]\) can be replaced by the corresponding real and imaginary parts polynomials.

As in the complex case, there are many equivalent algebraic–geometric definitions of dimensions of real algebraic or semialgebraic sets. We just point out that a real algebraic or semialgebraic set admits a stratification into real manifolds such that the maximal differential–geometric dimension of the strata coincides with the algebraic–geometric dimension (see [2, § 3.4] or [3, p. 50]).

The following is an immediate consequence of [2, Prop. 3.3.2]:

Proposition 5.5

If \(X\) is an algebraically closed subset of \(\mathbb {C}^n\), then \(\dim _\mathbb {R}(X \cap \mathbb {R}^n) \leqslant \dim _\mathbb {C}X\).

5.2 Rigidity and the dimension of the poor fibers

For simplicity of notation, let us write \(\mathcal {P}_m = \mathcal {P}_m^{(\mathbb {C})}\). Also, for \(A \in \hbox {GL}(d,\mathbb {C})\), write:
$$\begin{aligned} r(A) := {{\mathrm{rig }}}_+ \hbox {Ad}_A - 1 . \end{aligned}$$
We decompose the set \(\mathcal {P}_m\) of poor data in fibers:
$$\begin{aligned} \mathcal {P}_m = \bigcup _{A \in \hbox {GL}(d,\mathbb {C})} \{A\} \times \mathcal {P}_m(A), \quad \hbox {where} \quad \mathcal {P}_m(A) \subset \mathfrak {gl}(d,\mathbb {C})^m . \end{aligned}$$

Lemma 5.6

For any \(A \in \hbox {GL}(d,\mathbb {C})\), the codimension of \(\mathcal {P}_m(A)\) in \(\mathfrak {gl}(d,\mathbb {C})^m\) is at least \(m + 1 - r(A)\).

The lemma follows easily from Theorem 5.4 above:


Fix \(A \in \hbox {GL}(d,\mathbb {C})\) and write \(r=r(A)\). We can assume that \(r \leqslant m\), otherwise there is nothing to prove. By definition, there exists a \(r\)-dimensional subspace \(E \subset \mathfrak {gl}(d,\mathbb {C})^m\) such that \(\mathfrak {R}_{\hbox {Ad}_A}(\hbox {Id}\vee E)\) is transitive. Identify \(\mathfrak {gl}(d,\mathbb {C})\) with \(\mathbb {C}^{d^2}\) and thus regard \(\mathcal {P}_m(A)\) as a subset of \(\hbox {Mat}_{d^2 \times m} (\mathbb {C})\). Since the set \(\mathcal {P}_m\) is algebraically closed and saturated (recall § 2.3), the fiber \(\mathcal {P}_m(A)\) is algebraically closed and column-invariant, as required by Theorem 5.4. In the notation (5.2), we have \(E \not \in [\![\mathcal {P}_m(A) ]\!]\). So Theorem 5.4 gives the desired codimension estimate.

5.3 How rare is high rigidity?

For simplicity of notation, let us write:
$$\begin{aligned} a(A) := {{\mathrm{acyc }}}\hbox {Ad}_A \quad \hbox {for }A \in \hbox {GL}(d,\mathbb {C}). \end{aligned}$$
So Theorem 3.6 says that \(r(A) \leqslant a(A)-c(A)\) provided \(c(A) < d\).

Lemma 5.7

For any integer \(k \geqslant 1\), the set
$$\begin{aligned} M_k = \big \{ A \in \hbox {GL}(d,\mathbb {C}) ; \; r(A) \geqslant k \big \}; \end{aligned}$$
is algebraically closed in \(\hbox {GL}(d,\mathbb {C})\); moreover if \(M_k \ne \varnothing \) then
$$\begin{aligned} {{\mathrm{codim }}}M_k {\left\{ \begin{array}{ll} = 0 &{}\hbox {if }k=1,\\ \geqslant k &{}\hbox {if }k \geqslant 2. \end{array}\right. } \end{aligned}$$

Lemma 5.7 is basically a consequence of Theorem 3.6, using the following construction:

Lemma 5.8

There is a family \(\mathcal {G}(A)\) of subsets of \(\hbox {GL}(d,\mathbb {C})\), indexed by \(A\in \hbox {GL}(d,\mathbb {C})\), such that the following properties hold:
  1. 1.

    Each \(\mathcal {G}(A)\) contains \(A\).

  2. 2.

    Each \(\mathcal {G}(A)\) is an immersed manifold of codimension \(a(A)-c(A)\).

  3. 3.

    There are only countably many different sets \(\mathcal {G}(A)\).



Fix any \(A \in \hbox {GL}(d,\mathbb {C})\). Then \(A\) is conjugate to a matrix in Jordan form:
$$\begin{aligned} \tilde{A} = \left( \begin{array}{c@{\quad }c@{\quad }c} J_{t_1}(\lambda _1) &{} &{} \\ &{} \ddots &{} \\ &{} &{} J_{t_n}(\lambda _n) \end{array} \right) , \end{aligned}$$
where \(J_\lambda (t)\) denotes Jordan block as in (4.3). Let \(U\) be the set of matrices of the form
$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c} J_{t_1}(\mu _1) &{} &{} \\ &{} \ddots &{} \\ &{} &{} J_{t_n}(\mu _n) \end{array} \right) , \end{aligned}$$
where \(\mu _1\), ..., \(\mu _n\) are nonzero complex numbers such that
$$\begin{aligned} \lambda _i = \lambda _j \,\Leftrightarrow \,\mu _i = \mu _j \qquad \hbox {and} \qquad \lambda _i \asymp \lambda _j \,\Leftrightarrow \,\frac{\lambda _i}{\lambda _j} = \frac{\mu _i}{\mu _j} . \end{aligned}$$
Then \(U\) is an embedded submanifold of \(\hbox {GL}(d,\mathbb {C})\) of dimension \(c(A)\). Every \(Y \in U\) has the same Jordan type as \(A\), and so, by Remark 4.12, \(a(Y) = a(A)\). We define the set \(\mathcal {G}(A)\) as the image of the map \(\Psi = \Psi _A :\hbox {GL}(d,\mathbb {C}) \times U \rightarrow \hbox {GL}(d,\mathbb {C})\) given by \(\Psi (X,Y) = \hbox {Ad}_X (Y)\). Notice that \(\mathcal {G}(A)\) does not depend on the choice of \(\tilde{A}\). Actually \(\mathcal {G}(A)\) is characterized by the sizes of the Jordan blocks \(t_1\), ..., \(t_n\), the pairs \((i,j)\) such that \(\lambda _i \asymp \lambda _j\) and the corresponding roots of unity; in particular there are countably many such sets \(\mathcal {G}(A)\).
Let us check that property 2 holds. Let \(\partial _1 \Psi \) and \(\partial _2 \Psi \) denote the partial derivatives with respect to \(X\) and \(Y\), respectively. As we have seen in Remark 3.3, the rank of \(\partial _1 \Psi (X,Y)\) is equal to \(d^2 - a(Y) = d^2 - a(A)\) for every \((X,Y)\). On the other hand, \(\partial _2 \Psi (X,Y)\) is one-to-one and therefore of rank \(c(A)\). We claim that
$$\begin{aligned} (\hbox {image of } \partial _1 \Psi (X,Y)) \cap (\hbox {image of } \partial _2 \Psi (X,Y)) = \{0\} ; \end{aligned}$$
To see this, consider the map \(F :\hbox {Mat}_{d\times d}(\mathbb {C}) \rightarrow \mathbb {C}^d\) that associates with each matrix the coefficients of its characteristic polynomial. Then \(\partial _1 (F \circ \Psi )(X,Y) = 0\), while \(\partial _2 (F \circ \Psi )(X,Y)\) is one-to-one. So (5.4) follows. As a result, at every point the rank of the derivative of \(\Psi \) is equal to the sum of the ranks of the partial derivatives, that is, \(d^2 - a(A) + c(A)\). Therefore, by the Rank Theorem, the image of \(\Psi \) is an immersed manifold of codimension \(a(A) - c(A)\).

Proof of Lemma 5.7

If \(k=1\), then \(M_1 = \hbox {GL}(d,\mathbb {C})\) (since \(d \geqslant 2\)), so there is nothing to prove. Consider \(k \geqslant 2\). We have already shown in Sect. 2.3 that \(\mathcal {P}_k\) is algebraic. Since \(M_k = \{ A \in \hbox {GL}(d,\mathbb {C}) ; \; \forall \hat{X} \in \mathfrak {gl}(d,\mathbb {C})^{k},\,(A, \hat{X}) \in \mathcal {P}_k \}\), it is evident that \(M_k\) is algebraically closed as well. We are left to estimate its dimension.

Take a nonsingular point \(A_0\) of \(M_k\) where the local dimension is maximal. Let \(D\) be the intersection of \(M_k\) with a small neighborhood of \(A_0\); it is an embedded disk. Each \(A \in D\) has \(r(A) \geqslant 2\); therefore, by (both parts of) Theorem 3.6, we have \(a(A) - c(A) \geqslant r(A) \geqslant k\). So, in terms of the sets from Lemma 5.8,
$$\begin{aligned} D \subset \bigcup _{A \hbox { s.t. } a(A) - c(A) \geqslant k} \mathcal {G}(A). \end{aligned}$$
The right hand side is a countable union of immersed manifolds of codimension at least \(k\). It follows (e.g., by Baire Theorem) that \(D\) (and hence \(M_k\)) has codimension at least \(k\).

5.4 Proof of Theorems 1.5 and 1.4

We apply Lemmas 5.6 and 5.7 to prove one of our major results:

Proof of Theorem 5.7

The set \(\mathcal {P}_m \subset \hbox {GL}(d,\mathbb {C}) \times [\mathfrak {gl}(d,\mathbb {C})]^m\) is homogeneous in the second factor. Using Lemma 5.2 together with Remark 5.3, we obtain that the sets
$$\begin{aligned} C_j = \big \{A \in \hbox {GL}(d,\mathbb {C}) ; \; {{\mathrm{codim }}}\mathcal {P}_m (A) \leqslant j \big \} \end{aligned}$$
are algebraically closed in \(\hbox {GL}(d,\mathbb {C})\), and
$$\begin{aligned} {{\mathrm{codim }}}\mathcal {P}_m = \min \limits _{j ; \; C_j \ne \varnothing } \big ( j + {{\mathrm{codim }}}C_j \big ) . \end{aligned}$$
By Lemma 5.6, we have \(C_j \subset M_{m+1-j}\). Therefore, by Lemma 5.7,
$$\begin{aligned} C_j \ne \varnothing \quad \Rightarrow \quad {{\mathrm{codim }}}C_j {\left\{ \begin{array}{ll} \geqslant 0 &{}\hbox {if }j=m, \\ \geqslant m-j+1 &{}\hbox {if }j \leqslant m-1. \end{array}\right. } \end{aligned}$$
So \({{\mathrm{codim }}}\mathcal {P}_m \geqslant m\), as we wanted to show.

The proof above only used that \({{\mathrm{codim }}}C_j \geqslant m-j\). On the other hand, using the full power of (5.6) we obtain:

Scholium 5.9

The set of poor data in “fat fibers,” namely
$$\begin{aligned} \mathcal {F}_m := \big \{ (A, B_1, \ldots , B_m) \in \mathcal {P}_m^{(\mathbb {C})} ; \; {{\mathrm{codim }}}\mathcal {P}_m(A) \leqslant m-1 \big \}, \end{aligned}$$
has codimension at least \(m+1\) in \(\hbox {GL}(d,\mathbb {C}) \times [\mathfrak {gl}(d,\mathbb {C})]^m\).


The projection of \(\mathcal {F}_m\) on \(\hbox {GL}(d,\mathbb {C})\) is \(C_{m-1}\). Use Lemma 5.2 (together with Remark 5.3) and (5.6).

Next, let us consider the real case:

Proof of Theorem 1.4

The real part of \(\mathcal {P}^{(\mathbb {C})}_m\) is a real algebraic set which, in view of Proposition 5.5, has codimension at least \(m\). Recall from Sect. 2.3 that this set contains the semialgebraic set \(\mathcal {P}^{(\mathbb {R})}_m\), which therefore has codimension at least \(m\). Since we already knew from Proposition 2.9 that \({{\mathrm{codim }}}\mathcal {P}^{(\mathbb {R})}_m \leqslant m\), the theorem is proved.

6 Proof of the main result

We now use Theorem 1.4 and transversality theorems to prove our main result. For precise definitions and statements on the objects used in this section, see Appendix C (ESM).

A stratification is a filtration by closed subsets of a smooth manifold \(X\)
$$\begin{aligned} \Sigma = \Sigma _n \supset \Sigma _{n-1} \supset \cdots \supset \Sigma _0 \end{aligned}$$
such that for each \(i\), the set \(\Gamma _i= \Sigma _i {\backslash } \Sigma _{i-1}\) (where \(\Sigma _{-1}:=\varnothing \)) is a smooth submanifold of \(X\) without boundary, and the dimension of \(\Gamma _i\) decreases strictly with increasing \(i\).

We say that a \(C^1\)-map is transverse to that stratification if it is transverse to each of the submanifolds \(\Gamma _i\). There are explicit, so-called Whitney conditions that guarantee that a stratification behaves nicely with respect to transversality, as the next proposition shows. A stratification satisfying those conditions is called a Whiney stratification. By the classical Theorem C.1 stated in Appendix C (in ESM) (see for instance [7]), any semi-algebraic subset of an affine space admits a canonical Whitney stratification.

We refer the reader to Appendix C (ESM) for the definitions of jets, jet extensions and for a proof of the following:

Proposition 6.1

Let \(X\), \(Y\) be \(C^\infty \)-manifolds without boundary. Let \(\Sigma \) be a Whitney stratified closed subset of the set of 1-jets from \(X\) to \(Y\). Then the set of maps \(f \in C^2(X,Y)\) whose 1-jet extension \(j^1f\) is transverse to \(\Sigma \) is \(C^2\)-open and \(C^\infty \)-dense in \(C^2(X,Y)\) (i.e., its intersection with \(C^r(X,Y)\) is \(C^r\)-dense, for every \(2\leqslant r\leqslant \infty \)).

By Theorem 1.4, \(\mathcal {P}_m^{(\mathbb {R})}\) is a closed semialgebraic subset of \(\hbox {GL}(d,\mathbb {R}) \times \mathfrak {gl}(d,\mathbb {R})^m\) of codimension \(m\). The closure \(\overline{\mathcal {P}_m^{(\mathbb {R})}}\) of \(\mathcal {P}_m^{(\mathbb {R})}\) in \([\hbox {Mat}_{d\times d}(\mathbb {R})]^{1+m}\) is a closed semialgebraic set of the affine space \([\hbox {Mat}_{d\times d}(\mathbb {R})]^{1+m}\). As mentioned above, it admits a canonical Whitney stratification
$$\begin{aligned} \overline{\mathcal {P}_m^{(\mathbb {R})}} = \hat{\Gamma }_n \supset \cdots \supset \hat{\Gamma }_0 . \end{aligned}$$
The differentiable codimension of that stratification is also \(m\). By locality of the Whitney conditions (see Proposition C.2 of Appendix C in ESM), this stratification restricts to a Whitney stratification of codimension \(m\):
$$\begin{aligned} \mathcal {P}_{m}^{(\mathbb {R})} = \Gamma _n \supset \cdots \supset \Gamma _0 . \end{aligned}$$
Since that stratification of \(\overline{\mathcal {P}_m^{(\mathbb {R})}}\) is canonical, the stratification (6.1) is invariant under polynomial automorphisms of \(\hbox {GL}(d,\mathbb {R}) \times \mathfrak {gl}(d,\mathbb {R})^m\) that preserve \(\mathcal {P}_{m}^{(\mathbb {R})}\).

Proof of Theorem 1.1

Let \(\mathcal {U}\) be a smooth manifold without boundary and of dimension \(m\). Given local coordinates on an open set \(U\subset \mathcal {U}\), the set \(J^1(U,\hbox {GL}(d,\mathbb {R}))\) of 1-jets from \(U\) to \(\hbox {GL}(d,\mathbb {R})\) may be identified with the set
$$\begin{aligned} U \times \hbox {GL}(d,\mathbb {R}) \times \mathfrak {gl}(d,\mathbb {R})^m. \end{aligned}$$
Indeed, a jet \(\mathbf {J}\) represented by a pair \((u,A)\) can be identified with the point
$$\begin{aligned} (u,A(u),B_1, \ldots ,B_m)\in U \times \hbox {GL}(d,\mathbb {R}) \times \mathfrak {gl}(d,\mathbb {R})^m, \end{aligned}$$
where \(B_i\in \hbox {Mat}_{d \times d}(\mathbb {R})\) is the normalized derivative of \(A\) at \(u\), along the \(i\)th coordinate. Let us say that the 1-jet \(\mathbf {J}\) is rich if the datum \(\mathbf {A}= (A(u),B_1, \ldots ,B_m)\) is rich, or equivalently, if for sufficiently large \(N\), the input \((u,\ldots ,u)\in \mathcal {U}^N\) is universally regular for the system (1.4). If the jet is not rich then it is called poor.
Define a filtration
$$\begin{aligned} \Sigma _n \supset \cdots \supset \Sigma _0 \end{aligned}$$
of the set of poor jets from \(\mathcal {U}\) to \(\hbox {GL}(d,\mathbb {R})\) as follows: a jet \(\mathbf {J}\) represented as above in local coordinates by \((u,A(u),B_1, \ldots ,B_m)\) belongs to \(\Sigma _i\) if and only if \((A(u), B_1, \ldots ,B_m)\) belongs to the set \(\Gamma _i\) in (6.1). We need to check that this definition does not depend on the choice of the local coordinates. Indeed, this follows from \(\mathcal {P}_m^{(\mathbb {R})}\) being a saturated set (see Sect. 2.3) and from the invariance of (6.1) by polynomial automorphisms.
We claim that the filtration (6.2) is a Whitney stratification of codimension \(m\). Indeed, the intersection of the filtration with the open subset \(J^1(U,\hbox {GL}(d,\mathbb {R}))\) of \(J^1(\mathcal {U},\hbox {GL}(d,\mathbb {R}))\) is identified (through a smooth diffeomorphism) with the filtration
$$\begin{aligned} U \times \Gamma _n \supset \cdots \supset U \times \Gamma _0. \end{aligned}$$
Such a filtration is still a Whitney stratification (see Proposition C.2 of Appendix C in ESM) of codimension \(m\) in \(J^1(U,\hbox {GL}(d,\mathbb {R}))\approx U \times \hbox {GL}(d,\mathbb {R}) \times \mathfrak {gl}(d,\mathbb {R})^m\). Covering \(\mathcal {U}\) by open sets \(U\), we deduce that (6.2) is a Whitney stratification of codimension \(m\) in \(J^1(\mathcal {U},\hbox {GL}(d,\mathbb {R}))\).

Applying Proposition 6.1, we obtain a \(C^2\)-open \(C^\infty \)-dense set \(\mathcal {O}\subset C^2(\mathcal {U},\)\(\hbox {GL}(d,\mathbb {C}))\) formed by maps \(A\) that are transverse to the stratification (6.2) of the set of poor jets. Since the codimension of the stratification equals the dimension of \(\mathcal {U}\), if \(A \in \mathcal {O}\), then the points \(u\) for which \(j^1 A(u)\) is poor form a 0-dimensional set. This proves Theorem 1.1.



We are grateful for the hospitality of Institute Mittag–Leffler, where this work begun to take form. We thank R. Potrie, L. San Martin, S. Tikhomirov, and C. Tomei for valuable discussions. We thank the referees for corrections, references to the literature, and other suggestions that helped to improve the exposition.

Supplementary material

498_2014_126_MOESM1_ESM.pdf (469 kb)
Supplementary material 1 (pdf 469 KB)


  1. 1.
    Azoff EA (1986) On finite rank operators and preannihilators, vol 64, no 357. Mem. Amer. Math. Soc.Google Scholar
  2. 2.
    Benedetti R, Risler J-J (1990) Real algebraic and semi-algebraic sets. Hermann, ParisMATHGoogle Scholar
  3. 3.
    Bochnak J, Coste M, Roy M-F (1998) Real algebraic geometry. Springer, BerlinCrossRefMATHGoogle Scholar
  4. 4.
    Colonius F, Kliemann W (1993) Linear control semigroups acting on projective space. J Dynam Differ Eqs 5(3):495–528CrossRefMATHMathSciNetGoogle Scholar
  5. 5.
    Colonius F, Kliemann W (2000) The dynamics of control. Birkhäuser, Boston, MACrossRefGoogle Scholar
  6. 6.
    Elliott DL (2009) Bilinear control systems. Springer, DordrechtCrossRefMATHGoogle Scholar
  7. 7.
    Gibson CG, Wirthmüller K, du Plessis AA, Looijenga EJN (1976) Topological stability of smooth mappings. Lecture Notes in Mathematics, vol. 552. Springer, BerlinGoogle Scholar
  8. 8.
    Harris J (1992) Algebraic geometry: a first course. Springer, New YorkCrossRefMATHGoogle Scholar
  9. 9.
    Horn RA, Johnson CR (1994) Topics in matrix analysis. Corrected reprint of the 1991 original. Cambridge University Press, CambridgeGoogle Scholar
  10. 10.
    Levi FW (1942) Ordered groups. Proc Indian Acad Sci 16:256–263MATHGoogle Scholar
  11. 11.
    Roman S (2008) Advanced linear algebra, 3rd edn. Springer, New YorkMATHGoogle Scholar
  12. 12.
    Shafarevich IG (1994) Basic algebraic geometry, vol 1, 2nd edn. Springer, BerlinCrossRefGoogle Scholar
  13. 13.
    Sontag ED (1992) Universal nonsingular controls. Syst Control Lett 19(3):221–224 Errata: Ibid, 20 (1993), no. 1, 77CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Sontag ED, Wirth FR (1998) Remarks on universal nonsingular controls for discrete-time systems. Syst Control Lett 33(2):81–88CrossRefMATHMathSciNetGoogle Scholar
  15. 15.
    Wirth F (1998) Dynamics of time-varying discrete-time linear systems: spectral theory and the projected system. SIAM J Control Optim 36(2):447–487CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.Facultad de Matemáticas, Pontificia Universidad Católica de ChileSantiagoChile
  2. 2.Institut de Mathématiques de BordeauxUniversité Bordeaux IBordeauxFrance

Personalised recommendations