1 Introduction

Let \(\mathfrak {g}\) be a semisimple Lie algebra, with Iwasawa decomposition \(g = \mathfrak {k} \oplus \mathfrak {a} \oplus \mathfrak {n}\), where \(\mathfrak {k}\) is compact, \(\mathfrak {a}\) is abelian, and \(\mathfrak {n}\) is nilpotent, and let \(\mathfrak {m}\) be the centraliser of \(\mathfrak {a}\) in \(\mathfrak {k}\). Then \(\mathfrak {n} = \sum _{\gamma \in \Sigma ^+} \mathfrak {g}_\gamma \), where \(\Sigma ^+\) is the set of positive restricted roots and \(\mathfrak {g}_\gamma \) is the restricted root space corresponding to the restricted root \(\gamma \).

We study the derivations of \(\mathfrak {n}\) which preserve its root space decomposition, that is, the derivations D such that \(D(\mathfrak {g}_\gamma ) \subseteq \mathfrak {g}_\gamma \) for each positive restricted root \(\gamma \). By definition and the Jacobi identity, if \(W \in \mathfrak {m} \oplus \mathfrak {a}\), then \([H,W] = 0\) for all \(H \in \mathfrak {a}\), and so \([H, [W,X]] = [W, [H,X]] = \gamma (H) [W,X]\) for all \(X \in \mathfrak {g}_\gamma \); thus \({\text {ad}}(W)\) preserves the root space decomposition. The main point of this paper is that, unless \(\mathfrak {g}\) contains a simple summand of the form \(\mathfrak {so}(n,1)\) or \(\mathfrak {su}(n,1)\), every root-space-preserving derivation D of \(\mathfrak {n}\) is of the form \({\text {ad}}(W)\), where \(W \in \mathfrak {m} \oplus \mathfrak {a}\), and more precisely the symmetric part of D is in \({\text {ad}}(\mathfrak {a})\) while the skew-symmetric part of D is in \({\text {ad}}(\mathfrak {m})\).

The root-space-preserving derivations are known for real rank-one simple Lie algebras. Indeed, Korányi [16] showed that in the rank-one case, \(\mathfrak {n}\) is an H-type Lie algebra, and the Lie algebra of derivations and the automorphism group of an H-type algebra were found by Riehm [20] and Saal [21]. Our work may be viewed as a development of the ideas of these authors.

Here is our strategy. First, the algebra \(\mathfrak {g}\) splits into a sum of simple ideals, and we show that it suffices to consider the case where \(\mathfrak {g}\) is simple. As noted, the real rank-one case is understood; we treat the algebras of real rank two by considering the possible restricted root systems separately, and using a detailed analysis of several H-type algebras contained in \(\mathfrak {n}\) in each case. A key argument at this stage is showing that each derivation is the sum of a symmetric and an antisymmetric derivation. To treat the general case, we again show that it suffices to treat symmetric and an antisymmetric derivations separately. The symmetric derivations act by scalars on each restricted root space, and belong to \({\text {ad}}(\mathfrak {a})\), but the skew-symmetric derivations are trickier. We handle these by introducing identities \((D_{\gamma ,\delta })\) and \((E_{\gamma , \delta })\), where \(\gamma , \delta \in \Sigma ^+\), which involve the interplay of a derivation with the Cartan involution \(\theta \), and show that these identities characterise derivations in \({\text {ad}}(\mathfrak {m})\) in simple Lie algebras of arbitrary real rank. We prove these identities for the simple algebras of higher rank by reducing to subalgebras of real rank one and two, and then using our analysis of these algebras.

This paper is a step towards the classification of the derivations and automorphisms of \(\mathfrak {n}\), which is interesting for a variety of reasons. One reason is to find the derivations of (minimal) parabolic subalgebras of semisimple Lie algebras, which has been a lively field in recent years; see, for example, Chen [2] and Wang and Yu [22]. Every derivation of a parabolic subalgebra induces a derivation of its nilradical; if we can show that these are Lie multiplication by elements of the subalgebra, then we are well on the way to finding the derivations of the whole subalgebra. Another reason is the question of classification of nilpotent Lie algebras: In general, this is an impossibly tedious matter, but one might hope to do better with algebras with lots of symmetry; to see whether this is viable, we need to understand some examples.

Next, to carry out harmonic analysis on the simply connected nilpotent Lie group associated with \(\mathfrak {n}\), which has applications in diverse areas including theoretical physics and linear partial differential equations, it is important to understand its symmetries; see, for example, the study of Folland [12].

A fourth reason for studying the automorphisms of \(\mathfrak {n}\) is the theory of quasiconformal mappings of “Carnot groups”. Indeed, as defined by Pansu [19], the derivative of a quasiconformal mapping of an Iwasawa N group is an automorphism, and restrictions on the automorphisms give rise to restrictions on the quasiconformal mappings. Further, it was shown by Yamaguchi [24], using the theory of Tanaka prolungations and the Borel–Bott–Weil theorem, and Cowling et al. [8], using more elementary arguments, that the space of “multicontact mappings”, that is, mappings whose differentials preserve the simple root spaces, is finite-dimensional when all the derivations that preserve the root spaces are of the form \({\text {ad}}(\mathfrak {m} \oplus \mathfrak {a})\). The result presented here leads to the same conclusion in an even simpler way. Indeed, unless \(\mathfrak {n}\) has dimension 1 or 2, the Tanaka prolongation of \(\mathfrak {n}\) through \({\text {ad}}(\mathfrak {m}\oplus \mathfrak {a})\) is finite-dimensional; see Ottazzi and Warhurst [18], and this implies that multicontact mappings form a finite-dimensional Lie group. While a more abstract approach, for instance via cohomology, might well also establish our main result for Iwasawa algebras, our more concrete analysis also provides tools that should apply to more general algebras that do not arise as subalgebras of semisimple algebras.

It is also of interest to consider derivations that preserve the grading of \(\mathfrak {n}\), that is, the subspaces \(\sum _{\alpha } \mathfrak {g}_\alpha \) where we sum over all \(\alpha \) of the same height, and to consider derivations of nilradicals of more general parabolic algebras; we will return to these questions in future work.

This paper is organised as follows. In Sect. 2 we analyse the derivations of an H-type algebra. We start by showing that every derivation is the sum of a symmetric derivation and a skew-symmetric derivation; we then describe symmetric and skew-symmetric derivations separately.

In Sect. 3, we consider real semisimple Lie algebras. First, we reduce matters to the case of simple Lie algebras, and then we show that these contain various H-type algebras. We also see how the geometry of root systems is reflected in the structure of various subalgebras of \(\mathfrak {g}\). Most of the ideas behind this section may be found in Ciatti [3,4,5,6,7].

In Sect. 4, we consider the grading of a semisimple Lie algebra associated with a choice of positive roots, and grading-preserving derivations of \(\mathfrak {g}\), of \(\mathfrak {m} \oplus \mathfrak {a} \oplus \mathfrak {n}\) and of \(\mathfrak {n}\). We find a simple Lie algebraic criterion for a skew-symmetric grading-preserving derivation of \(\mathfrak {n}\) to extend to a derivation of \(\mathfrak {g}\); this extended derivation is not only grading preserving but also root space preserving.

Finally, in Sect. 5 we apply the results of Sects. 2 and 3 to the study of the derivations of \(\mathfrak {n}\) that preserve the root space decomposition. These are sums of symmetric derivations and skew-symmetric derivations. The main idea is to show that our assertion is true when the real rank of \(\mathfrak {g}\) is 1 or 2, and then apply this result to the rank-two subalgebras of a general simple Lie algebra, deducing from these the full result.

Main Theorem

If no simple summand of \(\mathfrak {g}\) is isomorphic to \(\mathfrak {so}(n, 1)\) or \(\mathfrak {su}(n,1)\) for any n, then all the derivations of \(\mathfrak {n}\) that preserve the root spaces are of the form \({\text {ad}}(W)\), where \(W\in \mathfrak {m} \oplus \mathfrak {a}\). Otherwise, there are derivations of \(\mathfrak {n}\) that preserve the root spaces that do not arise in this way.

2 Derivations of an H-type Lie algebra

In this section, we first define H-type Lie algebras, which arose in the work of Kaplan [14], and then describe their derivations. These are always the sum of a symmetric derivation and a skew-symmetric derivation. In Corollary 2.6, skew-symmetric derivations are decomposed as the sum of two components, one of which is trivial on the centre. The symmetric derivations are classified in Corollary 2.8 by a diagonalisation process.

Let \(\mathfrak {h}\) be a two-step nilpotent Lie algebra, endowed with an inner product \(\left\langle \cdot , \cdot \right\rangle \). We denote by \(\mathfrak {z}\) the centre of \(\mathfrak {h}\) and by \(\mathfrak {v}\) the orthogonal complement of \(\mathfrak {z}\); given a subspace \(\mathfrak {s}\) of \(\mathfrak {h}\), we write \(I_{\mathfrak {s}}\) for the identity map on \(\mathfrak {s}\). Then

$$\begin{aligned} \mathfrak {g} = \mathfrak {v} \oplus \mathfrak {z} \end{aligned}$$

(throughout this paper, \(\oplus \) denotes a vector space direct sum; in general, the summands need not be Lie algebras). For each Z in \(\mathfrak {z}\), we define \(J_Z\) in \({{\mathrm{End}}}(\mathfrak {v})\) by

$$\begin{aligned} \left\langle J_Z X, Y \right\rangle = \left\langle Z, [X,Y] \right\rangle \quad \forall X, Y \in \mathfrak {v}. \end{aligned}$$
(2.1)

Then \(J_Z\) is trivially skew-symmetric, that is, \(J_Z^{\mathsf {T}}= -J_Z\), where \({}^{\mathsf {T}}\) denotes the transpose relative to the inner product. We say that \(\mathfrak {h}\) is of Heisenberg type, or just H-type, when

$$\begin{aligned} J_Z^2 = -\Vert Z \Vert ^2 I_{\mathfrak {v}} \end{aligned}$$
(2.2)

for all \(Z \in \mathfrak {z}\). Equivalently, for each \(X \in \mathfrak {v}\) of length 1, the map \({\text {ad}}(X)\) is an isometry from \(\ker ({\text {ad}}(X))^\perp \) onto \(\mathfrak {z}\). For the rest of this section, we assume that \(\mathfrak {h}\) is an H-type algebra.

By polarisation, (2.2) implies that

$$\begin{aligned} J_Z J_{Z'} + J_{Z'} J_Z = - 2 \left\langle Z, Z' \right\rangle I_{\mathfrak {v}} \qquad \forall Z, Z' \in \mathfrak {z} . \end{aligned}$$
(2.3)

Thus the \(J_Z\) generate a Clifford algebra.

Recall that a derivation of a Lie algebra \(\mathfrak {h}\) is a linear endomorphism \(D: \mathfrak {h} \rightarrow \mathfrak {h}\) such that

$$\begin{aligned} D\left( \left[ X,Y\right] \right) = \left[ DX, Y\right] + \left[ X, DY\right] \quad \forall X, Y \in \mathfrak {h}; \end{aligned}$$

every derivation of a Lie algebra automatically preserves the centre. We say that a linear endomorphism D of \(\mathfrak {h}\) preserves the grading if \(D(\mathfrak {v}) \subseteq \mathfrak {v}\) and \(D(\mathfrak {z}) \subseteq \mathfrak {z}\), and write \(\mathcal {D}(\mathfrak {h})\) for the Lie algebra of all grading-preserving derivations of \(\mathfrak {h}\). We denote by \(\mathcal {D}^{\mathrm {sym}}(\mathfrak {h})\) the subspace of \(\mathcal {D}(\mathfrak {h})\) of all symmetric derivations and by \(\mathcal {D}^{\mathrm {skew}}(\mathfrak {h})\) the Lie subalgebra of all skew-symmetric derivations. We also write \(\mathcal {D}^{\mathrm {sym}}_0(\mathfrak {h})\) and \(\mathcal {D}^{\mathrm {skew}}_0(\mathfrak {h})\) for the subspaces of these spaces of derivations that vanish on \(\mathfrak {z}\).

Proposition 2.1

Let D be a grading-preserving linear endomorphism of \(\mathfrak {h}\). Then D is a derivation if and only if

$$\begin{aligned} J_{D^{\mathsf {T}}Z} = D^{\mathsf {T}}J_Z + J_Z D \quad \forall Z \in \mathfrak {z}. \end{aligned}$$
(2.4)

Suppose moreover that \(D \vert _{\mathfrak {z}} = 0\). If D is skew-symmetric, then D is a derivation if and only if D commutes with all the \(J_Z\) and if D is symmetric, then D is a derivation if and only if D anticommutes with all the \(J_Z\).

Proof

From (2.1), it follows that D is a derivation if and only if, for all Z in \(\mathfrak {z}\) and X, Y in \(\mathfrak {v}\),

$$\begin{aligned} \left\langle J_{D^{\mathsf {T}}Z} X, Y \right\rangle&= \left\langle D^{\mathsf {T}}Z, \left[ X, Y \right] \right\rangle = \left\langle Z, D \left[ X, Y \right] \right\rangle \\&= \left\langle Z, \left[ D X, Y \right] \right\rangle + \left\langle Z, \left[ X, D Y \right] \right\rangle \\&= \left\langle J_Z {D X}, Y \right\rangle + \left\langle D^{\mathsf {T}}J_Z X, Y \right\rangle , \end{aligned}$$

proving the result. \(\square \)

The next result is known, but we give a proof for completeness.

Lemma 2.2

(Riehm [20]) For every pair of orthogonal vectors \(Z'\) and \(Z''\) in \(\mathfrak {z}\), the grading-preserving linear map \(\Phi _{Z' Z''}\), defined by

$$\begin{aligned} \Phi _{Z' Z''} (X+Z) = J_{Z'} J_{Z''} X + 2\left\langle Z', Z \right\rangle Z'' - 2\left\langle Z'', Z \right\rangle Z' \end{aligned}$$

for all \(Z \in \mathfrak {z}\) and all \(X \in \mathfrak {v}\), is a skew-symmetric derivation of \(\mathfrak {h}\).

Proof

It is evident that \(\Phi _{Z' Z''}\) is skew-symmetric. By Proposition 2.1, it suffices to show that

$$\begin{aligned} J_{\Phi _{Z' Z''} (Z)} X = \Phi _{Z' Z''} J_Z X - J_Z \Phi _{Z' Z''} X \quad \forall X \in \mathfrak {v}. \end{aligned}$$

We consider the right-hand side of the equation, and use (2.3):

$$\begin{aligned} J_{Z'}J_{Z''} J_Z X - J_Z J_{Z'}J_{Z''}X&= - 2 \left\langle Z ,{Z''}\right\rangle J_{Z'}X - J_{Z'} J_ZJ_{Z''}X\\&\qquad + 2 \left\langle Z ,{Z'}\right\rangle J_{Z''}X + J_{Z'} J_ZJ_{Z''}X\\&= 2\left\langle Z', Z \right\rangle J_{Z''}X - 2\left\langle Z'', Z \right\rangle J_{Z'}X\\&= J_{2\left\langle Z', Z \right\rangle Z'' - 2\left\langle Z'', Z \right\rangle Z'}X\\&= J_{\Phi _{Z' Z''} (Z)}X , \end{aligned}$$

as required. \(\square \)

We define \(\mathcal {R}(\mathfrak {h})\) to be the vector subspace of \(\mathcal {D}(\mathfrak {h})\) of all grading-preserving derivations of \(\mathfrak {h}\) spanned by the \(\Phi _{Z'Z''}\). As observed by Riehm [20], the subspace \(\mathcal {R}(\mathfrak {h})\) is a subalgebra of \(\mathcal {D}(\mathfrak {h})\). To see this, we take an orthonormal basis \(\{ Z_1, \ldots , Z_m \}\) for \(\mathfrak {z}\), and write \(\Phi _{ij}\) in place of \(\Phi _{Z_iZ_j}\). Since \(\Phi _{Z'Z''}\) depends linearly on \(Z'\) and on \(Z''\), every element of \(\mathcal {R}(\mathfrak {h})\) is a linear combination of the \(\Phi _{ij}\). Moreover,

$$\begin{aligned} \Phi _{ij} \Phi _{kl} - \Phi _{kl} \Phi _{ij} = {\left\{ \begin{array}{ll} 0 &{}\text {if } \{i,j\} \cap \{k,l\} = \varnothing , \\ 2\Phi _{jl} &{}\text {if } i=k, \end{array}\right. } \end{aligned}$$

which shows that \(\mathcal {R}(\mathfrak {\mathfrak {h}})\) is closed under taking commutators. We omit the proof of these commutation relations, as we do not need this result.

Corollary 2.3

Suppose that D is a grading-preserving derivation of \(\mathfrak {h}\). Then we may write D as \(D_0 + D_1\), where \(D_0 \in \mathcal {D}(\mathfrak {h})\) and \(D_0\vert _{\mathfrak {z}}\) is symmetric, and \(D_1 \in \mathcal {R}(\mathfrak {h})\).

Proof

The skew-symmetric part of the restriction \(D|_\mathfrak {z}\) of D to \(\mathfrak {z}\) decomposes as a linear combination of the \(\Phi _{ij}|_\mathfrak {z}\) defined above; we take \(D_1\) to be the same linear combination of the \(\Phi _{ij}\), and \(D_0\) to be \(D - D_1\). The result follows immediately. \(\square \)

Corollary 2.4

Suppose that \(D \in \mathcal {D}(\mathfrak {h})\) and \(D\vert _{\mathfrak {z}}\) is symmetric. Then \(D^{\mathsf {T}}\in \mathcal {D}(\mathfrak {h})\).

Proof

Since \(D\vert _{\mathfrak {z}}\) is symmetric, it is diagonalisable. Take an eigenvector Z in \(\mathfrak {z}\) with eigenvalue \(2\mu \). By Proposition 2.1,

$$\begin{aligned} 2\mu J_Z = J_{DZ} = J_{D^{\mathsf {T}}Z} = D^{\mathsf {T}}J_Z + J_Z D, \end{aligned}$$

whence multiplication on both sides by \(J_Z\) gives

$$\begin{aligned} -2\mu |Z|^2 J_Z = -|Z|^2 J_ZD^{\mathsf {T}}- |Z|^2 D J_Z, \end{aligned}$$

and

$$\begin{aligned} J_{DZ} = D J_Z + J_ZD^{\mathsf {T}}. \end{aligned}$$

This holds for all eigenvectors Z of D, and so for all \(Z \in \mathfrak {z}\) by linearity, so \(D^{\mathsf {T}}\) is a derivation, again by Proposition 2.1. \(\square \)

Corollary 2.5

Suppose that D is a grading-preserving endomorphism of \(\mathfrak {h}\). Then \(D \in \mathcal {D}(\mathfrak {h})\) if and only if \(D^{\mathsf {T}}\in \mathcal {D}(\mathfrak {h})\). Hence if \(D \in \mathcal {D}(\mathfrak {h})\), then we may write D as \(D^a+D^s\), where \(D^s \in \mathcal {D}^{\mathrm {sym}}(\mathfrak {h})\) and \(D^a\in \mathcal {D}^{\mathrm {skew}}(\mathfrak {h})\).

Proof

For the first part, it suffices to suppose that \(D \in \mathcal {D}(\mathfrak {h})\) and show that \(D^{\mathsf {T}}\in \mathcal {D}(\mathfrak {h})\). In light of Corollary 2.3, by subtracting off an element of \(\mathcal {R}(\mathfrak {\mathfrak {h}})\) if necessary, we may assume that \(D\vert _{\mathfrak {z}}\) is symmetric. It follows that \(D^{\mathsf {T}}\in \mathcal {D}(\mathfrak {h})\), as required.

For the second part of the corollary, take

$$\begin{aligned} D^s = \frac{1}{2} \left( D+D^{\mathsf {T}}\right) \quad \mathrm{and}\quad D^a = \frac{1}{2} \left( D-D^{\mathsf {T}}\right) ; \end{aligned}$$

the conclusion is obvious. \(\square \)

Hence, to describe the elements of \(\mathcal {D}(\mathfrak {h})\), we can study symmetric and skew-symmetric derivations separately. First we consider the skew-symmetric derivations.

Corollary 2.6

Each D in \(\mathcal {D}^{\mathrm {skew}}(\mathfrak {h})\) decomposes as a sum \(D_0 +R\), where \(D_0 \in \mathcal {D}^{\mathrm {skew}}_0(\mathfrak {h})\) and \(R \in \mathcal {R}(\mathfrak {h})\). In particular, \(D_0 \vert _{\mathfrak {v}}\) commutes with all the maps \(J_Z\).

Proof

This is a consequence of Corollary 2.3 and Proposition 2.1. \(\square \)

Now we consider a symmetric derivation D, which is diagonalisable with real eigenvalues. Since D preserves \(\mathfrak {v}\) and \(\mathfrak {z}\), these spaces decompose into eigenspaces for D. We take \(\mathfrak {v}_{\lambda }\) to be the eigenspace of \(D\vert _{\mathfrak {v}}\) for the eigenvalue \(\lambda \), and, given a subspace \(\mathfrak {s}\) of \(\mathfrak {h}\), we write \(P_{\mathfrak {s}}\) for the orthogonal projection of \(\mathfrak {h}\) onto \(\mathfrak {s}\).

Proposition 2.7

Suppose that \(D\in \mathcal {D}^{\mathrm {sym}}(\mathfrak {h})\). Then \(D\vert _{\mathfrak {z}} = 2\mu I_{\mathfrak {z}}\) for some \(\mu \) in \(\mathbb {R}\). Moreover, if \(X \in \mathfrak {v}_\lambda \), then

$$\begin{aligned} D J_Z X = ( 2\mu - \lambda ) J_Z X \quad and \quad D J_Z J_{Z'} X = \lambda J_Z J_{Z'} X \end{aligned}$$
(2.5)

for all Z and \(Z'\) in \(\mathfrak {z}\).

Proof

Fix an orthonormal basis \(\{ Z_1, \dots , Z_m \}\) of \(\mathfrak {z}\) such that \(D Z_i = 2\mu _i Z_i\) when \(i = 1, \dots , m\), with \(\mu _i \in \mathbb R\). From (2.4), it follows that

$$\begin{aligned} D J_{Z_i} X = J_{D Z_i} X - J_{Z_i} D X = (2\mu _i - \lambda ) J_{Z_i} X \end{aligned}$$

when \(i = 1, \dots , m\), and the first formula of (2.5) is established, and similarly,

$$\begin{aligned} D J_{Z_i} J_{Z_k} X = (2\mu _i - 2\mu _k + \lambda ) J_{Z_i} J_{Z_k} X \end{aligned}$$
(2.6)

when \(i, k = 1, \dots , m\).

If \(\dim (\mathfrak {z}) = 1\), then \(D\vert _{\mathfrak {z}} = 2\mu I_{\mathfrak {z}}\) for some \(\mu \) in \(\mathbb {R}\) and the second formula of (2.5) is trivial, so we suppose henceforth that \(\dim (\mathfrak {z}) > 1\). By interchanging i and k in (2.6), we see that

$$\begin{aligned} D J_{Z_k} J_{Z_i} X = ( 2\mu _k - 2\mu _i + \lambda ) J_{Z_k} J_{Z_i} X , \end{aligned}$$

which yields

$$\begin{aligned} D J_{Z_i} J_{Z_k} X = (2\mu _k - 2\mu _i + \lambda ) J_{Z_i} J_{Z_k} X , \end{aligned}$$

when \(i \ne k\), since \(J_{Z_i} J_{Z_k} = - J_{Z_k} J_{Z_i}\) by (2.3). This equality, compared with (2.6), shows that \(\mu _i = \mu _k\), and the lemma follows. \(\square \)

Corollary 2.8

Let D be a derivation in \(\mathcal {D}^{\mathrm {sym}}(\mathfrak {h})\). Denote by \(2\mu \) the eigenvalue of D on \(\mathfrak {z}\), and by \(\{ \lambda _1, \dots , \lambda _r \}\) the distinct eigenvalues of D on \(\mathfrak {v}\), listed in decreasing order, and by \(\mathfrak {v}_i\) the corresponding eigenspaces. Then \(\lambda _i + \lambda _{r+1-i} = 2\mu \), and we may write

$$\begin{aligned} \begin{aligned} D&= \mu \bigl ( 2 P_{\mathfrak {z}} + P_{\mathfrak {v}} \bigr ) + \sum _{i = 1}^{\lfloor r/2 \rfloor } \left( \lambda _i - \mu \right) \bigl ( P_{\mathfrak {v}_i} - P_{\mathfrak {v}_{r+1-i}} \bigr ); \end{aligned} \end{aligned}$$

all the maps \(\bigl ( P_{\mathfrak {v}_i} - P_{\mathfrak {v}_{r+1-i}} \bigr )\) and \(2 P_{\mathfrak {z}} + P_{\mathfrak {v}} \) are derivations.

Proof

This follows from Propositions 2.7 and 2.1. \(\square \)

3 Structure of semisimple Lie algebras

In this section, we describe the restricted root structure and the standard Iwasawa and Bruhat decompositions of a semisimple Lie algebra. Then we exhibit a number of H-type subalgebras of the Iwasawa \(\mathfrak {n}\) subalgebra. Next, we analyse the structure of \(\mathfrak {g}\) and \(\mathfrak {n}\) in more detail.

3.1 Semisimple Lie algebras of the noncompact type

Take a real semisimple Lie algebra \(\mathfrak {g}\) with Killing form B and Cartan involution \(\theta \), and let \(\mathfrak {k} \oplus \mathfrak {p}\) be the corresponding Cartan decomposition of \(\mathfrak {g}\). Fix a maximal subalgebra \(\mathfrak {a}\) of \(\mathfrak {p}\); its dimension is known as the real rank of \(\mathfrak {g}\). Given an element \(\alpha \) of \({{\mathrm{Hom}}}(\mathfrak {a}, \mathbb {R})\), we define the (possibly trivial) subspace \(\mathfrak {g}_{\alpha }\) of \(\mathfrak {g}\) by

$$\begin{aligned} \mathfrak {g}_{\alpha } = \left\{ X \in \mathfrak {g} : [ H, X ] = \alpha (H) X, \ \forall H \in \mathfrak {a} \right\} . \end{aligned}$$

Then \(\alpha \) is said to be a restricted root if \(\alpha \ne 0\) and \(\mathfrak {g}_{\alpha } \ne \{0\}\). We denote by \(\Sigma \) the restricted root system, that is, the set of all restricted roots. Note that \([\mathfrak {g_\alpha }, \mathfrak {g}_\beta ] \subseteq \mathfrak {g}_{\alpha +\beta }\) for all \(\alpha , \beta \in {{\mathrm{Hom}}}(\mathfrak {a}, \mathbb {R})\), because \({\text {ad}}(H)\) is a derivation for each \(H \in \mathfrak {a}\). Hence if \(\alpha \) and \(\beta \) are roots, then \(\alpha +\beta \) is also a root, unless \(\alpha +\beta = 0\) or \([\mathfrak {g_\alpha }, \mathfrak {g}_\beta ] =\{0\}\). Since \(\mathfrak {a}\) is \(\theta \)-invariant, so is \(\mathfrak {g}_0\), and it follows easily that \(\mathfrak {g}_0 = \mathfrak {m} \oplus \mathfrak {a}\), where \(\mathfrak {m} = \mathfrak {g}_0 \cap \mathfrak {k}\). Then

$$\begin{aligned} \mathfrak {g} = \mathfrak {m} \oplus \mathfrak {a} \oplus \sum _{\alpha \in \Sigma } \mathfrak {g}_{\alpha } . \end{aligned}$$

Henceforth, in this paper, unless stated explicitly otherwise, we write rank and root rather than real rank and restricted root for brevity; this should not create any confusion.

We recall that \(\Sigma \) is said to be decomposable if \(\Sigma = \Sigma _1 \cup \Sigma _2\), where \(\Sigma _1\) and \(\Sigma _2\) are disjoint nontrivial subsets of \(\Sigma \) and \(\left\langle \gamma , \delta \right\rangle = 0\) for all \(\gamma \in \Sigma _1\) and all \(\delta \in \Sigma _2\), and indecomposable otherwise. It is standard (see, for instance, Helgason [13] or Knapp [15]) that \(\Sigma \) is indecomposable if and only if \(\mathfrak {g}\) is simple, that is, cannot be written as a direct sum of nontrivial pairwise commuting ideals. We recall also that \(\Sigma \) is said to be reduced if the only multiples of a root \(\gamma \) that also lie in \(\Sigma \) are \(\pm \gamma \).

A Weyl chamber is a maximal open subset of \(\mathfrak {a}\) in which no root vanishes. We choose one of these, C say, and say that a root \(\gamma \) is positive, and write \(\gamma \in \Sigma ^+\), when \(\gamma (H) >0\) for all \(H \in C\). Then \(\Sigma ^+\) is closed under addition and \(\Sigma = \Sigma ^+ \cup (-\Sigma ^+)\). We write \(\Delta \) for the smallest subset of \(\Sigma ^+\) such that the boundary of C is a subset of the set \(\bigcup _{\alpha \in \Delta } \{ H \in \mathfrak {a} : \alpha (H) = 0\}\); the roots in \(\Delta \) are called simple. Set

$$\begin{aligned} \mathfrak {n} = \sum _{\alpha \in \Sigma ^+} \mathfrak {g}_{\alpha } . \end{aligned}$$

Then we obtain the Bruhat decomposition of \(\mathfrak {g}\), namely,

$$\begin{aligned} \mathfrak {g} = \theta \mathfrak {n} \oplus \mathfrak {m} \oplus \mathfrak {a} \oplus \mathfrak {n} . \end{aligned}$$

Each root \(\gamma \) in \(\Sigma ^+\) may be written uniquely as a sum \(\sum _{\alpha \in \Delta } n_\alpha \alpha \), where each \(n_\alpha \) is a nonnegative integer. The positive integer \(\sum _{\alpha \in \Delta } n_\alpha \) is called the height of \(\gamma \), and written \({{\mathrm{height}}}(\gamma )\). Clearly the height of a simple root is 1, and moreover

$$\begin{aligned} {{\mathrm{height}}}(\gamma +\delta ) = {{\mathrm{height}}}(\gamma ) + {{\mathrm{height}}}(\delta ) \end{aligned}$$

for all \(\gamma , \delta \in \Sigma ^+\) such that \(\gamma +\delta \in \Sigma ^+\). Then \(\mathfrak {n}\) is graded by height; more precisely, we may write \(\mathfrak {n} = \sum _{h \in \mathbb {Z}^+} \mathfrak {g}_h\), where \([\mathfrak {g}_h, \mathfrak {g}_{k} ] \subseteq \mathfrak {g}_{h+k}\).

3.2 Reduction to the simple case

Our first simplification is a reduction of the problem to the case of the Iwasawa \(\mathfrak {n}\) subalgebra of a simple Lie algebra \(\mathfrak {g}\).

Proposition 3.1

Suppose that \(\mathfrak {g} = \mathfrak {g}^1 \oplus \mathfrak {g}^2 \oplus \dots \oplus \mathfrak {g}^J\), where \(J>1\) and each \(\mathfrak {g}^j\) is a nontrivial simple ideal, and that \(\mathfrak {n} = \mathfrak {n}^1 \oplus \mathfrak {n}^2 \oplus \dots \oplus \mathfrak {n}^J\) is the corresponding decomposition of \(\mathfrak {n}\) into subalgebras. Then

$$\begin{aligned} \mathcal {D}(\mathfrak {\mathfrak {n}}) = \sum _{j=1}^J \mathcal {D}(\mathfrak {\mathfrak {n}^j}). \end{aligned}$$

Remark 3.2

This is to be interpreted in the sense that each root-space-preserving derivation of \(\mathfrak {n}\) preserves each of the subalgebras \(\mathfrak {n}^j\), and the restriction to each subalgebra is a root-space-preserving derivation thereof, and vice versa.

Some of the simple summands \(\mathfrak {g}^j\) may be compact. In this case, the corresponding space \(\mathfrak {n}^j\) is \(\{0\}\); we define \(\mathcal {D}(\mathfrak {\{ \mathrm {0} \}}) = \{0\}\).

Proof

Since D in \(\mathcal {D}(\mathfrak {\mathfrak {n}})\) preserves the root spaces, it preserves each \(\mathfrak {g}_\alpha \) and hence each \(\mathfrak {n}^j\). So one direction of the assertion is proved. The other is obvious. \(\square \)

Remark 3.3

If we replace the root-space-preserving assumption by a grading-preserving assumption, and add the hypothesis that no summand is isomorphic to \(\mathfrak {so}(n,1)\) for any n, then the result still holds. Indeed, when there is no abelian summand, \(\mathfrak {n}\) is “totally nonabelian” in the language of Cowling and Ottazzi [11], and the conclusion follows from [11, Corollary 2.4].

3.3 The simple case

In light of Proposition 3.1, we may and shall assume that \(\mathfrak {g}\) is simple in the rest of this paper.

Two observations underpin our approach to the study of derivations. First, derivations are local, in the sense that if D is a root-space-preserving linear endomorphism of \(\mathfrak {n}\), then linearity implies that D is a derivation if and only if

$$\begin{aligned} D\left[ X,Y\right] = \left[ DX,Y\right] + \left[ X,DY\right] \qquad \forall X \in \mathfrak {g}_\gamma \quad \forall Y \in \mathfrak {g}_\delta , \end{aligned}$$

as \(\gamma \) and \(\delta \) range over \(\Sigma ^+\). This identity holds trivially if \(\gamma +\delta \) is not a root, for then both sides are 0. If \(\gamma +\delta \) is a root, then the subalgebra \(\mathfrak {n}^{\{\gamma , \delta \}}\), defined by

$$\begin{aligned} \mathfrak {n}^{\{\gamma , \delta \}} = \sum _{\epsilon \in \Sigma ^+ \cap \left( \mathbb {R}\gamma +\mathbb {R}\delta \right) } \mathfrak {g}_\epsilon , \end{aligned}$$

is the Iwasawa \(\mathfrak {n}\) subalgebra of a simple subalgebra of \(\mathfrak {g}\), whose rank is 1 if \(\gamma =\delta \) and 2 otherwise. Then we can understand D provided we understand its restriction to Iwasawa \(\mathfrak {n}\) algebras of simple Lie algebras of rank one and rank two.

The second observation is that \(\mathfrak {n}\) may be equipped with a natural inner product so that, in the rank-one case, \(\mathfrak {n}\) itself is an H-type algebra, while in the rank-two cases, \(\mathfrak {n}\) has many H-type subalgebras. We will use what we know about the derivations of H-type algebras, but first we need to find H-type subalgebras of \(\mathfrak {n}\).

3.4 Subalgebras of \(\mathfrak {n}\) of H-type

If \(c > 0\), then the symmetric bilinear form \(\left\langle \cdot , \cdot \right\rangle \) on \(\mathfrak {g}\), given by

$$\begin{aligned} \left\langle X, Y \right\rangle = - c B(X, \theta Y) , \end{aligned}$$
(3.1)

is an inner product, which induces an inner product on the dual of \(\mathfrak {a}\), also written \(\left\langle \cdot , \cdot \right\rangle \); we denote the corresponding norms by \(\Vert \cdot \Vert \). We fix c so that the length of the longest roots is \(\sqrt{2}\). In the vector space decomposition \(\mathfrak {m} \oplus \mathfrak {a} \oplus \sum _{\alpha \in \Sigma } \mathfrak {g}_\alpha \) of \(\mathfrak {g}\), the distinct summands are orthogonal.

Now the Killing form satisfies the well-known identity

$$\begin{aligned} B\left( \left[ Z,X\right] ,Y\right) + B\left( X, \left[ Z, Y\right] \right) = 0 \quad \forall X,Y,Z \in \mathfrak {g}, \end{aligned}$$

and so \({\text {ad}}(Y)^{\mathsf {T}}= - {\text {ad}}(\theta Y)\), that is,

$$\begin{aligned} \left\langle X, \left[ Y, Z \right] \right\rangle = - \left\langle \left[ \theta Y , X\right] , Z \right\rangle \quad \forall X,Y,Z \in \mathfrak {g}. \end{aligned}$$
(3.2)

If \(\gamma \in \Sigma \) and \(X,Y \in \mathfrak {g}_\gamma \), then \([\theta X, Y] \in \mathfrak {g}_0\). Further, for all \(H \in \mathfrak {a}\),

$$\begin{aligned} \left\langle H, \left[ \theta X, Y\right] \right\rangle = - \left\langle \left[ X, H\right] , Y \right\rangle = \left\langle \left[ H, X\right] , Y \right\rangle = \gamma (H) \left\langle X, Y\right\rangle . \end{aligned}$$
(3.3)

On the one hand, if \(X \perp Y\), then \(\left\langle H, [\theta X, Y] \right\rangle = 0\), and so \([\theta X, Y] \in \mathfrak {a}^\perp \), whence \([\theta X, Y] \in \mathfrak {m}\). On the other hand, \(\theta [\theta X,X] = - [\theta X, X]\), so \([\theta X, X] \in \mathfrak {a}\). We write \(H_\gamma \) for the unique element of \(\mathfrak {a}\) such that \(\delta (H_\gamma ) = \left\langle \delta ,\gamma \right\rangle \) for all \(\delta \in {{\mathrm{Hom}}}(\mathfrak {a}, \mathbb {R})\), or equivalently for all \(\delta \in \Sigma \). Now (3.3) implies that

$$\begin{aligned} \delta \left( \left[ \theta X, X\right] \right) = \left\langle \delta , \gamma \right\rangle \Vert X\Vert ^2 \quad \mathrm{and}\quad \left[ \theta X, X\right] = \Vert X\Vert ^2 H_\gamma \end{aligned}$$
(3.4)

for all \(X \in \mathfrak {g}_\gamma \). For future purposes, note that

$$\begin{aligned} \mathfrak {a} = \sum _{\alpha \in \Sigma ^+} \mathbb {R}H_\alpha ; \end{aligned}$$
(3.5)

in general, this sum is not direct.

Our next results allow us to find various subalgebras of \(\mathfrak {n}\) that are H-type algebras, or nearly so.

Lemma 3.4

Suppose that \(\gamma \), \(\delta \), and \(\gamma +\delta \) are positive roots. For all Z in \(\mathfrak {g}_{\gamma +\delta }\), we define the linear operator \(J_Z\) on \(\mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta \) by

$$\begin{aligned} J_Z = {\text {ad}}(Z) \circ \theta . \end{aligned}$$
(3.6)

Then \(J_Z\) maps \(\mathfrak {g}_\gamma \) into \(\mathfrak {g}_\delta \) and \(\mathfrak {g}_\delta \) into \(\mathfrak {g}_\gamma \); further

$$\begin{aligned} \left\langle J_Z X, Y \right\rangle = \left\langle Z, \left[ X,Y\right] \right\rangle \quad \forall X, Y \in \mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta . \end{aligned}$$
(3.7)

Proof

The mapping properties of \(J_Z\) are consequences of the orthogonality of distinct root spaces, while (3.7) follows from the definition of \(J_Z\) and (3.2). \(\square \)

Lemma 3.5

Suppose that \(\gamma \), \(\delta \), and \(\gamma +\delta \) are positive roots, and that \(J_Z\) is defined as in Lemma 3.4. Suppose also that neither \(2\gamma +\delta \) nor \(\gamma +2\delta \) is a root. Then

$$\begin{aligned} \left[ X,J_Z X\right] = \left\langle \gamma +\delta ,\gamma \right\rangle \Vert X\Vert ^2 Z \quad \forall X \in \mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta \end{aligned}$$
(3.8)

and

$$\begin{aligned} J_Z ^2 X = - \left\langle \gamma +\delta ,\gamma \right\rangle \Vert Z\Vert ^2 X \quad \forall X \in \mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta , \end{aligned}$$
(3.9)

Thus if \(Z \ne 0\), then \(J_Z\) is a linear isomorphism of \(\mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta \) that exchanges \(\mathfrak {g}_\delta \) and \(\mathfrak {g}_\gamma \). Moreover, if \(\gamma = \delta \), then \(\mathfrak {g}_{\gamma } \oplus \mathfrak {g}_{2\gamma }\) is an H-type algebra, while if neither \(2\gamma \) nor \(2\delta \) is a root, then \(\mathfrak {g}_{\gamma }\oplus \mathfrak {g}_{\delta } \oplus \mathfrak {g}_{\gamma +\delta }\) is an H-type algebra.

Proof

When \(2\gamma +\delta \) is not a root, \([X,Z] = 0\) for all Z in \(\mathfrak {g}_{\gamma +\delta }\) and all X in \(\mathfrak {g}_\gamma \). Hence, from the Jacobi identity and (3.4),

$$\begin{aligned} {[}X, J_Z X]&=\left[ X, \left[ Z,\theta X\right] \right] \\&= \left[ \left[ X,Z\right] , \theta X\right] + \left[ Z, \left[ X,\theta X\right] \right] \\&=\left( \gamma +\delta \right) \left( \left[ \theta X,X\right] \right) Z\\&=\left\langle \gamma +\delta ,\gamma \right\rangle \Vert X\Vert ^2 Z, \end{aligned}$$

and similarly,

$$\begin{aligned} J_Z \left( J_Z X\right)&= \left[ Z, \theta \left[ Z,\theta X\right] \right] \\&= \left[ Z, \left[ \theta Z, X\right] \right] \\&= \left[ X, \left[ \theta Z, Z\right] \right] +\left[ \theta Z, \left[ Z, X\right] \right] \\&= - \left\langle \gamma +\delta ,\gamma \right\rangle \Vert Z\Vert ^2 X. \end{aligned}$$

By exchanging the role of \(\gamma \) and \(\delta \) in the last two formulae, we see that

$$\begin{aligned} \left[ Y, J_Z Y\right] =\left\langle \gamma +\delta ,\delta \right\rangle \Vert Y\Vert ^2 Z \end{aligned}$$

and

$$\begin{aligned} J_Z \left( J_Z Y\right) = -\left\langle \gamma +\delta ,\delta \right\rangle \Vert Z\Vert ^2 Y \end{aligned}$$

for all \(Z \in \mathfrak {g}_{\gamma +\delta }\) and all \(Y \in \mathfrak {g}_\delta \) when \(\gamma +2\delta \) is not a root. Hence (3.8) and (3.9) are proved, and \(J_Z\) is a linear isomorphism from \(\mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta \) onto \(\mathfrak {g}_\delta \oplus \mathfrak {g}_\gamma \) when \(Z \ne 0\).

Either \(\gamma = \delta \) or the roots \(\gamma \) and \(\delta \) span a root system of rank 2. By inspection of the possibilities, we see that the hypotheses that \(\gamma \), \(\delta \), and \(\gamma +\delta \) are roots and \(2\gamma +\delta \) and \(\gamma +2\delta \) are not roots imply that \(\Vert \gamma \Vert = \Vert \delta \Vert \) and \(\left\langle \gamma +\delta ,\delta \right\rangle = \left\langle \gamma +\delta ,\gamma \right\rangle > 0\). Now (3.8) and (3.9) follow immediately. Further, if \(\gamma = \delta \) or neither \(2\gamma \) nor \(2\delta \) is a root, then \(\left\langle \gamma +\delta ,\gamma \right\rangle = 1\), and so

$$\begin{aligned} J_Z^2 X = -\Vert Z\Vert ^2 X \quad \forall X \in \mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta \quad \forall Z \in \mathfrak {g}_{\gamma +\delta }, \end{aligned}$$

as required. \(\square \)

Remark 3.6

We have just shown that the Iwasawa \(\mathfrak {n}\) algebras of real rank one simple Lie algebras are H-type. Further, inspection of the root systems of rank one and two shows that if \(\gamma \), \(\delta \), and \(\gamma +\delta \) are roots and \(2\gamma +\delta \) and \(\gamma +2\delta \) are not roots, then either \(2\gamma \) and \(2(\gamma +\delta )\) are both roots, or neither is a root.

Corollary 3.7

Suppose that \(\gamma \), \(\delta \), and \(\gamma +\delta \) are positive roots, and that neither \(2\gamma +\delta \) nor \(\gamma +2\delta \) is a root. If D is a root-space-preserving derivation of \(\mathfrak {g}\) whose restriction to \(\mathfrak {g}_{\gamma +\delta }\) is symmetric, then

$$\begin{aligned} D^{\mathsf {T}}[X,Y] = \left[ D^{\mathsf {T}}X, Y\right] + \left[ X, D^{\mathsf {T}}Y\right] \qquad \forall X \in \mathfrak {g}_\gamma \quad \forall Y \in \mathfrak {g}_\delta . \end{aligned}$$
(3.10)

Proof

The proof is a mild generalisation of the proof of Corollary 2.4. Observe first that if E is a root-space-preserving linear endomorphism of \(\mathfrak {g}_{\gamma } \oplus \mathfrak {g}_{\delta } \oplus \mathfrak {g}_{\gamma +\delta }\), then \(E[X,Y] = [EX,Y]+[X,EY]\) if and only if

$$\begin{aligned} \begin{aligned}{} \left\langle J_{E^{\mathsf {T}}Z} X, Y \right\rangle&= \left\langle E^{\mathsf {T}}Z, \left[ X, Y \right] \right\rangle \\&= \left\langle Z, E \left[ X, Y \right] \right\rangle \\&= \left\langle Z, \left[ E X, Y \right] \right\rangle + \left\langle Z, \left[ X, E Y \right] \right\rangle \\&= \left\langle J_Z {E X}, Y \right\rangle + \left\langle E^{\mathsf {T}}J_Z X, Y \right\rangle , \end{aligned} \end{aligned}$$
(3.11)

for all \(X \in \mathfrak {g}_{\gamma }\), all \(Y \in \mathfrak {g}_{\delta }\), and all \(Z \in \mathfrak {g}_{\gamma +\delta }\).

Since \(D\vert _{\mathfrak {g}_{\alpha +\beta }}\) is symmetric, it is diagonalisable. Take an eigenvector Z in \(\mathfrak {g}_{\alpha +\beta }\) with eigenvalue \(2\mu \). By (3.11),

$$\begin{aligned} 2\mu J_Z = J_{DZ} = J_{D^{\mathsf {T}}Z} = D^{\mathsf {T}}J_Z + J_Z D, \end{aligned}$$

whence composition on both sides by \(J_Z\) gives

$$\begin{aligned} -2\mu |Z|^2 J_Z = -|Z|^2 J_ZD^{\mathsf {T}}- |Z|^2 D J_Z, \end{aligned}$$

and

$$\begin{aligned} J_{DZ} = D J_Z + J_ZD^{\mathsf {T}}. \end{aligned}$$

This holds for all eigenvectors Z of D, and so for all \(Z \in \mathfrak {z}\) by linearity, so (3.10) holds by (3.11). \(\square \)

We are supposing that \(\mathfrak {g}\) is simple, so \(\Sigma \) is indecomposable. In particular, \(\Sigma \) contains just one highest root (see Bourbaki [1, p. 165, Proposition 25]), which we denote by \(\omega \). We fix the constant c in (3.1) by requiring that \(\Vert \omega \Vert ^2 = 2\). Then for each \(\gamma \in \Sigma \), the number \(\left\langle \gamma , \omega \right\rangle \) is one of \(\pm 2\), \(\pm 1\) and 0; further, it is \(\pm 2\) if and only if \(\gamma = \pm \omega \).

Define

$$\begin{aligned} \Sigma _1 = \{ \gamma \in \Sigma :\left\langle \gamma , \omega \right\rangle = 1 \} \quad \mathrm{and} \quad \Sigma _0 = \{ \gamma \in \Sigma :\left\langle \gamma , \omega \right\rangle = 0 \} , \end{aligned}$$

and write \(\Sigma _{0}^+\) for \(\Sigma ^+ \cap \Sigma _{0}\). Then, by Ciatti [6, Lemma 2.1],

$$\begin{aligned} \Sigma ^+ = \Sigma _{0}^+ \cup \Sigma _1 \cup \{ \omega \} . \end{aligned}$$

Further, define

$$\begin{aligned} \mathfrak {v} = \sum _{\gamma \in \Sigma _1} \mathfrak {g}_{\gamma } , \quad \mathfrak {h} = \mathfrak {v} \oplus \mathfrak {g}_{\omega } \quad \mathrm{and}\quad \mathfrak {n}_0 = \sum _{\gamma \in \Sigma _{0}^+} \mathfrak {g}_{\gamma } ; \end{aligned}$$

then

$$\begin{aligned} \mathfrak {n} = \mathfrak {n}_0 \oplus \mathfrak {v} \oplus \mathfrak {g}_{\omega } = \mathfrak {n}_0 \oplus \mathfrak {h} . \end{aligned}$$

Following Ciatti [6], for Z in \(\mathfrak {g}_{\omega }\), we define the operator \(J_Z :\mathfrak {v} \rightarrow \mathfrak {v}\) by

$$\begin{aligned} J_Z X = [ Z, \theta X ] . \end{aligned}$$
(3.12)

Then by definition and (3.2),

$$\begin{aligned} \left\langle J_Z X, Y \right\rangle = \left\langle \left[ Z, \theta X\right] ,Y \right\rangle = \left\langle Z, \left[ X,Y\right] \right\rangle \qquad \forall X, Y \in \mathfrak {v}. \end{aligned}$$

Lemma 3.8

(Ciatti [3]) The pair \((\mathfrak {v}\oplus \mathfrak {z} , \left\langle \cdot , \cdot \right\rangle )\) is an H-type algebra with centre \(\mathfrak {g}_{\omega }\), that is, \([\mathfrak {v}, \mathfrak {v}] = \mathfrak {g}_\omega \) and

$$\begin{aligned} J_Z^2 X = - \Vert Z \Vert ^2 X \end{aligned}$$
(3.13)

for all Z in \(\mathfrak {g}_{\omega }\) and X in \(\mathfrak {v}\).

Proof

This follows from Lemma 3.5. \(\square \)

Now we list some H-type subalgebras of the Iwasawa \(\mathfrak {n}\) algebras of rank-two simple Lie algebras. When the root system is of type \(A_2\), then \(\mathfrak {n}\) is itself an H-type algebra. When the root system is of type \(B_2\), say \(\Sigma = \{ \alpha , \beta , \alpha +\beta , 2\alpha +\beta \}\), then \(\mathfrak {g}_{\alpha } \oplus \mathfrak {g}_{\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +\beta }\) is an H-type subalgebra (and moreover \(\mathfrak {g}_{\beta } \oplus \mathfrak {g}_{\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +\beta }\) is abelian and hence a degenerate H-type algebra too). When the root system is of type \(BC_2\), say \(\Sigma = \{ \alpha , 2\alpha , \beta , \alpha +\beta , 2\alpha +\beta , 2\alpha +2\beta \}\), then \(\mathfrak {g}_{\alpha } \oplus \mathfrak {g}_{2\alpha }\) and \(\mathfrak {g}_{\beta } \oplus \mathfrak {g}_{\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +2\beta }\) are H-type subalgebras, and \(\mathfrak {g}_{\alpha } \oplus \mathfrak {g}_{\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +\beta }\) is close to an H-type subalgebra (see Lemma 3.5). Finally, when the root system is of type \(G_2\), say \(\Sigma = \{ \alpha , \beta , \alpha +\beta , 2\alpha +\beta , 3\alpha +\beta ,3\alpha +\beta \}\), then \(\mathfrak {g}_{\alpha } \oplus \mathfrak {g}_{2\alpha +\beta }\oplus \mathfrak {g}_{3\alpha +\beta }\) and \(\mathfrak {g}_{\beta } \oplus \mathfrak {g}_{\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +\beta } \oplus \mathfrak {g}_{3\alpha +\beta }\oplus \mathfrak {g}_{3\alpha +2\beta }\) are H-type subalgebras.

3.5 The fine structure of \(\mathfrak {g}\)

We now study \(\mathfrak {g}\) in more detail.

Lemma 3.9

Suppose that \(\gamma \), \(\delta \), and \(\gamma +\delta \) are positive roots, and that \(\gamma -\delta \) and \(\gamma +2\delta \) are not roots. If \(U \in \mathfrak {g}_\delta {\setminus }\{0\}\), then

$$\begin{aligned} \left\{ [U,X] : X \in \mathfrak {g}_{\gamma } \right\} = \mathfrak {g}_{\gamma +\delta }\quad and\quad \left\{ [\theta U,Y] : Y \in \mathfrak {g}_{\gamma +\delta }\right\} = \mathfrak {g}_{\gamma }. \end{aligned}$$
(3.14)

Consequently, \(\dim (\mathfrak {g}_{\gamma }) = \dim (\mathfrak {g}_{\gamma +\delta })\) and \({\text {ad}}(U)\) is bijective from \(\mathfrak {g}_{\gamma }\) to \(\mathfrak {g}_{\gamma +\delta }\).

Proof

The hypotheses imply that \(\left\langle \gamma +\delta , \delta \right\rangle = \frac{1}{2} \left\langle \delta , \delta \right\rangle \ne 0\). Evidently, if \(Y \in \mathfrak {g}_{\gamma +\delta }\), then \([\theta U, Y ] \in \mathfrak {g}_{\gamma }\) and

$$\begin{aligned} \left[ U , [\theta U, Y]\right] = \left[ [U , \theta U], Y\right] + \left[ \theta U, [U ,Y]\right] = (\gamma +\delta )\left( [U , \theta U]\right) Y = - \left\langle \gamma +\delta , \delta \right\rangle \Vert U \Vert ^2 Y \end{aligned}$$

by (3.4), and it follows that Y is in the range of \({\text {ad}}(U)\). This proves the left hand formula of (3.14) and hence \(\dim (\mathfrak {g}_{\gamma }) \ge \dim (\mathfrak {g}_{\gamma +\delta })\). The right-hand formula and the opposite inequality \(\dim (\mathfrak {g}_{\gamma }) \le \dim (\mathfrak {g}_{\gamma +\delta })\) may be shown similarly.

The bijectivity of \({\text {ad}}(U)\), and of \({\text {ad}}(\theta U)\), follow. \(\square \)

Lemma 3.10

Suppose that \(\gamma \), \(\delta \), \(\gamma +\delta \) and \(\gamma +2\delta \) are positive roots, and that \(\gamma -\delta \) and \(\gamma +3\delta \) are not roots. If \(U \in \mathfrak {g}_\gamma {\setminus }\{0\}\) and \(X \in \mathfrak {g}_\delta {\setminus }\{0\}\), then

$$\begin{aligned} {[}U,X] \ne 0 \quad and \quad [[U,X],X] \ne 0. \end{aligned}$$

Proof

First, \([U, \theta X] = 0\) since \(\gamma - \delta \) is not a root.

The hypotheses imply that \(\gamma \) and \(\delta \) span a root subsystem of type \(B_2\) or \(BC_2\), whence \(2\gamma +\delta \) is not a root, and that \(\left\langle \gamma + \delta , \delta \right\rangle = 0\) while \(\left\langle \gamma , \delta \right\rangle \ne 0\) (see Bourbaki [1, p. 148, Théorème 1]). Now \([U,X] \ne 0\), by Lemma 3.9 with the roles of \(\gamma \) and \(\delta \) exchanged.

Next, by the Jacobi identity and the facts that \(\left\langle \gamma + \delta , \delta \right\rangle =0\) and \([U, \theta X] = 0\),

$$\begin{aligned} \left[ \left[ [U,X],X\right] ,\theta X\right]&= \left[ [\theta X,X],[U,X]\right] + \left[ \left[ [U,X],\theta X\right] ,X\right] \\&= (\gamma + \delta )\left( [\theta X,X]\right) [U,X]+ \left[ \left[ [U,X],\theta X\right] ,X\right] \\&= \left\langle \gamma + \delta , \delta \right\rangle \Vert X\Vert ^2 [U,X] + \left[ \left[ U,[X, \theta X]\right] ,X\right] + \left[ \left[ [U,\theta X],X\right] ,X\right] \\&= \left\langle \gamma , \delta \right\rangle \Vert X\Vert ^2 [U,X]\ne 0, \end{aligned}$$

which ensures that \([[U,X],X] \ne 0\) if neither U nor X is 0. \(\square \)

We are going to analyse general simple Lie algebras by looking carefully at subalgebras of rank 1 or 2. Given a subset \(\mathrm {E}\) of \(\Sigma ^+\), we write \(\mathfrak {g}^{\mathrm {E}}\) for the subalgebra of \(\mathfrak {g}\) generated by the root spaces \(\mathfrak {g}_{\epsilon }\) where \(\epsilon \) ranges over \({\text {span}}\mathrm {E}\).

We define, for any root \(\gamma \), \(\mathfrak {m}^{\{\gamma \}} = \mathfrak {m} \cap \mathfrak {g}^{\{\gamma \}}\) and

$$\begin{aligned} \mathfrak {m}^{\gamma } = {\text {span}}\left\{ [ X, \theta Y ] :X, Y \in \mathfrak {g}_{\gamma }, \left\langle X, Y \right\rangle = 0 \right\} . \end{aligned}$$

Lemma 3.11

Suppose that \(\gamma \), \(\delta \) and \(\gamma +\delta \) are positive roots. Then

$$\begin{aligned} \left\{ [X, Y] : X \in \mathfrak {g}_\gamma , \ Y \in \mathfrak {g}_\delta \right\} = \mathfrak {g}_{\gamma +\delta } ; \\ \left\{ U \in \mathfrak {g}_\gamma : {\text {ad}}(U)\vert _{\mathfrak {g}_\delta } = 0 \right\} = \{0\}. \end{aligned}$$

Proof

We observe that \({\text {ad}}(\mathfrak {m} \oplus \mathfrak {a})\) is irreducible on \(\mathfrak {g}_{\epsilon }\) for any positive root \(\epsilon \). Indeed, we know that \({\text {ad}}(\mathfrak {m} \oplus \mathfrak {a})\) maps \(\mathfrak {g}_{\epsilon }\) into itself, while from Kostant’s double transitivity theorem [17], \({\text {ad}}(\mathfrak {m}^{\{\epsilon \}})\) is transitive on the unit sphere in \(\mathfrak {g}_{\epsilon }\), whence \({\text {ad}}(\mathfrak {m} \oplus \mathfrak {a})\) takes any nonzero vector in \(\mathfrak {g}_{\epsilon }\) to any other nonzero vector. (For an alternative approach to this, see Cowling et al. [9]).

It follows that the subspace \([\mathfrak {m} \oplus \mathfrak {a}, [\mathfrak {g}_\gamma , \mathfrak {g}_\delta ] ]\) is either \(\mathfrak {g}_{\gamma +\delta }\) or \(\{0\}\). Hence, to prove the first equality, it suffices to show that

$$\begin{aligned} {[}\mathfrak {g}_\gamma , \mathfrak {g}_\delta ] \ne \{0\}. \end{aligned}$$
(3.15)

To do this, we consider the subset \((\mathbb {Z}\gamma + \mathbb {Z}\delta ) \cap \Sigma \) of \(\Sigma \), which is a root system in its own right.

If this root subsystem is of rank one, then necessarily \(\delta =\gamma \), and \(\gamma + \delta = 2\gamma \). In this case, \(\mathfrak {g}_\gamma \oplus \mathfrak {g}_{2\gamma } \) is an H-type algebra, and (3.15) follows.

If the root system is of type \(A_2\), then we are done, since \(\mathfrak {g}_\gamma \oplus \mathfrak {g}_\delta \oplus \mathfrak {g}_{\gamma +\delta } \) is an H-type algebra.

If the root system is of type \(B_2\) or \(BC_2\), then (3.15) follows from Lemma 3.10.

If the root system is of type \(G_2\), then the algebra is split or complex, and in this case the result is well known.

Finally, suppose that \(U \in \mathfrak {g}_\gamma {\setminus } \{0\}\) and \([U,X ] = 0\) for all \(X \in \mathfrak {g}_\delta \). Then

$$\begin{aligned} {[}{\text {ad}}(W) U, X] = {\text {ad}}(W) [U,X] - [U, {\text {ad}}(W) X] = 0 \end{aligned}$$

for all \(X \in \mathfrak {g}_\delta \) and all \(W \in \mathfrak {m} \oplus \mathfrak {a}\), and hence \([V,X ] = 0\) for all \(V \in \mathfrak {g}_{\gamma }\) and all \(X \in \mathfrak {g}_\delta \), which is impossible. \(\square \)

Lemma 3.12

The following hold:

  1. (i)

    if \(\gamma \) is a root, then \(\mathfrak {m}^{-\gamma } = \mathfrak {m}^{\gamma }\),

  2. (ii)

    if \(\gamma \) is a root, then \([\mathfrak {m} , \mathfrak {m}^{\gamma }] \subseteq \mathfrak {m}^{\gamma } \),

  3. (iii)

    if \(\gamma \), \(\delta \) and \(\epsilon \) are roots and \(\epsilon \in \mathbb {Z}\gamma + \mathbb {Z}\delta \), then \(\mathfrak {m}^{\epsilon } \subseteq \mathfrak {m}^\gamma + \mathfrak {m}^\delta \),

  4. (iv)

    if \(\gamma \) is a root, then \(\mathfrak {m}^{\gamma } \subseteq \mathfrak {m}^{\{\gamma \}}\), with equality if \(\gamma /2\) is not a root,

  5. (v)

    \(\mathfrak {m} = \sum _{\gamma \in \Delta } \mathfrak {m}^\gamma = \sum _{\gamma \in \Sigma ^+} \mathfrak {m}^\gamma \).

Proof

Observe that if \(X, Y \in \mathfrak {g}_{-\gamma }\), then \([X, \theta Y] = - [ \theta Y, \theta (\theta X)]\), and \(\theta Y, \theta X \in \mathfrak {g}_{\gamma }\), so (i) holds.

Now we prove (ii). If \(Z \in \mathfrak {m}\) and \(X, Y \in \mathfrak {g}_\gamma \), then

$$\begin{aligned} \begin{aligned}{} \left[ Z, [X, \theta Y]\right] = \left[ [Z, X], \theta Y\right] + \left[ X, [Z, \theta Y]\right] . \end{aligned} \end{aligned}$$

Both summands lie in \(\mathfrak {m}^{\gamma }\). Thus \(\mathfrak {m}^{\gamma }\) is an ideal in \(\mathfrak {m}\), and in particular, is a subalgebra.

Next, we prove (iii). First, if \(\gamma \), \(\delta \) and \(\gamma +\delta \) are roots and \(W, Z \in \mathfrak {g}_{\gamma +\delta }\), then there exist \(X \in \mathfrak {g}_\gamma \) and \(Y \in \mathfrak {g}_\delta \) such that \([X,Y] = Z\), by Lemma 3.11. Then

$$\begin{aligned} \begin{aligned}{} [W, \theta Z] = \left[ W, [\theta X,\theta Y]\right]&= \left[ [W, \theta X],\theta Y\right] + \left[ \theta X,[W,\theta Y]\right] \\&= [W, \theta X],\theta Y] - \left[ [W,\theta Y], \theta X\right] \in \mathfrak {m}^{\gamma } + \mathfrak {m}^{\delta }. \end{aligned} \end{aligned}$$

To prove (iii), we use (i) and the observation above repeatedly.

To prove (iv), observe that if \(2\gamma \) and \(\frac{1}{2}\gamma \) are not roots, then \(\mathfrak {g}_{-\gamma } \oplus \mathfrak {m}^{\gamma } \oplus \mathbb {R}H_\gamma \oplus \mathfrak {g}_{\gamma }\) coincides with the subalgebra \(\mathfrak {g}^{\{\gamma \}}\), whence \(\mathfrak {m}^{\gamma } = \mathfrak {m}^{\{\gamma \}}\). Similarly, if \(\gamma \) and \(2\gamma \) are both roots, then \(\mathfrak {m}^{2\gamma } \subseteq \mathfrak {m}^\gamma \), by (ii), so \(\mathfrak {g}_{-2\gamma } \oplus \mathfrak {g}_{-\gamma } \oplus \mathfrak {m}^{\gamma } \oplus \mathbb {R}H_\gamma \oplus \mathfrak {g}_{\gamma } \oplus \mathfrak {g}_{2\gamma }\) coincides with the subalgebra \(\mathfrak {g}^{\{\gamma \}}\), and again \(\mathfrak {m}^{\gamma } = \mathfrak {m}^{\{\gamma \}}\). Finally, if \(\gamma \) and \(\frac{1}{2}\gamma \) are both roots, \(\mathfrak {m}^{\gamma } \subseteq \mathfrak {m}^{\gamma /2}\), and \(\mathfrak {m}^{\gamma } \subseteq \mathfrak {m}^{\{\gamma \}}\). This inclusion is strict when \(\mathfrak {g}^{\{\gamma \}}\) is \(\mathfrak {su}(n,1)\) (where \(n >1\)) or \(\mathfrak {sp}(n,1)\) (where \(n>1\)).

To prove (v), we use (i) and (iii) repeatedly. \(\square \)

4 Derivations of semisimple Lie algebras

In this section, we discuss the height of roots and the associated grading of the Lie algebra \(\mathfrak {g}\), and prove a number of results on height-preserving derivations. Then we prove a localisation result for derivations of \(\mathfrak {g}\). Our final result is a necessary and sufficient condition for a skew-symmetric root-space-preserving derivation of \(\mathfrak {n}\) to be of the form \({\text {ad}}(Z)\) for some \(Z \in \mathfrak {m}\).

Lemma 4.1

Suppose that \(W \in \mathfrak {g}_0\). Then \(W = 0 \) if and only if \({\text {ad}}(W) \vert _{\mathfrak {g}_\beta } = 0\) for all \(\beta \in \Delta \).

Proof

One implication is obvious. To prove the other, suppose that \({\text {ad}}(W) \vert _{\mathfrak {g}_\beta } = 0\) for all \(\beta \in \Delta \). Then \({\text {ad}}(W)\) vanishes on \(\mathfrak {n}\), whence \({\text {ad}}(\theta W)\) also vanishes on \(\mathfrak {n}\) since \({\text {ad}}(\theta W) = - {\text {ad}}(W)^{\mathsf {T}}\).

Now if \(X \in \mathfrak {n}\), then

$$\begin{aligned} {[}W , \theta X ] = \theta [ \theta W , X ] = 0. \end{aligned}$$

Since \(\mathfrak {g}\) is simple, and \({\text {ad}}(W)\) is a derivation that vanishes on \(\mathfrak {n} \oplus \theta \mathfrak {n}\) and hence on the algebra that this generates, that is, \(\mathfrak {g}\), we conclude that \(W = 0\). \(\square \)

We are interested in the derivations D of \(\mathfrak {n}\) that preserve the root space structure, that is, are such that \(D(\mathfrak {g}_\alpha ) \subseteq \mathfrak {g}_\alpha \) for all \(\alpha \in \Sigma ^+\). We write \(\mathcal {D}(\mathfrak {\mathfrak {n}})\) for the space of these mappings.

Recall that the height of the positive root \(\alpha \), written \({{\mathrm{height}}}(\alpha )\), is defined to be \(\sum _{j=1}^{r} n_j\), where \(\alpha = \sum _{j=1}^r n_j \alpha _j\) and \(\alpha _j \in \Delta \). Note that there is an element \(H_0\) of \(\mathfrak {a}\) such that \([H_0,X] = {{\mathrm{height}}}(\alpha ) X\) for all \(X \in \mathfrak {g}_\alpha \) and all \(\alpha \in \Delta \). We may extend the height function to all roots: we set \({{\mathrm{height}}}(\gamma ) = h\) when \([H_0,X] = hX\) for all \(X \in \mathfrak {g}_\gamma \). When h is a nonzero integer, we write \(\mathfrak {g}_{h}\) for \(\sum _{\gamma } \mathfrak {g}_\gamma \), where we sum over the \(\gamma \) in \(\Sigma \) such that \({{\mathrm{height}}}(\gamma ) = h\). We defined \(\mathfrak {g}_0\) to be the “null root space” \(\mathfrak {m} \oplus \mathfrak {a}\), which fortunately coincides with the subspace of \(\mathfrak {g}\) of elements of height 0, and so \(\mathfrak {g}_0\) may also be used to describe the latter space, consistently with our \(\mathfrak {g}_h\) notation.

Proposition 4.2

The Lie algebra \(\mathfrak {g}\) is graded: more precisely, \(\mathfrak {g} = \sum _{h\in \mathbb {Z}} \mathfrak {g}_h\), and \([\mathfrak {g}_h, \mathfrak {g}_{h'}] \subseteq \mathfrak {g}_{h+h'}\). Next, \(\mathfrak {n}\) is stratified, that is, \([\mathfrak {g}_h, \mathfrak {g}_{1}] = \mathfrak {g}_{h+1}\) for all \(h \in \mathbb {Z}^+\), so \(\mathfrak {g}_1\) generates \(\mathfrak {n}\). Finally, if \(0< h < {{\mathrm{height}}}(\omega )\), then \(\{ X \in \mathfrak {g}_h : {\text {ad}}(X) \vert _{\mathfrak {g}_{1}} = 0\} = \{0\}\).

Proof

The linear operator \({\text {ad}}(H_0)\) on \(\mathfrak {g}\) is diagonalisable, whence \(\mathfrak {g}\) decomposes as a sum of eigenspaces; given that the simple roots correspond to eigenvalue 1 and all positive roots are sums of simple roots (with multiplicities), all eigenvalues are integers. Further, \({\text {ad}}(H_0)\) is a derivation and so \([\mathfrak {g}_h, \mathfrak {g}_{h'}] \subseteq \mathfrak {g}_{h+h'}\).

If \({{\mathrm{height}}}(\gamma ) = h+1\) where \(h>0\), then there exists \(\alpha \in \Delta \) such that \(\gamma - \alpha \) is a root, by [8, Lemma 3.1]. Lemma 3.11 shows that \([\mathfrak {g}_{\alpha }, \mathfrak {g}_{\gamma - \alpha } ] = \mathfrak {g}_\gamma \), and it follows that \(\mathfrak {g}_\gamma \subseteq [\mathfrak {g}_1, \mathfrak {g}_{h}] \). This applies to all \(\gamma \) of height \(h+1\) and so \(\mathfrak {g}_{h+1} \subseteq [\mathfrak {g}_1, \mathfrak {g}_{h}] \). The converse inclusion has already been established.

Finally, suppose that \(X \in \mathfrak {g}_h\) and \({\text {ad}}(X)\vert _{\mathfrak {g}_1} =0\). Write X as \(\sum _{\gamma } X_\gamma \), where \(X_\gamma \in \mathfrak {g}_\gamma \) and \({{\mathrm{height}}}(\gamma ) = h\). Now

$$\begin{aligned} 0 = \left[ [H,X],Y\right] = \left[ [H,Y],X\right] + \left[ H,[X,Y]\right] \qquad \forall Y \in \mathfrak {g}_1, \end{aligned}$$

whence \({\text {ad}}([H,X])\vert _{\mathfrak {g}_1} =0\) for all \(H \in \mathfrak {a}\). The algebra of operators generated by the operators \({\text {ad}}(H)\) for all \(H \in \mathfrak {a}\) is closed under transpose and hence spanned by its minimal projections, which are precisely the projections onto the root spaces \(\mathfrak {g}_\gamma \) as \(\gamma \) varies over \(\Sigma \). We deduce that \({\text {ad}}(X_\gamma )\vert _{\mathfrak {g}_1} =0\) for all \(\gamma \) of height h. By Lemma 3.11, each \(X_\gamma \) is zero. \(\square \)

We are now going to work with derivation identities.

Definition 4.3

Suppose that D is a derivation of \(\mathfrak {n}\). For \(\gamma , \delta \in \Sigma ^+\), let \((D_{\gamma , \delta })\) be the formula

$$\begin{aligned} D[X,\theta Z] = [DX,\theta Z] + [X,\theta DZ] \end{aligned}$$

for all \(X \in \mathfrak {g}_\gamma \) and all \(Z \in \mathfrak {g}_\delta \), and \((E_{\gamma , \delta })\) be the formula

$$\begin{aligned} D\left[ [X,\theta Y],Z\right] = \left[ [DX,\theta Y],Z\right] + \left[ [X,\theta DY],Z\right] + \left[ [X,\theta Y],DZ\right] \end{aligned}$$

for all \(X, Y \in \mathfrak {g}_\gamma \) and all \(Z \in \mathfrak {g}_\delta \).

Note that if D were a derivation of \(\mathfrak {g}\) such that \(\theta D = D\theta \), then these formulae would follow from the Jacobi identity. At this point, we are not asserting their truth!

Theorem 4.4

Suppose that D is a skew-symmetric height-preserving derivation of \(\mathfrak {n}\), and that \((E_{\gamma , \delta })\), as in Definition 4.3, holds for all \(\gamma ,\delta \in \Delta \). Then the following statements hold.

  1. (i)

    There is a unique well-defined linear map \(\tilde{D}: \mathfrak {g}_0 \rightarrow \mathfrak {g}_0\) such that

    $$\begin{aligned} \tilde{D}\left[ X,\theta Y\right] = \left[ DX, \theta Y\right] + \left[ X, \theta DY\right] \qquad \forall X, Y \in \mathfrak {g}_1. \end{aligned}$$
  2. (ii)

    The range of the linear map \(\tilde{D}\) is contained in \(\mathfrak {m}\).

  3. (iii)

    The linear map \(E : \mathfrak {g}_0 \oplus \mathfrak {n} \rightarrow \mathfrak {g}_0 \oplus \mathfrak {n}\), defined by

    $$\begin{aligned} E\left( W + X\right) = \tilde{D}W + DX \qquad \forall W \in \mathfrak {g}_0\quad \forall X \in \mathfrak {n}, \end{aligned}$$

    is a derivation.

  4. (iv)

    If \(h \ge k \ge 0\), then

    $$\begin{aligned} E\left[ U,\theta V\right] = \left[ EU, \theta V\right] + \left[ U, \theta E V\right] \qquad \forall U \in \mathfrak {g}_h \quad \forall V \in \mathfrak {g}_k \end{aligned}$$

Proof

To prove (i), we first claim that if \(\alpha , \beta \in \Delta \) and \(\alpha \ne \beta \), then

$$\begin{aligned} \left[ DX, \theta Y\right] + \left[ X, \theta DY\right] = 0 \qquad \forall X \in \mathfrak {g}_\alpha \quad \forall Y \in \mathfrak {g}_\beta . \end{aligned}$$
(4.1)

To see this, take W in \(\mathfrak {g}_0\) of the form \([U, \theta V]\), where \(U, V \in \mathfrak {g}_\gamma \), for some \(\gamma \in \Delta \). Since D is a skew-symmetric derivation,

$$\begin{aligned} \begin{aligned} \left\langle [DX, \theta Y] + [X, \theta DY], W \right\rangle&= - \left\langle DX, [W, Y] \right\rangle - \left\langle X, [W, DY] \right\rangle \\&= \left\langle X, D[W, Y] \right\rangle - \left\langle X, [W, DY] \right\rangle \\&= \left\langle X, [[DU,\theta V]+ [U, \theta DV], Y] \right\rangle \\&= - \left\langle [X, \theta Y], [DU,\theta V]+ [U, \theta DV]\right\rangle \\&=0, \end{aligned} \end{aligned}$$

since \([X, \theta Y] = 0\) because \(\alpha - \beta \) is not a root; the third step uses \((E_{\gamma , \beta })\). Since \(\mathfrak {g}_0\) is spanned by elements of the form \([U, \theta V]\), our claim is established.

Now we define \(L: \bigcup _{\alpha \in \Delta } \mathfrak {g}_\alpha \times \bigcup _{\alpha \in \Delta } \mathfrak {g}_\alpha \rightarrow \mathfrak {g}_0\) by \(L(X,Y) = [X, \theta Y]\). Then L extends automatically to a linear map, also denoted L, from \( \mathfrak {g}_1 \otimes \mathfrak {g}_1\) to \(\mathfrak {g}_0\). Take \(X_j, Y_j \in \mathfrak {g}_{1}\), and suppose that \(\sum _j [X_j, \theta Y_j] = 0\) in \(\mathfrak {g}_0\). Write each \(X_j\) as \(\sum _{\alpha } X_{j,\alpha }\) and each \(Y_j\) as \(\sum _{\beta } Y_{j,\beta }\), where \(X_{j,\alpha } \in \mathfrak {g}_\alpha \) and \(Y_{j,\beta } \in \mathfrak {g}_\beta \); here \(\alpha \) and \(\beta \) range over \(\Delta \). Then

$$\begin{aligned} \sum _j \left[ X_j, \theta Y_j\right] = \sum _{j,\alpha } \left[ X_{j,\alpha }, \theta Y_{j,\alpha }\right] \end{aligned}$$

since \([X_{j,\alpha }, \theta Y_{j,\beta }] = 0\) because \(\alpha - \beta \) is not a root if \(\alpha \ne \beta \). If \(\gamma \in \Delta \) and \(W \in \mathfrak {g}_\gamma \), then

$$\begin{aligned} \sum _j \left[ \left[ X_j, \theta Y_j\right] , W\right] = 0 \quad \mathrm{and} \quad \sum _j \left[ \left[ X_j, \theta Y_j\right] , DW\right] = 0 \end{aligned}$$

by hypothesis. Thus by \((E_{\alpha ,\gamma })\),

$$\begin{aligned} \begin{aligned} 0&= D \sum _j \left[ \left[ X_j, \theta Y_j\right] , W\right] = D\sum _{j,\alpha } \left[ \left[ X_{j,\alpha }, \theta Y_{j,\alpha }\right] , W\right] \\&= \sum _{j,\alpha } \left[ \left[ DX_{j,\alpha }, \theta Y_{j,\alpha }\right] , W\right] + \left[ \left[ X_{j,\alpha }, \theta DY_{j,\alpha }\right] , W\right] + \left[ \left[ X_{j,\alpha }, \theta Y_{j,\alpha }\right] , DW\right] \\&= \sum _{j,\alpha } \left[ \left[ DX_{j,\alpha }, \theta Y_{j,\alpha }\right] + \left[ X_{j,\alpha }, \theta DY_{j,\alpha }\right] , W\right] + \sum _{j} \left[ \left[ X_{j}, \theta Y_{j}\right] , DW\right] \\&= \sum _{j,\alpha ,\beta } \left[ \left[ DX_{j,\alpha }, \theta Y_{j,\beta }\right] + \left[ X_{j,\alpha }, \theta DY_{j,\beta }\right] , W\right] , \end{aligned} \end{aligned}$$

and this shows that

$$\begin{aligned} \sum _{j} \left[ \left[ DX_{j,\alpha }, \theta Y_{j,\beta }\right] + \left[ X_{j,\alpha }, \theta DY_{j,\beta }\right] , W\right] = 0. \end{aligned}$$

From Lemma 4.1, we see that

$$\begin{aligned} \sum _{j} \left[ DX_{j,\alpha }, \theta Y_{j,\beta }\right] + \left[ X_{j,\alpha }, \theta DY_{j,\beta }\right] =0. \end{aligned}$$

It follows immediately that \(\tilde{D}\), given by

$$\begin{aligned} \tilde{D}\sum _j \left[ X_j,\theta Y_j\right] = \sum _j \left( \left[ DX_j, \theta Y_j\right] + \left[ X_j, \theta DY_j\right] \right) , \end{aligned}$$

is well defined; clearly \(\tilde{D}\) is also unique.

To prove (ii), note that if \(X, Y \in \mathfrak {g}_\alpha \) where \(\alpha \in \Delta \), and \(H \in \mathfrak {a}\), then

$$\begin{aligned} \begin{aligned} \left\langle [DX, \theta Y] + [X, \theta DY] , H \right\rangle&= \left\langle DX, [H,Y] \right\rangle + \left\langle X, [H, DY] \right\rangle \\&= \left\langle DX, [H,Y] \right\rangle + \left\langle [H,X], DY \right\rangle \\&= \alpha (H) \left\langle DX, Y \right\rangle + \left\langle X,DY \right\rangle \\&= 0 , \end{aligned} \end{aligned}$$

since D is skew-symmetric. This equality now holds for all \(X, Y \in \mathfrak {g}_1\) by linearity and (4.1), and so the range of \(\tilde{D}\) is contained in \(\mathfrak {m}\).

We now extend D and \(\tilde{D}\) to a linear map E on \(\mathfrak {g}_0 + \mathfrak {n}\) by setting \(E(W+X) = \tilde{D}W + DX\) for all \(W \in \mathfrak {g}_0\) and all \(X \in \mathfrak {n}\). Since D is a derivation on \(\mathfrak {n}\), to show that E is a derivation it suffices to show that

$$\begin{aligned} D[W,X] = \left[ \tilde{D} W,X\right] + [W, DX] \qquad \forall W \in \mathfrak {g}_0 \quad \forall X \in \mathfrak {n}. \end{aligned}$$
(4.2)

and

$$\begin{aligned} \tilde{D} [W,U] = \left[ \tilde{D} W,U\right] + \left[ W, \tilde{D}U\right] \quad \forall W, U \in \mathfrak {g}_0. \end{aligned}$$
(4.3)

To prove (4.2), observe that

$$\begin{aligned} D[W,X] - \left[ \tilde{D} W,X\right] - [W, DX] = \left( \left[ D, {\text {ad}}(W)\right] - {\text {ad}}\left( \tilde{D}W\right) \right) X, \end{aligned}$$

and \([D, {\text {ad}}(W)] - {\text {ad}}(\tilde{D}W) \) is a derivation. To show that it is 0 on \(\mathfrak {n}\), it suffices to show that it vanishes on \(\mathfrak {g}_\beta \) for all simple roots \(\beta \). By linearity, it suffices to take W of the form \([X, \theta Y]\) where \(X, Y \in \mathfrak {g}_\alpha \) and \(\alpha \in \Delta \); this case follows from \((E_{\alpha ,\beta })\).

To prove (4.3), we may suppose by linearity that \(U = [X, \theta Y] \) where \(X, Y \in \mathfrak {g}_\alpha \) for some \(\alpha \in \Delta \). Now \(\theta \tilde{D} \theta = \tilde{D}\), so

$$\begin{aligned} \begin{aligned} \tilde{D} \left[ W, \left[ X, \theta Y\right] \right]&= \tilde{D}\left[ \left[ W,X\right] , \theta Y\right] + \tilde{D}\left[ X, \left[ W, \theta Y\right] \right] \\&= \tilde{D}\left[ \left[ W,X\right] , \theta Y\right] + \tilde{D}\left[ X, \theta \left[ \theta W,Y\right] \right] \\&= \left[ D\left[ W,X\right] , \theta Y\right] + \left[ \left[ W,X\right] , \theta DY\right] \\&\quad + \left[ DX, \left[ W, \theta Y\right] \right] + \left[ X, \theta D\left[ \theta W,Y\right] \right] \\&= \left[ \left[ \tilde{D}W,X\right] , \theta Y\right] + \left[ \left[ W,DX\right] , \theta Y\right] + \left[ \left[ W,X\right] , \theta DY\right] \\&\quad + \left[ DX, \left[ W, \theta Y\right] \right] + \left[ X, \theta \left[ \tilde{D}\theta W,Y\right] \right] + \left[ X, \theta \left[ \theta W,DY\right] \right] \\&= \left[ \left[ \tilde{D}W,X\right] , \theta Y\right] + \left[ \left[ W,DX\right] , \theta Y\right] + \left[ \left[ W,X\right] , \theta DY\right] \\&\quad + \left[ DX, \left[ W, \theta Y\right] \right] + \left[ X, \left[ \tilde{D} W, \theta Y\right] \right] + \left[ X, \left[ W, \theta DY\right] \right] \\&= \left[ \tilde{D}W,\left[ X, \theta Y\right] \right] + \left[ W,[DX, \theta Y]\right] + \left[ W,[X, \theta DY]\right] \\&= \left[ \tilde{D}W,\left[ X, \theta Y\right] \right] + \left[ W,\tilde{D} \left[ X, \theta Y\right] \right] , \end{aligned} \end{aligned}$$

and (4.3) holds.

Finally, we prove (iv), using induction on h and k. We need to prove the identity \((D_{h,k})\), given by

$$\begin{aligned} E \left[ X, \theta Y \right] = \left[ EX, \theta Y\right] + \left[ X, \theta EY\right] \qquad \forall X \in \mathfrak {g}_{h} \quad \forall Y \in \mathfrak {g}_{k}. \end{aligned}$$

First we suppose that \(k=1\). The identity \((D_{h,1})\) is equivalent to

$$\begin{aligned} \left[ E \left[ X, \theta Y\right] , Z\right] = \left[ \left[ EX, \theta Y\right] , Z\right] + \left[ \left[ X, \theta EY\right] ,Z\right] \end{aligned}$$

for all \(X \in \mathfrak {g}_{h}\), all \(Y\in \mathfrak {g}_{1}\) and all \(Z\in \mathfrak {g}_{1}\), by Proposition 4.2. Write W for [XZ] in \(\mathfrak {g}_{h+1}\). Since E is a derivation, from the Jacobi identity and the definition of E

$$\begin{aligned} \begin{aligned}{}&\left[ E\left[ X, \theta Y\right] ,Z\right] - \left[ \left[ EX, \theta Y\right] , Z\right] - \left[ \left[ X, \theta EY\right] ,Z\right] \\&\quad = E\left[ \left[ X, \theta Y\right] ,Z\right] - \left[ \left[ X, \theta Y\right] , EZ\right] - \left[ \left[ EX, \theta Y\right] , Z\right] - \left[ \left[ X, \theta EY\right] ,Z\right] \\&\quad = E\left[ \left[ X, Z\right] , \theta Y\right] + E\left[ X, \left[ \theta Y,Z\right] \right] {-} \left[ \left[ X, \theta Y\right] , EZ\right] {-} \left[ \left[ EX, \theta Y\right] , Z\right] {-} \left[ \left[ X, \theta EY\right] ,Z\right] \\&\quad = E\left[ \left[ X, Z\right] , \theta Y\right] + \left[ EX, \left[ \theta Y,Z\right] \right] + \left[ X, \left[ \theta EY,Z\right] \right] + \left[ X, \left[ \theta Y,EZ\right] \right] \\&\quad \qquad - \left[ \left[ X, \theta Y\right] , EZ\right] - \left[ \left[ EX, \theta Y\right] , Z\right] - \left[ \left[ X, \theta EY\right] ,Z\right] \\&\quad = E\left[ \left[ X, Z\right] , \theta Y\right] + \left[ \theta Y, \left[ EX, Z\right] \right] + \left[ \theta EY, \left[ X,Z\right] \right] + \left[ \theta Y, \left[ X, EZ\right] \right] \\&\quad = E\left[ W, \theta Y\right] - \left[ EW,\theta Y\right] - \left[ W, \theta EY\right] . \end{aligned} \end{aligned}$$

We deduce that if \((D_{h+1,1})\) holds, the last line vanishes; hence the first line vanishes, and \((D_{h,1})\) holds. Since \((D_{h,1})\) holds for large positive h (because there is nothing to prove as \(\mathfrak {g}_h = \{0\}\)), \((D_{h,1})\) holds for all positive h.

Now suppose that \((D_{h,1})\) and \((D_{h,k})\) hold where \(1 \le k < h\). Take \(X \in \mathfrak {g}_h\), \(Y_1 \in \mathfrak {g}_{1}\) and \(Y_2 \in \mathfrak {g}_{k}\). Then

$$\begin{aligned} \begin{aligned}&E \left[ X, \theta \left[ Y_1,Y_2\right] \right] - \left[ EX, \theta \left[ Y_1,Y_2\right] \right] - \left[ X, \theta E \left[ Y_1,Y_2\right] \right] \\&\quad = E \left[ \left[ X, \theta Y_1\right] ,\theta Y_2\right] + E \left[ \theta Y_1,\left[ X,\theta Y_2\right] \right] - \left[ \left[ EX, \theta Y_1\right] ,\theta Y_2\right] - \left[ \theta Y_1, \left[ EX,\theta Y_2\right] \right] \\&\qquad - \left[ X, \left[ \theta EY_1,\theta Y_2\right] \right] - \left[ X, \left[ \theta Y_1, \theta EY_2\right] \right] \\&\quad = \left[ E \left[ X, \theta Y_1\right] ,\theta Y_2\right] + \left[ \left[ X, \theta Y_1\right] ,\theta EY_2\right] + \left[ \theta EY_1,\left[ X,\theta Y_2\right] \right] + \left[ \theta Y_1,E \left[ X,\theta EY_2\right] \right] \\&\qquad - \left[ \left[ EX, \theta Y_1\right] ,\theta Y_2\right] - \left[ \theta Y_1, \left[ EX,\theta Y_2\right] \right] - \left[ X, \left[ \theta EY_1,\theta Y_2\right] \right] - \left[ X, \left[ \theta Y_1, \theta EY_2\right] \right] \\&\quad = \left[ \left[ EX, \theta Y_1\right] ,\theta Y_2\right] + \left[ \left[ X, E \theta Y_1\right] ,\theta Y_2\right] + \left[ \left[ X, \theta Y_1\right] , \theta EY_2\right] + \left[ \theta EY_1,\left[ X,\theta Y_2\right] \right] \\&\qquad + \left[ \theta Y_1, \left[ EX,\theta Y_2\right] \right] + \left[ \theta Y_1, \left[ X,\theta EY_2\right] \right] - \left[ \left[ EX, \theta Y_1\right] ,\theta Y_2\right] - \left[ \theta Y_1, \left[ EX,\theta Y_2\right] \right] \\&\qquad - \left[ X, \left[ \theta EY_1,\theta Y_2\right] \right] - \left[ X, \left[ \theta Y_1,E \theta Y_2\right] \right] \\&\quad = \left[ \left[ X, E \theta Y_1\right] ,\theta Y_2\right] + \left[ \left[ X, \theta Y_1\right] ,\theta EY_2\right] + \left[ \theta EY_1,\left[ X,\theta Y_2\right] \right] + \left[ \theta Y_1, \left[ X,\theta EY_2\right] \right] \\&\qquad - \left[ X, \left[ \theta EY_1,\theta Y_2\right] \right] - \left[ X, \left[ \theta Y_1,E Y_2\right] \right] \\&\quad = 0 . \end{aligned} \end{aligned}$$

By Proposition 4.2, \((D_{h,k+1})\) also holds. By induction, \((D_{h,k})\) holds whenever \(h \ge k \ge 0\). \(\square \)

Theorem 4.5

Suppose that D is a skew-symmetric height-preserving derivation of \(\mathfrak {n}\). Then the following are equivalent:

  1. (i)

    there exists a height-preserving derivation \(\tilde{D}\) of \(\mathfrak {g}\) whose restriction to \(\mathfrak {n}\) coincides with D;

  2. (ii)

    \(D = {\text {ad}}(W)\) for some \(W \in \mathfrak {m}\);

  3. (iii)

    \((E_{\gamma , \delta })\) holds for all \(\gamma ,\delta \in \Sigma ^+\).

  4. (iv)

    \((E_{\gamma , \delta })\) holds for all \(\gamma ,\delta \in \Delta \).

Further, if any of these conditions hold, then \(\tilde{D}\) is root space preserving.

Proof

Suppose that (i) holds. Since all derivations of \(\mathfrak {g}\) are inner, \(\tilde{D} = {\text {ad}}(W)\) for some \(W \in \mathfrak {g}\). Evidently \({\text {ad}}(W)\) preserves height if and only if \(W \in \mathfrak {g}_0\). Thus \(W \in \mathfrak {m} \oplus \mathfrak {a}\). Since D is skew-symmetric, \(W \in \mathfrak {m}\), and (ii) is proved.

If (ii) holds, then the Jacobi identity and the fact that \(\theta W = W\) imply that

$$\begin{aligned}&D\left[ \left[ X,\theta Y\right] ,Z\right] \\&\quad = {\text {ad}}(W) \left[ \left[ X,\theta Y\right] ,Z\right] \\&\quad = \left[ \left[ {\text {ad}}(W)X,\theta Y\right] ,Z\right] + \left[ \left[ X {\text {ad}}(W)\theta Y\right] ,Z\right] + \left[ \left[ X,\theta Y\right] ,{\text {ad}}(W)Z\right] \\&\quad = \left[ \left[ {\text {ad}}(W)X,\theta Y\right] ,Z\right] + \left[ \left[ X, \theta {\text {ad}}(W)Y\right] ,Z\right] + \left[ \left[ X,\theta Y\right] ,{\text {ad}}(W)Z\right] \\&\quad = \left[ \left[ DX,\theta Y\right] ,Z\right] + \left[ \left[ X,\theta DY\right] ,Z\right] + \left[ \left[ X,\theta Y\right] ,DZ\right] , \end{aligned}$$

and (iii) holds.

It is trivial that (iii) implies (iv).

Suppose that (iv) holds. We are going to construct a derivation \(\tilde{E}\) that extends the derivation E of Theorem 4.4 to the simple Lie algebra \(\mathfrak {g}\) and preserves heights.

When \(X \in \mathfrak {g}_0 \oplus \mathfrak {n}\), we set \(\tilde{E} X = EX\). When \(X \in \mathfrak {g}_0 \oplus \theta \mathfrak {n}\), we define

$$\begin{aligned} \tilde{E} X = \theta E (\theta X). \end{aligned}$$
(4.4)

These definitions agree when \(X \in \mathfrak {g}_0\) by part (ii) of Theorem 4.4. It follows from the definition that

$$\begin{aligned} \theta \tilde{E} \theta = \tilde{E}. \end{aligned}$$
(4.5)

Finally, to show that \(\tilde{D}\) is a derivation, we have to verify that

$$\begin{aligned} \tilde{D}\left[ U, V\right] = \left[ \tilde{D}U, V\right] + \left[ U, \tilde{D} V\right] \qquad \forall U,V \in \mathfrak {g}. \end{aligned}$$

By linearity, it suffices to demonstrate this for \(U \in \mathfrak {g}_h\) and \(V \in \mathfrak {g}_k\), for all possible heights h and k. There are various cases to consider. We label the relevant identity \((D_{h,k})\):

$$\begin{aligned} \tilde{D}\left[ U, V\right] = \left[ \tilde{D}U, V\right] + \left[ U, \tilde{D} V\right] \qquad \forall U \in \mathfrak {g}_h \quad \forall V \in \mathfrak {g}_k. \end{aligned}$$

Case 1: \(h \ge 0\) and \(k \ge 0\). This case is trivial as \(\tilde{E}\) coincides with E on \(\mathfrak {g}_0 \oplus \mathfrak {n}\).

Case 2: \(h \le 0\) and \(k \le 0\). In this case, we take \(X, Y \in \mathfrak {g}_0 \oplus \theta \mathfrak {n}\), so \(\theta X, \theta Y \in \mathfrak {g}_0 \oplus \mathfrak {n}\), and then

$$\begin{aligned} \begin{aligned} \tilde{E} \left[ X, Y\right]&= \theta E \left[ \theta X, \theta Y\right] = \theta \left[ E\theta X, \theta Y\right] + \theta \left[ \theta X, E\theta Y\right] \\&= \left[ \tilde{E} X, Y\right] + \left[ X, \tilde{E}Y\right] , \end{aligned} \end{aligned}$$

and \((D_{h,k})\) holds.

Case 3: \(hk < 0\). We need to show that

figure a

If \(h + k \ge 0\), this follows from part (iv) of Theorem 4.4 and (4.5); otherwise, we conjugate by \(\theta \), as in Case 2.

We conclude with the observation that if \(\tilde{E} = {\text {ad}}(Z)\) for some \(Z \in \mathfrak {m}\), then \(\tilde{E}\) is root space preserving. \(\square \)

5 Derivations of \(\mathfrak {n}\)

We are now able to consider a nilpotent Lie algebra \(\mathfrak {n}\) that arises in the Iwasawa decomposition of a real simple Lie algebra \(\mathfrak {g}\). We write \(\mathcal {D}(\mathfrak {n})\) for the space of root-space-preserving derivations of \(\mathfrak {n}\).

Theorem 5.1

If \(\mathfrak {g}\) is simple and not isomorphic to \(\mathfrak {so}(n, 1)\) or \(\mathfrak {su}(n,1)\), then every D in \(\mathcal {D}(\mathfrak {n})\) is given by

$$\begin{aligned} D = {\text {ad}}(W), \end{aligned}$$

where \(W \in \mathfrak {m}\oplus \mathfrak {a}\).

The main theorem follows from this and Proposition 3.1.

We prove Theorem 5.1 by showing that every derivation is the sum of a symmetric and a skew-symmetric derivation, and treating these separately. The symmetric derivations are handled using the following lemma, which reduces matters to showing that symmetric derivations act by scalars on the root spaces.

Lemma 5.2

If a derivation D of \(\mathfrak {n}\) acts by a real scalar \(\lambda _\alpha \) on each root space \(\mathfrak {g}_\alpha \) where \(\alpha \in \Delta \), then \(D = {\text {ad}}(H)\) for some \(H \in \mathfrak {a}\).

Proof

Since D is a derivation, it is determined by the \(\lambda _\alpha \) where \(\alpha \) is simple; further, the simple roots form a basis of \({{\mathrm{Hom}}}(\mathfrak {a}, \mathbb {R})\) and so there exists \(H \in \mathfrak {a}\) such that \(\alpha (H) = \lambda _\alpha \) for each simple root. Hence \(D = {\text {ad}}(H)\). \(\square \)

Remark 5.3

A similar observation is valid when \(\mathfrak {g}\) is complex and D acts by a complex scalar on each root space, since every derivation of a complex Lie algebra is complex linear. Hence Theorem 5.1 is trivial when \(\mathfrak {g}\) is a split or complex Lie algebra. In fact, in the split case (that is, when all the roots have multiplicity 1) it follows that \(\mathcal {D}(\mathfrak {n}) = {\text {ad}}(\mathfrak {a})\). In particular, Theorem 5.1 holds for the algebras with root system \(D_n\) (where \(n \ge 4\)), \(E_6\), \(E_7\), \(E_8\) or \(G_2\), since these are either split or complex, by the classification.

The skew-symmetric derivations are treated using Theorem 4.5, which shows that a skew-symmetric derivation D lies in \({\text {ad}}(\mathfrak {m})\) when the identity \((E_{\alpha ,\beta })\) holds for all simple roots \(\alpha \) and \(\beta \). For convenience, we recall this identity:

figure b

for all \(X, Y \in \mathfrak {g}_\alpha \) and all \(V \in \mathfrak {g}_\beta \).

Another key ingredient of our proof, which we use in parallel with the previous observation, is a reduction to Lie algebras of rank at most two. Recall that if \(\mathrm {E}\) is a subset of \(\Sigma \), then \(\mathfrak {g}^{\mathrm {E}}\) denotes the subalgebra of \(\mathfrak {g}\) generated by all the spaces \(\mathfrak {g}_\epsilon \), where \(\epsilon \in \mathrm {E}\). We also denote by \(\mathfrak {m}^{\mathrm {E}}\) and \(\mathfrak {n}^{\mathrm {E}}\) the algebras \(\mathfrak {m} \cap \mathfrak {g}^{\mathrm {E}}\) and \(\mathfrak {n} \cap \mathfrak {g}^{\mathrm {E}}\), and by \(\Sigma ^\mathrm {E}\) the root subsystem \(\Sigma \cap {\text {span}}(\mathrm {E})\).

Now we come to the proof proper. We first consider the rank-one case; this is known, and we just state what we need. Next, we consider the real rank-two case, and the third step is to consider the case where the real rank is higher than two.

5.1 The rank-one algebras

The algebras are well known (see, for instance, Weyl [23]) and the root-space-preserving derivations are well known. We summarise the results in the following proposition for the convenience of the reader. As the simple algebras for which \(\mathcal {D}(\mathfrak {n}) \ne {\text {ad}}(\mathfrak {m} \oplus \mathfrak {a})\) are rank one, a case-by-case analysis is appropriate.

Proposition 5.4

(Riehm [20], Saal [21]) Let \(\mathfrak {g}\) be a simple Lie algebra of real rank one. Then \(\mathcal {D}(\mathfrak {n}) = \mathcal {D}^{\mathrm {sym}}(\mathfrak {n}) \oplus \mathcal {D}^{\mathrm {skew}}(\mathfrak {n})\). Moreover,

  1. (i)

    if \(\mathfrak {g} = \mathfrak {so}(1,n+1)\), then \(\mathcal {D}(\mathfrak {n}) = \mathfrak {sl}(n,\mathbb {R}) \oplus \mathbb {R}\);

  2. (ii)

    if \(\mathfrak {g} = \mathfrak {su}(1,n+1)\), then \(\mathcal {D}(\mathfrak {n}) = \mathfrak {sp}(n,\mathbb {R}) \oplus \mathbb {R}\);

  3. (iii)

    if \(\mathfrak {g} = \mathfrak {sp}(1,n+1)\), then \(\mathcal {D}(\mathfrak {n}) = \mathfrak {sp}(n-1) \oplus \mathfrak {sp}(1) \oplus \mathbb {R}\);

  4. (iv)

    if \(\mathfrak {g} = \mathfrak {f}_{(4,-20)}\), then \(\mathcal {D}(\mathfrak {n}) = \mathfrak {so}(7) \oplus \mathbb {R}\).

In all cases, the summand \(\mathbb {R}\) corresponds to \({\text {ad}}(\mathfrak {a})\). In the first two cases, \(\mathcal {D}^{\mathrm {sym}}(\mathfrak {n})\) strictly contains \({\text {ad}}(\mathfrak {a})\); in the last two cases, \(\mathcal {D}^{\mathrm {sym}}(\mathfrak {n})\) coincides with \({\text {ad}}(\mathfrak {a})\). In all cases, \(\mathcal {D}^{\mathrm {skew}}(\mathfrak {n})\) coincides with \({\text {ad}}(\mathfrak {m})\).

Remark 5.5

In the first two cases, \(\mathfrak {n}\) is not rigid enough to prevent the occurrence of derivations that are not in \({\text {ad}}(\mathfrak {m} \oplus \mathfrak {a})\).

Proof

This follows from the work of Riehm [20] and Saal [21]; see also Folland [12] and Pansu [19]. Alternatively, the reader may combine the results about H-type algebras with the description of the rank-one simple Lie algebras in terms of H-type algebras by Cowling et al. [10]. \(\square \)

Corollary 5.6

Suppose that \(\mathfrak {g}\) is a simple Lie algebra of arbitrary rank, and D is a skew-symmetric root-space-preserving derivation of \(\mathfrak {n}\). Then the identity \((E_{\alpha ,\alpha })\) holds for all positive roots \(\alpha \).

Proof

For all roots \(\alpha \), the restriction \(D \vert _{\mathfrak {n}^{\{\alpha \}}}\) is a skew-symmetric root-space-preserving derivation, and from Theorem 5.1 we deduce that \(D \vert _{\mathfrak {n}^{\{\alpha \}}} \in {\text {ad}}(\mathfrak {m}^{\{\alpha \}})\). Then \((E_{\alpha ,\alpha })\) holds by Theorem 4.5. \(\square \)

5.2 The rank-two algebras

Let \(\mathfrak {g}\) be a simple Lie algebra of rank two and denote by \(\mathfrak {n}\) an Iwasawa subalgebra of \(\mathfrak {g}\). We shall prove that each root-space-preserving derivation of \(\mathfrak {n}\) is the sum of a symmetric and a skew-symmetric derivation, that the symmetric derivation lies in \({\text {ad}}(\mathfrak {a})\), and that the skew-symmetric part satisfies \((E_{\alpha , \beta })\) for all \(\alpha \) and \(\beta \) in \(\Sigma ^+\).

Before we analyse the various cases, we need a general result about derivations which we will use when the root system is of type \(B_2\) or \(BC_2\).

Lemma 5.7

Suppose that \(\alpha \), \(\beta \) and \(\alpha +\beta \) are positive roots while \(\alpha - \beta \) and \(\alpha +2\beta \) are not roots, and that \(D \in \mathcal {D}(\mathfrak {n})\). Then

$$\begin{aligned} D^{\mathsf {T}}\left[ U,X\right] = \left[ D^{\mathsf {T}}U,X\right] + \left[ U, D^{\mathsf {T}}X\right] \quad \forall U \in \mathfrak {g}_\beta \quad \forall X \in \mathfrak {g}_\alpha . \end{aligned}$$
(5.1)

Proof

By Proposition 5.4, the skew-symmetric part of the restriction of D to \(\mathfrak {g}_\beta \) coincides with \({\text {ad}}(Z)\) for some Z in \(\mathfrak {m}^{\beta }\). Write \(D_0\) for \(D - {\text {ad}}(Z)\). Since \({\text {ad}}(Z)\) is a skew-symmetric derivation, D satisfies (5.1) if and only if \(D_0\) does. Thus, by replacing D by \(D_0\) if necessary, there is no loss of generality in assuming that the restriction of D to \(\mathfrak {g}_\beta \) is symmetric.

We need to show (5.1). The eigenvectors of \(D \vert _{\mathfrak {g}_\beta }\) span \(\mathfrak {g}_\beta \), and so by linearity it will suffice to show (5.1) when U is an eigenvector of D and X is arbitrary. Take U in \(\mathfrak {g}_\beta {\setminus }\{0\}\) and \(\lambda \in \mathbb {R}\) such that \(DU = \lambda U\). By Lemma 3.9, \({\text {ad}}(U): \mathfrak {g}_\alpha \rightarrow \mathfrak {g}_{\alpha +\beta }\) is surjective, and so it will suffice to show that

$$\begin{aligned} \left\langle D^{\mathsf {T}}\left[ U, X\right] , \left[ U, Y \right] \right\rangle = \left\langle \left[ D^{\mathsf {T}}U, X\right] , \left[ U, Y\right] \right\rangle + \left\langle \left[ U, D^{\mathsf {T}}X\right] , \left[ U, Y\right] \right\rangle \end{aligned}$$

for the eigenvector U in \(\mathfrak {g}_\beta \) and arbitrary X and Y in \(\mathfrak {g}_\alpha \). Now, by the hypothesis that D is a root-space-preserving derivation, the choice of U, (3.2), the Jacobi identy, and the fact that \(\alpha - \beta \) is not a root, the left hand side is equal to

$$\begin{aligned} \left\langle \left[ U, X\right] , D\left[ U, Y \right] \right\rangle&= \left\langle \left[ U, X\right] , \left[ DU, Y \right] \right\rangle + \left\langle \left[ U, X\right] , \left[ U, DY \right] \right\rangle \\&= \left\langle \left[ U, X\right] , \left[ \lambda U, Y \right] \right\rangle - \left\langle X , \left[ \theta U, \left[ U, DY \right] \right] \right\rangle \\&= \left\langle \left[ \lambda U, X\right] , \left[ U, Y \right] \right\rangle - \left\langle X , \left[ \left[ \theta U, U\right] , DY \right] \right\rangle - \left\langle X , \left[ U, \left[ \theta U, DY \right] \right] \right\rangle \\&= \left\langle \left[ \lambda U, X\right] , \left[ U, Y \right] \right\rangle - \left\langle X , \alpha \left( \left[ \theta U, U\right] \right) DY \right\rangle \\&= \left\langle \left[ D^{\mathsf {T}}U, X\right] , \left[ U, Y \right] \right\rangle - \alpha \left( \left[ \theta U, U\right] \right) \left\langle D^{\mathsf {T}}X , Y \right\rangle \\ \end{aligned}$$

and similarly \(\left\langle [U, D^{\mathsf {T}}X] , [U, Y] \right\rangle \) is equal to

$$\begin{aligned} -\left\langle D^{\mathsf {T}}X , \left[ \theta U, \left[ U, Y \right] \right] \right\rangle&= -\left\langle D^{\mathsf {T}}X , \left[ \left[ \theta U,U\right] , Y\right] \right\rangle - \left\langle D^{\mathsf {T}}X , \left[ U, \left[ \theta U, Y \right] \right] \right\rangle \\&= -\left\langle D^{\mathsf {T}}X , \alpha \left( \left[ \theta U,U\right] \right) Y \right\rangle \\&= - \alpha \left( \left[ \theta U, U\right] \right) \left\langle D^{\mathsf {T}}X , Y \right\rangle . \end{aligned}$$

The result now follows. \(\square \)

5.2.1 The case \(A_2\)

Until further notice, we assume that \(\mathfrak {g}\) has root system \(A_2\), the simplest indecomposible root system of rank 2. We label the simple roots \(\alpha \) and \(\beta \), so that the highest root is \(\alpha +\beta \) and \(\Sigma ^+ = \{ \alpha , \beta , \alpha +\beta \}\). With the notation of Lemma 3.8, \(\Sigma _1 = \{\alpha , \beta \}\) and \(\Sigma _0 = \varnothing \). We shall use the result of Ciatti [6, Proposition 4.1] about the structure of \(\mathfrak {n}\), giving a proof for the convenience of the reader. We first recall that \(\mathfrak {n}\) is an H-type algebra, and for \(Z \in \mathfrak {g}_{\alpha +\beta }\), the map \(J_Z\) on \(\mathfrak {g}_\alpha \oplus \mathfrak {g}_\beta \) is determined by the condition that

$$\begin{aligned} \left\langle J_Z X, Y \right\rangle = \left\langle Z, [X,Y] \right\rangle \quad \forall X, Y \in \mathfrak {g}_\alpha \oplus \mathfrak {g}_\beta . \end{aligned}$$

Lemma 5.8

For every nontrivial X in \(\mathfrak {g}_\alpha \),

$$\begin{aligned} \mathfrak {g}_{\beta }=\{J_Z X : Z\in \mathfrak {g}_{\alpha +\beta } \} \end{aligned}$$
(5.2)

and

$$\begin{aligned} \mathfrak {g}_{\alpha }=\{J_{Z'}J_Z X : Z,Z'\in \mathfrak {g}_{\alpha +\beta } \} . \end{aligned}$$
(5.3)

Proof

We may and shall assume that X is a unit vector, and take \(Y \in \mathfrak {g}_\beta \). By (3.12) and the Jacobi identity,

$$\begin{aligned} J_{[Y,X]}X= [[Y,X], \theta X] = [[\theta X,X], Y] = \beta ([\theta X, X]) Y = Y, \end{aligned}$$

which proves that \(Y \in \{J_Z X : Z \in \mathfrak {g}_{\alpha +\beta }\} \).

Now, by (3.13), \(J_Z\) is a linear isomorphism that exchanges \(\mathfrak {g}_{\beta }\) and \(\mathfrak {g}_{\alpha }\) for all nonzero Z in \(\mathfrak {g}_{\alpha +\beta }\), so (5.3) follows from (5.2). \(\square \)

Proposition 5.9

Every root-space-preserving derivation of \(\mathfrak {n}\) is the sum of a symmetric and a skew-symmetric derivation.

Proof

By Lemma 3.8, \(\mathfrak {n}\) is H-type. The result follows from Corollary 2.5. \(\square \)

Proposition 5.10

Every symmetric root-space-preserving derivation D of \(\mathfrak {n}\) lies in \({\text {ad}}( \mathfrak {a})\).

Proof

From Corollary 2.8, D is the sum of a symmetric derivation \(D_0\) that vanishes on \(\mathfrak {g}_{\alpha +\beta }\) and \({\text {ad}}(H)\) for some H in \(\mathfrak {a}\). Since \(D_0\) is symmetric and preserves root spaces, we may take an eigenvector X of \(D_0\) in \(\mathfrak {g}_\alpha \) with corresponding eigenvalue \(\lambda \). By Proposition 2.1 and Lemma 5.8, \(D_0\) anticommutes with the maps \(J_Z\) and so acts as \(-\lambda \) on \(\mathfrak {g}_\beta \), and hence as \(\lambda \) on \(\mathfrak {g}_\alpha \). This implies that \(D_0\) lies in \({\text {ad}}(\mathfrak {a})\) by Lemma 5.2. \(\square \)

Proposition 5.11

The basic derivation identity \((E_{\gamma ,\delta })\) holds as \(\gamma \) and \(\delta \) range over the set \(\{\alpha , \beta \}\) of simple roots. Consequently, every derivation D in \(\mathcal {D}^{\mathrm {skew}}(\mathfrak {n})\) is equal to \({\text {ad}}(Z)\) for some Z in \(\mathfrak {m}\).

Proof

Recall the basic derivation identity \((E_{\gamma ,\delta })\), that is, the identity

$$\begin{aligned} D\left[ \left[ X,\theta Y\right] ,Z\right] = \left[ [DX,\theta Y, Z\right] + \left[ \left[ X,\theta DY\right] ,Z\right] + \left[ \left[ X,\theta Y\right] ,DZ\right] \end{aligned}$$

for all \(X, Y \in \mathfrak {g}_\gamma \) and all \(Z \in \mathfrak {g}_\delta \). By Theorem 4.5, it suffices to prove \((E_{\gamma ,\delta })\) as \(\gamma \) and \(\delta \) range over \(\{\alpha , \beta \}\). The identities \((E_{\alpha ,\alpha })\) and \((E_{\beta ,\beta })\) hold by Corollary 5.6.

Now we prove \((E_{\alpha ,\beta })\). Suppose that \(X, Y \in \mathfrak {g}_\alpha \) and \(Z \in \mathfrak {g}_\beta \). Since \(\beta - \alpha \) is not a root,

$$\begin{aligned}&D\left[ \left[ X,\theta Y\right] ,Z\right] - \left[ \left[ DX,\theta Y\right] , Z\right] - \left[ \left[ X,\theta DY\right] ,Z\right] - \left[ \left[ X,\theta Y\right] ,DZ\right] \\&\quad = D\left[ \left[ X,Z\right] , \theta Y\right] - \left[ \left[ DX, Z\right] , \theta Y\right] - \left[ \left[ X,Z\right] , \theta DY\right] - \left[ \left[ X, DZ\right] ,\theta Y\right] \\&\quad = D\left[ \left[ X,Z\right] , \theta Y\right] - \left[ D\left[ X, Z\right] , \theta Y\right] - \left[ \left[ X,Z\right] , \theta DY\right] \\&\quad = D\left[ W, \theta Y\right] - \left[ DW, \theta Y\right] - \left[ W, \theta DY\right] , \end{aligned}$$

where \(W = [X,Z] \in \mathfrak {g}_{\alpha +\beta }\); it will suffice to prove that this is 0 for all \(W \in \mathfrak {g}_{\alpha +\beta }\) and all \(Y \in \mathfrak {g}_\alpha \). By the definition of \(J_W\), for \(W \in \mathfrak {g}_{\alpha +\beta }\), we may rewrite the last expression as

$$\begin{aligned} DJ_W Y - J_{DW} Y - J_W DY, \end{aligned}$$

and since D is a skew-symmetric derivation, this is 0 by (2.4).

We exchange the roles of \(\alpha \) and \(\beta \) to prove the remaining identity. \(\square \)

5.2.2 The case \(B_2\)

Until further notice, we assume that \(\mathfrak {g}\) has root system \(B_2\). We denote by \(\alpha \) and \(\beta \) the simple roots, with \(\beta \) longer than \(\alpha \). Hence \(\omega =2\alpha +\beta \) and \(\Sigma ^+ = \{\alpha , \beta , \alpha +\beta , 2\alpha +\beta \}\).

The first proposition is the basic result: it establishes that every element of \(\mathcal {D}(\mathfrak {n})\) is a sum of a symmetric and a skew-symmetric derivation.

Proposition 5.12

If D is in \(\mathcal {D}(\mathfrak {n})\), then its transpose \(D^{\mathsf {T}}\) is also in \(\mathcal {D}(\mathfrak {n})\).

Proof

By Lemma 3.8, the algebra \(\mathfrak {g}_\alpha \oplus \mathfrak {g}_{\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +\beta }\) is H-type. By Corollary 2.5, the restriction of \(D^{\mathsf {T}}\) to this H-type algebra is a derivation. Hence it suffices to show that

$$\begin{aligned} D^{\mathsf {T}}[U,X] = [D^{\mathsf {T}}U, X] + [U, D^{\mathsf {T}}X] \end{aligned}$$

for all X in \(\mathfrak {g}_\alpha \) and all U in \(\mathfrak {g}_\beta \). The proposition now follows from Lemma 5.7. \(\square \)

Now we describe the symmetric derivations.

Proposition 5.13

Every derivation D in \(\mathcal {D}^{\mathrm {sym}}(\mathfrak {n})\) is equal to \({\text {ad}}(H)\) for some H in \(\mathfrak {a}\).

Proof

By Lemma 3.5, \(\mathfrak {g}_\alpha \oplus \mathfrak {g}_{\alpha +\beta } \oplus \mathfrak {g}_{2\alpha +\beta }\) is an H-type algebra. In light of Corollary 2.8, we may assume that D vanishes on \(\mathfrak {g}_{2\alpha +\beta }\), by subtracting \({\text {ad}}(H)\) for a suitable H in \(\mathfrak {a}\).

The derivation D, being symmetric, may be diagonalised with real eigenvalues. We fix eigenvectors U in \(\mathfrak {g}_\beta \) with eigenvalue \(\lambda \) and X in \(\mathfrak {g}_\alpha \) with eigenvalue \(\mu \). Since D is a derivation,

$$\begin{aligned} D\left[ \left[ U,X\right] ,X\right] =\left( \lambda +2\mu \right) \left[ \left[ U,X\right] ,X\right] . \end{aligned}$$

Now \(\alpha \) and \(\beta \) satisfy the hypotheses of Lemma 3.10, and so [[UX], X] is nonzero. Since D vanishes on \(\mathfrak {g}_{2\alpha +\beta }\),

$$\begin{aligned} \lambda +2\mu = 0. \end{aligned}$$

We vary the eigenvector U, holding X fixed: This shows that \(\lambda \) is independent of X. Similarly, \(\mu \) is independent of U. By Lemma 5.2, this implies the proposition. \(\square \)

Proposition 5.14

The basic derivation identity \((E_{\gamma ,\delta })\) holds as \(\gamma \) and \(\delta \) range over the set \(\{\alpha , \beta \}\) of simple roots. Consequently, every derivation D in \(\mathcal {D}^{\mathrm {skew}}(\mathfrak {n})\) is equal to \({\text {ad}}(Z)\) for some Z in \(\mathfrak {m}\).

Proof

Recall the basic derivation identity \((E_{\gamma ,\delta })\), that is, the identity

$$\begin{aligned} D\left[ \left[ X,\theta Y\right] ,Z\right] = \left[ [DX,\theta Y, Z\right] + \left[ \left[ X,\theta DY\right] ,Z\right] + \left[ \left[ X,\theta Y\right] ,DZ\right] \end{aligned}$$

for all \(X, Y \in \mathfrak {g}_\gamma \) and all \(Z \in \mathfrak {g}_\delta \). Again by Theorem 4.5, we need to prove \((E_{\gamma ,\delta })\) as \(\gamma \) and \(\delta \) range over \(\{\alpha , \beta \}\). The identities \((E_{\alpha ,\alpha })\) and \((E_{\beta ,\beta })\) hold by Corollary 5.6.

Now we prove \((E_{\beta , \alpha })\). Suppose that \(X, Y \in \mathfrak {g}_\beta \) and \(Z \in \mathfrak {g}_\alpha \). Since \(\beta - \alpha \) is not a root,

$$\begin{aligned}&D\left[ \left[ X,\theta Y\right] ,Z\right] - \left[ \left[ DX,\theta Y\right] , Z\right] - \left[ \left[ X,\theta DY\right] ,Z\right] - \left[ \left[ X,\theta Y\right] ,DZ\right] \\&\quad = D\left[ \left[ X,Z\right] , \theta Y\right] - \left[ \left[ DX, Z\right] , \theta Y\right] - \left[ \left[ X,Z\right] , \theta DY\right] - \left[ \left[ X, DZ\right] ,\theta Y\right] \\&\quad = D\left[ \left[ X,Z\right] , \theta Y\right] - \left[ D\left[ X, Z\right] , \theta Y\right] - \left[ \left[ X,Z\right] , \theta DY\right] \\&\quad = D\left[ W, \theta Y\right] - \left[ DW, \theta Y\right] - \left[ W, \theta DY\right] , \end{aligned}$$

where \(W = [X,Z] \in \mathfrak {g}_{\alpha +\beta }\); it will suffice to prove that this is 0 for all \(W \in \mathfrak {g}_{\alpha +\beta }\) and all \(Y \in \mathfrak {g}_\beta \). By Lemma 3.9, \({\text {ad}}(Y)\) maps \(\mathfrak {g}_\alpha \) onto \(\mathfrak {g}_{\alpha +\beta }\), so it will suffice to prove that

$$\begin{aligned} D[[U,Y], \theta Y] - [D[U,Y], \theta Y] - [[U,Y], \theta DY] =0 \end{aligned}$$

for all \(U \in \mathfrak {g}_\alpha \) and all \(Y \in \mathfrak {g}_\beta \). Since \(\alpha - \beta \) is not a root, \([[R,S], \theta T] = [R,[S, \theta T]]\) for all \(R \in \mathfrak {g}_\alpha \) and all \(S,T \in \mathfrak {g}_\beta \), whence

$$\begin{aligned} \begin{aligned}&D\left[ \left[ U,Y\right] , \theta Y\right] - \left[ D\left[ U,Y\right] , \theta Y\right] - \left[ \left[ U,Y\right] , \theta DY\right] \\&\quad =D\left[ \left[ U,Y\right] , \theta Y\right] - \left[ \left[ DU,Y\right] , \theta Y\right] - \left[ \left[ U,DY\right] , \theta Y\right] - \left[ \left[ U,Y\right] , \theta DY\right] \\&\quad =D\left[ U,\left[ Y, \theta Y\right] \right] - \left[ DU,\left[ Y, \theta Y\right] \right] - \left[ U,\left[ DY, \theta Y\right] \right] - \left[ U,\left[ Y, \theta DY\right] \right] \\&\quad = \left\langle \alpha , \beta \right\rangle \Vert Y\Vert ^2 DU - \left\langle \alpha , \beta \right\rangle \Vert Y\Vert ^2 DU - \left[ U,\left[ DY, \theta Y\right] \right] - \left[ U,\left[ Y, \theta DY\right] \right] \\&\quad = \left[ \left[ DY, \theta Y\right] + \left[ Y, \theta DY\right] , U\right] . \end{aligned} \end{aligned}$$

Now if \(X \perp Y\), then \([X, \theta Y] \in \mathfrak {m}\) and hence

$$\begin{aligned} \left[ X, \theta Y\right] + \left[ Y, \theta X\right] = \theta \left[ X, \theta Y\right] + \left[ Y, \theta X\right] = \left[ \theta X, Y\right] + \left[ Y, \theta X\right] = 0. \end{aligned}$$

Applying this with X equal to DY finishes the proof of \((E_{\beta , \alpha })\).

It remains to prove \((E_{\alpha ,\beta })\). Take \(X, Y \in \mathfrak {g}_\alpha \) and \(U,Z \in \mathfrak {g}_\beta \). Then

$$\begin{aligned}&\left\langle D\left[ \left[ X, \theta Y\right] ,Z\right] - \left[ \left[ DX, \theta Y\right] ,Z\right] - \left[ \left[ X, \theta DY\right] ,Z\right] - \left[ \left[ X, \theta Y\right] ,DZ\right] , U \right\rangle \\&\quad = -\left\langle \left[ \left[ X, \theta Y\right] ,Z\right] , DU \right\rangle + \left\langle \left[ DX, \theta Y\right] , \left[ U, \theta Z\right] \right\rangle \\&\qquad + \left\langle \left[ X, \theta DY\right] , \left[ U, \theta Z\right] \right\rangle + \left\langle \left[ X, \theta Y\right] , \left[ U, \theta DZ\right] \right\rangle \\&\quad = \left\langle \left[ X, \theta Y\right] , \left[ DU,\theta Z\right] \right\rangle - \left\langle DX, \left[ \left[ U, \theta Z\right] , Y\right] \right\rangle \\&\qquad - \left\langle X, \left[ \left[ U, \theta Z\right] ,DY\right] \right\rangle - \left\langle X, \left[ \left[ U, \theta DZ\right] ,Y\right] \right\rangle \\&\quad =-\left\langle X, \left[ \left[ DU,\theta Z\right] ,Y\right] \right\rangle + \left\langle X, D\left[ \left[ U, \theta Z\right] , Y\right] \right\rangle \\&\qquad - \left\langle X, \left[ \left[ U, \theta Z\right] ,DY\right] \right\rangle - \left\langle X, \left[ \left[ U, \theta DZ\right] ,Y\right] \right\rangle \\&\quad = \left\langle X, D\left[ \left[ U, \theta Z\right] , Y\right] - \left[ \left[ DU,\theta Z\right] ,Y\right] - \left[ \left[ U, \theta Z\right] ,DY\right] - \left[ \left[ U, \theta DZ\right] ,Y\right] \right\rangle . \end{aligned}$$

This shows that \((E_{\alpha ,\beta })\) and \((E_{\beta ,\alpha })\) are equivalent, so we are done.

Note that we have not used the fact that \(2\alpha \) and \(2(\alpha +\beta )\) are not roots, so this argument holds in the \(BC_2\) case too. \(\square \)

This completes our discussion of the algebras with root system \(B_2\). We remind the reader that \(C_2\) is the same as \(B_2\). The algebras with root system \(G_2\) are covered by Remark 5.3. It remains to consider the algebras with root system \(BC_2\).

5.2.3 The case \(BC_2\)

Until further notice, we assume that \(\mathfrak {g}\) has root system \(BC_2\). Denote by \(\alpha \) and \(\beta \) the simple roots, with \(\alpha \) orthogonal to the highest root \(\omega \). Then \(\Sigma ^+ = \{\alpha , 2\alpha , \beta , \alpha +\beta , 2\alpha + \beta , 2\alpha + 2\beta \}\) and \(\omega = 2\alpha + 2\beta \).

Note that \(\{ \pm 2 \alpha , \pm \beta , \pm (2\alpha +\beta ), \pm (2\alpha +2\beta )\}\) is a root subsystem of type \(B_2\), write \(\mathfrak {n}_{\mathrm {sub}}\) for \(\mathfrak {g}_{2\alpha } \oplus \mathfrak {g}_{\beta } \oplus \mathfrak {g}_{2\alpha + \beta } \oplus \mathfrak {g}_{2\alpha + 2\beta }\). The results of the previous subsection apply to the root-space-preserving derivations of the subalgebra \(\mathfrak {n}_{\mathrm {sub}}\) to give us information about derivations of \(\mathfrak {n}\).

The first step is to establish the analogue of Proposition 5.12.

Proposition 5.15

If D is in \(\mathcal {D}(\mathfrak {n})\), then its transpose \(D^{\mathsf {T}}\) is also in \(\mathcal {D}(\mathfrak {n})\).

Proof

By linearity, it suffices to show that

$$\begin{aligned} D^{\mathsf {T}}\left[ X,Y\right] = \left[ D^{\mathsf {T}}X, Y\right] + \left[ X, D^{\mathsf {T}}Y\right] \quad \forall X \in \mathfrak {g}_\gamma \quad \forall Y \in \mathfrak {g}_\delta , \end{aligned}$$
(5.4)

as \(\gamma \) and \(\delta \) range over \(\Sigma ^+\). As D and hence also \(D^{\mathsf {T}}\) preserve root spaces, this is trivial unless \(\gamma +\delta \) is a root. Moreover, by Corollary 2.5, the restrictions of \(D^{\mathsf {T}}\) to the H-type algebras \(\mathfrak {g}_{\beta } \oplus \mathfrak {g}_{\alpha + \beta } \oplus \mathfrak {g}_{2\alpha + \beta } \oplus \mathfrak {g}_{2\alpha + 2\beta }\) and \(\mathfrak {g}_\alpha \oplus \mathfrak {g}_{2\alpha }\) are derivations, and by Proposition 5.12, the restriction of \(D^{\mathsf {T}}\) to \(\mathfrak {g}_{\beta } \oplus \mathfrak {g}_{2\alpha } \oplus \mathfrak {g}_{2\alpha + \beta } \oplus \mathfrak {g}_{2\alpha + 2\beta }\) is a derivation.

Thus it suffices to prove (5.4) when \((\gamma ,\delta )\) is either \((\alpha ,\beta )\) or \((\alpha ,\alpha + \beta )\). Lemma 5.7 takes care of the case when \(\gamma = \alpha \) and \(\delta =\beta \).

Since \(2(2\alpha +\beta )\) is not a root, Proposition 5.4 implies that there exists Z in \(\mathfrak {m}^{2\alpha +\beta }\) that agrees with the skew-symmetric part of D on \(\mathfrak {g}_{2\alpha +\beta }\). By subtracting \({\text {ad}}(Z)\) from D if necessary, we may suppose that D is symmetric on \(\mathfrak {g}_{2\alpha +\beta }\). Now Corollary 3.7 gives (5.4). \(\square \)

Once again, we consider the symmetric derivations.

Proposition 5.16

If the root system of \(\mathfrak {g}\) is \(BC_2\), then every derivation in \(\mathcal {D}^{\mathrm {sym}}(\mathfrak {n})\) is given by \({\text {ad}}(H)\) for some H in \(\mathfrak {a}\).

Proof

By Proposition 5.13, the restriction of D to \(\mathfrak {n}_{\mathrm {sub}}\) is given by \({\text {ad}}(H)\) for some H in \(\mathfrak {a}\). By subtracting \({\text {ad}}(H)\) if necessary we may suppose that D vanishes on \(\mathfrak {n}_{\mathrm {sub}}\); it will then suffice to show that D is trivial.

To do this, we pick an eigenvector X of D in \(\mathfrak {g}_{\alpha }\) with eigenvalue \(\lambda \) and \(U \in \mathfrak {g}_{\beta } {\setminus }\{0\}\). Since D is a derivation and \(DU = 0\)

$$\begin{aligned} D\left[ \left[ U,X\right] ,X\right] = 2 \lambda \left[ \left[ U,X\right] ,X\right] . \end{aligned}$$

However, \(D[[U,X],X]=0\), since [[UX], X] lies in \(\mathfrak {g}_{2\alpha +\beta } \subset \mathfrak {n}_{\mathrm {sub}}\). Since \([[U,X],X] \ne 0\) by Lemma 3.10, \(\lambda =0\), and D is trivial on \(\mathfrak {g}_\alpha \). Since D is also trivial on \(\mathfrak {g}_\beta \), it is trivial on \(\mathfrak {g}_{\alpha +\beta }\), and hence trivial on all the root spaces. \(\square \)

We conclude our discussion of the rank-two case with a description of the skew-symmetric derivations.

Proposition 5.17

The basic derivation identity \((E_{\gamma ,\delta })\) holds as \(\gamma \) and \(\delta \) range over the set \(\{\alpha , \beta \}\) of simple roots. Consequently, every derivation D in \(\mathcal {D}^{\mathrm {skew}}(\mathfrak {n})\) is equal to \({\text {ad}}(Z)\) for some Z in \(\mathfrak {m}\).

Proof

This follows from Proposition 5.14, which also holds in the root system \(BC_2\), and Theorem 4.5. \(\square \)

5.3 The general case

Now we prove Theorem 5.1. Henceforth, \(\mathfrak {g}\) denotes a real simple Lie algebra of rank at least 3, and \(\mathfrak {n}\) is an Iwasawa nilpotent subalgebra of \(\mathfrak {g}\).

Proposition 5.18

Suppose that D is a derivation of \(\mathfrak {n}\). Then \(D^{\mathsf {T}}\) is also a derivation of \(\mathfrak {n}\).

Proof

By linearity, this follows provided that

$$\begin{aligned} D^{\mathsf {T}}\left[ X,Y\right] = \left[ D^{\mathsf {T}}X,Y\right] + \left[ X,D^{\mathsf {T}}Y\right] \end{aligned}$$

for all \(X \in \mathfrak {g}_\gamma \) and all \(Y \in \mathfrak {g}_\delta \) where \(\gamma \) and \(\delta \) range over \(\Sigma ^+\). This is obvious if \(\gamma +\delta \) is not a root, while if \(\gamma +\delta \) is a root, then it follows by restricting D to \(\mathfrak {n}^{\{\gamma ,\delta \}}\). \(\square \)

Proposition 5.19

Suppose that D is a symmetric derivation of \(\mathfrak {n}\). Then D lies in \({\text {ad}}(\mathfrak {a})\).

Proof

Again, by restricting to rank-two subalgebras, we may show that D acts as a scalar on each root space. By Lemma 5.2, \(D \in {\text {ad}}(\mathfrak {a})\). \(\square \)

Proposition 5.20

Suppose that D is a skew-symmetric derivation of \(\mathfrak {n}\). Then D lies in \({\text {ad}}(\mathfrak {m})\).

Proof

Let D be a skew-symmetric root-space-preserving derivation of \(\mathfrak {n}\). Again, by restricting to rank-two subalgebras, we may show that D satisfies the basic derivation identity \((E_{\gamma ,\delta })\) whenever \(\gamma \) and \(\delta \) are positive roots. Hence \(D \in {\text {ad}}(\mathfrak {m})\) by Theorem 4.5. \(\square \)