We begin following the historical arc of quaternion algebras and tracing their impact on the development of mathematics. Our account is selective: for further overview, see Lam [Lam2003] and Lewis [Lew2006a].

1.1 Hamilton’s quaternions

In perhaps the “most famous act of mathematical vandalism”, on October 16, 1843, Sir William Rowan Hamilton (1805–1865, Figure 1.1.2) carved the following equations into the Brougham Bridge (now Broom Bridge) in Dublin:

$$\begin{aligned} i^2 = j^2 = k^2 = ijk = -1. \end{aligned}$$
(1.1.1)

His discovery was a defining moment in the history of algebra (Figure 1.1.3).

Figure 1.1.2:
figure 1

William Rowan Hamilton (public domain; scan by Wellesley College Library)

For at least ten years (on and off), Hamilton had been attempting to model (real) three-dimensional space with a structure like the complex numbers, whose addition and multiplication occur in two-dimensional space. Just like the complex numbers had a “real” and “imaginary” part, so too did Hamilton hope to find an algebraic system whose elements had a “real” and two-dimensional “imaginary” part. In the early part of the month of October 1843, his sons Archibald Henry and William Edwin Hamilton, while still very young, would ask their father at breakfast [Ham67, p. xv]: “Well, papa, can you multiply triplets?” To which Hamilton would reply, “with a sad shake of the head, ‘No, I can only add and subtract them’” [Ham67, p. xv]. For a history of the “multiplying triplets” problem—the nonexistence of division algebra over the reals of dimension 3—see May [May66, p. 290].

Figure 1.1.3:
figure 2

William Rowan Hamilton, a sand sculpture by Daniel Doyle, part of the 2012 Dublin castle exhibition, Irish Science (reproduced with permission)

Then, on the dramatic day in 1843, Hamilton’s had a flash of insight [Ham67, p. xv–xvi], which he described in a letter to Archibald (written in 1865):

On the 16th day of [October]—which happened to be a Monday, and a Council day of the Royal Irish Academy—I was walking in to attend and preside, and your mother was walking with me, along the Royal Canal, to which she had perhaps driven; and although she talked with me now and then, yet an under-current of thought was going on in my mind, which gave at last a result, whereof it is not too much to say that I felt at once the importance. An electric circuit seemed to close; and a spark flashed forth, the herald (as I foresaw, immediately) of many long years to come of definitely directed thought and work, by myself if spared, and at all events on the part of others, if I should even be allowed to live long enough distinctly to communicate the discovery. Nor could I resist the impulse—unphilosophical as it may have been—to cut with a knife on a stone of Brougham Bridge, as we passed it, the fundamental formula with the symbols, i, j, k; namely,

$$\begin{aligned} i^2 = j^2 = k^2 = ij k =-1 \end{aligned}$$

which contains the Solution of the Problem, but of course, as an inscription, has long since mouldered away.

In this moment, Hamilton realized that he needed a fourth dimension; he later coined the term quaternions for the real space spanned by the elements 1, ijk, subject to his multiplication laws. He presented his theory of quaternions to the Royal Irish Academy in a paper entitled “On a new Species of Imaginary Quantities connected with a theory of Quaternions” [Ham1843]. Today, we denote this algebra \(\mathbb H :=\mathbb R +\mathbb R i+\mathbb R j+\mathbb R k\) and call \(\mathbb H \) the ring of Hamilton quaternions in his honor (Figure 1.1.4).

This charming story of quaternionic discovery remains in the popular consciousness, and to commemorate Hamilton’s discovery of the quaternions, there is an annual “Hamilton walk” in Dublin [ÓCa2010]. Although his carvings have long since worn away, a plaque on the bridge now commemorates this significant event in mathematical history (Figure 1.1.5).

Figure 1.1.4:
figure 3

A page from Hamilton’s Elements of quaternions [Ham1866] (public domain)

For more on the history of Hamilton’s discovery, see the extensive and detailed accounts of Dickson [Dic19] and Van der Waerden [vdW76]. There are also three main biographies written about the life of William Rowan Hamilton, a man sometimes referred to as “Ireland’s greatest mathematician”: by Graves [Grav1882, Grav1885, Grav1889] in three volumes, Hankins [Hankin80], and O’Donnell [O’Do83]. Numerous other shorter biographies have been written [DM89, Lanc67, ÓCa2000]. (Certain aspects of Hamilton’s private life deserve a more positive portrayal, however: see Van Weerden–Wepster [WW2018].)

Figure 1.1.5:
figure 4

The Broom Bridge plaque (author’s photo)

There are several precursors to Hamilton’s discovery that bear mentioning. First, the quaternion multiplication laws are already implicit in the four-square identity of Leonhard Euler (1707–1783):

$$\begin{aligned} \begin{aligned}&(a_1^2+a_2^2+a_3^2+a_4^2)(b_1^2+b_2^2+b_3^2+b_4^2)= c_1^2 + c_2^2 + c_3^2 + c_4^2 = \\&\qquad (a_1 b_1 - a_2 b_2 - a_3 b_3 - a_4 b_4)^2 + (a_1 b_2 + a_2 b_1 + a_3 b_4 - a_4 b_3)^2 \\&\qquad +(a_1 b_3 - a_2 b_4 + a_3 b_1 + a_4 b_2)^2 + (a_1 b_4 + a_2 b_3 - a_3 b_2 + a_4 b_1)^2. \end{aligned} \end{aligned}$$
(1.1.6)

Indeed, the full multiplication law for quaternions reads precisely

$$\begin{aligned} (a_1 + a_2i + a_3j + a_4k)(b_1 + b_2i + b_3j + b_4k) = c_1 + c_2i+c_3j + c_4k \end{aligned}$$

with \(c_1,c_2,c_3,c_4\) as defined in (1.1.6); the four-square identity corresponds to taking a norm on both sides.

It was perhaps Carl Friedrich Gauss (1777–1855) who first observed this connection. In a note dated around 1819 [Gau00], he interpreted the formula (1.1.6) as a way of composing real quadruples: to the quadruples \((a_1,a_2,a_3,a_4)\) and \((b_1,b_2,b_3,b_4)\) in \(\mathbb R ^4\), he defined the composite tuple \((c_1,c_2,c_3,c_4)\) and noted the noncommutativity of this operation. Gauss elected not to publish these findings (as he chose not to do with many of his discoveries). In letters to De Morgan [Grav1885, Grav1889, p. 330, p. 490], Hamilton attacks the allegation that Gauss had discovered quaternions first.

Finally, Olinde Rodrigues (1795–1851) (of the Rodrigues formula for Legendre polynomials) gave a formula for the angle and axis of a rotation in \(\mathbb R ^3\) obtained from two successive rotations—essentially giving a different parametrization of the quaternions—but had left mathematics for banking long before the publication of his paper [Rod1840]. The story of Rodrigues and the quaternions is given by Altmann [Alt89] and Pujol [Puj2012], and the fuller story of his life is recounted by Altmann–Ortiz [AO2005]. See also the description by Pujol [Puj2014] of Hamilton’s derivation of the relation between rotations and quaternions from 1847, set in historical context.

In any case, the quaternions consumed the rest of Hamilton’s academic life and resulted in the publication of two bulky treatises [Ham1853, Ham1866] (see also the review [Ham1899]). Hamilton’s mathematical writing over these years was at times opaque; nevertheless, many physicists used quaternions extensively and for a long time in the mid-19th century, quaternions were an essential notion in physics.

Other figures contemporaneous with Hamilton were also developing vectorial systems, most notably Hermann Grassmann (1809–1877) [Gras1862]. The modern notion of vectors was developed by Willard Gibbs (1839–1903) and Oliver Heaviside (1850–1925), independently. In 1881 and 1884 (in two halves), Gibbs introduced in a pamphlet Elements of Vector Analysis the now standard vector notation of the cross product and dot product, with the splendid equality

$$\begin{aligned} vw = -v\cdot w + v \times w \end{aligned}$$
(1.1.7)

for \(v,w \in \mathbb R i+\mathbb R j+\mathbb R k \subset \mathbb H \) relating quaternionic multiplication on the left to dot and cross products on the right. (The equality (1.1.7) also appears in Hamilton’s work, but in different notation.) Gibbs did not consider the quaternion product to be a “fundamental notion in vector analysis” [Gib1891, p. 512], and argued for a vector analysis that would apply in arbitrary dimension; on the relationship between these works, Gibbs wrote after learning of the work of Grassmann: “I saw that the methods wh[ich] I was using, while nearly those of Hamilton, were almost exactly those of Grassmann” [Whe62, p. 108]. For more on the history of quaternionic and vector calculus, see Crowe [Cro64] and Simons [Sim2010].

The rivalry between physical notations flared into a war in the latter part of the 19th century between the ‘quaternionists’ and the ‘vectorists’, and for some the preference of one system versus the other became an almost partisan split. On the side of quaternions, James Clerk Maxwell (1831–1879), who derived the equations which describe electromagnetic fields, wrote [Max1869, p. 226]:

The invention of the calculus of quaternions is a step towards the knowledge of quantities related to space which can only be compared, for its importance, with the invention of triple coordinates by Descartes. The ideas of this calculus, as distinguished from its operations and symbols, are fitted to be of the greatest use in all parts of science.

And Peter Tait (1831–1901), Hamilton’s “chief disciple” [Hankin80, p. 316], wrote in 1890 [Tai1890] decrying notation and attacking Willard Gibbs (1839–1903):

It is disappointing to find how little progress has recently been made with the development of Quaternions. One cause, which has been specially active in France, is that workers at the subject have been more intent on modifying the notation, or the mode of presentation of the fundamental principles, than on extending the applications of the Calculus. ...Even Prof. Willard Gibbs must be ranked as one the retarders of quaternions progress, in virtue of his pamphlet on Vector Analysis, a sort of hermaphrodite monster, compounded of the notation of Hamilton and Grassman.

Game on! On the vectorist side, Lord Kelvin (a.k.a. William Thomson, who formulated the laws of thermodynamics), said in an 1892 letter to R. B. Hayward about his textbook in algebra (quoted in Thompson [Tho10, p. 1070]):

Quaternions came from Hamilton after his really good work had been done; and, though beautifully ingenious, have been an unmixed evil to those who have touched them in any way, including Clerk Maxwell.

(There is also a rompous fictionalized account by Pynchon in his tome Against the Day [Pyn2006].) Ultimately, the superiority and generality of vector notation carried the day, and only certain useful fragments of Hamilton’s quaternionic notation—e.g., the “right-hand rule” \(i \times j = k\) in multivariable calculus—remain in modern usage.

1.2 Algebra after the quaternions

The debut of Hamilton’s quaternions was met with some resistance in the mathematical world: it proposed a system of “numbers” that did not satisfy the usual commutative rule of multiplication. Quaternions predated even the notion of matrices, introduced in 1855 by Arthur Cayley (1821–1895). Hamilton’s bold proposal of a noncommutative multiplication law was the harbinger of a burgeoning array of algebraic structures. In the words of J.J. Sylvester [Syl1883, pp. 271–272]:

In Quaternions (which, as will presently be seen, are but the simplest order of matrices viewed under a particular aspect) the example had been given of Algebra released from the yoke of the commutative principle of multiplication—an emancipation somewhat akin to Lobachevsky’s of Geometry from Euclid’s noted empirical axiom; and later on, the Peirces, father and son (but subsequently to 1858) had prefigured the universalization of Hamilton’s theory, and had emitted an opinion to the effect that probably all systems of algebraical symbols subject to the associative law of multiplication would be eventually found to be identical with linear transformations of schemata susceptible of matriculate representation.

So with the introduction of the quaternions, the floodgates of algebraic possibility had been opened. See Happel [Hap80] for an overview of the early development of algebra following Hamilton’s quaternions, as well as the more general history given by Van der Waerden [vdW85, Chapters 10–11].

The day after his discovery, Hamilton sent a letter [Ham1844] describing the quaternions to his friend John T. Graves (1806–1870). Graves replied on October 26, 1843, with his compliments, but added:

There is still something in the system which gravels me. I have not yet any clear views as to the extent to which we are at liberty arbitrarily to create imaginaries, and to endow them with supernatural properties. ...  If with your alchemy you can make three pounds of gold, why should you stop there?

Following through on this invitation, on December 26, 1843, Graves wrote to Hamilton that he had successfully generalized the quaternions to the “octaves”, now called octonions \(\mathbb O \), an algebra in eight dimensions, with which he was able to prove that the product of two sums of eight perfect squares is another sum of eight perfect squares, a formula generalizing (1.1.6). In fact, Hamilton first invented the term associative in 1844, around the time of his correspondence with Graves. Unfortunately for Graves, the octonions were discovered independently and published in 1845 by Cayley [Cay1845b], who often is credited for their discovery. (Even worse, the eight squares identity was also previously discovered by C. F. Degen.) For a more complete account of this story and the relationships between quaternions and octonions, see the survey article by Baez [Bae2002], the article by Van der Blij [vdB60], and the delightful book by Conway–Smith [CSm2003].

Cayley also studied quaternions themselves [Cay1845a] and was able to reinterpret them as arising from a doubling process, also called the Cayley–Dickson construction, which starting from \(\mathbb R \) produces \(\mathbb C \) then \(\mathbb H \) then \(\mathbb O \), taking the ordered, commutative, associative algebra \(\mathbb R \) and progressively deleting one adjective at a time. So algebras were first studied over the real and complex numbers and were accordingly called hypercomplex numbers in the late 19th and early 20th century. And this theory flourished. Hamilton himself considered the algebra over \(\mathbb C \) defined by his famous equations (1.1.1), calling them biquaternions. In 1878, Ferdinand Frobenius (1849–1917) proved that the only finite-dimensional associative real division algebras are \(\mathbb R \), \(\mathbb C \), and \(\mathbb H \) [Fro1878]. This result was also proven independently by C.S. Peirce, the son of Benjamin Peirce, below. Adolf Hurwitz (1859–1919) later showed that the only normed finite-dimensional not-necessarily-associative real division algebras are \(\mathbb R \), \(\mathbb C \), \(\mathbb H \), and \(\mathbb O \). (The same statement is true without the condition that the algebra be normed, but currently the proofs use topology, not algebra! Bott–Milnor [BM58] and Kervaire [Ker58] proved that the \((n-1)\)-dimensional sphere \(\{x \in \mathbb R ^n : \Vert x\Vert ^2 =1\}\) has trivial tangent bundle if and only if there is an n-dimensional not-necessarily-associative real division algebra if and only if \(n=1,2,4,8\). The solution to the Hopf invariant one problem by Adams also implies this result; an elegant and concise proof using K-theory, Adams operations, and elementary number theory was given by Adams–Atiyah [AA66]. See Hirzebruch [Hir91] or Ranicki [Ran2011] for a more complete account.)

In another attempt to seek a generalization of the quaternions to higher dimension, William Clifford (1845–1879) developed a way to build algebras from quadratic forms in 1876 [Cli1878]. Clifford constructed what we now call a Clifford algebra C(V) associated to \(V=\mathbb R ^n\) (with the standard Euclidean norm); it is an algebra of dimension \(2^n\) containing V with multiplication induced from the relation \(x^2=-\Vert x\Vert ^2\) for all \(x \in V\). We have \(C(\mathbb R ^1)=\mathbb C \) and \(C(\mathbb R ^2)=\mathbb H \), so the Hamilton quaternions arise as a Clifford algebra—but \(C(\mathbb R ^3)\) is not the octonions. The theory of Clifford algebras tightly connects the theory of quadratic forms and the theory of normed division algebras and its impact extends in many mathematical directions. For more on the history of Clifford algebras, see Diek–Kantowski [DK95].

A further physically motivated generalization was pursued by Alexander Macfarlane (1851–1913): he developed a theory of what he called hyperbolic quaternions [Macf00] (a revised version of an earlier, nonassociative attempt [Macf1891]), with the multiplication laws

$$\begin{aligned} \begin{gathered} i^2=j^2=k^2=1, \\ ij=\sqrt{-1}k=-ji, \quad jk=\sqrt{-1}i=-kj, \quad ki=\sqrt{-1}j=-ik. \end{gathered} \end{aligned}$$
(1.2.1)

Thought of as an algebra over \(\mathbb C =\mathbb R (\sqrt{-1})\), Macfarlane’s hyperbolic quaternions are isomorphic to Hamilton’s biquaternions (and therefore isomorphic to \({{\,\mathrm{M}\,}}_2(\mathbb C )\)). Moreover, the restriction of the norm to the real span of the basis 1, ijk in Macfarlane’s algebra is a quadratic form of signature (1, 3): this gives a quaternionic version of space-time, something also known as Minkowski space (but with Macfarlane’s construction predating that of Minkowski). For more on the history and further connections, see Crowe [Cro64].

Around this time, other types of algebras over the real numbers were also being investigated, the most significant of which were Lie algebras. In the seminal work of Sophus Lie (1842–1899), group actions on manifolds were understood by looking at this action infinitesimally; one thereby obtains a Lie algebra of vector fields that determines the local group action. The simplest nontrivial example of a Lie algebra is the cross product of two vectors, related to quaternion multiplication in (1.1.7): it defines, a linear, alternating, but nonassociative binary operation on \(\mathbb R ^3\) that satisfies the Jacobi identity emblematized by

$$\begin{aligned} i \times (j \times k) + k \times (i \times j) + j \times (k \times i) = 0. \end{aligned}$$
(1.2.2)

The Lie algebra “linearizes” the group action and is therefore more accessible. Wilhelm Killing (1847–1923) initiated the study of the classification of Lie algebras in a series of papers [Kil1888], and this work was completed by Élie Cartan (1869–1951). We refer to Hawkins [Haw2000] for a description of this rich series of developments.

In this way, the study of division algebras gradually evolved, independent of physical interpretations. Benjamin Peirce (1809–1880) in 1870 developed what he called linear associative algebras [Pei1882]; he provided a decomposition of an algebra relative to an idempotent (his terminology). The first definition of an algebra over an arbitrary field seems to have been given by Leonard E. Dickson (1874–1954) [Dic03]: at first he still called the resulting object a system of complex numbers and only later adopted the name linear algebra.

The notion of a simple algebra had been discovered by Cartan, and Theodor Molien (1861–1941) had earlier shown in his terminology that every simple algebra over the complex numbers is a matrix algebra [Mol1893]. But it was Joseph Henry Maclagan Wedderburn (1882–1948) who was the first to find meaning in the structure of simple algebras over an arbitrary field, in many ways leading the way forward. The jewel of his 1908 paper [Wed08] is still foundational in the structure theory of algebras: a simple algebra (finite-dimensional over a field) is isomorphic to a matrix ring over a division ring. Wedderburn also proved that a finite division ring is a field, a result that like his structure theorem has inspired much mathematics. For more on the legacy of Wedderburn, see Artin [Art50].

In the early 1900s, Dickson was the first to consider quaternion algebras over a general field [Dic12, (8), p. 65]. He began by considering more generally those algebras in which every element satisfies a quadratic equation [Dic12], exhibited a diagonalized basis for such an algebra, and considered when such an algebra can be a division algebra. This led him to multiplication laws for what he later called a generalized quaternion algebra [Dic14, Dic23], with multiplication laws

$$\begin{aligned} \begin{gathered} i^2=a, \quad j^2=b, \quad k^2=-ab, \\ ij=k=-ji, \quad ik=aj=-ki, \quad kj=bi=-jk \end{gathered} \end{aligned}$$
(1.2.3)

with ab nonzero elements in the base field. (To keep track of these, it is helpful to write ijk around a circle clockwise.) Today, we no longer employ the adjective “generalized”—over fields other than \(\mathbb R \), there is no reason to privilege the Hamiltonian quaternions—and we can reinterpret this vein of Dickson’s work as showing that every 4-dimensional central simple algebra is a quaternion algebra (a statement that holds even over a field F with \({{\,\mathrm{char}\,}}F = 2\)). See Fenster [Fen98] for a summary of Dickson’s work in algebra, and Lewis [Lew2006b] for a broad survey of the role of involutions and anti-automorphisms in the classification of algebras.

1.3 Quadratic forms and arithmetic

Hamilton’s quaternions also fused a link between quadratic forms and arithmetic, phrased in the language of noncommutative algebra. Indeed, part of Dickson’s interest in quaternion algebras stemmed from earlier work of Hurwitz [Hur1898], alluded to above. Hurwitz had asked for generalizations of the composition laws arising from sum of squares laws like that of Euler (1.1.6) for four squares and Cayley for eight squares: for which n does there exist an identity

$$\begin{aligned} (a_1^2+\dots +a_n^2)(b_1^2+\dots +b_n^2)=c_1^2+\dots +c_n^2 \end{aligned}$$

with each \(c_i\) bilinear in the variables a and b? He then proved [Hur1898] that over a field where 2 is invertible, these identities exist only for \(n=1,2,4,8\) variables (so in particular, there is no formula expressing the product of two sums of 16 squares as the sum of 16 squares). As Dickson [Dic19] further explained, this result of Hurwitz is intimately tied to the theory of algebras. For more on compositions of quadratic forms and their history, including theorems of Hurwitz–Radon and Pfister, see Shapiro [Sha90].

Thinking along similar lines, Hurwitz gave a new proof of the four-square theorem of Lagrange, that every positive integer is the sum of four integer squares: he first wrote about this in 1896 on quaternionic number theory (“Über die Zahlentheorie der Quaternionen”) [Hur1896], then published a short book on the subject in 1919 [Hur19]. To this end, Hurwitz considered Hamilton’s equations over the rational numbers and said that a quaternion \(t+xi+yj+zk\) with \(t,x,y,z \in \mathbb Q \) was an integer if txyz all belonged to \(\mathbb Z \) or all to \(\frac{1}{2}+\mathbb Z \), conditions for the quaternion to satisfy a quadratic polynomial with integer coefficients. Hurwitz showed that his ring of integer quaternions, today called the Hurwitz order, admits a generalization of the Euclidean algorithm and thereby a factorization theory. He then applied this to count the number of ways of representing an integer as the sum of four squares, a result due to Jacobi. The notion of integral quaternions was also explored in the 1920s by Venkov [Ven22, Ven29] and the 1930s by Albert [Alb34]. Dickson considered further questions of representing positive integers by integral quaternary quadratic forms [Dic19, Dic23, Dic24] in the same vein.

So by the end of the 1920s, quaternion algebras were used to study quadratic forms in a kind of noncommutative algebraic number theory [Lat26, Gri28]. It was known that a (generalized) quaternion algebra (1.2.3) was semisimple in the sense of Wedderburn, and thus it was either a division algebra or a full matrix algebra over the ground field. Indeed, a quaternion algebra is a matrix algebra if and only if a certain ternary quadratic form has a nontrivial zero, and over the rational numbers this problem was already studied by Legendre. Helmut Hasse (1898–1979) reformulated Legendre’s conditions: a quadratic form has a nontrivial zero over the rationals if and only if it has a nontrivial zero over the real numbers and Hensel’s field of p-adic numbers for all odd primes p. This result paved the way for many further advances, and it is now known as the Hasse principle or the local-global principle for quadratic forms. For an overview of this history, see Scharlau [Scha2009, §1].

Further deep results in number theory were soon to follow. Dickson [Dic14] had defined cyclic algebras, reflecting many properties of quaternion algebras, and in 1929 lectures Emmy Noether (1882–1935) considered the even more general crossed product algebras. Not very long after, in a volume dedicated to Hensel’s seventieth birthday, Richard Brauer (1901–1977), Hasse, and Noether proved a fundamental theorem for the structure theory of algebras over number fields [BHN31]: every central division algebra over a number field is a cyclic algebra. This crucial statement had profound implications for class field theory, the classification of abelian extensions of a number field, with a central role played by the Brauer group of a number field, a group encoding its division algebras. For a detailed history and discussion of these lines, see Fenster–Schwermer [FS2007], Roquette [Roq2006], and the history of class field theory summarized by Hasse himself [Hass67].

At the same time, Abraham Adrian Albert (1905–1972), a doctoral student of Dickson, was working on the structure of division algebras and algebras with involution, and he had written a full book on the subject [Alb39] collecting his work in the area, published in 1939. Albert had examined the tensor product of two quaternion algebras, called a biquaternion algebra (not to be confused with Hamilton’s biquaternions), and he characterized when such an algebra was a division algebra in terms of a senary (six variable) quadratic form. Albert’s classification of algebras with involution was motivated by understanding possible endomorphism algebras of abelian varieties, viewed as multiplier rings of Riemann matrices and equipped with the Rosati involution: a consequence of this classification is that quaternion algebras are the only noncommutative endomorphism algebras of simple abelian varieties. He also proved that a central simple algebra admits an involution if and only if the algebra is isomorphic to its opposite algebra (equivalently, it has order at most 2 in the Brauer group). For a biography of Albert and a survey of his work, see Jacobson [Jacn74]. Roquette argues convincingly [Roq2006, §8] that because of Albert’s contributions to its proof (for example, his work with Hasse [AH32]), we should refer to the Albert–Brauer–Hasse–Noether theorem in the previous paragraph.

1.4 Modular forms and geometry

Quaternion algebras also played a formative role in what began as a subfield of complex analysis and ordinary differential equations and then branched into the theory of modular forms—and ultimately became a central area of modern number theory.

Returning to a thread from the previous section, the subject of representing numbers as the sum of four squares saw considerable interest in the 17th and 18th centuries [Dic71, Chapter VIII]. Carl Jacobi (1804–1851) approached the subject from the analytic point of view of theta functions, the basic building blocks for elliptic functions; these were first studied in connection with the problem of the arc length of an ellipse, going back to Abel. Jacobi studied the series

$$\begin{aligned} \theta (\tau ) :=\sum _{n=-\infty }^{\infty } \exp (2\pi i n^2 \tau ) = 1 + 2q + 2q^4 + 2q^9 + \ldots \end{aligned}$$
(1.4.1)

where \(\tau \) is a complex number with positive imaginary part and \(q=\exp (2\pi i\tau )\). Jacobi proved the remarkable identity

$$\begin{aligned} \theta (\tau )^4 = \sum _{a,b,c,d \in \mathbb Z } q^{a^2+b^2+c^2+d^2} = 1+8\sum _{n=1}^{\infty } \sigma ^*(n) q^n, \end{aligned}$$
(1.4.2)

where \(\sigma ^*(n) :=\sum _{4 \not \mid d \mid n} d\) is the sum of divisors of n not divisible by 4. In this way, Jacobi gave an explicit formula for the number of ways of expressing a number as the sum of four squares. For a bit of history and an elementary derivation in the style of Gauss and Jacobi, see Ewell [Ewe82].

As a Fourier series, the Jacobi theta function \(\theta \) (1.4.1) visibly satisfies \(\theta (\tau +1)=\theta (\tau )\). Moreover, owing to its symmetric description, Jacobi showed using Poisson summation that \(\theta \) also satisfies the transformation formula

$$\begin{aligned} \theta (-1/\tau )= \sqrt{\tau /i}\, \theta (\tau ). \end{aligned}$$
(1.4.3)

Felix Klein (1849–1925) saw geometry in formulas like (1.4.3). In his Erlangen Program (1872), he recast 19th century geometry in terms of the underlying group of symmetries, unifying Euclidean and non-Euclidean formulations. Turning then to hyperbolic geometry, he studied the modular group \({{\,\mathrm{SL}\,}}_2(\mathbb Z )\) acting by linear fractional transformations on the upper half-plane, and interpreted transformation formulas for elliptic functions: in particular, Klein defined his absolute invariant \(J(\tau )\) [Kle1878], a function invariant under the modular group. Together with his student Robert Fricke (1861–1930), this led to four volumes [FK1890–2, FK1897, FK12] on elliptic modular functions and automorphic functions, combining brilliant advances in group theory, number theory, geometry, and invariant theory (Figure 1.4.4).

Figure 1.4.4:
figure 5

The (2, 3, 7)-tiling by Fricke and Klein [FK1890-2] (public domain)

At the same time, Henri Poincaré (1854–1912) brought in the theory of linear differential equations—and a different, group-theoretic approach. In correspondence with Fuchs in 1880 on hypergeometric differential equations, he writes about the beginnings of his discovery of a new class of analytic functions [Gray2000, p.177]:

They present the greatest analogy with elliptic functions, and can be represented as the quotient of two infinite series in infinitely many ways. Amongst those series are those which are entire series playing the role of Theta functions. These converge in a certain circle and do not exist outside it, as thus does the Fuchsian function itself. Besides these functions there are others which play the same role as the zeta functions in the theory of elliptic functions, and by means of which I solve linear differential equations of arbitrary orders with rational coefficients whenever there are only two finite singular points and the roots of the three determinantal equations are commensurable.

As he reminisced later in his Science et Méthode [Poi1908, p. 53]:

I then undertook to study some arithmetical questions without any great result appearing and without expecting that this could have the least connection with my previous researches. Disgusted with my lack of success, I went to spend some days at the sea-side and thought of quite different things. One day, walking along the cliff, the idea came to me, always with the same characteristics of brevity, suddenness, and immediate certainty, that the arithmetical transformations of ternary indefinite quadratic forms were identical with those of non-Euclidean geometry.

In other words, like Klein, Poincaré launched a program to study complex analytic functions defined on the unit disc that are invariant with respect to a discrete group of matrix transformations that preserve a rational indefinite ternary quadratic form. Today, such groups are called arithmetic Fuchsian groups, and we study them as unit groups of quaternion algebras. To read more on the history of differential equations in the time of Riemann and Poincaré, see the history by Gray [Gray2000], as well as Gray’s scientific biography of Poincaré [Gray2013].

In the context of these profound analytic discoveries, Erich Hecke (1887–1947) began his study of modular forms. He studied the Dedekind zeta function, a generalization of Riemann’s zeta function to number fields, and established its functional equation using theta functions. In the study of similarly defined analytic functions arising from modular forms, he was led to define the “averaging” operators acting on spaces of modular forms that now bear his name. In this way, he could interpret the Fourier coefficients a(n) of a Hecke eigenform (normalized, weight 2) as eigenvalues of his operators: he proved that they satisfy a relation of the form

$$\begin{aligned} a(m)a(n) = \sum _{d \mid \gcd (m,n)} a(mn/d^2) d \end{aligned}$$
(1.4.5)

and consequently a two-term recursion relation. He thereby showed that the Dirichlet L-series of an eigenform, defined via Mellin transform, has an Euler product, analytic continuation, and functional equation.

Hecke went further, and connected the analytic theory of modular forms and his operators to the arithmetic theory of quadratic forms. In 1935–1936, he found that for certain systems of quaternary quadratic forms, the number of representations of integers by the system satisfied the recursion (1.4.5), in analogy with binary quadratic forms. He published a conjecture on this subject in 1940 [Hec40, Satz 53, p. 100]: that the weighted representation numbers satisfy the Hecke recursion, connecting coefficients to operators on theta series, and further that the columns in a composition table always result in linearly independent theta series. He verified the conjecture up to prime level \(q <37\), but was not able to prove this recursion using his methods of complex analysis (see his letter [Bra41, Footnote 1]).

The arithmetic part of these conjectures was investigated by Heinrich Brandt (1886–1954) in the quaternionic context—and so the weave of our narrative is further tightly sewn. Preceding Hecke’s work, and inspired by Gauss composition of binary quadratic forms as the product of classes of ideals in a quadratic field, Brandt had earlier considered a generalization to quaternary quadratic forms and the product of classes of ideals in a quaternion algebra [Bra28]: he was only able to define a partially defined product, and so he coined the term groupoid for such a structure [Bra40]. He then considered the combinatorial problem of counting the ways of factoring an ideal into prime ideals, according to their classes. In this way, he recorded these counts in a matrix T(n) for each positive integer n, and he proved strikingly (sketched in 1941 [Bra41], dated 1939, and proved completely in 1943 [Bra43]) that the matrices T(n) satisfy Hecke’s recursion (1.4.5). To read more on the life and work of Brandt, see Hoehnke–Knus [HK2004]. Today we call the matrices T(n) Brandt matrices, and for certain purposes, they are still the most convenient way to get ahold of spaces of modular forms.

Martin Eichler (1912–1992) wrote his thesis [Eic36] under the supervision of Brandt on quaternion orders over the integers, in particular studying the orders that now bear his name. Later he continued the grand synthesis of modular forms, quadratic forms, and quaternion algebras, viewing in generality the orthogonal group of a quadratic form as acting via automorphic transformations [Eic53]. In this vein, he formulated his basis problem (arising from the conjecture of Hecke) which sought to understand explicitly the span of quaternionic theta series among classical modular forms, giving a correspondence between systems of Hecke eigenvalues appearing in the quaternionic and classical context. He answered the basis problem in affirmative for the case of prime level in 1955 [Eic56a] and then for squarefree level [Eic56b, Eic58, Eic73]. For more on Eichler’s basis problem and its history, see Hijikata–Pizer–Shemanske [HPS89a].

Having come to recent history, our account now becomes much more abbreviated: we provide further commentary in situ in remarks in the rest of this text, and we conclude with just a few highlights. In the 1950s and 1960s, there was subtantial work done in understanding zeta functions of certain varieties arising from quaternion algebras over totally real number fields. For example, Eichler’s correspondence was generalized to totally real fields by Shimizu [Shz65]. Shimura embarked on a deep and systematic study of arithmetic groups obtained from indefinite quaternion algebras over totally real fields, including both the arithmetic Fuchsian groups of Poincaré, Fricke, and Klein, and the generalization of the modular group to totally real fields studied by Hilbert. In addition to understanding their zeta functions, he also formulated a general theory of complex multiplication in terms of automorphic functions; as a consequence, he found the corresponding arithmetic quotients can be defined as an algebraic variety with equations defined over a number field—and so today we refer to quaternionic Shimura varieties. For an overview of Shimura’s work, see his lectures at the International Congress of Mathematicians in 1978 [Shi80]. As it turns out, quaternion algebras over number fields also give rise to arithmetic manifolds that are not algebraic varieties, and they are quite important in the areas of spectral theory, low-dimensional geometry, and topology—in particular, in Thurston’s geometrization program for hyperbolic 3-manifolds and in classifying knots and links.

Just as the Hecke operators determine the coefficients of classical modular forms and Dirichlet L-series, they may be vastly generalized, replacing modular groups by other algebraic groups, such as the group of units in a central simple algebra or the orthogonal group of a quadratic form. Understanding the theory of automorphic forms in this context is a program that continues today: formulated in the language of automorphic representations, and seen as a nonabelian generalization of class field theory, Langlands initiated this program in a letter to Weil in January 1967. It is indeed fitting that an early success of the Langlands program [Gel84, B+2003] would be on the subject of quaternion algebras: a generalization of the Eichler–Shimizu correspondence to encompass arbitrary quaternion algebras over number fields was achieved in foundational work by Jacquet–Langlands [JL70] in 1970. For more on the modern arithmetic history of modular forms, see Edixhoven–Van der Geer–Moonen [EGM2008]; Alsina–Bayer [AB2004, Appendices B–C] also give references for further applications of quaternion algebras in arithmetic geometry (in particular, of Shimura curves).

1.5 Conclusion

We have seen how quaternion algebras have threaded mathematical history through to the present day, weaving together advances in algebra, quadratic forms, number theory, geometry, and modular forms. And although our history ends here, the story does not!

Quaternion algebras continue to arise in unexpected ways. In the arithmetic setting, quaternion orders arise as endomorphism rings of supersingular elliptic curves and have been used in proposed post-quantum cryptosystems and digital signature schemes (see for example the overview by Galbraith–Vercauteren [GV2018]). In the field of quantum computation, Parzanchevski–Sarnak [PS2018] have proposed Super-Golden-Gates built from certain special quaternion algebras and their arithmetic groups that would give efficient 1-qubit quantum gates. In coding theory, lattices in quaternion algebras (and more generally central simple algebras over number fields) yield space-time codes that achieve high spectral efficiency on wireless channels with two transmit antennas, currently part of certain IEEE standards [BO2013].

Quaternions have also seen a revival in computer graphics, modeling, and animation [HFK94, Sho85]. Indeed, a rotation in \(\mathbb R ^3\) about an axis through the origin can be represented by a \(3 \times 3\) orthogonal matrix with determinant 1, conveniently encoded in Euler angles. However, the matrix representation is redundant, as there are only three degrees of freedom in such a rotation. Moreover, to compose two rotations requires the product of the two corresponding matrices, which requires 27 multiplications and 18 additions in \(\mathbb R \). Quaternions, on the other hand, represent this rotation with a 4-tuple, and multiplication of two quaternions takes only 16 multiplications and 12 additions in \(\mathbb R \) (if done naively). In computer games, quaternion interpolation provides a way to smoothly interpolate between orientations in space—something crucial for fighting Nazi zombies. Quaternions are also vital for attitude control of aircraft and spacecraft [Hans2006]: they avoid the ambiguity that can arise when two rotation axes align, leading to a potentially disastrous loss of control called gimbal lock.

In quantum physics, quaternions yield elegant expressions for Lorentz transformations, the basis of the modern theory of relativity [Gir83]. Some physicists are now hoping to find deeper understanding of these principles of quantum physics in terms of quaternions. And so, although much of Hamilton’s quaternionic physics fell out of favor long ago, we have come full circle in our elongated historical arc. The enduring role of quaternion algebras as a catalyst for a vast range of mathematical research promises rewards for many years to come.

1.6 Exercises

  1. 1.

    Hamilton sought a multiplication \(*:\mathbb R ^3 \times \mathbb R ^3 \rightarrow \mathbb R ^3\) that preserves length:

    $$\begin{aligned} \Vert v \Vert ^2 \cdot \Vert w \Vert ^2 = \Vert v * w \Vert ^2 \end{aligned}$$

    for \(v,w \in \mathbb R ^3\). Expanding out in terms of coordinates, such a multiplication would imply that the product of the sum of three squares over \(\mathbb R \) is again the sum of three squares in \(\mathbb R \). (Such a law holds for the sum of four squares (1.1.6).) Show that such a formula for three squares is impossible as an identity in the polynomial ring in 6 variables over \(\mathbb Z \). [Hint: Find a natural number that is the product of two sums of three squares which is not itself the sum of three squares.]

  2. 2.

    Hamilton originally sought an associative multiplication law on

    $$\begin{aligned} D :=\mathbb R + \mathbb R i + \mathbb R j \simeq \mathbb R ^3 \end{aligned}$$

    where \(i^2=-1\) and every nonzero element of D has a (two-sided) inverse. Show this cannot happen in two (not really different) ways.

    1. (a)

      If \(ij=a+bi+cj\) with \(a,b,c \in \mathbb R \), multiply on the left by i and derive a contradiction.

    2. (b)

      Show that D is a (left) \(\mathbb C \)-vector space, so D has even dimension as an \(\mathbb R \)-vector space, a contradiction.

  3. 3.

    Show that there is no way to give \(\mathbb R ^3\) the structure of a ring (with 1) in which multiplication respects scalar multiplication by \(\mathbb R \), i.e.,

    $$ x \cdot (cy)=c(x \cdot y)=(cx)\cdot y \quad \text { for all }c \in \mathbb R ~\text {and }x,y \in \mathbb R ^3 $$

    and every nonzero element has a (two-sided) inverse, as follows.

    1. (a)

      Suppose \(B :=\mathbb R ^3\) is equipped with a multiplication law that respects scalar multiplication. Show that left multiplication by \(\alpha \in B\) is \(\mathbb R \)-linear and \(\alpha \) satisfies the characteristic polynomial of this linear map, a polynomial of degree 3.

    2. (b)

      Now suppose that every nonzero \(\alpha \in B\) has an inverse. By consideration of eigenvalues or the minimal polynomial, derive a contradiction. [Hint: show that the characteristic polynomial has a real eigenvalue, or that every \(\alpha \in B\) satisfies a (minimal) polynomial of degree 1, and derive a contradiction from either statement.]