1 Introduction

Topology is the study of whether objects can be smoothly transformed into each other. Sometimes these ‘objects’ are extremely abstract mathematical ideas, sometimes they’re not: Can I uncoil this garden hose without removing the end from the bucket? Can I untangle these necklaces without undoing the clasps? Can I wrap this map of the world onto a globe? These are all problems of topology. Related topological questions appear across physics, where wave dispersion surfaces [1], knotted fluid flow lines [2], and electromagnetic fields [3] can all be grouped according to whether, or not they can be smoothly transformed into one another.

In physics and engineering there has been a recent burst of activity, applying topology to control waves. Specifically, topology has been applied to design materials, stipulating what happens at their interface without having to know anything about what the interface is like. For instance, take two homogeneous lumps of elastic stuff. We can ensure that vibrational waves can be trapped at the interface formed when we stick them together, irrespective of how messy our joinery is!

The basic idea is this: take a material where wave propagation can be specified in terms of a conserved wave vector \(\varvec{k}\). For topology to be at all powerful, we need to be able to treat the components of \(\varvec{k}\) as the coordinates on a closed surface. In many cases this is possible. We can, for example, wrap the first Brillouin zone onto a torus, whenever the material is periodic.

Imagine that attached to each point on this closed surface is a vector \(|\psi \rangle \); the solution to the wave equation for each particular value of \(\varvec{k}\). Now, taking two different materials we can construct two closed surfaces (e.g. two tori), upon each of which there is a different form of the wave, \(|\psi \rangle \) or \(|\psi '\rangle \). The question is whether it is possible to smoothly deform one of these wave fields into the other, a question topologists have already developed the necessary tools to answer, at least in the negative!

If the two waves cannot be smoothly deformed into one another then something non–smooth and perhaps ‘interesting’ must happen when we try, something that will occur for instance at an interface, in the transition region between the two materials. This ‘interesting’ thing turns out to be the presence of one or more interface states, where the wave is trapped in the transition region. Topology therefore guarantees the presence of interface states between two materials, without the physicist ever having to consider whether the interface is flat, rough, curved, sharp, narrow, or wide.

This is an odd business for most physicists and engineers, who are used to caring about details! A graded index fiber optic cable, for instance, must be made with precision, confining light rays with a particular spatial distribution of refractive index to minimize dispersion. Here we have a completely different kind of theory, where the design process doesn’t even mention the details of the region where the mode is to be confined. This peculiar insensitivity to the form of the interface (often called ‘topological protection’) was perhaps first appreciated by Volkov and co–workers, who were concerned with the physics of electrons around electronic contacts [4, 5]. The significance of topology was appreciated later [6, 7], in connection with earlier work on the quantum hall effect [8, 9].

Since the discovery of negative refraction [10] and transformation optics [11], and the rapid development of metamaterials [12], it has been widely appreciated that one wave is like any other. Although quantum mechanics has its peculiarities, there is nothing fundamentally different between the Dirac equation, Schrödinger equation, the classical Maxwell equations, or the equations of elasticity. The same topological arguments have therefore been applied to design of periodic electromagnetic materials supporting unidirectional interface states, dubbed ‘photonic topological insulators’ [13], which has led to the fields of topological photonics [14] and acoustics [15].

The fact that we can use metamaterials to realise a wide range of material parameters means there is actually more to explore in these classical systems (the distribution of atoms in an ordinary crystal is not easy to specify on the scale of an electron wavelength!). Work on topological photonic and acoustic materials is therefore able to investigate effects that could not be observed in condensed matter systems (see the discussion in e.g. [16] and [17]), and even allows the exploration of active topological non–Hermitian materials, where the material can amplify an absorb the wave in a controlled way [18] (which is also extremely difficult to mimic in electronic systems).

In the author’s opinion, several things are opaque in this subject. Firstly, while the Chern classes are by now a familiar tool for classifying the topology of a wave field \(|\psi \rangle \), their origin and connection to other characteristic classes—such as the more familiar Euler class—is never explained in terms palatable to the physicist. Secondly, besides giving a “plausbile” intuitive explanation for its truth, the connection between integrals of the Chern class and the number of interface states is never proved in a straightforward way. The first part of the tutorial clarifies both of these points. Finally there is the problem of a physical interpretation for these topological calculations. In the final part we connect a non–zero Chern number to the existence of peculiar points of vanishing refractive index, where the wave is forced to circulate in only one sense (e.g. only clockwise).

2 Winding Numbers of Paths

We’ll begin with the topological classification of curves in terms of a ‘winding number’. Imagine unwinding a ball of string, one end of which (A) is attached to a wall. With the string we trace out a path, closing it by returning to A and tying the ends. An example is sketched as the blue curve in Fig. 1a.

Fig. 1
figure 1

Without any obstacles, any closed path can be deformed into a simple loop. By pulling along the indicated red arrows, the path in (a) can be unwound into the simple path shown in (b). In panel (c) an obstacle is placed in the centre of the path. As the path winds twice around the obstacle, one reaches an impasse as shown in (d)

If there are no obstacles, we can always move the string until it unwinds into a single loop. Pulling in the direction indicated in Fig. 1a, the string can be untwisted into the single loop shown in Fig. 1b. The two configurations are therefore topologically equivalent. This situation changes if the space contains an obstacleFootnote 1. Suppose there is a tree at the position shown by the black dot in Fig. 1c. Whether we can deform one arrangement of string into another is now determined by the winding number, \(\nu \): the number of times the string encircles the tree. As we know from experience, the winding number cannot be changed by any continuous re–positioning of the string, to change it we must cut the string (or cut down the tree). A difference in the winding number between two configurations of string indicates that they are topologically in–equivalent, and \(\nu \) can be used to classify configurations of string that cannot be continuously changed into one another.

To calculate the winding number, suppose we position the tree at the origin of the coordinate system, integrating the change in polar angle \({\text{ d }}\theta \) as we follow the string. After dividing by \(2\pi \), the integral counts the number of times the path encircles the obstacle. Each point on the path thus picks out an angle \(\theta \) on a circle. The winding number counts the number of times this point covers the circle, adding \(+1\) for each anticlockwise circuit and \(-1\) for each clockwise one.

It is simplest to write the change in angle \({\text{ d }}\theta \) using the complex number \(z=x+\textrm{i}y=r\textrm{e}^{\textrm{i}\theta }\). The polar angle is then simply \(\theta =\textrm{Im}[\log (z)]\) and the winding number can be written as a line integral of a vector,

$$\begin{aligned} \nu =\frac{1}{2\pi }\oint _\textrm{Path} {\text{ d }}\theta =\frac{1}{2\pi }\oint _\textrm{Path}\varvec{\nabla }\,\textrm{Im}\left[ \log \left( x+\textrm{i}y\right) \right] \cdot \,{\text{ d }}\varvec{x} \end{aligned}$$
(1)

where \({\text{ d }}\varvec{x}={\text{ d }}x\,\varvec{e}_x+{\text{ d }}y\,\varvec{e}_y\). Having written \(\nu \) as a line integral, there is a second equivalent way to write (1). Using Stokes’ theorem the one dimensional line integral can be written as a two dimensional surface integral over the enclosed region \(S_P\),

$$\begin{aligned} \nu =\frac{1}{2\pi }\int _\mathrm {S_P}\varvec{\nabla }\times \varvec{A}\cdot \,{\text{ d }}\varvec{S} \end{aligned}$$
(2)

where \(S_P\) may be quite a strange origami–like surface, like that enclosed by the blue curve in Fig. 1a. This is an important development that we’ll see again. The winding number (our topological invariant) now appears as a net ‘magnetic flux’ through the surface \(\mathrm S_P\), something which common to all the topological invariants considered here. The ‘vector potential’ associated with this flux is defined as

$$\begin{aligned} \varvec{A}=\varvec{\nabla }\,\textrm{Im}\left[ \log \left( x+\textrm{i}y\right) \right] . \end{aligned}$$
(3)

But having said all this, it now seems as though we made a mistake. The integral (2) is surely always zero, because the curl of any gradient is zero!

We didn’t make a mistake. We have uncovered an extremely important subtlety that appears again and again in topology. The concerning result \(\varvec{\nabla }\times \varvec{\nabla }f=0\) requires f to be a proper function. Every point (xy) must be associated with a single number f(xy). This isn’t true for the ‘function’, \(\theta =\textrm{Im}[\log (x+\textrm{iy})]\), which can take any value at the origin.Footnote 2 This defect in the vector potential (3) is known as a critical point. The ‘flux’ in equation (2) records the presence of such critical points, and is confined to the obstacle, where both \(\theta \) and \(\varvec{A}\) are undefined. The winding number thus only depends on the number of times the surface \(\textrm{S}_P\) cuts through the critical point \(\varvec{x}=\varvec{0}\). The fact that the curl of (3) is zero at all points except where there is an obstacle is actually essential for the winding number to be insensitive to deformations of the path, and hence for \(\nu \) to be a topological invariant.

3 The Euler Characteristic: Winding Numbers of Surfaces

Having given a topological categorization of curves, we move up a dimension to classify the ‘winding’ of closed two dimensional surfaces, like those shown in Fig. 2. Just as we classified the path of our string by mapping it to a point on a circle and counting the total number of revolutions, we associate each point on a surface \(\mathrm S\) to an equivalent point on a sphere, and count the number of times the sphere is covered as we move over the surface.

Fig. 2
figure 2

Two dimensional surfaces can be classified by counting the number of times the surface normal \(\varvec{n}\) covers a sphere, which equals half of the Euler characteristic, \(\chi \). Panels (b) and (d) show arbitrary surface normal vectors \(\varvec{n}\) and \(\varvec{n}'\) on a sphere, picking out points on the two surfaces in (a) and (c) respectively. Every surface normal on the sphere corresponds to only one point on the egg–like surface in (a), meaning \(\chi =2\). Meanwhile a point on the sphere corresponds to two points on the torus: one ‘inside’, and one ‘outside’. Following \(\varvec{n}\) around the torus we see that the outside and inside regions cover the sphere in opposite directions, one undoing the other such that \(\chi =0\)

We make the connection between a point on an arbitrary surface \(\mathrm S\) and a point on a sphere through the surface normal \(\varvec{n}\). For each coordinate \((x_{1},x_{2})\) on our surface we find those coordinates \((\theta ,\phi )\) on the sphere where the surface normal vector takes the same value,

$$\begin{aligned} \varvec{n}(x_1,x_2)=\sin (\theta )\left[ \cos (\phi )\varvec{e}_{x}+\sin (\phi )\varvec{e}_{y}\right] +\cos (\theta )\varvec{e}_{z}. \end{aligned}$$
(4)

as indicated in Fig. 2. In Sec. 2 we mapped the curve onto a circle and calculated the winding number through integrating the angle swept out around the circle. Here we instead integrate up the solid angle \({\text{ d }}\Omega =\sin (\theta )\,{\text{ d }}\theta \,{\text{ d }}\phi \) swept out on the sphere as we move over the area \({\text{ d }}x_1\,{\text{ d }}x_2\) on our arbitrary surface \(\mathrm S\). A useful expression for \({\text{ d }}\Omega \) can be found through transforming the expression for the solid angle from spherical to surface coordinates,

$$\begin{aligned} {\text{ d }}\Omega&=\varvec{n}\cdot \left( \frac{\partial \varvec{n}}{\partial \theta }\times \frac{\partial \varvec{n}}{\partial \phi }\right) \,{\text{ d }}\theta \,{\text{ d }}\phi =\varvec{n}\cdot \left( \frac{\partial \varvec{n}}{\partial \theta }\times \frac{\partial \varvec{n}}{\partial \phi }\right) \left( \frac{\partial \theta }{\partial x_1}\frac{\partial \phi }{\partial x_2}-\frac{\partial \theta }{\partial x_2}\frac{\partial \phi }{\partial x_1}\right) \,{\text{ d }}x_1\,{\text{ d }}x_2\nonumber \\&=\varvec{n}\cdot \left( \frac{\partial \varvec{n}}{\partial x_1}\times \frac{\partial \varvec{n}}{\partial x_2}\right) \,{\text{ d }}x_1\,{\text{ d }}x_2. \end{aligned}$$
(5)

By analogy with our calculation of the winding number of a curve (1), we count the number of times \(\nu \) the surface normal \(\varvec{n}\) wraps around the sphere, simply integrating the solid angle element (5) over the surface \(\mathrm S\), and dividing the result by \(4\pi \)

$$\begin{aligned} \nu =\frac{\chi }{2}=\frac{1}{4\pi }\int _\textrm{S}\,{\text{ d }}\Omega =\frac{1}{4\pi }\int _\textrm{S}\varvec{n}\cdot \left( \frac{\partial \varvec{n}}{\partial x_1}\times \frac{\partial \varvec{n}}{\partial x_2}\right) \,{\text{ d }}x_1\,{\text{ d }}x_2. \end{aligned}$$
(6)

The winding number is half the Euler characteristic, \(\chi =2\nu \) which is the topological invariant typically used to classify closed surfaces. We should remember that this is simply a way of expressing the winding number. Equation (6) is the essence of the topological classification of surfaces: there is no way to smoothly change one surface into another if their normal vectors cover a sphere a different number of times. Two such incompatible surfaces are shown in Fig. 2a and c.

There is a more interesting way to write the Euler characteristic (6), that replaces the change in the surface normal with the local surface curvature. To make this transition we take some point on the surface, and use it as origin of a Cartesian coordinate system, where the \(z=0\) plane is tangent to the surface, as shown in Fig. 3. Close to this point, the surface shape satisfies

$$\begin{aligned} z-h(x,y)=0 \end{aligned}$$
(7)

where h(xy) is the height of the surface above the tangent plane. By definition the height and its gradient vanish at the origin. The surface normal is now proportional to the gradient of the above equation for the surface height (7),

$$\begin{aligned} \varvec{n}=N\left( \varvec{e}_z-\varvec{\nabla }h(x,y)\right) . \end{aligned}$$
(8)

where the scalar N ensures normalization, \(\varvec{n}\cdot \varvec{n}=1\) and equals unity at \(x=y=0\), where \(\varvec{\nabla }h=0\). Using the x and y coordinates in the formula for the element of solid angle (5), and substituting the above expression for the surface normal vector (8), we can re–write the element of solid angle at \(x=y=z=0\) in terms of the Hessian (the matrix of second derivatives with respect to x and y) of the surface height,

$$\begin{aligned} \textrm{d}\Omega =\varvec{n}\cdot \frac{\partial \varvec{n}}{\partial x}\times \frac{\partial \varvec{n}}{\partial y}\,\textrm{d}x\,\textrm{d}y= & {} \varvec{e}_{z}\cdot \left( \varvec{\nabla }\partial _{x} h\times \varvec{\nabla }\partial _{y} h\right) \,\textrm{d}x\,\textrm{d}y\nonumber \\= & {} \textrm{det}\left( \begin{array}{ll}\frac{\partial ^{2}h}{\partial x^{2}}&{}\frac{\partial ^{2}h}{\partial x\partial y}\\ \frac{\partial ^{2}h}{\partial x\partial y}&{}\frac{\partial ^{2}h}{\partial y^{2}}\end{array}\right) \,\textrm{d}x\,\textrm{d}y. \end{aligned}$$
(9)

The determinant of the Hessian is positive for surfaces that are locally elliptic paraboloids, and negative for hyperbolic paraboliods. It equals the inverse product of the two principal radii of curvature, \(\textrm{det}[\partial ^2 h/\partial x_i\partial x_j]=(R_1 R_2)^{-1}\), which is known as the surface’s Gaussian curvature K (see Fig. 3). This argument can be carried out at every point on the surface. Summing the results we find the Euler characteristic (6) can also be re–written as an integral of the curvature of the surface

$$\begin{aligned} \chi =\frac{1}{2\pi }\int _\textrm{S}K\,{\text{ d }}A \end{aligned}$$
(10)

where \({\text{ d }}A\) is an infinitesimal element of surface area. Equation (10) is the famous Gauss–Bonnet theorem, and is quite a remarkable expression. The surface curvature integrated over any closed surface always equals a multiple of \(2\pi \). We can continuously deform the surface, changing the distribution of surface curvature, but—so long as we don’t tear a new hole in the surface—every region of increased curvature is unavoidably balanced by regions where it is reduced.

Fig. 3
figure 3

At any point on a surface the infinitesimal element of solid angle \({\text{ d }}\Omega \) appearing in (6) can be re–written in terms of the Hessian of the surface height function h(xy). Rotating the in–plane coordinates to \(x',y'\) to align with the principal axes of curvature (white lines), the Hessian is diagonalized with eigenvalues equal to the inverses of the two radii of curvature (the radii of the two circles that approximate the white curves in the diagram)

Fig. 4
figure 4

Instead of the surface normal \(\varvec{n}\), we can use tangent vector fields \(\varvec{e}_{1}\), \(\varvec{e}_{2}\) to calculate the Euler characteristic. The Euler characteristic \(\chi \) can then be written as the surface integral of a ‘magnetic flux’ as in (15). The topology of the surface is determined by the number of critical points, where either of the tangent vectors are undefined. When \(\chi \ne 0\), as for the sphere in panels (a) and (b), every choice of tangent vector field will exhibit critical points, shown here as the black dots in the zoomed in regions. Only when \(\chi =0\) is it possible to have a tangent vector that is well defined at all points. An example is shown on the surface of the torus in panel (c)

In parallel with our earlier discussion of one dimensional curves, we can write the Gauss–Bonnet theorem in a third equivalent form, as integral of an effective magnetic flux passing through the closed surface. This is achieved through introducing a pair of orthonormal tangent vectors on the surface, \(\varvec{e}_{1}\) and \(\varvec{e}_{2}\). The surface normal is everywhere given by the cross product between these tangent vectors

$$\begin{aligned} \varvec{n}=\varvec{e}_{1}\times \varvec{e}_{2}. \end{aligned}$$
(11)

Substituting expression (11) for the surface normal into our expression for the solid angle element (5), we see that it equals the curl of a vector

$$\begin{aligned} {\text{ d }}\Omega =\varvec{n}\cdot \frac{\partial \varvec{n}}{\partial x_1}\times \frac{\partial \varvec{n}}{\partial x_2}{\text{ d }}x_1{\text{ d }}x_2&=\left[ \left( \varvec{n}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_2}\right) \left( \varvec{n}\cdot \frac{\partial \varvec{e}_{1}}{\partial x_1}\right) -\left( \varvec{n}\cdot \frac{\partial \varvec{e}_{1}}{\partial x_2}\right) \left( \varvec{n}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_1}\right) \right] {\text{ d }}x_1{\text{ d }}x_2\nonumber \\&=\left[ \frac{\partial \varvec{e}_{1}}{\partial x_1}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_2}-\frac{\partial \varvec{e}_{1}}{\partial x_2}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_1}\right] {\text{ d }}x_1{\text{ d }}x_2\nonumber \\&=\left( \frac{\partial A_{2}}{\partial x_1}-\frac{\partial A_{1}}{\partial x_2}\right) {\text{ d }}x_1{\text{ d }}x_2. \end{aligned}$$
(12)

where the components of this two component surface ‘vector potential’ \(A_i\) are defined as

$$\begin{aligned} A_{j}=\varvec{e}_{1}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_j}. \end{aligned}$$
(13)

Note that, despite appearances there is no bias between the two tangent vectors, and the vector potential can be written equivalently as \(A_{i}=-\varvec{e}_{2}\cdot \partial _i\,\varvec{e}_{1}\), due to the normalization condition \(\varvec{e}_{1}\cdot \varvec{e}_{2}=0\). Although we have called \(A_{j}\) a ‘vector potential’ due to the analogous quantity in physics, more precisely the quantity \(A_{j}\) is a connection on the surface, a quantity from differential geometry that characterises how the basis vectors change from point to point (see Appendix A for details).

Our three different expressions for \({\text{ d }}\Omega \) show that the Gaussian curvature both expresses a change in the surface normal \(\varvec{n}\), and an effective ‘magnetic field’ (the curl of the connection) due to the change in the surface tangent vectors \(\varvec{e}_{i}\)

$$\begin{aligned} K=\varvec{n}\cdot (\varvec{\nabla }\times \varvec{A})=\varvec{n}\cdot \left( \frac{\partial \varvec{n}}{\partial x_1}\times \frac{\partial \varvec{n}}{\partial x_2}\right) \end{aligned}$$
(14)

Using the above expression for the solid angle in terms of the vector potential (12) we can also write the Gauss–Bonnet theorem as the integral of a ‘magnetic flux’ passing through the surface

$$\begin{aligned} \chi =\frac{1}{2\pi }\int _{S}\varvec{\nabla }\times \varvec{A}\cdot {\text{ d }}\varvec{S}. \end{aligned}$$
(15)

where the vector surface area element is given by \({\text{ d }}\varvec{S}={\text{ d }}x_1\,{\text{ d }}x_2\,\varvec{n}\). Note that the Euler characteristic (15) now takes an identical form to the winding number of a one dimensional curve (2). Just as we saw there, the integral (15) doesn’t seem right. Stokes’ theorem tells us that the integral of a curl over a surface equals a line integral around the surface boundary. But these closed surfaces have no boundary! So surely the integral is zero.

But, after Sec. 2 we are prepared for this puzzle. Stokes’ theorem can only be applied if the vector potential \(\varvec{A}\) is well defined over the whole surface. We have an integral of something that looks like a curl over a closed surface, but it isn’t the curl of anything at some discrete points (critical points) on the surface. These critical points are familiar for the polar and azimuthal unit vectors on a sphere, which are both undefined at the poles: it is not always possible to have tangent vectors \(\varvec{e}_{i}\) that are normalized, orthogonal, and everywhere well defined Footnote 3. In general, tangent vectors on a closed surface exhibit critical points where they do not have a well defined direction, as illustrated in Fig. 4. If we exclude a small region surrounding each of these critical points from the surface integral in (15) (\(\varvec{A}\) is now well defined everywhere), we can transform (15) to a sum of line integrals encircling the critical points and take the limit as the area of each of these excluded regions tends to zero. The contribution of these infinitesimal excluded regions is not zero. Each of these points will contribute a multiple of \(2\pi \),

$$\begin{aligned} \oint A_{j}\,{\text{ d }}x_{j}=\oint \varvec{e}_{1}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_j}\,{\text{ d }}x_{j}=\oint \varvec{e}_{1}\cdot {\text{ d }}\varvec{e}_{2}=\oint {\text{ d }}\theta =2\pi \textrm{n}. \end{aligned}$$
(16)

The last three steps follow from the infinitesimal change in the tangent vector, \(\varvec{e}_{1}\cdot {\text{ d }}\varvec{e}_{2}={\text{ d }}\theta \), where \({\text{ d }}\theta \) is angle by which the vector \(\varvec{e}_{2}\) rotates due to the infinitesimal displacement \({\text{ d }}x_{j}\). The integer n is known as the index of the critical point. Examples of critical points of different index are shown in Fig. 5.

Fig. 5
figure 5

A critical point of a normalized vector field is a point where its direction is undefined. Critical points are classified in terms of their index, which equals the number of times the vector rotates as we move around the critical point, being positive for an anti–clockwise rotation. In (a–c) we show three critical points (black dots), each of a different index

From Eqns. (15) and (16) we conclude that the Euler characteristic of a surface records the critical points of any tangent vector field on the surface. The sum of the indices \(\textrm{n}_{\varvec{x}_{c}}\) of all the critical points \(\varvec{x}_{c}\) of any surface tangent vector field equals the Euler characteristic

$$\begin{aligned} \chi =\sum _{\varvec{x}_{c}}\textrm{n}_{\varvec{x}_{c}}. \end{aligned}$$
(17)

This result is known as the Poincaré–Hopf theorem. As the Euler characteristic is a topological invariant we can thus conclude that however we deform the tangent vector field on a closed surface, the critical points cannot all be eliminated, unless the surface has the topology of a torus, \(\chi =0\).

3.1 Example: The Euler Characteristics of the Torus and the Sphere

Points on the surfaces of both a torus and a sphere can be parameterized in terms of two cyclic coordinates, \(x_1\) and \(x_2\),

$$\begin{aligned} \varvec{r}(x_1,x_2)=a\cos (x_1)\,\varvec{e}_{z}+\left( R+a\sin (x_1)\right) \left( \cos (x_2)\varvec{e}_{x}+\sin (x_2)\varvec{e}_{y}\right) , \end{aligned}$$
(18)

where R is the distance from the origin to the centre of the torus ‘tube’ of radius a. The surface makes a topological transition when the distance of the centre of the tube from the origin equals its radius \(R=a\), at which point the innermost circle of points on the torus becomes a single point at the origin, and the topology changes to that of a sphere. For \(R<a\) the \(x_1\) coordinate has the reduced range \(x_1\in [-\arcsin (R/a),\pi +\arcsin (R/a)]\), which becomes \(x_1\in [0,\pi ]\) when \(R=0\), where (18) describes the surface of a sphere.

Tangent vectors on the surface can be found through differentiating (18) with respect to the two coordinates, which after normalization gives the orthogonal pair of vectors

$$\begin{aligned} \varvec{e}_{1}&=-\sin (x_2)\varvec{e}_{x}+\cos (x_2)\varvec{e}_{y}\nonumber \\ \varvec{e}_{2}&=-\sin (x_1)\varvec{e}_{z}+\cos (x_1)\left( \cos (x_2)\varvec{e}_{x}+\sin (x_2)\varvec{e}_{y}\right) . \end{aligned}$$
(19)

When the surface has the topology of a torus, the tangent vectors (19) are uniquely defined at all points, and therefore the Euler characteristic equals zero

$$\begin{aligned} \chi =\frac{1}{2\pi }\sum _{n}\oint _{C_n}A_{j}\,{\text{ d }}x_{j}=0\qquad \text {(torus)} \end{aligned}$$
(20)

where the sum runs over the critical points (of which there are non in the case of a torus), each encircled by \(C_n\).

When the surface becomes a sphere (\(R=0\)), the two points \(x_1=0\) and \(x_1=\pi \) are critical points of the tangent vectors (19), being isolated points where the tangent vectors take many possible values, depending on how we approach the point. Using expressions (19) we can calculate the ‘vector potential’ from our earlier formula (13),

$$\begin{aligned} A_{1}&=\varvec{e}_{1}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_1}=0\nonumber \\ A_{2}&=\varvec{e}_{1}\cdot \frac{\partial \varvec{e}_{2}}{\partial x_2}=\cos (x_1). \end{aligned}$$
(21)

The Euler characteristic then equals the sum of the line integrals of \(A_{j}\) around the critical points

$$\begin{aligned} \chi =\frac{1}{2\pi }\left[ \int _0^{2\pi }A_{2}(x_1=0)\,{\text{ d }}x_{2}+\int _{2\pi }^{0}A_{2}(x_1=\pi )\,{\text{ d }}x_{2}\right] =2\qquad \text {(sphere)} \end{aligned}$$
(22)

where the line integral around the south pole is taken in the opposite direction, due to the reversal of the surface normal. We have shown the Euler characteristic of a sphere equals 2, as expected from the observation that \(\chi \) is twice the winding number of the surface normal around a sphere.

4 The Berry Connection and the Euler Characteristic

So far we’ve illustrated something of the basics of topology, but have not made much connection with physics, which is supposed to be why we’re here! We can begin to see the connection by re–writing the formula for the Euler characteristic (15) in terms of a single complex tangent vector field (a so–called complex line bundle)

$$\begin{aligned} |\psi \rangle =\frac{1}{\sqrt{2}}\left( \varvec{e}_{1}+\textrm{i}\varvec{e}_{2}\right) , \end{aligned}$$
(23)

where the state is normalized such that \(\langle \psi |\psi \rangle =1\) and we have adopted the bra–ket notation for vectors and inner products. In terms of this complex vector, the ‘vector potential’ (13) appearing in the Gauss–Bonnet theorem takes a simpler form

$$\begin{aligned} A_{j}=\varvec{e}_1\cdot \frac{\partial \varvec{e}_2}{\partial x_j}=-\textrm{i}\frac{1}{2}(\varvec{e}_{1}-\textrm{i}\varvec{e}_{2})\frac{\partial }{\partial x_j}(\varvec{e}_{1}+\textrm{i}{\varvec{e}_{2}})=-\textrm{i}\left\langle \psi |\frac{\partial }{\partial x_j}|\psi \right\rangle \end{aligned}$$
(24)

where we used the normalization conditions \(\varvec{e}_{i}\cdot \varvec{e}_{i}=1\), which implies \(\varvec{e}_{i}\cdot \partial _{j}\varvec{e}_{i}=0\). The final expression on the right of (24) can be recognised at once as the Berry connection [19]. An analogous quantity appears in quantum theory, and that tells us how to transport a quantum mechanical state vector \(|\psi \rangle \) around a space of parameters \(x_i\). For our tangent vector \(|\psi \rangle \), the curl of the coresponding Berry connection is—via (14)—simply the Gaussian curvature of the surface.

Using the complex vector \(|\psi \rangle \), we can—again, analogous to quantum theory—understand it as an eigenvector of a Hermitian operator \(\hat{L}=\textrm{i}\,\varvec{n}\,\times \),

$$\begin{aligned} \hat{L}=\textrm{i}\,\varvec{n}\,\times \rightarrow \hat{L}|\psi \rangle =|\psi \rangle .. \end{aligned}$$
(25)

To understand the origins of this operator note that \(\varvec{n}\times (\varvec{e}_{1}+\textrm{i}\varvec{e}_{2})=-\textrm{i}(\varvec{e}_{1}+\textrm{i}\varvec{e}_{2})\). Having introduced the operator \(\hat{L}\) we can calculate the Gauss–Bonnet theorem in yet another way! Not only does the Euler characteristic record critical points of the complex vector \(|\psi \rangle \) through the Berry connection (24), but these critical points arise from the properties of the operator \(\hat{L}\), of which \(|\psi \rangle \) is an eigenfunction.

To see this we calculate the curl of the Berry connection (24)—which can be used in the formula for the Euler characteristic (15)—and use the eigenvalue relation (25) to replace derivatives of the vector with those of the operator \(\hat{L}\),

$$\begin{aligned} \frac{\partial A_{2}}{\partial x_1}-\frac{\partial A_{1}}{\partial x_2}&=-\textrm{i}\left[ \bigg \langle \frac{\partial \psi }{\partial x_1}\bigg |\chi \bigg \rangle \bigg \langle \chi \bigg |\frac{\partial \psi }{\partial x_2}\bigg \rangle -\bigg \langle \frac{\partial \psi }{\partial x_2}\bigg |\chi \bigg \rangle \bigg \langle \chi \bigg |\frac{\partial \psi }{\partial x_1}\bigg \rangle \right] \nonumber \\&=-\textrm{i}\left[ \bigg \langle \psi \bigg |\frac{\partial \hat{L}}{\partial x_1}\bigg |\chi \bigg \rangle \bigg \langle \chi \bigg |\frac{\partial \hat{L}}{\partial x_2}\bigg |\psi \bigg \rangle -\bigg \langle \psi \bigg |\frac{\partial \hat{L}}{\partial x_2}\bigg |\chi \bigg \rangle \bigg \langle \chi \bigg |\frac{\partial \hat{L}}{\partial x_1}\bigg |\psi \bigg \rangle \right] \nonumber \\&=\varvec{n}\cdot \left( \frac{\partial \varvec{n}}{\partial x_1}\times \frac{\partial \varvec{n}}{\partial x_2}\right) =K \end{aligned}$$
(26)

where the final line demonstrates the consistency of our different approaches to the Euler characteristic. To derive (26) we used the completeness relation \(|\psi \rangle \langle \psi |+|\phi \rangle \langle \phi |+|\chi \rangle \langle \chi |=1\), where the remaining eigenvectors of \(\hat{L}\) are \(|\phi \rangle =(\varvec{e}_1-\textrm{i}\varvec{e}_{2})/\sqrt{2}\) (eigenvalue \(-1\)) and \(|\chi \rangle =\varvec{n}\) (eigenvalue 0). We also applied the result \(\partial _i\hat{L}|\chi \rangle +\hat{L}\partial _{i}|\chi \rangle =0\).

Equation (26) shows that the curl of the Berry connection is related to the ‘winding’ of the operator \(\hat{L}\). In general this is difficult to picture, but here it is simply another way of telling us the element of solid angle swept out by the surface normal \(\varvec{n}\) on the sphere. We have thus found yet another method for calculating the Euler characteristic! Not only is it the integral of the curl of the Berry connection (the Berry curvature) associated with the complex tangent vector field \(|\psi \rangle \), divided by \(2\pi \), this can also be written in terms of matrix elements of the operator \(\hat{L}\) (25) defined over the surface.

In physics, the integral of the Berry connection \(\oint A_{i}\,{\text{ d }}x_{i}\) is the phase shift a quantum mechanical wave function obtains after being adiabatically moved through a parameter space with coordinates \(x_i\) [19]. Our surface tangent ‘state vector’ \(|\psi \rangle \), undergoes the same phase shift as we encircle a critical point. We can see this by examining how the vector changes as we move a small distance on the surface

$$\begin{aligned} {\langle \psi |\left( |\psi \rangle +{\text{ d }}|\psi \rangle \right) =1+\frac{1}{2}\left( \varvec{e}_{1}-\textrm{i}\varvec{e}_{2}\right) \cdot \left( -\varvec{e}_{2}+\textrm{i}\varvec{e}_{1}\right) {\text{ d }}\theta =\left( 1+\textrm{i}{\text{ d }}\theta \right) =\textrm{e}^{\textrm{i}{\text{ d }}\theta }}. \end{aligned}$$
(27)

where \({\text{ d }}\theta \) is the change in the angle of the basis vectors \(\varvec{e}_{1,2}\) as they rotate around the surface normal \(\varvec{n}\). Using our earlier equation for the line integral of the vector potential around a critical point (16), we see from the above that the phase accumulated around a critical point is \(\oint A_{j}\,{\text{ d }}x^{j}=2\pi n\), where n is the critical point index.

4.1 Example: Electromagnetic Polarization

Free space electromagnetic waves are transverse, with the Fourier amplitudes of the fields obeying \(\varvec{k}\cdot \tilde{\varvec{E}}=\varvec{k}\cdot \tilde{\varvec{H}}=0\). For a monochromatic field of frequency \(\omega \), the length of the wavevector is also fixed by the free space dispersion relation \(\varvec{k}^2=k_0^2=\omega ^2/c^2\), which defines the surface of a sphere of radius \(k_0\). Monochromatic radiation is therefore defined by a set of Fourier amplitudes that both live on, and are tangent to a sphere in \(\varvec{k}\)–space.

The electric field in free space can thus be written as an integral over the surface of this sphere,

$$\begin{aligned} \varvec{E}(\varvec{x})=\int _{0}^{\pi }\sin (\theta _k)\,{\text{ d }}\theta _k\,\int _{0}^{2\pi }{\text{ d }}\phi _{k}\,\varvec{e}(\theta _k,\phi _k)\,\mathcal {E}(\theta _k,\phi _k)\,{\text{ e }}^{{\text{ i }}k_0\varvec{n}(\theta _k,\phi _k)\cdot \varvec{x}} \end{aligned}$$
(28)

where \(\varvec{e}\) represents the direction of the electric field, and \(\mathcal {E}\) the scalar amplitude of the wave propagating in direction \(\varvec{n}=\varvec{k}/k_0\). The polarization vector is chosen to satisfy \(\varvec{e}\cdot \varvec{e}^{\star }=1\) and \(\varvec{n}\cdot \varvec{e}=0\).

Fig. 6
figure 6

The nodes in the radiation pattern from an electric dipole are a consequence of topology. (a) Snapshot of the oscillating electric field \(\varvec{E}\) from a dipole moment \(\varvec{p}\). Enclosing the dipole in a sphere (the blue curve, for example) there will always be points where the electric field is normal to the sphere, making its direction tangent to the sphere undefined (a critical point). (b) Far away from the dipole, the observation direction picks out a Fourier component of the field (28) and the electric field is everywhere tangent to the sphere, with critical points along the axis of the dipole

If \(\varvec{e}\) is real valued, the polarization is linear for all directions of propagation. In this case we can use it as the tangent vector \(\varvec{e}_{1}\) in (13), with \(\varvec{e}_{2}=\varvec{n}\times \varvec{e}_{1}\). The integral of the curl of the ‘vector potential’ defined in (13), \(\varvec{\nabla }\times \varvec{A}\), over the entire sphere in \(\varvec{k}\) space will therefore equal 2, the Euler characteristic of the sphere. This means that linear polarized fields must have at least one critical point (index 2) in the electric field, as a function of direction, although in general there will be many. The same argument can also be applied to circularly polarized fields, where at every point the direction of the electric field is of the form \(\varvec{e}=(\varvec{e}_{1}\pm \textrm{i}\varvec{e}_{2})/\sqrt{2}\). As described below (27), there must be a phase vortex around each of these critical points, which can be understood as the Berry phase.

As a concrete example, take radiation from a dipole with dipole moment \(\varvec{p}\), where the far field electric field takes the form

$$\begin{aligned} \varvec{E}(\varvec{x})\sim \frac{k_0^2}{4\pi \epsilon _0 r}\left( \varvec{p}-\varvec{e}_{r}(\varvec{e}_{r}\cdot \varvec{p})\right) \textrm{e}^{\textrm{i}(k_0 r-\omega t)}. \end{aligned}$$
(29)

In the far field (\(r\rightarrow \infty \)), the direction of observation selects one direction of propagation in the expansion (28) and we can therefore see that the vector \(\varvec{e}\) on the sphere is

$$\begin{aligned} \varvec{e}=\frac{\varvec{p}-\varvec{e}_{r}(\varvec{e}_{r}\cdot \varvec{p})}{\sqrt{\varvec{p}^2-(\varvec{e}_{r}\cdot \varvec{p})^2}}. \end{aligned}$$
(30)

This vector has two critical points of index \(+1\), each when the radial unit vector points along the axis of the dipole: \(\varvec{e}_{r}(\varvec{e}_{r}\cdot \varvec{p})=\varvec{p}\). We can thus see that the nodes in the radiation pattern from a simple dipole—something we are familiar with from a first course in electromagnetism and shown in Fig. 6—can be understood to be a necessary consequence of the topology of a sphere. Were the dispersion surface to have a different topology (in a hyperbolic material, for example [20]), these properties would change. This line of argument has far reaching consequences for the polarization of scattered light, governing the polarization of sunlight [21] and multipole scattering in metamaterials [22].

5 Characteristic Classes and Physics

We have now developed several methods, all for calculating the same thing: the Euler characteristic (winding number) of a two dimensional surface. This limitation in part occurs because we always considered tangent vectors \(\varvec{e}_{1,2}\); the so–called tangent bundle. As the critical points of the tangent vectors are a direct reflection of the number of times the surface normal wraps around a sphere, we were stuck with the Euler characteristic. But there is nothing stopping us from adapting the same formulae to calculate different topological invariants both for higher dimensional surfaces and for vectors on the surface that are not related to the tangent vectors in any way.

To understand the generalization to higher dimensions we need to introduce some terminology. Although we didn’t name them as such, we have so–far been looking at integrals of characteristic classes. In (15) we integrated the tangent vectors’ Euler class, \(\varvec{\nabla }\times \varvec{A}/2\pi \) where \(A_{i}=\varvec{e}_{1}\cdot \partial _{i}\varvec{e}_2\), which yields a topological invariant from the real vectors \(\varvec{e}_{1,2}\) on the surface. Meanwhile, when we did the same thing for the single complex vector field \(|\psi \rangle =(\varvec{e}_{1}+\textrm{i}\varvec{e}_{2})/\sqrt{2}\) (23) (a so–called line bundle) we integrated the tangent vectors’ first Chern class,

$$\begin{aligned} 1^{\textrm{st}}~ \mathrm {Chern~ class}: \;c_1=\frac{1}{2\pi }\left( \frac{\partial A_2}{\partial x_1}-\frac{\partial A_1}{\partial x_2}\right) ,\qquad \textrm{where }~A_{j}=-\textrm{i}\langle \psi |\partial _{j}|\psi \rangle \end{aligned}$$
(31)

which is the Berry curvature divided by \(2\pi \), and yields a topological invariant from complex vectors on the surface. These are two examples of characteristic classes. For the special case of tangent vectors on a two dimensional surface, integrals of the Euler and first Chern classes give the same result: the Euler characteristic.

The Chern classes are just one type of characteristic class. Each Chern class is a ‘closed but not exact’ expression depending on the Berry connection. This is a generalization of what we’ve already seen in two dimensions, namely \(\varvec{\nabla }\times \varvec{A}\) ‘looks like a curl’ (it is closed), but fails to be the curl of any properly defined vector at the critical points on the surface (it is not exact). These critical points are a direct reflection of the topology of both the vector field, and the shape of the surface. The non–zero winding number in both our 1D (1) and 2D (15) examples is equivalent to summing the indices of these critical points. Each of the Chern classes beyond the first is a ‘closed but not exact’ expression that does exactly the same thing in higher dimensions, each being integrated over ever higher dimensional regions: the first Chern class being associated with two dimensional integrals, the second with four dimensions, the third with six dimensions, and so on.

To illustrate the point, let’s look at the second Chern class. We consider a four dimensional surface, to which we attach a pair of complex vector fields, \(|1\rangle \) and \(|2\rangle \) (as opposed to the single vector \(|\psi \rangle \) used in two dimensions). As explained in Appendix A, when dealing with a set of N complex vector fields, each component of the Berry connection \(A_i\) becomes an \(N\times N\) matrix. The Berry curvature is also replaced by the two index ‘curvature form’ \(\Omega _{ij}\), each component of which is—in this particular case—a \(2\times 2\) matrix

$$\begin{aligned} \Omega _{ij}=\frac{1}{2}\left( \frac{\partial A_{j}}{\partial x_i}-\frac{\partial A_{i}}{\partial x_j}+\textrm{i}[A_i, A_j]\right) . \end{aligned}$$
(32)

The commutator is defined as \([A_i,A_j]=A_i\,A_j-A_j\,A_i\), and represents the difference between \(\Omega _{ij}\) and the Berry curvature encountered in the previous section.

Expressed in terms of the curvature (32), the second Chern class—which is to be integrated over a four dimensional surface—is simply required to be a ‘closed but not exact’ scalar expression that can be integrated over the surface Footnote 4. As \(\Omega _{ij}\) has only two of the requisite four spatial indices we must therefore consider the square of the Berry curvature,

$$\begin{aligned} 2^{\textrm{nd}}~ \mathrm {Chern~class:}\;c_2=\frac{1}{8\pi ^2}\textrm{tr}[\Omega ^2]=\frac{1}{8\pi ^2}\epsilon _{ijkl}\textrm{tr}[\Omega _{ij}\Omega _{kl}] \end{aligned}$$
(33)

The ‘square’ of the curvature given above is not defined as an ordinary matrix product, but as an antisymmetrized product: the ‘wedge product’ in the terminology of differential forms. This antisymmetrized product is given in the final step of (33), where \(\epsilon _{ijkl}\) is the completely anti–symmetric unit tensor (repeated indices are summed over), and ‘\(\textrm{tr}\)’ is a trace over the matrix left after the sum over the spatial indices ijkl. The pre–factor of \(1/8\pi ^2\) is analogous to the factor of \(1/2\pi \) in (31), ensuring the result of integrating (33) over the surface is an integer.

Substituting the curvature (32) into the definition of the second Chern class (33) we see that \(c_2\) can be written as a divergence

$$\begin{aligned} c_2&=\frac{1}{32\pi ^2}\epsilon _{ijkl}\,\textrm{tr}\bigg [\frac{\partial A_{j}}{\partial x_i}-\frac{\partial A_{i}}{\partial x_j}+\textrm{i}[A_i, A_j]\bigg ]\bigg [\frac{\partial A_{l}}{\partial x_k}-\frac{\partial A_{k}}{\partial x_l}+\textrm{i}[A_k, A_l]\bigg ]\nonumber \\&=\frac{1}{8\pi ^2}\epsilon _{ijkl}\,\frac{\partial }{\partial x_i}\textrm{tr}\left[ A_{j}\frac{\partial A_{l}}{\partial x_k}+\frac{2\textrm{i}}{3}A_{j}A_k A_l\right] . \end{aligned}$$
(34)

As we hoped, we have something that is ‘closed but not exact’. As with the expressions we have met for the winding number of a curve (2), and the Euler characteristic (15), the integral of \(c_2\) over any closed four dimensional surface appears to be zero. Equation (34) takes the form of a divergence, and the divergence theorem tells us its integral equals an integral over a boundary, which vanishes for any closed surface! Yet again this is not the case: \(c_2\) is non–zero due to the critical points of the so–called Chern–Simons form [23]

$$\begin{aligned} S_i=\epsilon _{ijkl}\left[ A_j\partial _k A_l+\frac{2\textrm{i}}{3}A_j A_k A_l\right] \end{aligned}$$
(35)

where we have introduced the compact notation \(\partial _{i}\equiv \partial /\partial x_{i}\), which we shall use wherever convenient from hereon. The Chern–Simons form exhibits critical points where the basis vectors \(|1\rangle \) and \(|2\rangle \) become undefined. We should note that the Chern–Simons form appears in several places in physics, including in the next section, and is e.g. an important object in topological field theory [24].

The second Chern class is another example of a quantity that is ‘closed but not exact’. The integral of the second Chern class depends on the integer number of critical points of the Chern Simons form and is thus a number that cannot be continuously changed. The pattern evident in the first two Chern classes, (31) and (33) continues into higher dimensions, with e.g. the \(3^\textrm{rd}\) Chern class given in terms of the cube of \(\Omega _{ij}\), which can be written as the divergence of a Chern–Simons form containing higher powers of the matrices \(A_i\) and their derivatives. Note that there are no Chern classes associated with odd dimensional surfaces, simply because they are all zero!

In recent years, Chern classes above the first have been used to design waveguides [25] and acoustic lattices [26], where the dimensions are typically ‘synthetic’, being system parameters such as the resonant frequency.

5.1 Example: the First Chern Class Applied to Distinguish Eigenmodes

As an illustration of the application of characteristic classes in physics, we can calculate the first Chern class for an eigenmode \(|\psi \rangle \) of a system parameterized by coordinates, \(x_1\) and \(x_2\) that cover a closed surface. Suppose we have two such eigenmodes, \(|\psi \rangle \) and \(|\phi \rangle \), of a linear operator \(\hat{L}\),

$$\begin{aligned} \hat{L}(x_1,x_2)\,|\psi \rangle&=\lambda \,|\psi \rangle \nonumber \\ \hat{L}(x_1,x_2)\,|\phi \rangle&=\lambda '\,|\phi \rangle . \end{aligned}$$
(36)

By analogy with the discussion of the Sec. 4 we can define separate Berry connections for each of these eigenmodes,

$$\begin{aligned} A_{\psi ,i}=-\textrm{i}\langle \psi |\frac{\partial }{\partial x_i}|\psi \rangle ,\qquad A_{\phi ,i}=-\textrm{i}\langle \phi |\frac{\partial }{\partial x_i}|\phi \rangle . \end{aligned}$$
(37)

and ask the question of whether the state \(|\psi \rangle \) can be continuously deformed into \(|\phi \rangle \). If the integral of the first Chern class, known as the first Chern number \(\textrm{Ch}_1\),

$$\begin{aligned} {\textrm{Ch}_{1,\psi }}=\frac{1}{2\pi \textrm{i}}\int \varvec{\nabla }\times \langle \psi |\varvec{\nabla }|\psi \rangle \,{\text{ d }}x_1{\text{ d }}x_2=\frac{1}{2\pi }\int \varvec{\nabla }\times \varvec{A}_{\psi }\,{\text{ d }}x_1{\text{ d }}x_2 \end{aligned}$$
(38)

is a different integer for modes \(|\phi \rangle \) and \(|\psi \rangle \) then the answer to this question is no.

We can gain some useful insights if, as in Sec. 4, we relate the first Chern class to derivatives of the operator \(\hat{L}\), rather than the state \(|\psi \rangle \). To do this we first differentiate the eigenvalue equation (36) to find the overlap between an arbitrary eigenmode \(|n\rangle \) and the derivative of \(|\psi \rangle \), in terms of the derivative of \(\hat{L}\),

$$\begin{aligned} \bigg \langle n\bigg |\frac{\partial \psi }{\partial x_{i}}\bigg \rangle =\frac{\langle n|\frac{\partial \hat{L}}{\partial x_{i}}|\psi \rangle }{\lambda -\lambda _n} \end{aligned}$$
(39)

where we assume that \(|n\rangle \) and \(|\psi \rangle \) are non–degenerate eigenstates. Taking the curl of the first of the two Berry connections (37) and applying the above identity (39) then leads to the following expression for the Berry curvature,

$$\begin{aligned} \frac{\partial A_{2}}{\partial x_1}-\frac{\partial A_{1}}{\partial x_2}&=-\textrm{i}\sum _{|n\rangle \ne |\psi \rangle }\left[ \bigg \langle \frac{\partial \psi }{\partial x_{1}}\bigg |n\bigg \rangle \bigg \langle n\bigg |\frac{\partial \psi }{\partial x_2}\bigg \rangle -\bigg \langle \frac{\partial \psi }{\partial x_{2}}\bigg |n\bigg \rangle \bigg \langle n\bigg |\frac{\partial \psi }{\partial x_1}\bigg \rangle \right] \nonumber \\&=-\textrm{i}\sum _{|n\rangle \ne |\psi \rangle }\left[ \frac{\langle \psi |\frac{\partial \hat{L}}{\partial x_1}|n\rangle \langle n|\frac{\partial \hat{L}}{\partial x_2}|\psi \rangle -\langle \psi |\frac{\partial \hat{L}}{\partial x_2}|n\rangle \langle n|\frac{\partial \hat{L}}{\partial x_1}|\psi \rangle }{(\lambda -\lambda _n)^2}\right] \end{aligned}$$
(40)

where in the first step we used the completeness of states \(\sum _{n}|n\rangle \langle n|=1\), noting that there is zero contribution to the sum from the state \(|\psi \rangle \). Not also that for brevity we have dropped the subscript ‘\(\psi \)’ from the vector potential.

In (26) of Sec. 4 we related the Gaussian curvature to derivatives of a Hermitian operator \(\hat{L}=\textrm{i}\,\varvec{n}\times \). In the same way, here we find the more general concept of the Berry curvature is given in terms of derivatives of an arbitrary linear operator \(\hat{L}\), of which \(|\psi \rangle \) is an eigenstate. We can see from (40) that an eigenvalue degeneracy, \(\lambda =\lambda _n\) between any of the levels \(|n\rangle \) and the state of interest \(|\psi \rangle \), leads to points of singular Berry curvature. Such points are critical points of the Berry connection, arising due to the indeterminacy in the eigenvector \(|\psi \rangle \). However, degeneracies are not the only points of non–zero Berry curvature. Equation (40) shows that the curvature is generally non–zero whenever the derivatives of the linear operator, \(\partial \hat{L}/\partial x_i\), have complex off diagonal matrix elements with the state of interest \(|\psi \rangle \).

An important corollary of (40) is that, if we sum the Berry curvature over all system states \(|\psi \rangle \), we obtain zero

$$\begin{aligned} -\textrm{i}\sum _{|\psi \rangle }\sum _{|n\rangle \ne |\psi \rangle }\left[ \frac{\langle \psi |\frac{\partial \hat{L}}{\partial x_1}|n\rangle \langle n|\frac{\partial \hat{L}}{\partial x_2}|\psi \rangle -\langle \psi |\frac{\partial \hat{L}}{\partial x_2}|n\rangle \langle n|\frac{\partial \hat{L}}{\partial x_1}|\psi \rangle }{(\lambda -\lambda _n)^2}\right] =0 \end{aligned}$$
(41)

because the two terms in the numerator are now equal, cancelling due to the summation. This simple observation means that the sum of the first Chern numbers for all the eigenstates of a system is always zero.

5.2 Example: Polarization Eigenstates in Anisotropic Materials

Take an electromagnetic wave propagating through an anisotropic, non–magnetic material. We shall use the Chern number to characterize a family of these materials, to tell us how many times all possible polarizations are explored as we run through the material parameters.

Assuming a permeability \(\mu _0\varvec{1}\), a Hermitian (lossless) permittivity,

$$\begin{aligned} \varvec{\epsilon }=\epsilon _0\left( \begin{array}{ccc}\epsilon _{xx}&{}\epsilon _{xy}&{}0\\ \epsilon _{yx}&{}\epsilon _{yy}&{}0\\ 0&{}0&{}\epsilon _{zz}\end{array}\right) =\epsilon _0\left( \begin{array}{cc}\varvec{\epsilon }_{\parallel }&{}\varvec{0}\\ \varvec{0}&{}\epsilon _{zz}\end{array}\right) , \end{aligned}$$
(42)

and propagation along the z–axis, Maxwell’s equations for a wave of fixed frequency \(\omega \) reduce to

$$\begin{aligned} \frac{k}{\omega }\varvec{e}_{z}\times \varvec{E}&=\mu _0\varvec{H}\nonumber \\ \frac{k}{\omega }\varvec{e}_{z}\times \varvec{H}&=-\epsilon _0\varvec{\epsilon }_{\parallel }\cdot \varvec{E} \end{aligned}$$
(43)

where k the propagation constant of the wave. The problem is now to find the propagation constant and the electric field vector for a given set of material parameters \(\varvec{\epsilon }_{\parallel }\). Eliminating the magnetic field from the two equations (43) we can reduce the problem to an eigenvalue equation

$$\begin{aligned} \varvec{\epsilon }_{\parallel }\cdot \varvec{E}=\textrm{n}^2\varvec{E} \end{aligned}$$
(44)

where \(\textrm{n}=ck/\omega \) is the refractive index. Therefore the eigenvectors and eigenvalues of the in–plane permittivity tensor determine the polarization and refractive index of the wave, respectively. Interestingly, (44) is equivalent to the Schrödinger equation for a spin 1/2 particle in a magnetic field (see e.g. [19]). With this in mind we re–write the in–plane permittivity tensors in terms of the Pauli spin matrices, \(\varvec{\sigma }=\sigma _{x}\varvec{e}_{x}+\sigma _{y}\varvec{e}_{y}+\sigma _{z}\varvec{e}_{z}\), as follows

$$\begin{aligned} \left( \begin{matrix}\epsilon _{xx}&{}\epsilon _{xy}\\ \epsilon _{yx}&{}\epsilon _{yy}\end{matrix}\right) =\left( \begin{matrix}V_0+V\,\cos (\theta )&{}V\,\sin (\theta )\,\textrm{e}^{-\textrm{i}\phi }\\ V\,\sin (\theta )\,\textrm{e}^{\textrm{i}\phi }&{}V_0-V\,\cos (\theta )\end{matrix}\right) =V_0\varvec{1}+V\,\varvec{n}\cdot \varvec{\sigma }. \end{aligned}$$
(45)

where \(\varvec{n}\) is a unit vector, here parameterized by the spherical coordinates, \(\theta \) and \(\phi \). While the angle \(\theta \) determines the orientation of the principal axes of the permittivity, \(\phi \) governs the gyrotropy of the medium [27]. We can thus visualise these forms of lossless in–plane permittivity in terms of two real numbers, \(V_0\) and V, and the coordinates on a sphere, \(\theta \) and \(\phi \). Interestingly, the eigenvalues of (45) (the refractive index squared) are independent of the coordinates \(\theta \) and \(\phi \),

$$\begin{aligned} \textrm{n}^2=V_0\pm V. \end{aligned}$$
(46)

while the eigenvectors—defining the polarization of the electric field—depend only on the spherical coordinates

$$\begin{aligned} |\psi _{+}\rangle =\left( \begin{matrix}\cos (\theta /2)\\ \sin (\theta /2)\textrm{e}^{\textrm{i}\phi }\end{matrix}\right) ,\qquad |\psi _{-}\rangle =\left( \begin{matrix}\sin (\theta /2)\textrm{e}^{-\textrm{i}\phi }\\ cos(\theta /2)\end{matrix}\right) . \end{aligned}$$
(47)

These two states can be pictured as points on the Bloch sphere shown in Fig. 7. Keeping the refractive index fixed, the two polarizations can be interchanged by modifying the material parameters (45) according to the substitution \(\theta \rightarrow \pi -\theta \) and \(\phi \rightarrow \phi +\pi \).

Fig. 7
figure 7

(a) An electromagnetic wave propagating through a dielectric (here along z), normal to the plane of anisotropy (here the xy plane) will generally have two different propagation constants \(k=\textrm{n}\omega /c\), each corresponding to a different, generally elliptical, polarization, as given by (46) and (47). (b) We can visualise the polarizations on the Bloch sphere, with \(\theta \) and \(\phi \) labelling the different material parameters in (45). For \(\theta =0,\pi \) the principal axes of the permittivity tensor align with the x and y axes, and the wave is either x or y polarized. Meanwhile, for \(\theta =\pi /2\) the principal axes are diagonal and the wave polarization varies between diagonal and circular, depending on the gyrotropy \(\phi \)

For both states (47), the Berry connection contains only a single component,

$$\begin{aligned} A_{+,2}=-\textrm{i}\langle \psi _{+}|\frac{\partial }{\partial \phi }|\psi _{+}\rangle =\sin ^{2}(\theta /2)=-A_{-,2} \end{aligned}$$
(48)

which has an \(n=1\) critical point at the south pole, \(\theta =\pi \). The result \(A_{+,2}=-A_{-,2}\) is also in agreement with Sec. 4, where we found the sum of the Berry curvature over all eigenstates should vanish. From the above equation for the Berry connection, the Berry curvature is found to equal

$$\begin{aligned} \frac{\partial A_{+,2}}{\partial \theta }-\frac{\partial A_{+,1}}{\partial \phi }=\sin (\theta )/2. \end{aligned}$$
(49)

The integral of the curvature (49) over all values of \(\theta \) and \(\phi \) equals the first Chern number, here \(\textrm{Ch}_+\), which in this system is the number of times the polarization covers the Bloch sphere

$$\begin{aligned} \textrm{Ch}_{+}=\frac{1}{4\pi }\int _{0}^{2\pi }{\text{ d }}\phi \int _0^{\pi }{\text{ d }}\theta \sin (\theta )=1=-\textrm{Ch}_{-}. \end{aligned}$$
(50)

A Chern number of unity tells us that, for a fixed value of the refractive index squared, \(\textrm{n}^2=V_0\pm V\), the family of in–plane permittivity tensors (45) cover all possible electromagnetic polarizations exactly once. The difference in sign between \(\textrm{Ch}_{+}\) and \(\textrm{Ch}_{-}\) means that the two polarization eigenstates (47) wind in opposite senses around the Bloch sphere as we change the material parameters. This argument fails when we look at isotropic materials where \(V=0\), in which case the ‘gap’ between the two values of refractive index (46) closes, and the permittivity (45) no longer depends on the spherical coordinates. At this point the two polarizations (47) have degenerate values of the refractive index, meaning all polarizations are solutions to (44), whatever the value of \(V_0\).

Had we parameterized the permittivity differently, as e.g.

$$\begin{aligned} \left( \begin{matrix}\epsilon _{xx}&{}\epsilon _{xy}\\ \epsilon _{yx}&{}\epsilon _{yy}\end{matrix}\right) =\left( \begin{matrix}V_0\cos (\phi )+V\cos (\theta )&{}V\sin (\theta )\\ V\sin (\theta )&{}V_0\cos (\phi )-V\cos (\theta )\end{matrix}\right) \end{aligned}$$
(51)

the refractive index would have been \(\textrm{n}^2=V_0\cos (\phi )\pm V\), with eigenstates independent of \(\phi \), e.g. \(|\psi _{+}\rangle =\left( \cos (\theta /2),\sin (\theta /2)\right) ^\textrm{T}\). For this family of permittivity tensors the Berry connection (48) therefore vanishes, and no area is covered on the Bloch sphere. The Chern numbers thus vanish, \(\textrm{Ch}_+=\textrm{Ch}_-=0\). This tells us that for anisotropic media, gyrotropy is therefore essential to realise all possible polarization eigenstates.

6 Chern Numbers and Dispersion Relations

We introduced the ‘characteristic classes’ to set up what is probably the most striking application of topology in wave physics: the prediction of interface modes, and in particular the possibility of these interface modes being constrained to propagate in one direction only. In this section we prove the relation between the integral of the first Chern class, and the prediction of interface modes between different materials.

Take some planar material that supports waves. Perhaps electromagnetic waves in a periodic array of pillars, or elastic waves in a plate. Suppose the system is homogeneous, or at least periodic, so the modes \(|n_{\varvec{k}}\rangle \) can be labelled with a wave–vector \(\varvec{k}\). The modes will generally be the solution to some eigenvalue equation

$$\begin{aligned} \hat{L}(\varvec{k})|n_{\varvec{k}}\rangle =\lambda _{n,\varvec{k}}|n_{\varvec{k}}\rangle \end{aligned}$$
(52)

where the integer n labels the different branches of the dispersion relation. At this stage we make no assumption about the meaning of \(\lambda _{n,\varvec{k}}\); it could be a frequency, a wave vector component, or a material parameter.

In the case of homogeneous media, the wave–vector ranges over all pairs of real values, which can be mapped to a sphere via the stereographic projection (so long as care is taken at infinity [28], see Fig. 8a). For periodic media the Bloch vector has components ranging over the first Brillouin zone, the edges of which can be connected to each other through the addition of a reciprocal lattice vector, thus forming the surface of a torus (see Fig. 8b). In both cases we are dealing with the situation described in the previous sections: a single complex vector field \(|n_{\varvec{k}}\rangle \) defined over a closed two dimensional surface, the points of which are specified by the vector \(\varvec{k}\).

Fig. 8
figure 8

The modes \(|n_{\varvec{k}}\rangle \) of both homogeneous and periodic two dimensional media can be understood as a set of vectors on a closed surface, as discussed in Sec. 5. For homogeneous media the wave–vector \(\varvec{k}\) ranges over all pairs of real values, which—via the stereographic projection—can be mapped onto the surface of a sphere, as shown in (a). For periodic media, the Bloch vector \(\varvec{K}\) is equivalent after the addition of a reciprocal lattice vector. The edges of the first Brillouin zone (differing as they do by a reciprocal lattice vector) can thus be identified, e.g. the two blue lines in panel (b) represent the same physical states. Connecting the edges of the Brillouin zone yields a torus, as shown in panel (b)

In a nutshell

Suppose we have a system where the different branches of the dispersion relation (the \(\lambda _{n,\varvec{k}}\) for different n) are separated by gaps. For instance \(\lambda _{2,\varvec{k}}\) takes values in some range, separated above and below by a gap before the next lower range of values \(\lambda _{1,\varvec{k}}\), and larger range of values \(\lambda _{3,\varvec{k}}\). The idea is that we focus on one of these gaps in the spectrum. We classify all the branches of the dispersion relation below this gap in terms of the integral of the first Chern class (the first Chern number) over all wave–vectors

$$\begin{aligned} \textrm{Ch}_{1}=\frac{1}{2\pi \textrm{i}}\int _{S} \varvec{\nabla }_{\varvec{k}}\times \langle n_{\varvec{k}}|\varvec{\nabla }_{\varvec{k}}|n_{\varvec{k}}\rangle \,{\text{ d }}k_1\,{\text{ d }}k_2=\frac{1}{2\pi }\int _{S}\varvec{\nabla }\times \varvec{A}_{n}\,{\text{ d }}k_1\,{\text{ d }}k_2. \end{aligned}$$
(53)

As shown in Fig. 8, the surface S is a sphere for homogeneous media, and a torus for periodic materials.

Now take two different materials, A and \(A'\) that have a shared gap in their eigenvalue spectrum. If we sum the first Chern numbers \(\textrm{Ch}_1\) for the all the modes \(|n_{\varvec{k}}\rangle \) and \(|n'_{\varvec{k}}\rangle \) below the gap of interest and find the answers for the two materials are not the same, there is no way to smoothly change the material from A to \(A^{\prime }\), without closing the gap of interest in the eigenvalue spectrum.

To see this, note that to change the Chern number for any one of the branches of the dispersion relation we must introduce or remove critical points from the Berry connection \(\varvec{A}_{n}\). From our formula for the Berry connection (40) we know that new critical points arise whenever another branch of the dispersion relation becomes degenerate with branch n. But we know from (41) that the total Berry curvature is zero for all the branches of the dispersion relation. Therefore the sum of the Chern numbers below the gap cannot be changed through the closure of any gap, except the one we are interested in. This argument holds, however we choose to change material A into \(A'\).

Therefore, if we join the two materials together, in the transition region between the two there must be a region of space where the gap in the spectrum closes such that propagation is allowed. This region is where interface modes can be trapped. To add some meat onto the bones of this idea, we now give an argument based around a paper of the author [29], which was in turn adapted to Maxwell’s equations from [30].

6.1 Volovik’s Mode Counting Argument:

Suppose we have a system where the material varies smoothly as a function of x, as shown in Fig. 9. Asymptotically we have material A as x becomes large and positive and material \(A'\) when x is large and negative. Along the y–axis, the system remains translationally invariant so that we can replace y with the Fourier variable k. Due to the lack of translational invariance along x, the wave operator cannot be written in terms of the Fourier variable \(k_x\), but depends on both \(-\textrm{i}\partial _{x}\) and x. The eigenvalue equation is thus changed from (52) to

$$\begin{aligned} \hat{L}(-\textrm{i}\partial _x,x,k)|n_{k}\rangle =\lambda _{n,k}|n_{k}\rangle . \end{aligned}$$
(54)

The Green function for this equation obeys

$$\begin{aligned} \left[ \hat{L}(-\textrm{i}\partial _x,x,k)-\lambda \right] \varvec{G}(x,x',k,\lambda )=\varvec{1}\,\delta (x-x') \end{aligned}$$
(55)

and contains information about all of the eigenmodes. Here we make use of the Green function to characterize how the eigenvalues change with respect to the y component of the wavevector, k. Using the completeness of the eigenmodes of (54), \(\sum _{n}\langle x|n\rangle \langle n|x'\rangle =\varvec{1}\,\delta (x-x')\) we can expand the Green function in the eigenmode basis, yeilding the ‘bilinear expansion’ [31]

$$\begin{aligned} \varvec{G}(x,x',k,\lambda )=\sum _{n}\frac{\langle x|n\rangle \langle n|x'\rangle }{\lambda _{n,k}-\lambda }.. \end{aligned}$$
(56)

For us \(\lambda \) is a complex number so that the denominator in (56) is typically non–zero.

Fig. 9
figure 9

The spectral asymmetry \(\nu (k)\) defined in (57) is half the difference between the number of modes with positive and negative eigenvalue, \(\lambda \) for a fixed value of \(k_y=k\). If a single mode crosses \(\lambda =0\) with increasing k (shown here in red) then the difference \(\nu (+K)-\nu (-K)\) equals \(\pm 1\). If the asymptotic materials A and \(A'\) both have a gap in their eigenvalue spectrum around \(\lambda =0\), the mode that crosses \(\lambda =0\) must be an interface mode, confined to the inhomogeneous region between the blue and yellow regions indicated in the figure

As discussed above, we assume that when taken as homogeneous media, the two asymptotic materials, A and \(A'\) have a shared gap in their eigenvalue spectrum that includes \(\lambda =0\). Therefore, if we find a solution to (54) with \(\lambda =0\) we know that it must be confined to the region in space where the material is in the process of changing from A to \(A'\), i.e. it is an interface mode. We count these modes through introducing the spectral asymmetry, which records the difference in the number of modes with positive and negative eigenvalues (see Fig. 9),

$$\begin{aligned} \nu (k)=\frac{1}{2}\sum _{n}\textrm{sign}[\lambda _{n,k}] \end{aligned}$$
(57)

this number is a kind of tripwire, changing whenever a mode of the system crosses \(\lambda =0\). The spectral asymmetry can be calculated in terms of the Green function as follows,

$$\begin{aligned} \nu (k)&=\textrm{Re}\int _{-\infty }^{\infty }\frac{{\text{ d }}\lambda }{2\pi }\,\textrm{Tr}[G(x,x,k,\textrm{i}\lambda )]=\sum _{n}\textrm{Tr}[\langle x|n\rangle \langle n|x\rangle ]\,\int _{-\infty }^{\infty }\frac{{\text{ d }}\lambda }{2\pi }\frac{\lambda _{n,k}}{\lambda _{n,k}^2+\lambda ^2}\nonumber \\&=\frac{1}{2}\sum _{n}\textrm{sign}[\lambda _{n,k}] \end{aligned}$$
(58)

where the capitalized ‘\(\textrm{Tr}\)’ has been used for the sake of brevity. It means both a trace over matrix indices, and an integration over position x.

We assume the material parameters change on a length scale that is large compared to all other system scales. Equation (58) also shows us we only need the behaviour of the Green function in the neighbourhood of \(x=x'\). The differential equation (55) can therefore be expanded to leading order in the distance \(x-x'\). Making the change of variables from x and \(x'\) to the average position \(X=(x+x')/2\), and separation \(\xi =x-x'\), we expand the linear operator \(\hat{L}\) to leading order in \(\xi \) and \(\partial _X\),

$$\begin{aligned} \hat{L}(-\textrm{i}\partial _x,x,k)&=\hat{L}\left( -\textrm{i}\partial _\xi -\textrm{i}\frac{1}{2}\partial _X,X+\frac{1}{2}\xi ,k\right) \nonumber \\&\sim \hat{L}(-\textrm{i}\partial _{\xi },X,k_y)-\frac{\textrm{i}}{4}\left[ \frac{\partial }{\partial X}\frac{\partial \hat{L}}{\partial k_x}+\frac{\partial \hat{L}}{\partial k_x}\frac{\partial }{\partial X}\right] +\frac{1}{4}\left[ \xi \frac{\partial \hat{L}}{\partial X}+\frac{\partial \hat{L}}{\partial X}\xi \right] . \end{aligned}$$
(59)

where derivatives with respect to \(k_x\) indicate a derivative of the operator with respect to the first argument of \(\hat{L}\). Everywhere in (59) the operator \(\hat{L}\) depends only on derivatives \(\partial _{\xi }\) and average coordinate X. Performing a Fourier transform of (59) over \(\xi \), the wave operator \(\hat{L}\) becomes a simpler object involving only two differential operators

$$\begin{aligned} \int {\text{ d }}\xi \,\textrm{e}^{-\textrm{i}k_x\xi }\,\hat{L}(-\textrm{i}\partial _x,x,k)\sim L(k_x,X,k)+\frac{\textrm{i}}{2}\left[ \frac{\partial L}{\partial X}\frac{\partial }{\partial k_x}-\frac{\partial L}{\partial k_x}\frac{\partial }{\partial X}\right] . \end{aligned}$$
(60)

Staying in \(k_x\) space, we can now solve for the Green function of (60) to the same order, writing it as the sum of a zeroth and first order term \(G\sim G_0+G_1\). The zeroth order Green function is a solution to (55) with the operator \(\hat{L}\) replaced by the first term in (60). This is simply the Green function for an infinite homogeneous medium with the local material properties at the average position X,

$$\begin{aligned} G_0(k_x,X,k_y,\lambda )=\left[ L(k_x,X,k_y)-\lambda \right] ^{-1} \end{aligned}$$
(61)

Meanwhile the first order correction to G is the solution to

$$\begin{aligned} \left[ L(k_x,X,k)-\lambda \right] G_1=-\frac{\textrm{i}}{2}\left[ \frac{\partial L}{\partial X}\frac{\partial }{\partial k_x}-\frac{\partial L}{\partial k_x}\frac{\partial }{\partial X}\right] G_0 \end{aligned}$$
(62)

which can be solved using the zeroth order Green function (61), giving the correction

$$\begin{aligned} G_{1}(k_x,X,k,\lambda )=-\frac{\textrm{i}}{2}G_0\left[ \frac{\partial L}{\partial X}\frac{\partial G_0}{\partial k_x}-\frac{\partial L}{\partial k_x}\frac{\partial G_0}{\partial X}\right] . \end{aligned}$$
(63)

Summing Eqns. (61) and (63) we have an approximate expression for the Green function, valid when there is a slow change of material parameters with respect to position.

With our approximate expression for the Green function we can now compute the spectral asymmetry via (58). To keep the expressions relatively simple, we assume the spectral asymmetry computed from \(G_0\) is zero. This assumption is easily relaxed, and means that we only consider materials on the way from A to \(A'\) that–were they a homogeneous medium—would have a symmetric eigenvalue spectrum around \(\lambda _{n,k}=0\)Footnote 5. The spectral asymmetry is then determined by the first order correction to the Green function. Substituting the first order correction to the Green function (63) into (58) yields

$$\begin{aligned} \nu (k)&=\textrm{Im}\int _{-\infty }^{\infty }\frac{{\text{ d }}\lambda }{2\pi }\int _{-\infty }^{\infty }\frac{{\text{ d }}k_x}{2\pi }\int _{-\infty }^{\infty }\frac{{\text{ d }}X}{2}\,\textrm{tr}\,\left[ G_0\frac{\partial G_0^{-1}}{\partial X}\frac{\partial G_0}{\partial k_x}-G_0\frac{\partial G_0^{-1}}{\partial k_x}\frac{\partial G_0}{\partial X}\right] \nonumber \\&=\textrm{Im}\int \frac{{\text{ d }}{}^{3}x}{8\pi ^2}\,\textrm{tr}\left[ G_0\frac{\partial G_0^{-1}}{\partial x_1}G_0\,\frac{\partial G_0^{-1}}{\partial x_2}G_0\frac{\partial G_0^{-1}}{\partial x_0}-G_0\frac{\partial G_0^{-1}}{\partial x_2}G_0\,\frac{\partial G_0^{-1}}{\partial x_1}G_0\frac{\partial G_0^{-1}}{\partial x_0}\right] \nonumber \\ \end{aligned}$$
(64)

where we introduced the four coordinates \((x_0,x_1,x_2,x_3)=(\lambda ,X,k_x,k)\)., with the integration carried out over the first three coordinates, \({\text{ d }}{}^{3}x={\text{ d }}x_0{\text{ d }}x_1{\text{ d }}x_2\).

Fig. 10
figure 10

Illustration of the integrals (64) and (66). The spectral asymmetry is calculated as a three dimensional surface integral (here shown as a shaded 2D plane), at fixed k. The number of modes crossing \(\lambda _{n,k}=0\) equals the difference in the spectral asymmetry \(\nu (+K)-\nu (-K)\) (the two shaded planes). Provided the Green function vanishes for large X, \(k_x\), and \(\lambda \) the integral can be replaced with the closed cubic surface integral shown above. As the only contributions to the integral are the critical points of \(A_i\) on the \(\lambda =0\) surface (occurring whenever the band gap closes, and shown as red dots here), we can deform the closed surface into two parallel surfaces at fixed X, where the material parameters no longer change

Now the topology appears! Equation (64) is actually an integral of the Chern–Simons form (35) in disguise. To see this, we define a connection \(A_i\) with matrix components, as described in Sec. 4 and Appendix A,

$$\begin{aligned} A_i=\textrm{i}G_0\frac{\partial G_0^{-1}}{\partial x_i}=\textrm{i}H_0^{-1}\frac{\partial H_0}{\partial x_i}+\textrm{i}U_0^{\dagger }\frac{\partial U_0}{\partial x_i}. \end{aligned}$$
(65)

where we have written the matrix \(G_0^{-1}\) in polar form, \(G_0^{-1}=H_0 U_0\), where \(H_0\) is Hermitian and \(U_0\) unitary. Note that the connection (65) has zero curvature (167), and that the final term in (65) involving the unitary matrix is that found in (175) of Appendix A, corresponding to a rotation of the basis vectors. It is this final term that winds around the critical points of the connection \(A_i\), analogous to the winding of the Berry phase around critical points of a tangent vector on a surface.

The difference between the spectral asymmetry between the fixed values \(k=+K\) and \(-K\) tells us the number of modes N that cross \(\lambda _{n,k}=0\). After taking this difference, the open surface integral in (64) can be replaced with an integral over a closed three dimensional surface see Fig. 10), and the number of modes N can be written as Footnote 6

$$\begin{aligned} N=\nu (+K)-\nu (-K)&=\frac{\textrm{i}}{24\pi ^2}\textrm{Im}\oint {\text{ d }}^{3}x\,\epsilon _{ijk3}\,\textrm{tr}\left[ A_i\,A_j\,A_k\right] .\nonumber \\&=\frac{1}{8\pi ^2}\textrm{Im}\oint {\text{ d }}^{3}x\,\epsilon _{ijk3}\,\textrm{tr}\left[ \frac{\partial A_j}{\partial x_i}A_k+\frac{2\textrm{i}}{3}A_i\,A_j\,A_k\right] \end{aligned}$$
(66)

Note the sign of N indicates the direction in which the modes cross \(\lambda _{n,k}=0\).

After comparison with Sec. 4 we can see that (66) is the integral of the Chern–Simons form over a closed three dimensional surface, divided by \(8\pi ^2\). As described there, this boundary integral is equal to the four dimensional bulk integral of the second Chern class (33)! However, due to the form of the vector potential (65), the ‘curvature’ \(\Omega _{i j}\) defined in (32) appears to vanish identically everywhere, and once again we have an integral that looks like it should be exactly zero!

It’s the same old story: the integral is recording the discrete critical points of \(A_i\), which occur when \(G_0^{-1}=[L-\textrm{i}\lambda ]\) cannot be inverted (i.e. both the coordinate \(\lambda \) is zero, and one or more pairs of eigenvalues \(\lambda _{n,k}\) are zero). This means that the number of interface modes N in our inhomogeneous system that pass through zero eigenvalue is determined by the number of times the ‘gap’ in the eigenvalue spectrum of \(L(k_x,X,k)\) closes. This is a satisfying result: as we deform material A into material \(A'\) we must close the gap N times in order to have N interface modes crossing \(\lambda _{n,k}=0\).

So long as our integration volume encloses all of the critical points of \(A_i\) we won’t change the predicted number of interface modes, N. Therefore—as illustrated in Fig. 10—we can replace the integral over the constant k surfaces in (66) with an equivalent one over constant X surfaces,

$$\begin{aligned} N=\frac{1}{24\pi ^2}\textrm{Im}\oint {\text{ d }}{}^3 x\epsilon _{i1jk}\,\textrm{tr}\left[ A_i A_j A_k\right] \end{aligned}$$
(67)

We now only need understand the dependence of the operator L on the wave–vector \(\varvec{k}\) in the two asymptotic homogeneous materials A and \(A'\), and no longer need to consider the interface. It is also clear from the above argument that our integral over the coordinate \(\lambda \) on which the Green function depends is somewhat redundant. The critical points of \(A_i\) always occur in the plane of \(\lambda =0\). Substituting the expression (65) for \(A_i\) into (67), we can perform the integral over \(\lambda \) exactly using the following result

$$\begin{aligned} \int _{-\infty }^{\infty }{\text{ d }}\lambda \,\textrm{tr}\left[ \left( L-\textrm{i}\lambda \right) ^{-2}\frac{\partial L}{\partial k_x}\left( L-\textrm{i}\lambda \right) ^{-1}\frac{\partial L}{\partial k}\right] \\ =2\pi \sum _{n<0,m\ne n}\frac{\langle n|\frac{\partial L}{\partial k_x}|m\rangle \langle m|\frac{\partial L}{\partial k} |n\rangle -\langle n|\frac{\partial L}{\partial k}|m\rangle \langle m|\frac{\partial L}{\partial k_x} |n\rangle }{(\lambda _{n,k}-\lambda _{m,k})^2} \end{aligned}$$
(68)

which comes from an expansion of the operator L in terms of its eigemodes \(L=\sum _{n}\lambda _{n,k}|n\rangle \langle n|\) and an application of Cauchy’s integral formula. After applying (68) to our integral (67), the number of interface modes equals the difference

$$\begin{aligned} N=\bar{\nu }(X_{A'}) - \bar{\nu }(X_A) \end{aligned}$$
(69)

where we have defined

$$\begin{aligned} \bar{\nu }(x)=\frac{1}{2\pi \textrm{i}}\sum _{n<0,m\ne n}\int _{S}{\text{ d }}{}^{2}k\,\frac{\langle n|\frac{\partial L}{\partial k_x}|m\rangle \langle m|\frac{\partial L}{\partial k} |n\rangle -\langle n|\frac{\partial L}{\partial k}|m\rangle \langle m|\frac{\partial L}{\partial k_x} |n\rangle }{(\lambda _{n,k}-\lambda _{m,k})^2} \end{aligned}$$
(70)

and e.g. \(X_{A}\) indicates a position away from the inhomogeneity where the material parameters are those of material A.

Comparison with our expression for the Berry curvature (40), and the first Chern number (38) shows that \(\bar{\nu }(x)\) is equal to the sum of all the Chern numbers for the bands below \(\lambda _{n,k}=0\),

$$\begin{aligned} \bar{\nu }(x)=\sum _{n<0}\textrm{Ch}_{1,n}(x) \end{aligned}$$
(71)

which will be an integer so long as the surface S appearing in (70) can be made closed (as shown in Fig. 8). The number of interface modes N arising from changing the material from A to \(A'\) is thus equal to the difference in Chern numbers between the two media, summed over all the bands below the gap in the eigenvalue spectrum,

$$\begin{aligned} N=\sum _{n<0}\left[ \textrm{Ch}_{1,n}(X_{A'})-\textrm{Ch}_{1,n}(X_{A})\right] . \end{aligned}$$
(72)

We now have the striking result mentioned at the beginning of this section. The number of interface modes trapped between materials A and \(A'\) is determined by the difference in their Chern numbers. These are computed over a closed surface parameterized by the wave–vector, \(\varvec{k}\), and summed over all branches of the dispersion relation below the gap in the spectrum where there are the interface modes of interest.

So far in the tutorial we have gradually introduced the machinery of topology in order to get to this amazing result, the origin of which is not often discussed in metamaterials textbooks or papers. We now consider a few simple examples.

6.2 Example: Counting One Mode

As our first example we give a simple one dimensional calculation of the spectral asymmetry (58) in terms of the Green function, and relate it to a winding number.

Suppose we have a mode that satisfies the linear dispersion relation \(k=\alpha k_0\). We can see immediately that exactly one mode crosses from negative \(k_0\) to positive \(k_0\) as k is increased from \(-\infty \) to \(+\infty \). One choice of linear equation that gives such a dispersion relations is,

$$\begin{aligned} -\textrm{i}\alpha ^{-1}\frac{\partial \phi }{\partial y}=k_0\phi . \end{aligned}$$
(73)

Comparing to the previous section we see that the linear operator is \(\hat{L}=-\textrm{i}\alpha ^{-1}\partial _y\), and the eigenvalue is the wavenumber, \(\lambda _{k}=k_0\).

Fig. 11
figure 11

Counting one mode: the first order equation (73) has wave–like solutions with a dispersion relation \(k=\alpha k_0=\alpha \omega /c\). Panel (a) shows the dispersion relation for \(\alpha >0\), indicating the spectral asymmetry, which is \(+1/2\) for positive k and \(-1/2\) for negative k. (b) The spectral asymmetry \(\nu (K)\) can be written as the change in the argument (\(\Delta \theta \)) of the Green function (74) as \(\lambda \) varies from \(-\textrm{i}\infty \) to \(+\textrm{i}\infty \), divided by \(2\pi \). The difference \(\nu (K)-\nu (-K)\) thus equals the number of times the argument of the Green function winds around the critical point at \(\alpha ^{-1}k+\textrm{i}\lambda =0\)

We can now calculate the spectral asymmetry as a function of the wave–vector, k. For a fixed value of k the Green function is simple

$$\begin{aligned} \left( \alpha ^{-1}k-\lambda \right) G(k,\lambda )=1\rightarrow G(k,\lambda )=\frac{1}{\alpha ^{-1}k-\lambda }. \end{aligned}$$
(74)

According to (58), the spectral asymmetry equals the real part of the integral of the Green function over purely imaginary values of \(\lambda \). For our simple Green function (74), this is simply the derivative of the phase of G

$$\begin{aligned} \textrm{Re}[G(k,\textrm{i}\lambda )]=\textrm{Im}\frac{{\text{ d }}}{{\text{ d }}\lambda }\log \left( \frac{1}{\alpha ^{-1}k-\textrm{i}\lambda }\right) =\frac{{\text{ d }}}{{\text{ d }}\lambda }\textrm{arg}[G(k,\textrm{i}\lambda )] \end{aligned}$$
(75)

The integral of (75) over \(\lambda \) is thus simply the change in the phase of the Green function between the end points of the integral. The phase angle is defined relative to the critical point in (75) that is at \(k,\lambda =0\) (see Fig. 11), where the Green function diverges, and is exactly where our mode crosses \(\lambda _{k}=0\)!

Integrating from \(\lambda =-\infty \) to \(+\infty \), this change of angle is \(\pm \pi \) depending on the sign of \(\alpha ^{-1}k\). Applying (58) we thus find the spectral asymmetry equals

$$\begin{aligned} \nu (k)=\textrm{Re}\int _{-\infty }^{\infty }\frac{{\text{ d }}\lambda }{2\pi }\,G(\textrm{i}\lambda )=\int _{-\infty }^{\infty }\frac{{\text{ d }}\lambda }{2\pi }\,\frac{{\text{ d }}}{{\text{ d }}\lambda }\textrm{arg}[G(k,\textrm{i}\lambda )]=\frac{1}{2}\textrm{sign}[\alpha ^{-1}k] \end{aligned}$$
(76)

Assuming \(\alpha >0\) and taking the difference between the spectral asymmetry at fixed values \(k=+K\) and \(k=-K\), the number of modes equals the winding number of the argument of the Green function

$$\begin{aligned} N&=\nu (+K)-\nu (-K)\nonumber \\&=\frac{1}{2\pi }\bigg [\textrm{arg}[G(K,\textrm{i}\infty )]-\textrm{arg}[G(K,-\textrm{i}\infty )]+\textrm{arg}[G(-K,\textrm{i}\infty )]-\textrm{arg}[G(-K,-\textrm{i}\infty )]\bigg ]\nonumber \\&=1 \end{aligned}$$
(77)

This is a simple case of the winding number given in Sec. 6: it is the topological invariant counting the critical points of the Green function, which here occurs where the mode crosses \(\lambda _k=0\).

6.3 Example: A Lattice of Resonators

Fig. 12
figure 12

In some cases wave equations can be approximated as a discrete system of coupled resonators. A periodic honeycomb lattice of resonators (lattice constant d, lattice vectors \(\varvec{a}_{1}\) and \(\varvec{a}_{2}\)) is drawn in panel (a) where there are two resonators, ‘a’ and ‘b’ per unit cell, separated by a distance \(d/\sqrt{3}\). In the Haldane model each resonator is coupled to both its three nearest neighbours (red lines) and its six next nearest neighbours (green dashed lines). The corresponding reciprocal space is sketched in panel (b), where the grey circles indicate multiples of the reciprocal lattice vectors \(\varvec{b}_{1}\) and \(\varvec{b}_{2}\). The first Brillouin zone is shaded in blue, with the dashed and solid boundaries of the same colour connected by the addition of a reciprocal lattice vector. The six corners of the Brillouin zone where the coupling \(g(\varvec{K})\) vanishes are indicated as blue dots

As a second example, we take an array of coupled resonators, as sketched in Fig. 12a. These ‘resonators’ are a discrete approximation to a continuous system. For instance, the amplitude of a single resonator can represent the pressure in an acoustically resonant hole; the electric polarization of a dielectric particle; the displacement of an elastic rod; or simply the extension of a spring.

As we are dealing with a discrete lattice of resonators, the continuum theory given in Sec. 6 does not obviously apply. Yet the reasoning can be simply adapted. For instance we can expand the continuous wave in an orthonormal basis of N functions, \(|\psi \rangle =a_0|0\rangle +a_1|1\rangle +\dots +a_{N-1}|N-1\rangle \), with the basis functions representing e.g. the modes of the lattice before a perturbation is applied, or the modes of individual resonators in the tight binding approximation [32]. Assuming the basis itself obeys \(\varvec{\nabla }_{\varvec{K}}\times \langle n|\varvec{\nabla }_{\varvec{K}}|m\rangle =0\), the Berry curvature of a given mode \(|\psi \rangle \) is given entirely in terms of the expansion coefficients, \(-\textrm{i}\langle \psi |\varvec{\nabla }_{\varvec{K}}|\psi \rangle =-\textrm{i}\varvec{\nabla }_{\varvec{K}}\times \sum _{n}a_{n}^{\star }\varvec{\nabla }_{\varvec{K}}a_{n}\). This is the expression we would obtain for the Berry curvature of an N component complex vector \(|\psi \rangle =(a_0,a_1,\dots a_{N-1})^\textrm{T}\) in a discrete system.

In our simplified model we assume a tight binding approximation, where each resonator only has one possible mode (i.e. frequencies are such that higher order modes do not contribute). Due to the periodicity of the lattice, the unit cell contains all the degrees of freedom. Therefore a unit cell containing a single resonator has only one effective degree of freedom and will therefore exhibit a dispersion relation \(\omega (\varvec{K})\) with only one branch (band). As the theory given in Sec. 6 depends on the closure of a gap between two or more bands, we consider a lattice with two resonators (labelled a and b) per unit cell.

Labelling the amplitudes of the two resonators in each point in the lattice as \(a_{n,m}\) and \(b_{n,m}\), the equations of motion can be written in the general form

$$\begin{aligned} \ddot{a}_{n,m}+\omega _a^2\,a_{n,m}&=\sum _{n',m'}\left[ \alpha _{n'-n,m'-m}b_{n',m'}+\beta _{n'-n,m'-m}a_{n',m'}\right] \nonumber \\ \ddot{b}_{n,m}+\omega _b^2\,b_{n,m}&=\sum _{n',m'}\left[ \alpha _{n-n',m-m'}a_{n',m'}+\gamma _{n'-n,m'-m}b_{n',m'}\right] \end{aligned}$$
(78)

where the ‘a’ and ‘b’ resonant frequencies are \(\omega _{a}\) and \(\omega _{b}\) respectively. The amplitude of the cross–coupling between resonators is given by \(\alpha _{n,m}\), and the coupling between like resonators is given by \(\beta _{n,m}\) and \(\gamma _{n,m}\). It is assumed that the self coupling between like resonators vanishes \(\gamma _{0,0}=\beta _{0,0}=0\), as these terms are equivalent to a modification of the resonant frequencies \(\omega _{a,b}\). From hereon we will work at a fixed frequency \(\omega \), where the coupling constants can take complex values.

We now compute the first Chern number for an infinite periodic system of resonators, writing the resonator amplitudes in accordance with Bloch’s theorem, and taking a fixed frequency of oscillation \(\omega \)

$$\begin{aligned} a_{n,m}=e^{\textrm{i}[\varvec{K}\cdot (n\varvec{a}_1+m\varvec{a}_{2})-\omega t]}a\nonumber \\ b_{n,m}=e^{\textrm{i}[\varvec{K}\cdot (n\varvec{a}_1+m\varvec{a}_{2})-\omega t]}b \end{aligned}$$
(79)

where \(\varvec{K}\) is the Bloch vector, and \(\varvec{a}_{1}\) and \(\varvec{a}_{2}\) are the real space lattice vectors (see Fig. 12). Substituting (79) into (78), the infinite set equations of motion reduce to a set of two coupled linear equations

$$\begin{aligned}{}[\omega _a^2-f(\varvec{K})]\,a-g(\varvec{K})b&=\omega ^2 a\nonumber \\ [\omega _b^2-h(\varvec{K})]\,b-g^{\star }(\varvec{K})a&=\omega ^2 b \end{aligned}$$
(80)

where we have defined the three Bloch–vector dependent coupling functions

$$\begin{aligned} f(\varvec{K})&=\sum _{n',m'\ne 0,0}\beta _{n',m'}\textrm{e}^{\textrm{i}\varvec{K}\cdot (n'\varvec{a}_{1}+m'\varvec{a}_{2})}\nonumber \\ g(\varvec{K})&=\alpha _{0,0}+\sum _{n',m'\ne 0,0}\alpha _{n',m'}\textrm{e}^{\textrm{i}\varvec{K}\cdot (n'\varvec{a}_{1}+m'\varvec{a}_{2})}\nonumber \\ h(\varvec{K})&=\sum _{n',m'\ne 0,0}\gamma _{n',m'}\textrm{e}^{\textrm{i}\varvec{K}\cdot (n'\varvec{a}_{1}+m'\varvec{a}_{2})}. \end{aligned}$$
(81)

Equation (80) is almost in the form required by the theory described in Sec. 6. One wrinkle is that we do not yet satisfy the assumption that the eigenvalues of the operator \(\hat{L}\) are symmetrically distributed around zero (the trace of \(\hat{L}\) does not vanish). To avoid complicating the discussion, we assume the two resonators in the unit cell are identical: \(\omega _a=\omega _b\) and \(h(\varvec{K})=f(\varvec{K})\). Subtracting \(\omega _a^2\) from both sides of (80) and writing \(\lambda =\omega ^{2}-\omega _{a}^{2}\) then gives us

$$\begin{aligned} \left( \begin{array}{cc}-f(K)&{} -g(\varvec{K})\\ {-g}^{*}(\varvec{K})&{} f(\varvec{K})\end{array}\right) \left( \begin{array}{c}a\\ b\end{array}\right) =\lambda \left( \begin{array}{c}a\\ b\end{array}\right) \end{aligned}$$
(82)

The operator \(\hat{L}\) is now traceless and can be written in terms of Pauli matrices, as in the example application of the Chern number given in Sec. 5,

$$\begin{aligned} \hat{L}=-f(\varvec{K})\,\sigma _z-g_1(\varvec{K})\,\sigma _x+g_{2}(\varvec{K})\,\sigma _y \end{aligned}$$
(83)

where f is a real function, and \(g=g_1+\textrm{i}g_2\). The Chern number for the operator (83) records the same thing as in the example of Sec. 5, namely the number of times the vector

$$\begin{aligned} \varvec{n}=\frac{-g_1\varvec{e}_{x}+g_2\varvec{e}_{y}-f\varvec{e}_{z}}{\sqrt{f^2+g_1^2+g_2^2}} \end{aligned}$$
(84)

covers the unit sphere as \(\varvec{K}\) is varied over the first Brillouin zone, i.e. the winding of the Hamiltonian around the point \(f=g=0\). Comparison with the example of Sec. 5 shows that the two eigenstates of \(\hat{L}\) are given by (47) with eigenvalues \(\lambda =\omega ^2-\omega _a^2=\pm [f^2+g_1^2+g_2^2]\) . As in that example, the eigenstates depend only on the spherical angles \(\theta \) and \(\phi \) of the vector \(\varvec{n}\), which are here identified as \(\cos (\theta )=-f/[f^2+g_1^2+g_2^2]^{1/2}\) and \(\textrm{exp}(\textrm{i}\phi )=(-g_1+\textrm{i}g_2)/[g_1^2+g_2^2]^{1/2}\). The Berry connection is also given in Sec. 5, by (48).

From the example of Sec. 5.1, we know that the points of vanishing cross coupling between the two resonators in the unit cell, \(g(\varvec{K})=0\) are the critical points of the Berry connection. These correspond to the North or South pole of the unit sphere defined by (84), depending on the sign of f. We therefore see that the Chern number depends in an important way on the coupling functions between the same resonators in each unit cell, f and g. If, for example f is always positive then \(\varvec{n}\) will only ever explore the lower half of the unit sphere and the Chern number will therefore always be zero. Between such lattices of resonators there can never be the non–trivial interface states predicted above.

Fig. 13
figure 13

Coupling functions and Berry curvature for the honeycomb lattice shown in Fig. 12, for parameters \(\beta =0.01\) and \(\alpha =1\). The hexagon indicated in all four plots shows the boundary of the first Brillouin zone. Panel (a) shows a phase plot of the complex coupling \(g(\varvec{K})\) between ‘a’ and ‘b’ resonators (color indicates phase and saturation, magnitude). Panel (b) shows the real valued coupling \(f(\varvec{K})\) between next–nearest–neighbour resonators. Panel (c) shows the Berry curvature, evaluated within the first Brillouin zone. Panel (d) indicates the ‘folding’ rules for mapping the first Brillouin zone onto a torus, with the total Berry curvature around each corner of the zone indicated

To ensure that the Chern number does not vanish, we take an approach that is close to the famous example known as the ‘Haldane model’ developed by F. Duncan Haldane in the late 1980s [8]. We consider the special case of a lattice with hexagonal symmetry and lattice constant dFootnote 7. Taking two identical resonators per unit cell, we construct a honeycomb lattice where each ‘a’ resonator is separated by \(d/\sqrt{3}\) from three nearest neighbour ‘b’ resonators, as shown in Fig. 12a. Assuming only nearest neighbour coupling between the a and b resonators, \(\alpha _{0,0}=\alpha _{1,0}=\alpha _{0,1}=\alpha \,\textrm{exp}(-\textrm{i}K_y d/\sqrt{3})\) of equal strength \(\alpha _{1,0}=\alpha _{0,1}=\alpha \), the coupling function \(g(\varvec{K})\) reduces to

$$\begin{aligned} g(\varvec{K})&=\alpha \,\textrm{e}^{-\textrm{i}K_yd/\sqrt{3}}\left[ 1+\textrm{e}^{\textrm{i}\varvec{K}\cdot \varvec{a}_{1}}+\textrm{e}^{\textrm{i}\varvec{K}\cdot \varvec{a}_{2}}\right] \nonumber \\&=\alpha \,\textrm{e}^{-\textrm{i}K_yd/\sqrt{3}}\left[ 1+2\cos (K_x d/2)\textrm{e}^{\frac{\textrm{i}\sqrt{3}d}{2}K_y}\right] \end{aligned}$$
(85)

This quantity vanishes at the 6 corner points on the boundary of the first Brillouin zone (see Fig. 13a): when the \(K_x\) component of the Bloch vector equals \(\pm 2\pi /3d\), or \(\pm 4\pi /3d\), with the \(K_y\) component equal to \(\pm 2\pi /\sqrt{3}d\), or 0 respectively. These 6 points can be grouped into two lots of three, where one group can be obtained from the other via the substitution \(\varvec{K}\rightarrow -\varvec{K}\), equivalent to a complex conjugation of (85). Being equivalent to a complex conjugation, the phase of \(g(\varvec{K})\) winds in an opposite sense around these two sets of points. Therefore, if \(f(\varvec{K})=f(-\varvec{K})\) (as it does if e.g. there is no coupling beyond nearest neighbour, \(f=0\)), the winding number around these critical points will cancel, yielding a Chern number of zero!

To ensure that the Chern number does not vanish, we introduce a complex next–nearest neighbour coupling that breaks time reversal symmetry. We suppose that each resonator couples to its six equivalents at a distance a away with strength \(\beta _{1,0}=-\beta _{-1,0}=\textrm{i}\beta \), \(\beta _{0,1}=-\beta _{0,-1}=-\textrm{i}\beta \), and \(\beta _{1,-1}=\beta _{-1,1}=-\textrm{i}\beta \)

$$\begin{aligned} f(\varvec{K})&=-\textrm{i}\beta \left[ -\textrm{e}^{\textrm{i}\varvec{K}\cdot \varvec{a}_{1}}+\textrm{e}^{-\textrm{i}\varvec{K}\cdot \varvec{a}_{1}}+\textrm{e}^{\textrm{i}\varvec{K}\cdot \varvec{a}_{2}}-\textrm{e}^{-\textrm{i}\varvec{K}\cdot \varvec{a}_{2}}+\textrm{e}^{\textrm{i}\varvec{K}\cdot (\varvec{a}_{1}-\varvec{a}_{2})}-\textrm{e}^{-\textrm{i}\varvec{K}\cdot (\varvec{a}_{1}-\varvec{a}_{2})}\right] \nonumber \\&=2\beta \left[ \sin (K_x d)-\sin ((K_x+\sqrt{3}K_y)d/2)-\sin ((K_x-\sqrt{3}K_y)d/2)\right] \end{aligned}$$
(86)

which is an odd function of \(\varvec{K}\), meaning—via the argument above—that the total Berry curvature does not vanish.

Figure 13 shows the Berry curvature in the first Brillouin zone for the coupling functions (85) and (86) (positive \(\alpha \) and \(\beta \)), which now corresponds to a Chern number of \(+1\) . From the relationship between interface states and Chern numbers derived above we can thus see that there will always be an interface state trapped between a lattice with non–zero next–nearest neighbour coupling (86), and one with \(f(\varvec{K})=0\), where the Chern number vanishes.

6.4 Example: Electromagnetic Waves in a Gyrotropic Medium

As a final example we consider a continuous system where the eigenvalue \(\lambda \), appearing in Sec. 6, is a material parameter rather than frequency. This example is based on the results in [29].

Take an electromagnetic material where the relative magnetic permeability \(\mu \) is a real scalar, and the permittivity \(\varvec{\epsilon }\) is a tensor of the form

$$\begin{aligned} \varvec{\epsilon }=\left( \begin{array}{cc}\varvec{\epsilon }_{\parallel }&{}\varvec{0}\\ \varvec{0}&{}\epsilon _{\perp }\end{array}\right) \end{aligned}$$
(87)

where \(\varvec{\epsilon }_{\parallel }\) is a \(2\times 2\) matrix representing the anisotropic permittivity in the xy plane of propagation. Taking propagation in the xy plane, and assuming TM polarization, where \(\varvec{H}=(h/\eta _0)\varvec{e}_{z}\), where \(\eta _0=\sqrt{\mu _0/\epsilon _0}\), Maxwell’s equations take the form

$$\begin{aligned} -\textrm{i}\varvec{\nabla }\times \varvec{E}&=k_0\mu h\varvec{e}_{z}\nonumber \\ \textrm{i}\varvec{\nabla }h\times \varvec{e}_{z}&=k_0\varvec{\epsilon }_{\parallel }\cdot \varvec{E}. \end{aligned}$$
(88)

To simplify the discussion, consider media where the two diagonal elements of \(\varvec{\epsilon }_{\parallel }\) are equal to each other and also equal to the scalar permeability \(\mu \). This allows us to write \(\varvec{\epsilon }_{\parallel }=\varvec{\epsilon }'_{\parallel }+\varvec{1}\lambda \) and \(\mu =\lambda \), where \(\varvec{\epsilon }_{\parallel }'\) has zeros on the diagonal. When \(\varvec{\epsilon }_{\parallel }'=0\), the material is impedance matched (\(\mu =\epsilon =\lambda \)), and the field behaves as if the distance has been rescaled by a factor of \(\lambda \). This is a simple example of ‘transformation optics’ [11, 33]. When the impedance matching condition is satisfied, there is a propagating wave with some wave number \(|\varvec{k}|\) for every value of \(\lambda \), with positive \(\lambda \) corresponding to positive index media, and negative \(\lambda \) negative index media. There is thus no ‘gap’ in the \(\lambda \) spectrum. Through introducing a gyrotropy [27] parameterized by \(\alpha \)

$$\begin{aligned} \varvec{\epsilon }_{\parallel }'=\left( \begin{matrix}0&{}-\textrm{i}\alpha (x)\\ \textrm{i}\alpha (x)&{}0\end{matrix}\right) \end{aligned}$$
(89)

we break time reversal symmetry and open up a ‘gap’, where a range of \(\lambda \) values correspond to materials where no wave can propagate. Here we assume that \(\alpha \) does not depend on the wave–vector \(\varvec{k}\) of the electromagnetic field as it does in chiral media, thus the material breaks time reversal symmetry. A similar analysis to that given below can be carried out for a chiral medium where \(\alpha \) depends on the out–of–plane propagation constant, which is described in the next section. We now use the mode counting argument given above to count the number of interface modes that cross this spectral gap in an inhomoeneous material.

It is assumed that the gyrotropy, \(\alpha \) varies with position, as discussed in the theory given above. Maxwell’s equations can then be written as an eigenvalue problem equivalent to that given in (54) at the beginning of the theory of interface modes given in Sec. 6

$$\begin{aligned} \left( \begin{array}{ccc}0&{} \textrm{i}\alpha \left( {x}_{1}\right) &{} {\textrm{i}\partial }_{2}\\ -\textrm{i}\alpha \left( {x}_{1}\right) &{} 0&{} {-\textrm{i}\partial }_{1}\\ {\textrm{i}\partial }_{2}&{} -{\textrm{i}\partial }_{1}&{} 0\end{array}\right) \left( \begin{array}{c}{E}_{x}\\ {E}_{y}\\ h\end{array}\right) =\lambda \left( \begin{array}{c}{E}_{x}\\ {E}_{y}\\ h\end{array}\right) \end{aligned}$$
(90)

where we used the dimensionless coordinates \((x_1,x_2)=k_0 (x,y)\). At large |x|, where the material becomes homogeneous, the operator \(\hat{L}\) can be written in Fourier space as

$$\begin{aligned} L=\left( \begin{array}{ccc}0&{} \textrm{i}\alpha &{} -{k}_{2}\\ -\textrm{i}\alpha &{} 0&{} {k}_{1}\\ -{k}_{2}&{} {k}_{1}&{} 0\end{array}\right) \end{aligned}$$
(91)

which has vanishing trace, as assumed in Sec. 6.

Fig. 14
figure 14

The material parameters \(\lambda =\epsilon =\mu \), calculated as a function of the wavenumber \(k=(k_1^2+k_2^2)^{1/2}\) and fixed gyrotropy \(\alpha =1\). There are three modes with \(\lambda =\pm (k^2+\alpha ^2)^{1/2}\) and \(\lambda =0\). A non–zero value of the gyrotropy opens up a ‘gap’ in the spectrum: a range of \(\lambda \) between positive and negative index materials, where there are no propagating solutions

Finding the eigenvalue of (91) is to ask the question “what is the material parameter \(\lambda \) for a medium with gyrotropy \(\alpha \) and a wave with wave–vector \(\varvec{k}=(k_1,k_2)\)?”. From the eigenvectors and eigenvalues of this operator we can now compute the Chern number of the lower propagation branch shown in Fig. 14. The eigenvectors and eigenvalues are

$$\begin{aligned} \lambda =\pm \sqrt{k^2+\alpha ^2},\qquad |\psi _{\pm }\rangle =\frac{1}{\sqrt{2}k\sqrt{k^2+\alpha ^2}}\left( \begin{matrix}-\lambda k_2-\textrm{i}\alpha k_1\\ \lambda k_1-\textrm{i}\alpha k_2\\ k^2\end{matrix}\right) \end{aligned}$$
(92)

and

$$\begin{aligned} \lambda =0,\qquad |\psi _0\rangle =\frac{1}{\sqrt{k^2+\alpha ^2}}\left( \begin{matrix}k_1\\ k_2\\ \textrm{i}\alpha \end{matrix}\right) \end{aligned}$$
(93)

where \(k^2=k_1^2+k_2^2\)Footnote 8.

We now count the interface modes that arise from the change in the gyrotropy from −ve to \(+\)ve. The final result of Sec. 6 says that the number of these modes is given by the difference in the Chern numbers for the homogeneous media either side of the interface. To calculate this we use the Berry connection for the state \(|\psi _{-}\rangle \), wrapping the infinite k–space onto the sphere using the stereographic projection (see the schematic in Fig. 8a)

$$\begin{aligned} \mathcal {K}=k_x+\textrm{i}k_y=k\,\textrm{e}^{\textrm{i}\phi }=\frac{\sin (\theta )\,\textrm{e}^{\textrm{i}\phi }}{1-\cos (\theta )}=\cot (\theta /2)\,\textrm{e}^{\textrm{i}\phi }. \end{aligned}$$
(94)

so that on the sphere the state \(|\psi _{-}\rangle \) takes the form

$$\begin{aligned} |\psi _{-}\rangle =\frac{1}{\sqrt{2}\sqrt{\cot ^{2}(\theta /2)+\alpha ^2}}\left( \begin{array}{c}\sqrt{\cot ^2(\theta /2)+\alpha ^2}\sin (\phi )-\textrm{i}\alpha \cos (\phi )\\ sqrt{\cot ^2(\theta /2)+\alpha ^2}\cos (\phi )-\textrm{i}\alpha \sin (\phi )\\ \cot (\theta /2)\end{array}\right) \end{aligned}$$
(95)

This state is undefined at both North (\(\theta =0\)) and South (\(\theta =\pi \)) poles. However, at the North pole the vector is purely real, a defect that is associated with zero Berry curvature (31). The Berry connection \(A_j\) computed from (95) has a single relevant component

$$\begin{aligned} A_{\phi }=-\textrm{i}\langle \psi _{-}|\frac{\partial }{\partial \phi }|\psi _{-}\rangle =\frac{\alpha }{\sqrt{\cot ^2(\theta /2)+\alpha ^2}} \end{aligned}$$
(96)

which vanishes at the North pole, and has a critical point at the South pole. Integrating (96) around the critical point at the South pole \(\theta =\pi \) yields a Chern number of \(\pm 1\),

$$\begin{aligned} \textrm{Ch}_1=-\frac{1}{2\pi }\int _{0}^{2\pi }A_{\phi }\,{\text{ d }}\phi =-\textrm{sign}[\alpha ]. \end{aligned}$$
(97)

Thus from (72) the number of interface modes N supported at an interface between media where the gyrotropy changes from positive to negative sign is

$$\begin{aligned} N=(+1)-(-1)=2. \end{aligned}$$
(98)

This pair of interface modes can be understood as those with vanishing tangential electric and magnetic field at the point where the gyrotropy changes sign on the interface, i.e. the interface modes associated with a perfect electric or magnetic conductor placed at this point in the graded material. An examination of this pair of modes has revealed some of the subtleties in the application of the theory of Sec. 6 to continuous media. The reader is encouraged to consult [34] and [35] for more details.

7 One–way Propagation and the Refractive Index

The application of topology to predict interface modes reveals two remarkable things. Firstly that the abstract mathematics of topology has a rather direct and powerful application to the design of materials. Secondly that there exist interface modes that can propagate in only one direction (see, for example Fig. 15). This is unusual behaviour for a wave, to say the least. It means that whatever you put in the way of such an interface mode (a mirror, chocolate, or an elephant), there is simply no possibility for it to reflect. This is not entirely true for the example of Fig. 15, which has a fixed polarization and can thus be reflected by any polarization converting object, but let’s not let that discourage us.

Fig. 15
figure 15

One–way propagating electromagnetic interface waves, from a source in a medium with graded gyrotropy, as described in the final example of Sec. 6 (simulated using program given in Appendix B). Panel (a) shows a phase plot of the out of plane magnetic field for the profile of gyrotropy given in (b), with constant diagonal part of the permittivity \(\epsilon =\lambda =2.5+0.001\textrm{i}\). Away from the interface the value of the gyrotropy exceeds the permittivity and we are in the gap indicated in Fig. 14. A close examination of the interface mode shows that it exhibits a beating as it propagates. This is the interference between the two interface modes predicted in (98)

One problem with these topological arguments is that they do not give us an explanation for why there are such interface modes. All we have to go on is an integer that took a long time to calculate. What is it about these particular materials that force the wave to propagate in only one direction? In this section we more fully explore the final example of Sec. 6, using the concept of the refractive index rather than topology. This is based on the findings of [36], and we’ll find that the refractive index concept gives us a different, but complementary way to understand one–way propagation.

The starting point is the Berry connection (96) for a gyrotropic medium. As we established in the previous section, the Chern number (97), \(\textrm{Ch}_{1}=-\textrm{sign}[\alpha ]\) records the single critical point in the Berry connection, which is at the South pole (\(\theta =\pi \)) of the sphere onto which k–space has been stereographically projected. As shown in Fig. 8), the South pole of the sphere is the origin of k–space. So what is happening at this critical point in the Berry connection?

To answer this, let’s return to Maxwell’s Eqns. (88), setting \(\mu =\lambda \). We use the same set of material parameters as we did our earlier discussion, \(\varvec{\epsilon }_{\parallel }=\lambda \varvec{1}+\textrm{i}\alpha \varvec{e}_{z}\times \). Substituting this in our earlier form of Maxwell’s equations (88), the gradient of the out of plane magnetic field is governed by

$$\begin{aligned} \varvec{\nabla }h=-\textrm{i}k_0(\lambda \varvec{e}_{z}\times \varvec{E}-\textrm{i}\alpha \varvec{E}) \end{aligned}$$
(99)

As we saw earlier in Sec. 4, the complex vectors \(\varvec{e}_{\pm }=(\varvec{e}_{x}\pm \textrm{i}\varvec{e}_{y})/\sqrt{2}\) are eigenvectors of the cross product \(\varvec{e}_{z}\times \varvec{e}_{\pm }=\mp \textrm{i}\varvec{e}_{\pm }\). Therefore, taking the inner product of (99) with \(\varvec{e}_{+}\) simplifies the equation to

$$\begin{aligned} \varvec{e}_{+}\cdot \varvec{\nabla }h=\frac{1}{\sqrt{2}}\left( \frac{\partial h}{\partial x}+\textrm{i}\frac{\partial h}{\partial y}\right) =k_0\left( \lambda -\alpha \right) \varvec{e}_{+}\cdot \varvec{E} \end{aligned}$$
(100)

and with \(\varvec{e}_{-}\) it simplifies to

$$\begin{aligned} \varvec{e}_{-}\cdot \varvec{\nabla }h=\frac{1}{\sqrt{2}}\left( \frac{\partial h}{\partial x}-\textrm{i}\frac{\partial h}{\partial y}\right) =-k_0\left( \lambda +\alpha \right) \varvec{e}_{-}\cdot \varvec{E} \end{aligned}$$
(101)

Equations (100) and (101) are important. The critical point of the Berry connection is at the point \(k=0\), which from the dispersion plot given in Fig. 14, corresponds to the material parameters \(\lambda =\pm \alpha \) (depending on whether we are computing the Chern number of the upper, or lower band of propagation, respectively). From Eqns. (100) and (101)) we can see that at these critical points the magnetic field obeys the equation

$$\begin{aligned} \frac{\partial h}{\partial x}\pm \textrm{i}\frac{\partial h}{\partial y}=0\qquad (\lambda =\pm \alpha ) \end{aligned}$$
(102)

These are the Cauchy–Riemann equations from complex analysis (see e.g. [37])! These are fulfilled by analytic functions of either \(\mathcal {Z}^{\star }=x-\textrm{i}y\), in the case \(\lambda =\alpha \), or \(\mathcal {Z}=x+\textrm{i}y\), when \(\lambda =-\alpha \). Therefore the critical points of the Berry connection in a gyrotropic medium—which are, of course the reason the Chern number is non–zero—correspond to those points where the wave is an analytic function of position. At these points the wave depends solely on either the complex number \(\mathcal {Z}\), or on \(\mathcal {Z}^{\star }\), depending on the band of interest and the sign of the gyrotropy.

The physical significance of the wave becoming an analytic function is clear if we consider a Taylor expansion of the out of plane magnetic field, h around some point \(\mathcal {Z}_0=x_0+\textrm{i}y_0\) in the plane. Assuming h is a function of \(\mathcal {Z}\), and expressing the complex number in terms of polar coordinates \((r,\theta )\) centred at the point of expansion

$$\begin{aligned} h(x+\textrm{i}y)=\sum _{n=0}^{\infty }h_{n}\,(\mathcal {Z}-\mathcal {Z}_0)^n=\sum _{n=0}^{\infty }h_{n}\,r^n\,\textrm{e}^{\textrm{i}n\theta }. \end{aligned}$$
(103)

This expansion in powers of \(\textrm{exp}(\textrm{i}\theta )\) is equivalent to expanding the wave in terms of its component angular momenta. Noting that the terms in the series each evolve in time as \(\textrm{exp}(\textrm{i}(n\theta -\omega t))\), we see that each term rotates with a fixed angular velocity \(\dot{\theta }=\omega /n\). As the field must be everywhere finite (assuming the material is simply connected), n is always positive in the series (103). Therefore a wave that is given as an analytic function of position rotates in only one sense; anti–clockwise in the case of (103), and as shown in Fig. 16b.

To emphasize the point, compare this to the expansion of a generic function of x and y,

$$\begin{aligned} h(x,y)=\sum _{n=0}^{\infty }\sum _{m=0}^{\infty }h_{n,m}x^{n}y^{m}=\sum _{n=0}^{\infty }\sum _{m=0}^{\infty }h_{n,m}r^{n+m}\left( \frac{\textrm{e}^{\textrm{i}\theta }+\textrm{e}^{-\textrm{i}\theta }}{2}\right) ^{n}\left( \frac{\textrm{e}^{\textrm{i}\theta }-\textrm{e}^{-\textrm{i}\theta }}{2\textrm{i}}\right) ^{m} \end{aligned}$$
(104)

which—as well as containing two summation indices rather than one—contains both positive and negative powers of \(\textrm{exp}(\textrm{i}\theta )\), meaning that there are component waves that can rotate in both senses around the origin. This is illustrated in Fig. 16a. The critical point of the Berry connection calculated in (96) therefore corresponds to a set of material parameters where the wave can only circulate one way.

Fig. 16
figure 16

Difference between (a) a random superposition of plane waves of wavenumber \(k_0\), and (b) an analytic function of \(\mathcal {Z}=x+\textrm{i}y\). Arrows indicate the direction in which the phase increases around the zeros. In panel (a) we plot the complex function obtained through adding together 8 plane waves of random complex amplitude, propagating at angles \(\{0,\pi /4,\pi /2,3\pi /4,\dots \}\). In panel (b) we plot the function \(\Pi _{n}(z-z_{n})\) for 8 randomly generated complex numbers \(z_n\). While the sum of waves generates zeros around which the phase circulates in either sense, the analytic function always exhibits circulation in an anti–clockwise sense

7.1 Critical Points and the Refractive Index

We have just established that in a gyrotropic medium, the points where \(\lambda =\pm \alpha \) are where the wave behaves as an analytic function of position. As the length of the wave–vector also vanishes at this point, \(k=0\), it is reminiscent of a point of vanishing refractive index. Indeed, the behaviour of the field can be connected to the study of wave propagation in anisotropic materials, and the critical points of the Berry connection can be understood as an unusual kind of point of vanishing refractive index. We shall show that this finding can be used as a shortcut to materials where there are one–way interface states.

To simplify the discussion we consider the propagation of a transverse magnetic field \(\varvec{H}=H\varvec{e}_{z}\) in the xy plane of a homogeneous non–magnetic (\(\mu =1\)) material. Combining the two Maxwell equations given by (88), the electric field can be eliminated, leaving a second order equation for the out of plane magnetic field amplitude h

$$\begin{aligned} \varvec{\nabla }\times \left( \varvec{\epsilon }^{-1}\cdot \varvec{\nabla }h\times \varvec{e}_z\right) =k_0^2 h\varvec{e}_z. \end{aligned}$$
(105)

As the material is homogeneous, we can write the magnetic field in the form \(h=\varvec{e}_{z}\,h_0\,\textrm{exp}(\textrm{i}\,k\,\varvec{n}\cdot \varvec{x})\), where \(\varvec{n}(\theta )=\cos (\theta )\varvec{e}_{x}+\sin (\theta )\varvec{e}_{y}\) and \(k=(k_x^2+k_y^2)^{1/2}\), thus eliminating the derivatives from (105). Assuming a material of the same form as in (87), where the only off–diagonal elements are \(\epsilon _{xy}\) and \(\epsilon _{yx}\), (105) becomes an equation for the \(\theta \) dependent refractive index \(\textrm{n}=k/k_0\)

$$\begin{aligned} (\varvec{n}\times \varvec{e}_z)\cdot \varvec{\epsilon }^{-1}_{\parallel }\cdot (\varvec{n}\times \varvec{e}_z)=\left( \frac{k_0}{k}\right) ^2=\frac{1}{\textrm{n}(\theta )^2} \end{aligned}$$
(106)

This equation defines the refractive index as a function of the propagation angle \(\theta \) in the xy plane. Equation (106) is a special case of the defining equation for the refractive index ellipsoid, used in the optics of three dimensional crystals [38].

It is illustrative to re–write the dispersion relation (106) in terms of the eigenvalues, \(\epsilon _i\) and eigenvectors, \(\varvec{e}_{i}\) of the in–plane permittivity, obeying \(\varvec{\epsilon }_{\parallel }\cdot \varvec{e}_{i}=\lambda _i\varvec{e}_{i}\). In terms of these quantities, the inverse of the in–plane permittivity is given by,

$$\begin{aligned} \varvec{\epsilon }_{\parallel }^{-1}=\frac{1}{\epsilon _{1}}\varvec{e}_{1}\otimes \varvec{e}_{1}^{\star }+\frac{1}{\epsilon _{2}}\varvec{e}_{2}\otimes \varvec{e}_{2}^{\star }. \end{aligned}$$
(107)

where we’ve assumed a Hermitian (and hence lossless) permittivity tensor. Substituting (107) into the dispersion relation (106), the angle dependent refractive index can then be written as

$$\begin{aligned} \textrm{n}(\theta )=\sqrt{\frac{\epsilon _{1}\epsilon _{2}}{\epsilon _2\,|(\varvec{e}_1\times \varvec{e}_{z})\cdot \varvec{n}|^2+\epsilon _1\,|(\varvec{e}_2\times \varvec{e}_{z})\cdot \varvec{n}|^2}}, \end{aligned}$$
(108)

where we have taken the positive root (although in some important cases the negative root should be taken [10]).

Take a moment to dwell on the dependence of the refractive index (108) on the permittivity tensor. As the permittivity is Hermitian, the eigenvectors are orthonormal \(\varvec{e}_{i}^{\star }\cdot \varvec{e}_{j}=\delta _{ij}\), and form a complete set \(\varvec{1}=\varvec{e}_{1}^{\star }\otimes \varvec{e}_{1}+\varvec{e}_{2}^{\star }\otimes \varvec{e}_{2}\). Therefore, when the two eigenvalues are equal \(\epsilon _1=\epsilon _2=\epsilon \), the denominator on the right of (108) simply equals \(\epsilon \). Such a medium is an isotropic dielectric in the plane of propagation, and the refractive index reduces to the textbook expression, \(\textrm{n}(\theta )=\sqrt{\epsilon }\), shown as the dashed circle in Fig. 17a. As shown in the figure, when \(\epsilon \rightarrow 0\), this dispersion circle closes to a point and the refractive index vanishes.

Fig. 17
figure 17

Refractive index surfaces in planar media: (a) Isotropic medium with equal eigenvalues. The dispersion surface closes to a point as the eigenvalues are reduced to zero. In panel (b) we have an anisotropic medium with eigenvectors \(\varvec{e}_{1}=\varvec{e}_{x}\) and \(\varvec{e}_{2}=\varvec{e}_{y}\), and positive eigenvalues. The dispersion surface now forms an ellipse, with eccentricity approaching unity as one of the eigenvalues approaches zero. In panel (c) the medium is hyperbolic with the same eigenvectors as (b). The dispersion surface is now open, with a range of angles where propagation is not allowed. Finally, panel (d) shows the dispersion surface for complex eigenvectors \(\varvec{e}_{1}=(2\varvec{e}_{x}+\textrm{i}\varvec{e}_{y})/\sqrt{5}\) and \(\varvec{e}_{2}=(\varvec{e}_{x}+2\textrm{i}\varvec{e}_{y})/\sqrt{5}\). Unlike the case of real eigenvectors, the dispersion surface closes to a point rather than a line as only one of the eigenvalues approaches zero

Meanwhile, when the two eigenvalues differ \(\epsilon _2>\epsilon _1\) and the eigenvectors are real, the refractive index varies between its largest value \(\sqrt{\epsilon _2}\) (propagation along \(\varvec{e}_{1}\)) and its smallest value \(\sqrt{\epsilon _1}\) (propagation along \(\varvec{e}_{2}\)). The angle dependence of the refractive index \(\textrm{n}(\theta )\) now either traces out an ellipse (Fig. 17b), when \(\epsilon _1\) and \(\epsilon _2\) are both positive, or a hyperbola (Fig. 17c), when \(\epsilon _1\) and \(\epsilon _2\) have different signs, constituting a hyperbolic material.Footnote 9 At the transition between elliptical and hyperbolic dispersion, the smallest eigenvalue passes through zero \(\epsilon _1\rightarrow 0\). In this case the eccentricity of the dispersion ellipse tends to unity, and the ellipse is compressed into a line, as shown in Fig. 17b. Again the refractive index vanishes, but now this only occurs for one direction of propagation. For instance taking \(\varvec{e}_{1}=\varvec{e}_{x}\) and \(\varvec{e}_{2}=\varvec{e}_{y}\) we have,

$$\begin{aligned} \textrm{n}(\theta )=\sqrt{\frac{\epsilon _1\epsilon _2}{\epsilon _2\sin ^2(\theta )+\epsilon _1\cos ^2(\theta )}}={\left\{ \begin{array}{ll} \sqrt{\epsilon _2}\rightarrow 0&{}\; (\text {Propagation along } x)\\ \sqrt{\epsilon _1}\ne 0&{}\; (\text {Propagation along } y). \end{array}\right. } \end{aligned}$$
(109)

We can therefore see that setting one of the eigenvalues of the in–plane permittivity to zero makes the refractive index in the direction perpendicular to the corresponding eigenvector vanish. In this way it is possible to have zero refractive index for only one direction of propagation.

The situation becomes more interesting when the eigenvectors \(\varvec{e}_{i}\) are complex and the eigenvalues are positive. Now the propagation vector \(\varvec{n}\), appearing in the denominator of (108), can never be completely parallel (or orthogonal) to either of the complex vectors, \(\varvec{e}_{1}\times \varvec{e}_{z}\) or \(\varvec{e}_{2}\times \varvec{e}_{z}\). As a consequence the denominator can never vanish, even if one of the eigenvalues are zero! Therefore, if we let the smallest eigenvalue \(\epsilon _{1}\) alone tend to zero, the numerator of (108) is zero, making the refractive index \(\textrm{n}(\theta )\) vanish for all directions of propagation \(\theta \)! As shown in Fig. 17d, in this limit the dispersion ellipse closes to a point, like that of an isotropic zero index medium, where the permittivity tensor as a whole vanishes. However, the behaviour is more subtle now.

If we return to the defining equation for the refractive index (106) and multiply through by k, we have

$$\begin{aligned} (\varvec{k}\times \varvec{e}_{z})\cdot \left( \frac{1}{\epsilon _1}\varvec{e}_{1}\otimes \varvec{e}_{1}^{\star }+\frac{1}{\epsilon _2}\varvec{e}_{2}\otimes \varvec{e}_{2}^{\star }\right) \cdot (\varvec{k}\times \varvec{e}_{z})=k_0^2. \end{aligned}$$
(110)

Given that the magnitude of the wave number \(k_0\) is fixed, as \(\epsilon _{1}\rightarrow 0\) the first term in the brackets of (110) dominates and we are left with the condition

$$\begin{aligned} \varvec{e}_{1}\cdot (\varvec{k}\times \varvec{e}_{z})\rightarrow 0 \end{aligned}$$
(111)

an equation that could be equivalently written as \((\varvec{e}_{z}\times \varvec{e}_{1})\cdot \varvec{\nabla }h=0\), i.e. the refractive index is zero in the \(\varvec{e}_{z}\times \varvec{e}_{1}^{\star }\) direction. For real \(\varvec{e}_{1}\), this indicates the squashing of the dispersion ellipse into a line, as shown in Fig. 17. When the first eigenvector is a complex vector e.g. \(\varvec{e}_{1}=\varvec{e}_{+}=(\varvec{e}_{x}+\textrm{i}\varvec{e}_{y})/\sqrt{2}\), the refractive index is zero in a complex direction, and our condition reduces to \(\partial _{x}h+\textrm{i}\partial _{y}h=0\), which are the Cauchy–Riemann conditions (102) found earlier.Footnote 10

Therefore, even though the dispersion surface in the limit \(\epsilon _1\rightarrow 0\) shown in Fig. 17d looks like that of an isotropic medium where the refractive index vanishes, the behaviour of the wave is quite different. Rather than uniformly stretch the wavelength to infinity as would happen in an isotropic medium, instead the wave is forced to propagate with only one sense of circulation. Although analytic functions diverge at infinity and are therefore inadmissible in a bulk material, this behaviour is revealed at a boundary with another material (see Sec. 8), where e.g. an interface state \(\textrm{exp}(-\textrm{i}k(x-\textrm{i}y))\) (\(y>0\)) would be an allowed solution, whereas the counter propagating wave \(\textrm{exp}(\textrm{i}k(x+\textrm{i}y))\) would not. This unusual kind of zero index material exhibits one–way propagation where the wave obeys the Cauchy–Riemann conditions. This is what the defect in the Berry connection (96), and the non–zero Chern number (97) is recording.

8 Applications

We now give applications in three different wave physics regimes where we can enforce one–way propagation through simply demanding that the refractive index is zero in a complex direction.

8.1 General Electromagnetic Media

We can use this idea of a ‘vanishing index in a complex direction’ to extend the discussion of Sec. 7 from gyrotropic media, to general electromagnetic materials. In an arbitrary material, in the absence of any sources, and at a fixed frequency \(\omega \), Maxwell’s equations are given by

$$\begin{aligned} \varvec{\nabla }\times \varvec{E}&=\textrm{i}\omega \varvec{B}\nonumber \\ \varvec{\nabla }\times \varvec{H}&=-\textrm{i}\omega \varvec{D}. \end{aligned}$$
(112)

We take a general lossless linear material, where the constitutive relations are given by

$$\begin{aligned} \varvec{D}&=\epsilon _0[\varvec{\epsilon }\cdot \varvec{E}+\varvec{\xi }\cdot \eta _0\varvec{H}]\nonumber \\ \varvec{B}&=\mu _0[\varvec{\mu }\cdot \varvec{H}+\varvec{\xi }^{\dagger }\cdot \eta _0^{-1}\varvec{E}] \end{aligned}$$
(113)

where the three \(3\times 3\) tensors \(\varvec{\epsilon }\) and \(\varvec{\mu }\) are Hermitian, and the bi–anisotropy tensor \(\varvec{\xi }\) is arbitrary. The Hermitian property of the permittivity and permeability, and the appearance of \(\varvec{\xi }\) and \(\varvec{\xi }^{\dagger }\) ensure that the material does not absorb wave energy [39].

A compact and useful way to write Maxwell’s equations (112) is in the form of a six–vector \((\varvec{E},\varvec{h})^\textrm{T}\), as follows

$$\begin{aligned} \left( \begin{array}{cc}\varvec{0}&{}\textrm{i}\varvec{\nabla }\times \\ newmathrm{i}\varvec{\nabla }\times &{}\varvec{0}\end{array}\right) \left( \begin{array}{c}\varvec{E}\\ \varvec{h}\end{array}\right) =k_0\left( \begin{array}{cc}\varvec{\epsilon }&{}\varvec{\xi }\\ \varvec{\xi }^{\dagger }&{}\varvec{\mu }\end{array}\right) \left( \begin{array}{c}\varvec{E}\\ \varvec{h}\end{array}\right) \end{aligned}$$
(114)

where, as in the previous sections \(\varvec{h}=\eta _0\varvec{H}\). As discussed in [29, 40,41,42], equation (114) has a great deal in common with the Dirac equation [43], where the operator on the left hand side is analogous to the operator \(\varvec{\alpha }\cdot \hat{\varvec{p}}\), and the right hand side matrix contains terms analogous to the mass, energy, and an external gauge field.

We now restrict propagation to the xy plane. With this assumption, the curl of the fields can be written in terms of derivatives of the in–plane field components e.g. \(\varvec{E}_{\parallel }\), and the out of plane ones e.g. \(E_{z}=E\). For example, \(\varvec{\nabla }\times \varvec{E}=\varvec{\nabla }\times \varvec{E}_{\parallel }+\varvec{\nabla }E\times \varvec{e}_{z}\). With this assumption, the in–plane part of the left hand side of (114) depends only on the gradient of the out of plane field components E and h. To isolate these parts of the field, we take an inner product of (114) with the six–vector

$$\begin{aligned} V=\left( \begin{array}{c}\varvec{v}_{E}\\ 0\\ \varvec{v}_{H}\\ 0\end{array}\right) \end{aligned}$$
(115)

where \(\varvec{v}_{E,H}\) are two arbitrary vectors lying in the xy planeFootnote 11. This inner product yields the single scalar equation for the derivatives of the out–of–plane field

$$\begin{aligned} \textrm{i}\left[ \varvec{v}_{E}\cdot \varvec{\nabla }h\times \varvec{e}_{z}-\varvec{v}_{H}\cdot \varvec{\nabla }E\times \varvec{e}_{z}\right] \\ =k_0\left[ (\varvec{v}_{E}\cdot \varvec{\epsilon }+\varvec{v}_{H}\cdot \varvec{\xi }^{\dagger })\cdot \varvec{E}+\left( \varvec{v}_{H}\cdot \varvec{\mu }+\varvec{v}_{E}\cdot \varvec{\xi }\right) \cdot \varvec{h}\right] . \end{aligned}$$
(116)

This equation can be reduced to a simple gradient of a combination of out of plane field components if the two vectors \(\varvec{v}_{E,H}\) are chosen as parallel. We write \(\varvec{v}_{E,H}=\alpha _{E,H}\varvec{e}\times \varvec{e}_{z}\) where \(\alpha _{E,H}\) are scalar quantities, \(\varvec{e}\) is a unit vector (\(\varvec{e}\cdot \varvec{e}^{\star }=1\)). We then have,

$$\begin{aligned} \textrm{i}\varvec{e}\cdot \varvec{\nabla }\left[ \alpha _{E}h-\alpha _{H}E\right] =k_0(\varvec{e}\times \varvec{e}_{z})\cdot \left[ \left( \alpha _{E}\varvec{\epsilon }+\alpha _{H}\varvec{\xi }^{\dagger }\right) \cdot \varvec{E}+\left( \alpha _{H}\varvec{\mu }+\alpha _{E}\varvec{\xi }\right) \cdot \varvec{h}\right] . \end{aligned}$$
(117)

The left hand side of this equation is a generalization of Eqns. (100) and (101) discussed in Sec. 7. Setting the right hand side of (117) to zero picks out a set of material parameters such that the refractive index is zero in the direction \(\varvec{e}\). In order for this to hold, we must impose two conditions on the material tensors

$$\begin{aligned} (\varvec{e}\times \varvec{e}_{z})\cdot (\alpha _{E}\varvec{\epsilon }+\alpha _{H}\varvec{\xi }^{\dagger })&=0\nonumber \\ (\varvec{e}\times \varvec{e}_{z})\cdot (\alpha _{H}\varvec{\mu }+\alpha _{E}\varvec{\xi })&=0. \end{aligned}$$
(118)

For the particular case of \(\varvec{e}=\varvec{e}_{+}=(\varvec{e}_{x}+\textrm{i}\varvec{e}_{y})/\sqrt{2}\), (118) provides a large family of material parameters where the out of plane field component \(\alpha _{E}h-\alpha _{H}E\) behaves as an analytic function of position; thus circulating in only one sense and exhibiting unidirectional interface states. Note that in the particular case where we take \(\alpha _H=0\) and \(\varvec{\xi }=0\), (118) reduces to the zero–index condition for gyrotropic media \(\varvec{e}_{+}\times \varvec{e}_{z}\cdot \varvec{\epsilon }=0\) identified above in Eqns. (99101).

Fig. 18
figure 18

Free space exhibits a similar dispersion relation to that of a gyrotropic medium, see e.g. Fig. 14, and in some cases shows ‘zero index in a complex direction’. (a) For a fixed out of plane propagation constant \(k_z\) the frequency is \(k_0=\pm (k^2+k_z^2)^{1/2}\), with a ‘gap’ in the dispersion relation where \(k_0<|k_z|\). The wave becomes an analytic function of position when \(k_0=|k_z|\) (dashed lines). (b) An imagined experiment where an oscillating line current \(j_z=j_0\textrm{exp}(\mathrm{i k_z z})\) excites electromagnetic waves close to a planar dielectric, a distance d away, where one–way propagation is evident when \(k_0<|k_z|\)

Example: The Cauchy–Riemann conditions in free space

There is an interesting special case of conditions (118), where it can be applied to electromagnetic near fields propagating in free space. At first this seems counter intuitive: surely we need a material if the wave is to exhibit something as strange as complex analyticity! But suppose we consider a free space electromagnetic wave propagating out of the plane, at fixed wave–vector \(k_z\). The dispersion relation of such as mode is given by

$$\begin{aligned} k_x^2+k_y^2+k_z^2=k_0^2\rightarrow k_0=\pm \sqrt{k^2+k_z^2} \end{aligned}$$
(119)

which is of the same form as for a gyrotropic medium (92), and for a relativistic particle: we have two bands of propagation, separated by a ‘gap’ \(\Delta k_0=2k_z\) equivalent to a mass, and due to the propagation constant \(k_z\), as shown in Fig. 18. The system also has exactly the same behaviour under time reversal. Just as for the gyrotropy constant \(\alpha \), if we reverse the direction of time the out of plane wave–vector \(k_z\) changes sign, although such a sign change does not affect the dispersion relation (119). As we shall see, in a particular polarization basis, out of plane propagation is completely equivalent to gyrotropy.

As above, consider a system that is translationally invariant along the z axis. Instead of taking the field as uniform in z, we assume propagation with wave vector component \(k_z\). Separating out the in–plane derivatives as \(\varvec{\nabla }=\varvec{\nabla }_{\parallel }+\varvec{e}_{z}\partial _z\), this modifies Maxwell’s equations (112) to

$$\begin{aligned} \varvec{\nabla }_{\parallel }\times \varvec{E}+\textrm{i}k_z\varvec{e}_{z}\times \varvec{E}&=\textrm{i}\omega \varvec{B}\nonumber \\ \varvec{\nabla }_{\parallel }\times \varvec{H}+\textrm{i}k_z\varvec{e}_{z}\times \varvec{H}&=-\textrm{i}\omega \varvec{D} \end{aligned}$$
(120)

Comparison with the constitutive relations (113) we can see that out of plane propagation is equivalent to adding an anti–symmetric contribution to the bi–anisotropy tensor \(\Delta \varvec{\xi }=(k_z/k_0)\,\varvec{e}_{z}\times \). This effective contribution to the bi–anisotropy is what allows us to fulfil the zero index condition (118), even in free space.

Fig. 19
figure 19

Phase plot of the electric E, magnetic h, and the two ‘circular polarizations’ \(E\pm \textrm{i}h\) for an oscillating line source of wave number \(k_z\) in front of a metal with \(\epsilon =-2\) (see Fig. 18b). As the wavenumber approaches the free space wave number \(k_0/k_z\rightarrow 1\), the system approaches the ‘zero index’ point shown in Fig. 18a, and the field components \(E\pm \textrm{i}h\) behave as analytic functions of \(\mathcal {Z}\) and \(\mathcal {Z}^{\star }\)

Taking free space \(\varvec{\epsilon }=\varvec{\mu }=\varvec{1}_{3}\), and the effective bi–anisotropy \(\varvec{\xi }=(k_z/k_0)\,\varvec{e}_z\times \), the zero index conditions become

$$\begin{aligned} (\varvec{e}\times \varvec{e}_{z})\cdot \left( \alpha _{E}\varvec{\epsilon }+\alpha _{H}\varvec{\xi }^{\dagger }\right) =\left( \begin{matrix}e_y&-e_x&0\end{matrix}\right) \left( \begin{matrix}\alpha _{E}&{}\alpha _{H}\left( \frac{k_z}{k_0}\right) &{}0\\ alpha_{H}\left( \frac{k_z}{k_0}\right) &{}\alpha _{E}&{}0\\ 0&{}0&{}\alpha _{E}\end{matrix}\right) =0 \end{aligned}$$
(121)

and

$$\begin{aligned} \left( \varvec{e}\times \varvec{e}_{z}\right) \cdot \left( \alpha _{H}\varvec{\mu }+\alpha _{E}\varvec{\xi }\right) =\left( \begin{matrix}e_{y}&-e_{x}&0\end{matrix}\right) \left( \begin{matrix}\alpha _{H}&{}-\alpha _{E}\left( \frac{k_z}{k_0}\right) &{}0\\ \alpha _{E}\left( \frac{k_z}{k_0}\right) &{}\alpha _{H}&{}0\\ 0&{}0&{}\alpha _{H}\end{matrix}\right) =0. \end{aligned}$$
(122)

In both equations (121) and (122) the matrix in the middle equation represents an effective permittivity for the combination of fields \(\alpha _{E}h-\alpha _{H}E\), analogous to the gyrotropic permittivity defined below (88). Choosing the combination of fields where \(\alpha _{E}=1\) and \(\alpha _{H}=\textrm{i}\), these conditions become exactly the same as for a gyrotropic medium. For the complex direction \(\varvec{e}=\varvec{e}_{+}\), the zero index conditions (121122) are fulfilled when

$$\begin{aligned} k_z=-k_0. \end{aligned}$$
(123)

For a wave propagating out of the plane with a wave–vector obeying (123), the linear combination of fields Footnote 12\(h-\textrm{i}E=-\textrm{i}(E+\textrm{i}h)\) behaves as an analytic function of position in the xy plane, and is thus forced to circulate in only one sense (anti–clockwise). Meanwhile the polarization \(h+\textrm{i}E\) circulates in the opposite sense (clockwise). The sense of rotation is reversed for both polarizations when we take the opposite direction of out of plane propagation, \(k_z=+k_0\).

Figure 18 shows an example where the electromagnetic field from an oscillating line source (see schematic in Fig. 18b) has been calculated analytically in terms of the Fresnel coefficients of an isotropic half space of permittivity \(\epsilon =-2\) and permeability \(\mu =1\) (see e.g. [44] for details). As the wave–number \(k_z\) of the line current approaches \(k_0\) we can see that a confined mode (a surface plasmon) emerges, which—in terms of the field components \(E+\textrm{i}H\) and \(E-\textrm{i}H\)—can only propagate in one direction on the interface (Fig. 19).

8.2 Continuous Elastic Media

Having shown the applicability of our zero index condition to general electromagnetic materials, we give an example for another kind of wave. In the theory of elasticity, the equation of motion is a continuous version of the Newtonian equation of motion \(\varvec{F}=m\varvec{a}\) [45],

$$\begin{aligned} \rho \frac{\partial ^{2}\varvec{U}}{\partial t^{2}}=-\rho \omega ^2\varvec{U}=\varvec{\nabla }\cdot \varvec{\sigma }. \end{aligned}$$
(124)

where the local force density is the divergence of the stress tensor \(\varvec{\nabla }\cdot \varvec{\sigma }\equiv \partial _{i}\sigma _{ij}\), the material mass density is \(\rho \), and the local displacement of the material from its equilibrium position is \(\varvec{U}\). We assume an elastic wave of fixed frequency, which gives the middle equation in (124) where we applied \(\partial _t^2\rightarrow -\omega ^2\).

We cannot use (124) without a constitutive relation between the stress and the displacement. More precisely it is the relative displacement of different parts of a body—the strain, \(u_{i j}=(\partial _i U_j+\partial _j U_i)/2\)—rather than an overall displacement that gives rise to stress, and for linear elastic materials the stress and strain are related by the rank four stiffness tensor \(C_{i j k l}\),

$$\begin{aligned} \sigma _{ij}=C_{i j k l}u_{k l}. \end{aligned}$$
(125)

The derivatives of the local displacement field are thus governed by the inverse stiffness tensor, known as the compliance tensor \(C_{i j k l}^{-1}\)

$$\begin{aligned} u_{kl}=\frac{1}{2}\left( \frac{\partial U_{k}}{\partial x_l}+\frac{\partial U_{l}}{\partial x_k}\right) =C^{-1}_{klij}\sigma _{ij} \end{aligned}$$
(126)

which obeys

$$\begin{aligned} C^{-1}_{i j k l}C_{k l p q}=\delta _{i p}\delta _{j q}. \end{aligned}$$
(127)

As in the theory of Sec. 7, we assume the field is independent of the z coordinate (e.g. confinement in an elastic plate, or waveguide), and propagates solely in the xy plane. This means that the z components of the strain tensor simplify to

$$\begin{aligned} u_{13}=\frac{1}{2}\frac{\partial U_3}{\partial x} \qquad u_{23}=\frac{1}{2}\frac{\partial U_3}{\partial y}\qquad u_{33}=\frac{\partial U_3}{\partial z}=0. \end{aligned}$$
(128)

Taking \(k=3\) in the constitutive relation (126), and using the simplified form of the strain tensor (128), the spatial derivatives of the out of plane displacement are given in terms of the stress tensor by,

$$\begin{aligned} \frac{\partial U_{3}}{\partial x_l}=2C^{-1}_{3lij}\sigma _{ij}. \end{aligned}$$
(129)

Contracting both sides of (129) with the unit vector \(\varvec{e}\) (components \(e_{l}\)), we obtain an equation analogous to (117), telling us the derivative of the out of plane displacement in the \(\varvec{e}\) direction

$$\begin{aligned} \varvec{e}\cdot \varvec{\nabla }U_3=2C^{-1}_{3lij}e_l\sigma _{ij}, \end{aligned}$$
(130)

This is what we were after! For elasticity, this allows us to set the derivative of the wave amplitude in a given direction to zero. In order that the elastic refractive index vanish in direction \(\varvec{e}\) (i.e. \(U_{3}\) is stretched to uniformity in this direction), the \(\varvec{e}\) vector must be a zero eigenvector of the compliance tensor

$$\begin{aligned} C^{-1}_{3lij}e_l=0. \end{aligned}$$
(131)

To enforce the Cauchy–Riemann conditions, \(\varvec{e}\) must be one of the complex vectors \(\varvec{e}_{\pm }\), which implies that a stiffness tensor \(C_{i j k l}\) supporting this kind of propagation must also be complex. Yet in lossless systems the stiffness tensor is usually taken as a real symmetric object obeying \(C_{ijkl}=C_{klij}\). However, just as in electromagnetism when dealing with monochromatic waves, the stiffness tensor can take complex values. A general lossless linear elastic medium has a Hermitian, rather than real symmetric stiffness tensor (see e.g. [46]), obeying \(C_{i j k l}=C_{k l i j}^{\star }\). Just as in electromagnetism, such complex valued stiffness tensors arise in systems where time reversal symmetry has been explicitly broken due to e.g. an externally applied magnetic field (for example, magnetostricton effects), or motion of the medium.

Example: An elastic material exhibiting the Cauchy–Riemann conditions

Condition (131) can be used to design an elastic medium where a one–way propagating elastic wave is trapped at its interface. To show this we start from an isotropic elastic material, which has the following form of stiffness tensor in terms of the bulk K and shear G moduli [45]

$$\begin{aligned} C_{i j k l}=\left( K-\frac{2G}{3}\right) \delta _{ij}\delta _{kl}+2G\delta _{ik}\delta _{jl}. \end{aligned}$$
(132)

Take propagation in the xy plane and shear displacement solely in the z direction. The components of the strain are \(u_{3i}=(1/2)\partial _i U_3\), as identified above in (128). For the isotropic medium (132) the first bulk modulus dependent term does not contribute to the stress, which is related to the strain by a diagonal \(2\times 2\) matrix containing the stiffness tensor elements \(C_{3131}=2G\) and \(C_{3232}=2G\),

$$\begin{aligned} \left( \begin{matrix}\sigma _{31}\\ \sigma _{32}\end{matrix}\right) =\left( \begin{matrix}C_{3131}&{}C_{3132}\\ C_{3231}&{}C_{3232}\end{matrix}\right) \left( \begin{matrix}u_{31}\\ u_{32}\end{matrix}\right) =\left( \begin{matrix}G&{}0\\ 0&{}G\end{matrix}\right) \left( \begin{matrix}\partial _1 U_{3}\\ \partial _2 U_{3}\end{matrix}\right) . \end{aligned}$$
(133)

and similarly for inverse relation containing the compliance tensor

$$\begin{aligned} \left( \begin{array}{c}\partial _1 U_{3}\\ \partial _2 U_{3}\end{array}\right) =2\left( \begin{array}{cc}C_{3131}^{-1}&{}C_{3132}^{-1}\\ C_{3231}^{-1}&{}C_{3232}^{-1}\end{array}\right) \left( \begin{array}{c}\sigma _{31}\\ \sigma _{32}\end{array}\right) =\left( \begin{array}{cc}\frac{1}{G}&{}0\\ 0&{}\frac{1}{G}\end{array}\right) \left( \begin{array}{c}\sigma _{31}\\ \sigma _{32}\end{array}\right) . \end{aligned}$$
(134)

With this form of material parameters it is not possible to fulfil our zero index condition (131), without sending the shear modulus to infinity, which in optics is equivalent to an isotropic zero index medium.

By analogy with the discussion of gyrotropic electromagnetic materials in Sec. 7, we make the components \(C_{3132}^{-1}=-\textrm{i}\alpha /2\) of the compliance tensor non–zero and purely imaginary.

$$\begin{aligned} \left( \begin{matrix}\partial _1 U_3\\ \partial _{2} U_3\end{matrix}\right) =\left( \begin{matrix}\kappa &{}-\textrm{i}\alpha \\ \textrm{i}\alpha &{}\kappa \end{matrix}\right) \left( \begin{matrix}\sigma _{31}\\ \sigma _{32}\end{matrix}\right) \end{aligned}$$
(135)

where we set the ‘diagonal’ of the compliance tensor as \(C_{3131}^{-1}=C_{3232}^{-1}=\kappa /2\). Figure 20 gives a sketch of what the stress–strain relationship is like in such a material, with the direction of the in–plane stress circulating over a single cycle of the wave. Performing an inner product of both sides of (135), we find the derivative of the out of plane displacement in the \(\varvec{e}_{+}^{\star }=\varvec{e}_{-}\) direction

$$\begin{aligned} \frac{\partial U_3}{\partial x}+\textrm{i}\,\frac{\partial U_3}{\partial y}=\left( \kappa -\alpha \right) \left( \sigma _{31}+\textrm{i}\,\sigma _{32}\right) . \end{aligned}$$
(136)

which equals zero when \(\alpha =\kappa \) (analogous to the \(\alpha =\lambda \) point in the dispersion relation of the gyrotropic medium shown in Fig. 14), at which point the wave becomes an analytic function of position. The stiffness tensor \(C_{i j k l}\) corresponding to the choice (135) has components

$$\begin{aligned} \left( \begin{matrix}C_{3131}&{}C_{3132}\\ C_{3231}&{}C_{3232}\end{matrix}\right) =\frac{2}{\kappa ^2-\alpha ^2}\left( \begin{matrix}\kappa &{}\textrm{i}\alpha \\ newmathrm{i}\alpha &{}\kappa \end{matrix}\right) \end{aligned}$$
(137)

showing that the stiffness tensor must have very large components close to the zero index points \(\kappa =\pm \alpha \) (just as it must for an isotropic zero index elastic medium).

Fig. 20
figure 20

Elastic materials with ‘zero index in a complex direction’. In an ordinary infinite isotropic elastic material, a shear wave polarized along \(\varvec{e}_{z}\) and propagating in the x direction gives rise to an off–diagonal stress \(\sigma _{13}\), which means that a small area element pointing along \(\varvec{e}_{z}\) is subject to a force along the x axis. The peculiar zero index materials studied here, and defined by (135), have a stress–strain relation that circulates over time. During a single cycle the shear force will rotate from the x–axis to the y–axis, to the -ve x–axis, and so on

Now we consider a planar elastic medium with a stiffness tensor of the form (137), and solve the equation of motion (124), assuming the material is terminated by vacuum. Given that the displacement field has only a single non–zero component \(\varvec{U}=U_3\varvec{e}_{z}\), and only the components \(\sigma _{12}\) and \(\sigma _{13}\) of the stress are non–zero, the equation of motion (124) reduces to

$$\begin{aligned} \varvec{\nabla }\cdot \varvec{\sigma }+\rho \omega ^2\varvec{U}=\frac{\partial \sigma _{13}}{\partial x}+\frac{\partial \sigma _{23}}{\partial y}+\rho \omega ^2 U_3=0 \end{aligned}$$
(138)

For a homogeneous medium, where the parameters \(\kappa \) and \(\alpha \) in (137) are independent of position, the equation of motion becomes the Helmholtz equation

$$\begin{aligned} \varvec{\nabla }^2 U_3+\frac{\rho \omega ^2}{\kappa } \left( \kappa ^2-\alpha ^2\right) U_3=0 \end{aligned}$$
(139)

which is identical to that for a scalar wave in a material with wave number \(|\varvec{k}|=\omega \sqrt{\rho \,(\kappa ^2-\alpha ^2)/\kappa }\). Superficially the wave appears to behave as if in an isotropic zero index medium, and as \(\alpha \rightarrow \pm \kappa \), the dispersion circle closes to a point as shown in Fig. 17a and d. As discussed previously, the one–way propagation is only evident at inhomogeneities, e.g. interfaces.

If the material has an interface with vacuum, with surface normal \(\varvec{e}_{y}\), the stress components \(\varvec{e}_{y}\cdot \varvec{\sigma }\) will equal zero,

$$\begin{aligned} \sigma _{23}=\frac{1}{\kappa ^2-\alpha ^2}\left( -\textrm{i}\alpha \frac{\partial U_3}{\partial x}+\kappa \frac{\partial U_3}{\partial y}\right) =0. \end{aligned}$$
(140)

Assuming that the elastic medium occupies \(y<0\), and vacuum occupies \(y>0\), we can write an interface state as \(U_{3}=\textrm{exp}(\textrm{i}k x+\beta y)\), where \(\beta \) is the real and positive decay constant of the wave into the elastic medium. Demanding that the normal stress vanishes as in (140) relates the decay constant and the propagation constant

$$\begin{aligned} k=-\frac{\kappa }{\alpha } \beta \end{aligned}$$
(141)

implying \(k<0\) when \(\alpha >0\), and \(k>0\) when \(\alpha <0\). As the decay constant \(\beta \) must be positive, the interface state can only satisfy the boundary condition (140) for one direction of propagation" When our zero index condition is satisfied \(\alpha =\kappa \), the displacement \(U_3\) becomes a function of \(x+\textrm{i}y\), i.e. an analytic function, obeying the Cauchy–Riemann conditions and circulating in only an anti–clockwise sense.

To show that this mode is a solution to the equations of elasticity (124), we finally need to verify that the boundary condition (141) is consistent with the dispersion relation derived from (139),

$$\begin{aligned} \varvec{k}^2=k^2-\beta ^2=\frac{\rho \omega ^2}{\kappa }\left( \kappa ^2-\alpha ^2\right) \rightarrow k^2=\rho \kappa \omega ^2 \end{aligned}$$
(142)

so that when \(\alpha >0\) the mode satisfies

$$\begin{aligned} k=-\sqrt{\rho \kappa }\omega , \end{aligned}$$
(143)

which is independent of \(\alpha \)! Through demanding that the refractive index of the elastic mode vanish in a complex direction we have thus found a one–way interface state with a linear dispersion, independent of the parameter \(\alpha \), exactly as found for gyrotropic electromagnetic materials using a topological argument.

8.3 Periodic Media

Finally, let’s apply this idea to a periodic planar material. As the material is not homogeneous, it is not obvious whether the concept of the refractive index can be applied at all. The closest we can get is to consider the Bloch vector \(\varvec{K}\), which is analogous to the wave–vector \(\varvec{k}\) in a homogeneous medium. In this case, zero index in a given direction occurs when \(\varvec{K}\) vanishes along one or more directions. Alternatively we can expand the dispersion relation around points \(\varvec{K}_{b}\) on the Brillouin zone boundary, \(\varvec{K}=\varvec{K}_b+\delta \varvec{K}\). The change in the mode’s frequency \(\delta \omega \) as a function of the deviation \(\delta \varvec{K}\) from the zone boundary can then also be considered analogous to the dispersion relation in a homogeneous medium, and when one or more components of \(\delta \varvec{K}\) vanish, this is analogous to a point of zero index.

We consider the two dimensional Helmholtz equation, governing the behaviour of a TE polarized electromagnetic wave in a periodic permittivity profile \(\epsilon (\varvec{x})\)

$$\begin{aligned} \left[ \varvec{\nabla }^2+k_0^2\epsilon (\varvec{x})\right] \phi (\varvec{x})=0 \end{aligned}$$
(144)

although the same equation can also describe elastic and acoustic pressure waves. Suppose the profile \(\varvec{\epsilon }(\varvec{x})\) is such that two modes, \(\phi _{1}(\varvec{x})\) and \(\phi _{2}(\varvec{x})\), have degenerate frequencies \(\omega \) at point \(\varvec{K}\) in the Brillouin zone. To (144) we add a perturbation \(\delta \epsilon \) to the permittivity. Then to examine small deviations away from this point in the Brillouin zone we expand the field as a sum of the two modes

$$\begin{aligned} \phi (\varvec{x})=a_{1}(\varvec{x})\phi _{1}(\varvec{x})+a_{2}(\varvec{x})\phi _{2}(\varvec{x}) \end{aligned}$$
(145)

where the expansion coefficients \(a_1\) and \(a_{2}\) vary slowly in position compared to the two solutions \(\phi _1\) and \(\phi _2\). Substituting (145) into (144), and dropping derivatives of \(a_{1,2}\) beyond the first we have,

$$\begin{aligned} 2\varvec{\nabla }\phi _{1}\cdot \varvec{\nabla }a_{1}+2\varvec{\nabla }\phi _{2}\cdot \varvec{\nabla }a_{2}+k_0^2\delta \epsilon [a_1 \phi _{1}+a_2 \phi _2]=-2k_0 \delta k_0\epsilon [a_1 \phi _{1}+a_2 \phi _2] \end{aligned}$$
(146)

For a fixed value of \(\varvec{K}\), the non–degenerate modes of (144) obey \(\int \epsilon \phi _{i}\phi _{j}^{\star } {\text{ d }}^{2}\varvec{x}=\delta _{ij}\), where the integral is taken over a unit cell of the medium. We are free to choose our two degenerate modes \(\phi _{1,2}\) to obey the same condition. Taking the inner product of (146) with \(\phi _{1}^{\star }\) and \(\phi _{2}^{\star }\), and neglecting the variation of the expansion coefficients \(a_{1,2}\) over the unit cell we obtain two equations that can be written as a single vector differential equation

$$\begin{aligned} -\textrm{i}\varvec{\alpha }\cdot \varvec{\nabla }\,|\psi \rangle +m|\psi \rangle =\frac{\delta k_0}{k_0}|\psi \rangle \end{aligned}$$
(147)

where the ‘wavefunction’ is defined as \(|\psi \rangle =(a_{1},a_{2})^\textrm{T}\), and we have introduced three matrices \(\alpha _{j}\) that form the vector of matrices \(\varvec{\alpha }=(\alpha _1,\alpha _2,\alpha _3)\),

$$\begin{aligned} \alpha _{j}=-\frac{\textrm{i}}{k_0^2}\left( \begin{array}{cc}\int \phi _{1}^{\star }\partial _j\phi _{1}\,{\text{ d }}^{2}\varvec{x}&{}\int \phi _{1}^{\star }\partial _{j}\phi _{2}\,{\text{ d }}^{2}\varvec{x} \\ \int \phi _{2}^{\star }\partial _j\phi _{1}\,{\text{ d }}^{2}\varvec{x}&{}\int \phi _{2}^{\star }\partial _{j}\phi _{2}\,{\text{ d }}^{2}\varvec{x} \end{array}\right) \end{aligned}$$
(148)

and the ‘mass’ matrix

$$\begin{aligned} m=-\frac{1}{2}\left( \begin{matrix}\int \phi _{1}^{\star }\,\delta \epsilon \, \phi _{1}\,{\text{ d }}^{2}\varvec{x}&{}\int \phi _{1}^{\star }\,\delta \epsilon \, \phi _{2}\,{\text{ d }}^{2}\varvec{x} \\ \int \phi _{2}^{\star }\,\delta \epsilon \, \phi _{1}\,{\text{ d }}^{2}\varvec{x}&{}\int \phi _{2}^{\star }\,\delta \epsilon \, \phi _{2}\,{\text{ d }}^{2}\varvec{x} \end{matrix}\right) . \end{aligned}$$
(149)

The latter is named as such due to the similarity between (147) and the Dirac equation (see [47] and [48] for a complementary discussion).

The two–fold degeneracy at the point \(\varvec{K}\) is assumed to arise from a symmetry of the latticeFootnote 13. Assuming a \(2\pi /N\) rotational symmetry, the two degenerate eigenfunctions will either be invariant under the rotation, or will become mixed up by it, in the same way as the components of a two dimensional vector after the application of the rotation matrix \(\varvec{R}\). We take the modes \(\phi _{1}\) and \(\phi _{2}\) to be eigenfunctions of this rotation matrix, which has eigenvalues \(\textrm{exp}(\pm 2\pi \textrm{i}/N)\). With this choice the modes transform under rotation as \(\phi _{1}\rightarrow \textrm{exp}(2\pi \textrm{i}/N)\,\phi _{1}\) and \(\phi _{2}\rightarrow \textrm{exp}(-2\pi \textrm{i}/N)\,\phi _{2}\)Footnote 14.

We can use this behaviour of the modes under rotation to deduce the form of the matrix elements appearing in Eqns. (148) and (149). For instance, the off–diagonal elements of the ‘mass’ matrix must take the same value if we use the rotated coordinate system \(\varvec{x}'=\varvec{R}^\textrm{T}\cdot \varvec{x}\),

$$\begin{aligned} \int \phi _{1}^{\star }(\varvec{x})\,\delta \epsilon (\varvec{x})\,\phi _{2}(\varvec{x})\,{\text{ d }}^{2}\varvec{x}&=\int \phi _{1}^{\star }(\varvec{R}\cdot \varvec{x}')\,\delta \epsilon (\varvec{R}\cdot \varvec{x}')\,\phi _{2}(\varvec{R}\cdot \varvec{x}')\,{\text{ d }}^{2}\varvec{x}\nonumber \\&=\textrm{e}^{-4\pi \textrm{i}/N}\int \phi _{1}^{\star }(\varvec{x}')\,\delta \epsilon (\varvec{x}')\,\phi _{2}(\varvec{x}')\,{\text{ d }}^{2}\varvec{x}' \end{aligned}$$
(150)

where \(\varvec{R}\) is the two dimensional rotation matrix, for rotation by \(2\pi /N\), and we assume that \(\delta \epsilon \) takes the same form after this rotation. Equation (150) implies that the integral is zero, unless we have an \(N=2\) fold symmetry, leaving only the possibility of \(N=3,4\), and 6 fold lattice symmetryFootnote 15. We assume \(N>2\). Similarly the diagonal elements of the \(\varvec{\alpha }\) matrix must obey

$$\begin{aligned} -\textrm{i}\int \phi _{1}^{\star }(\varvec{x})\varvec{\nabla }\phi _{1}(\varvec{x})\,{\text{ d }}^{2}\varvec{x}&=-\textrm{i}\int \phi _{1}^{\star }(\varvec{R}\cdot \varvec{x}')\,\varvec{R}^\textrm{T}\cdot \varvec{\nabla }'\phi _{1}(\varvec{R}\cdot \varvec{x}')\,{\text{ d }}^{2}\varvec{x}\nonumber \\&=-\textrm{i}\varvec{R}^\textrm{T}\cdot \int \phi _{1}^{\star }(\varvec{x}')\,\varvec{\nabla }'\phi _{1}(\varvec{x}')\,{\text{ d }}^{2}\varvec{x}' \end{aligned}$$
(151)

implying that the matrix element written on the left of (151) is an eigenfunction of the rotation matrix \(\varvec{R}^\textrm{T}\), with unit eigenvalue. As the eigenvalues of the rotation matrix are \(\textrm{exp}(\pm 2\pi \textrm{i}/N)\), the matrix element itself must be zero! The two conditions (150) and (151) imply that the Dirac–like equation (147) takes the form

$$\begin{aligned} \left( \begin{matrix}0&{}-\varvec{e}\cdot \textrm{i}\varvec{\nabla } \\ -\varvec{e}^{\star }\cdot \textrm{i}\varvec{\nabla }&{}0\end{matrix}\right) \left( \begin{matrix}a_0\\ a_1\end{matrix}\right) +\left( \begin{matrix}m&{}0 \\ 0&{}-m\end{matrix}\right) \left( \begin{matrix}a_0\\ a_1\end{matrix}\right) =\lambda \left( \begin{matrix}a_0\\ a_1\end{matrix}\right) \end{aligned}$$
(152)

where the complex vector \(\varvec{e}\) is defined as the integral

$$\begin{aligned} \varvec{e}=-\frac{\textrm{i}}{k_0^{2}}\int \phi _{1}^{\star }(\varvec{x})\varvec{\nabla }\phi _{2}(\varvec{x})\,{\text{ d }}^{2}\varvec{x}. \end{aligned}$$
(153)

The ‘mass’, m equals the difference in the ‘averaged’ values of the permittivity perturbation

$$\begin{aligned} m=\frac{1}{4}\left[ \int \phi _{2}^{\star }\,\delta \epsilon \,\phi _{2}\,{\text{ d }}^{2}\varvec{x}-\int \phi _{1}^{\star }\,\delta \epsilon \,\phi _{1}\,{\text{ d }}^{2}\varvec{x}\right] \end{aligned}$$
(154)

and the ‘energy’ eigenvalue \(\lambda \) equals the relative frequency shift of the mode plus the sum of the ‘averaged’ permittivity values,

$$\begin{aligned} \lambda =\frac{\delta k_0}{k_0}+\frac{1}{4}\left[ \int \phi _{1}^{\star }\,\delta \epsilon \,\phi _{1}\,{\text{ d }}^{2}\varvec{x}+\int \phi _{2}^{\star }\,\delta \epsilon \,\phi _{2}\,{\text{ d }}^{2}\varvec{x}\right] . \end{aligned}$$
(155)

We have now reduced our Dirac–like equation (147) to (152): the exact form of the two dimensional Dirac equation [43]. In the special case where \(m=\lambda \) (analogous to the point where \(E=mc^2\) for a relativistic particle), (152) implies that

$$\begin{aligned} \varvec{e}\cdot \varvec{\nabla }a_1=0 \end{aligned}$$
(156)

i.e. the ‘refractive index’ is again zero in the complex direction \(\varvec{e}\)! In the case of periodic medium this means that the dispersion surface in the vicinity of the point \(\varvec{K}\) in the Brillouin zone behaves similarly to the zero index limit of a homogeneous medium, as shown in Fig. 17.

But what is the value of \(\varvec{e}\)? This too can be deduced using the same symmetry arguments as above. The complex vector \(\varvec{e}\) must obey

$$\begin{aligned} \varvec{e}=-\frac{\textrm{i}}{k_0^2}\int \phi _{1}^{\star }(\varvec{x})\varvec{\nabla }\phi _{2}(\varvec{x})\,{\text{ d }}^{2}\varvec{x}&=-\frac{\textrm{i}}{k_0^2}\varvec{R}^{T}\cdot \int \phi _{1}^{\star }(\varvec{R}\cdot \varvec{x}')\varvec{\nabla }'\phi _{2}(\varvec{R}\cdot \varvec{x}')\,{\text{ d }}^{2}\varvec{x} \\&=-\frac{\textrm{i}}{k_0^2}\textrm{e}^{-\frac{4\pi \textrm{i}}{N}}\varvec{R}^{T}\cdot \int \phi _{1}^{\star }(\varvec{R}\cdot \varvec{x}')\varvec{\nabla }'\phi _{2}(\varvec{R}\cdot \varvec{x}')\,{\text{ d }}^{2}\varvec{x}, \end{aligned}$$

and \(\varvec{e}\) must therefore be an eigenvector of the inverse rotation matrix with eigenvalue \(\textrm{exp}(4\pi \textrm{i}/N)\). This is possible for an \(N=3\) fold symmetry, where the vector \(\varvec{e}=v(\varvec{e}_{x}+\textrm{i}\varvec{e}_{y})\) is such an eigenvector (v is a positive real constant). Therefore, for the case of a doubly degenerate point in the Brillouin zone, with three fold symmetry, (156) reduces to the Cauchy–Riemann equations, \(\partial a_{1}/\partial \mathcal {Z}^{\star }=0\). This means that the spatially varying envelope of the wave in the lattice becomes an analytic function of position. On top of the standing wave at e.g. a point \(\varvec{K}\) on the Brillouin zone boundary, we thus have a one–way circulation of the wave leading again to one–way interface states. Again, this reproduces the same result that would be obtained from a topological analysis of the wave in the vicinity of the degeneracy in the Brillouin zone.

Note that unlike the case of homogeneous media, here we only considered a region close to some point of the dispersion relation. Therefore our theory says nothing about the total number of interface states. From the perspective of a topological calculation the analogue is the ‘valley’ Chern number [49] (computed as an integral over a region of the Brillouin zone, rather than the full zone), which we can thus see records the presence of such points of zero index.

Example: The Jackiw–Rebbi state and Cauchy–Riemann conditions

Through introducing a slow spatial variation of the perturbation \(\delta \epsilon \), one–way modes (in the limited sense discussed immediately above) can be confined to propagate within the region of the inhomogeneity. We now show this for the special case of an interface between two materials where the wave behaves as an analytic function of \(\mathcal {Z}\) and \({\mathcal {Z}}^{\star }\) respectively.

For a uniform perturbation to the permittivity \(\delta \epsilon \) and fixed propagation constant \(\varvec{k}\), the Dirac equation (147) reduces to

$$\begin{aligned} \left( \begin{matrix}m&{}-\textrm{i}v\left( \frac{\partial }{\partial x}+\textrm{i}\frac{\partial }{\partial y}\right) \\ newmathrm{i}v\left( \frac{\partial }{\partial x}-\textrm{i}\frac{\partial }{\partial y}\right) &{}-m\end{matrix}\right) \left( \begin{matrix}a_0\\ a_1\end{matrix}\right)&=\left( \begin{matrix}m&{}v\left( k_x+\textrm{i}k_y\right) \\ v\left( k_x-\textrm{i}k_y\right) &{}-m\end{matrix}\right) \left( \begin{matrix}a_0\\ a_1\end{matrix}\right) \nonumber \\&=\lambda \left( \begin{matrix}a_0\\ a_1\end{matrix}\right) \end{aligned}$$
(157)

The eigenvalues of (157) therefore fix the propagation constant \(\varvec{k}\) to obey \(\lambda =\pm (v^2\varvec{k}^2+m^2)^{1/2}\). We have seen this form of dispersion relation many times now! Not only is this the counterpart of the relativistic dispersion relation, \(E=\pm (m^2c^4+\varvec{p}^2)^{1/2}\), the same form governs the elastic (142), and electromagnetic (92) waves discussed above. As in all those cases the dispersion relation is that illustrated in Fig. 14: we have two regions of allowed propagation, where the magnitude of the ‘energy’ is greater than the rest energy, separated by a band gap. At the edge of the band gap the wave becomes an analytic function of position and accordingly exhibits one–way propagation.

Suppose that \(\delta \epsilon \) is such that it changes sign under an inversion of the (xyz) coordinate system \(\delta \epsilon \rightarrow -\delta \epsilon \). Such an inversion reverses the sense of rotation, and must interchange the two modes \(\phi _1\) and \(\phi _2\) (i.e. it is equivalent to swapping the eigenvectors of the rotation matrix). Thus, for this particular form of perturbation

$$\begin{aligned} \int \phi _{1}^{\star }\delta \epsilon \phi _1\,{\text{ d }}^{2}\varvec{x}=-\int \phi _{2}^{\star }\delta \epsilon \phi _2\,{\text{ d }}^{2}\varvec{x} \end{aligned}$$
(158)

implying that Eqns. (154) and (155) simply reduce to, \(\lambda =\delta k_0/k_0\), and \(m=(1/2)\int \phi _2^{\star }\,\delta \epsilon \,\phi _2\,{\text{ d }}^{2}\varvec{x}\). If m changes as a function of x alone, homogeneous at infinity and smoothly changing from \(-\lambda \) to \(+\lambda \), then there is a general solution to (157)

$$\begin{aligned} |\psi \rangle =\textrm{e}^{-\frac{1}{v}\int _0^x m(x')\,{\text{ d }}x'+\textrm{i}k_y y}\frac{1}{\sqrt{2}}\left( \begin{matrix}1\\ \textrm{i}\end{matrix}\right) \end{aligned}$$
(159)

which holds only for \(\lambda =-v k_y\), and hence only a negative phase velocity. Assuming \(|m|=\lambda \) as \(|x|\rightarrow \infty \), the mode (159) takes the form

$$\begin{aligned} |\psi \rangle \propto \textrm{e}^{-|k_y|(|x|-\textrm{i}y)} \end{aligned}$$
(160)

i.e. an analytic function of \(\mathcal {Z}\) or \(\mathcal {Z}^{\star }\), depending on which side of the interface we are considering. This mode is a special case of the one–way propagation Jackiw–Rebbi mode of the two dimensional Dirac equation [50]. Again, the idea of enforcing one–way propagation of a wave through demanding analyticity—in this case finding unidirectional interface state through connecting materials where the wave behaves as a function of \(\mathcal {Z}\) and \(\mathcal {Z}^{\star }\)—is a simple shortcut to findings that are ordinarily connected with topological arguments [51].

9 Concluding Remarks

Topology is a deep subject that a physicist can easily get lost in. At the beginning of the tutorial we spent some time building up some small foundations of topology, for the special case of vectors living on closed surfaces. As many physicists do not receive any education in this area of mathematics, we hope that this will be a useful introduction, clarifying the origins of the infamous “Chern number”, in addition to the connection between this invariant and the number of trapped wave at an interface.

Although powerful, in the author’s view it is also problematic that the prediction of topological interface states lacks any local information about either the wave behaviour, or the properties of the interface. This makes it very difficult to understand the origins of these one–way interface states. What is it about the material that leads to this one way propagation, and could we have predicted these states without using topology?

The Chern number records the number and type of critical points in the Berry connection over e.g. the torus corresponding to the first Brillouin zone. In the second half of the tutorial we showed in some examples that these critical points correspond to points where the refractive index is zero. In terms of crystal optics the refractive index at these points vanishes in a complex direction, e.g. \(\varvec{e}_{x}+\textrm{i}\varvec{e}_{y}\). This is equivalent to the wave satisfying the Cauchy–Riemann conditions and thus circulating in only one direction, which is the origin of the one–way propagation of the interface states found from a topological calculation. Finding these zero index points can thus be used as a shortcut to find one–way propagating interface modes, as shown in our examples in electromagnetic materials, elastic continua, and periodic materials. The reader may find this a useful alternative to standard topological calculations.