Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Fiber bundles, Yang–Mills theory, and general relativity


I articulate and discuss a geometrical interpretation of Yang–Mills theory. Analogies and disanalogies between Yang–Mills theory and general relativity are also considered.

This is a preview of subscription content, log in to check access.


  1. 1.

    Healey (2007) is now (already) the locus classicus for philosophy of Yang–Mills theory; the reader should refer there for an extensive discussion of the philosophical literature on the topic and for further references.

  2. 2.

    One exception to this rule is Leeds (1999), who does attempt to articulate a novel fiber bundle interpretation of Yang–Mills theory, though the fiber bundle formalism he begins with is somewhat idiosyncratic with respect to the mathematical physics literature. The view developed here is significantly different. Catren (2008), too, offers an interpretation of Yang–Mills theory that makes extensive use of the fiber bundle formalism, but his goal is to relate this formalism to certain general “principles” that he detects in the foundations of Yang–Mills theory. This approach reflects a substantially different posture towards the foundations of physics from the one adopted here. Perhaps the closest precursor to the present paper is Healey (2001, 2007), who describes and rejects a fiber bundle interpretation in the course of clearing the ground for his alternative holonomy interpretation. But for reasons discussed in detail in Sect. 5 of this paper, Healey rejects what I argue is the most natural interpretational strategy, leading him to a rather different view.

  3. 3.

    I will not discuss the relative merits of fiber bundle and holonomy interpretations of Yang–Mills theories. (For more on that topic, see Rosenstock and Weatherall 2015a, b.) And although I do not know of anyone who has written a philosophical treatment along the lines of what follows, I do not claim that the views presented here are original. Indeed, much of what I say is implicit in classical sources in the mathematical physics literature, especially Palais (1981) and Bleecker (1981). That said, in both sources the themes emphasized here are somewhat obscured by the authors’ treatments of Kaluza-Klein theory, which puts the same formalism to strikingly different physical uses.

  4. 4.

    Of course, how one should understand general relativity is itself a matter of some continuing controversy (see Brown 2005); nonetheless, it seems we are still on firmer ground than in Yang–Mills theory. As will be clear in what follows, I take general relativity to be a theory of spatiotemporal geometry, along the lines of what is described in, say, Wald (1984) or Malament (2012).

  5. 5.

    One possible exception occurs in theories of cosmological inflation, where the “inflaton” field is often taken to be a smooth scalar field satisfying some non-linear wave equation. But since this field is not charged, its dynamics would not depend on a principal connection.

  6. 6.

    I am taking for granted, here, as in the appendix, some familiarity with both the mathematics and the physics of general relativity. For sympathetic treatments of both topics, see Wald (1984) and Malament (2012).

  7. 7.

    All manifolds considered here are assumed to be Hausdorff, paracompact, and smooth; more generally, all maps and fields that are candidates to be smooth will also be assumed to be smooth.

  8. 8.

    The question of what postulating such a structure means physically will be the primary concern of this paper, and will be addressed in subsequent sections. For now, we just take for granted that this is the geometrical setting in which we are working. For mathematical background, see the appendix and references therein. One notational convention is worth mentioning here, however: throughout, I use the “abstract index” notation developed by Penrose and Rindler (1984) and described in detail by Wald (1984) and Malament (2012), suitably modified to distinguish the range of vector spaces that one encounters in the theory of principal bundles. A detailed discussion of these modifications is given in appendix section “Notational conventions”.

  9. 9.

    The invariance of the inner product \(k_{\mathfrak {A}\mathfrak {B}}\) is with respect to the adjoint action of \(\mathfrak {g}\) on itself. In the case of a compact Lie group, there is an essentially unique choice of inner product up to scaling factor. This scaling factor is closely related to the “coupling constant” associated with a Yang–Mills theory, i.e., it provides a measure of the relative “strength of interaction” of different Yang–Mills fields. The inner product is necessary to define, for instance, energy-momentum tensors and Lagrangians for Yang–Mills fields, but it plays no role in the present paper.

  10. 10.

    To be clear about this notation, the curvature is the tensor that takes vectors \(\xi ^{\alpha },\eta ^{\alpha }\) at a point \(x\in P\) to the Lie algebra element \(\xi ^{\alpha }\eta ^{\beta }d_{\alpha }\omega ^{\mathfrak {A}}{}_{\beta } + \frac{1}{2}\left[ \omega ^{\mathfrak {A}}{}_{\alpha }\xi ^{\alpha },\omega ^{\mathfrak {A}}{}_{\beta }\eta ^{\beta }\right] \). See “Curvature”.

  11. 11.

    See appendix sections “Exterior and induced covariant derivatives” and “Hodge star operation on horizontal and equivariant vector valued forms”, respectively, for more on exterior covariant derivatives and this Hodge star. Note that, although the Hodge star notation is a somewhat unnatural fit with the index notation, Eq. (2) makes sense: \(\star \Omega ^{\mathfrak {A}}{}_{\beta \kappa }\) is a vector valued two form, so \(\overset{\omega }{D}{}_{\alpha }\star \Omega ^{\mathfrak {A}}{}_{\beta \kappa }\) is a vector valued three form, and thus \(\star \overset{\omega }{D}{}_{\alpha }\star \Omega ^{\mathfrak {A}}{}_{\beta \kappa }\) is a vector valued one form.

  12. 12.

    For more on the pullback of vector valued forms, see appendix section “Vector valued forms”.

  13. 13.

    Vertical principal bundle automorphisms are defined in appendix section “Vector bundles and principal bundles”.

  14. 14.

    One has to be slightly careful: the right action on P by a fixed element \(g\in G\) gives rise to a vertical principal bundle automorphism only when G is Abelian; more generally, the G action will vary from point to point. See Bleecker (1981, §3.2) for a lucid and complete treatment of the vertical principal bundle automorphisms of a principal bundle P.

  15. 15.

    Once again, for further details on the associated bundle construction, see appendix section “Associated bundles”.

  16. 16.

    Up to a choice of scaling factor.

  17. 17.

    Let me once again echo the dissatisfaction noted above. The complex scalar field described here, though a useful toy example, does not appear to represent any realistic matter.

  18. 18.

    Note that we might have equally well begun with a two dimensional real vector space. This will be important in Sect. 4.

  19. 19.

    Note that in this case, we may define the charge-current density as a field on M because, as noted above, \(J_a\) is independent of the choice of section of EM. In a more general setting, we would define \(J^{\mathfrak {A}}{}_{\alpha }\) as a field on the total space of the principal bundle.

  20. 20.

    See, for instance, Trautman (1965) and Malament (2012, Ch. 4).

  21. 21.

    Another class of physical theories that philosophers have studied recently—see, for instance, Belot (2007), Butterfield (2007), North (2009), Curiel (2013), and Barrett (2014)—that make significant use of the sorts of geometrical methods discussed here are Hamiltonian and Lagrangian mechanics. In that context one works with the cotangent and tangent bundles, respectively, of the manifold of possible configurations of some physical system. As I hope will be clear in what follows, although similar methods are used, there are no strong analogies between this application of geometry in physics and Yang–Mills theory. In particular, nothing in Yang–Mills theory as I have described it here should be understood as a manifold whose points are (global) configurations or instantaneous states of any physical system.

  22. 22.

    Some more details on the construction of the frame bundle are provided in appendix section “Tangent and frame bundles”.

  23. 23.

    Generally in what follows, when we refer to “bases”, we mean “ordered bases”.

  24. 24.

    Again, for details of the construction of the tangent bundle, see appendix section “Tangent and frame bundles”.

  25. 25.

    See also Swanson and Halvorson (2012) for a related discussion.

  26. 26.

    To be clear: GL(V) is just the general linear group \(GL(n,\mathbb {K})\), where n is the (unspecified) dimension of V, and \(\mathbb {K}\) is \(\mathbb {R}\) or \(\mathbb {C}\), depending on whether V is real or complex.

  27. 27.

    In particular, it is the fact that \(\rho \) is a faithful representation of U(1) that allows us to use it to produce a U(1) subbundle of the frame bundle of the corresponding associated bundle. This is not always possible: there exist representations of Lie groups that are not faithful, and indeed, there exist Lie groups that do not admit faithful representations on any vector space. Still, even in the most general case, a representation \(\rho :G\rightarrow GL(V)\) (faithful or not) of a group G on some vector space V gives rise to some subbundle of the frame bundle of the corresponding associated bundle, namely a \(\rho [G]\) subbundle. Moreover, in cases of interest in physics, one does have faithful representations, except where one uses trivial representations of a given Yang–Mills structure group as shorthand for the observation that a certain matter field is impervious to the interactions governed by that Yang–Mills force.

  28. 28.

    This holds generally for subbundles. See Kobayashi and Nomizu (1963, Prop. 6.1).

  29. 29.

    Indeed, one can state the fundamental theorem of (pseudo)-Riemannian geometry in these terms: there exists a unique principal connection on any O(1, 3) principal bundle that induces a torsion-free derivative operator. See Bleecker (1981, §6.2).

  30. 30.

    To be clear, a given structure on V always determines a subbundle of LE; the converse is true for each fiber of the vector bundle, but in some cases a further integrability condition is needed to define the corresponding structure on the entire vector bundle.

  31. 31.

    What is the physical significance of this additional structure? It is hard to say, for the same reasons that I expressed dissatisfaction in the introduction and fn. 17. Ultimately, the structure is a relic of the fact that we are interpreting fields associated with quantum states—where a Hermitian inner product is natural—as classical matter.

  32. 32.

    Of course, one can consider U(n) subbundles of the frame bundles of vector bundles with higher dimensional fibers; in such cases, the U(n) subbundle puts a complex vector space structure and a Hermitian inner product on a 2n dimensional subspace of the fibers. Similar considerations apply in the other cases mentioned.

  33. 33.

    This perspective, which is very congenial to the one offered here, is also reflected in Palais (1981); Geroch (private correspondence) appears to take the same line. I should emphasize, though, that on a purely mathematical level, the facts on the ground support a kind of equanimity regarding “which comes first”, the principal bundle or the vector bundles. My claim here is that, with regard to the physics, the formalism is much more naturally understood if one takes the vector bundles to be primary, since they play a more direct role in representing matter.

  34. 34.

    One might worry that the role I have just ascribed to the principal bundles—of coordinating physically significant data concerning parallel transport and curvature across different kinds of charged matter—is robust enough that it is misleading to call it “auxiliary”, since being “auxiliary” may suggest that the structure is unnecessary or eliminable. In any case, I hope I have been clear enough above about what I take the roles of the various bundles to be—vector bundles represent possible local states of matter; principal bundles coordinate between these vector bundles—that the sense of “auxiliary” I have in mind is clear. It is the sense in which a coach is auxiliary to the players on the field.

  35. 35.

    Note that the reason that the dynamics of these theories are different—or even can be different, given the uniqueness results discussed in Bleecker (1981, §10.2), Palais (1981, pp.80-2), or Feintzeig and Weatherall (2014)—is intimately related to the other disanalogies discussed in the present section, and so in that sense the differences in the dynamics may be relevant to the discussion here. But the mere fact that they are different is not relevant.

  36. 36.

    If one has a vector bundle \(E{^\prime }\rightarrow N\) over N, a smooth map \(f:M\rightarrow N\) may be used to define a pullback bundle, \(f^*(E{^\prime })\rightarrow M\), which is a bundle whose fiber at each point \(p\in M\) is the fiber at f(p) of \(E{^\prime }\). But this construction defines a new bundle; it does not generate a map between a fixed bundle over M and a bundle over N.

  37. 37.

    Compare this with the situation in the Yang–Mills theories we have described, where one generally has a non-degenerate inner product on the fibers of one’s bundle, but where there are generally many connections compatible with this inner product. The reason is that there is no analogue of torsion for a connection on a generic vector bundle, and so there is no way to single out an analogue to the Levi-Civita derivative operator.

  38. 38.

    Here \((\rho (g^{-1}))^A{}_B\) is the tensor corresponding to the action of \(g^{-1}\) on V in the representation \(\rho \).

  39. 39.

    The reason this should be unsurprising is that vertical principal bundle automorphisms will generally not preserve the “choices” described above.

  40. 40.

    Note that this expression makes sense, since both sides of the inequality are evaluated at points of the same fiber, and at every point of that fiber, \(\bar{\vartheta }^A{}_a\) is a map between the same two vector spaces, viz. \(T_{\wp _L(u)}M\) and V. To see that the inequality holds, note that for any \(u\in LM\), there is some \(g\in GL(n,\mathbb {R})\) such that \(\Psi (u)=ug\), where \(g=e\) iff \(\Psi =1_{LM}\). And since \(\bar{\vartheta }^A{}_a\) is equivariant, \((\bar{\vartheta }^A{}_a)_{|\Psi (u)}=(\bar{\vartheta }^A{}_a)_{|ug}= (\rho (g^{-1}))^A{}_B(\bar{\vartheta }^B{}_a)_{|u}\ne (\bar{\vartheta }^A{}_a)_{|u}\).

  41. 41.

    It is interesting to note that without a choice of solder form on LM, the group of principal bundle automorphisms is naturally a supergroup of the group of diffeomorphisms of its base space. So on the Barrett (2014) account, a frame bundle without a solder form has less structure than its base space.

  42. 42.

    There is a tension in Anandan’s view, here. He takes the solder form to be canonical, in the sense that it is a structure that both arises naturally on LM and “breaks” the symmetry; but he also argues that the considerations he presents suggest that the solder form should be a fundamental dynamical variable in theories of gravitation, implying that different configurations of matter would lead to different solder forms. One cannot have it both ways.

  43. 43.

    There is an analogy here to general relativity in a different guise. In that context, one often hears that general relativity exhibits “diffeomorphism invariance”. But it is not the case that, given a relativistic spacetime \((M,g_{ab})\) and a diffeomorphism \(f:M\rightarrow M\), \(f^*(g_{ab})=g_{ab}\). So in this sense—which is the same as the sense in which the solder form “breaks” “gauge symmetry”—the metric “breaks” diffeomorphism invariance. What is the case is that if \((M,g_{ab})\) is a spacetime, then \((M,f^*(g_{ab}))\) is also a spacetime, and it is “equivalent” in the sense that \((M,g_{ab})\) and \((M,f^*(g_{ab}))\) may be used to represent the same physical situations. But this, mutatis mutandis, is precisely what happens with the solder form. See Weatherall (2015) for a discussion of related issues and how they have led to spurious arguments in the relativity literature.

  44. 44.

    Here the \(\bar{}\) over the raised A index indicates that this index is valued in a different vector space than the lowered index. (Neither space is a tangent space to any manifold.)

  45. 45.

    One can extend this argument to any principal bundle with an associated bundle. Given a principal bundle \(G\rightarrow P\xrightarrow {\wp } M\), a vector space V, and a representation \(\rho :G\rightarrow V\), one can always define an equivariant mixed index tensor \(\delta ^{\bar{A}}{}_B\) on P whose action at each \(x\in P\) is to map vectors in the fiber of \(P\times _{G} V\) at \(\wp (x)\) to the vector in V corresponding to it. This, too, is a kind of solder form: it specifies how P relates to its associated bundle. But it is not invariant under vertical principal bundle automorphisms of P!

  46. 46.

    In the general case, a metric is no help.

  47. 47.

    This case is of interest because it is often cited as an analogue to the Aharanov-Bohm effect, wherein a quantum particle propagating around a solenoid exhibits a distinctive interference pattern even though the electromagnetic field vanishes in the region in which the particle propagates. The reason the cone is analogous is that, with the standard metric and derivative operator, the cone is everywhere flat, but because it is not simply connected, parallel transport is globally path-dependent. Likewise, in the Aharanov-Bohm effect, one has non-trivial phase shifts corresponding to non-trivial “holonomies” of the principal connection on the principal bundle associated with electromagnetism, even though the electromagnetic field (i.e., the curvature of the principal connection) vanishes everywhere. See Healey (2007, Ch.2) for an extensive discussion of the Aharanov-Bohm effect.

  48. 48.

    This is because in such a case, \(\eta ^n\nabla _n(g_{ab}\xi ^a\eta ^b)=0\), meaning that the angle between \(\xi ^a\) and \(\eta ^a\) is constant everywhere on \(\gamma \). This is one sense in which the metric is preserved by \(\nabla \).

  49. 49.

    For similar reasons, it is not clear that this sense of holonomy along open curves is sufficient to support Healey’s conclusion that general relativity is “separable,” in the sense that parallel transport depends only on local properties, in a way that Yang–Mills theory is not. (A similar point is made by Myrvold 2011.) But I will defer further discussion of the relationship between the present views and Healey’s “holonomy interpretation” of Yang–Mills theory to future work. (See, in particular, Rosenstock and Weatherall 2015a, b.)

  50. 50.

    See Geroch (1996) and Wald (1984, Ch. 13) for similar generalizations of the notation. The principal virtues to using this notation, aside from familiarity to a particular community, is that it naturally permits multi-index tensor fields, as discussed in appendix section “Notational conventions” below.

  51. 51.

    In what follows, as in the body of the paper, it should be assumed that all manifolds, maps, and fields described are smooth. Likewise, all manifolds are assumed to be Hausdorff and paracompact.

  52. 52.

    Note that a trivial bundle and a product manifold are not quite the same thing: a fiber bundle only has one privileged projection map, which is onto the base space, whereas a product manifold has two privileged projection maps, one onto each factor.

  53. 53.

    A right action of a group G on a space P is free if for any point \(x\in P\) and \(g\in G\), \(xg=x\) if and only if g is the identity.

  54. 54.

    As noted above, these should be understood in the context of the abstract index notation. The particular conventions introduced here follow Geroch (1996) closely.

  55. 55.

    See “Vector valued forms” for a degenerate instance of this criterion that may clarify how it works in practice.

  56. 56.

    By “fixed”, I mean that all of these maps have the same vector space as their codomain, i.e., the vector space does not vary from point to point of M as it would with a vector bundle over M. Another, more general, way of thinking about vector valued n forms is as mixed index tensors on M, \(\kappa ^A{}_{a_1\ldots a_n}(=\kappa ^A{}_{[a_1\ldots a_n]})\), whose single contravariant index is valued in the fibers of some vector bundle over M, rather than a single vector space V. But one has to be careful. If one adopts this more general perspective, one cannot extend the exterior derivative from ordinary forms to vector valued forms except in the presence of a linear connection on the vector bundle over M. See Palais (1981, pp.10–11, 30–32) for discussion.

  57. 57.

    I am grateful to Dick Palais (personal correspondence) for suggesting this way of thinking about the exterior derivative’s action on vector valued forms to me.

  58. 58.

    Given a manifold M, a point p, and two (tangent) vector fields \(\xi ^a\) and \(\eta ^a\) defined on some neighborhood O containing p, the commutator of \(\xi ^a\) and \(\eta ^a\), written \([\xi ,\eta ]^a\) or \([\xi ^a,\eta ^a]\), is defined by \([\xi ,\eta ]^a=\mathcal {L}_{\xi }\eta ^a\), where \(\mathcal {L}_{\xi }\) is the Lie derivative with respect to \(\xi ^a\), defined in Sect. 5. For more on the Lie derivative and commutator, see Malament (2012, §1.6).

  59. 59.

    The construction is regrettably complicated; see Malament (2012, §1.7) for a detailed discussion. Note, too, that a covariant derivative on a vector bundle is closely related to a connection on that bundle, in the sense described in appendix section “Connections and parallel transport”. Given a covariant derivative on a vector bundle, there is always a unique connection on the bundle that gives rise to the same standard of parallel transport; conversely, given any connection whose associated parallel transport generates linear maps between fibers (a so-called linear connection), there is a unique covariant derivative operator that gives rise to the same standard of parallel transport.

  60. 60.

    In general many vectors at x will have this property; the reason it does not matter which one chooses is that the exterior covariant derivative only acts on the horizontal part of vectors, relative to \(\omega ^{\mathfrak {A}}{}_{\alpha }\), and all vectors at x that project down to a given vector at \(\wp (x)=p\) have the same horizontal part.

  61. 61.

    As with the Riemann curvature tensor, the right hand side of this equation is independent of the values of \(\kappa ^A\) away from p. See Malament (2012, §1.8).

  62. 62.

    To see this, observe that the raised index on \(\Omega ^A{}_{\alpha \beta }\) may be thought of as “vertical valued”, since the vertical space at any point \(v^A\) of a vector bundle is canonically isomorphic to the fiber containing \(v^A\).

  63. 63.

    For more on volume elements, see Malament (2012, §1.11).


  1. Anandan, J. (1993). Remarks concerning the geometries of gravity and gauge fields. In B. L. Hu, M. P. Ryan, & C. V. Vishveshwara (Eds.), Directions in general relativity (pp. 10–20). New York: Cambridge University Press.

  2. Arntzenius, F. (2012). Space, time, and stuff. New York: Oxford University Press.

  3. Baez, J., & Munian, J. (1994). Gauge fields, knots and gravity. River Edge, NJ: World Scientific.

  4. Barrett, T. (2014). On the structure of classical mechanics. The British Journal of Philosophy of Science. doi:10.1093/bjps/axu005.

  5. Belot, G. (1998). Understanding electromagnetism. The British Journal for Philosophy of Science, 49(4), 531–555.

  6. Belot, G. (2003). Symmetry and gauge freedom. Studies in History and Philosophy of Modern Physics, 34(2), 189–225.

  7. Belot, G. (2007). The representation of time and change in mechanics. In J. Butterfield & J. Earman (Eds.), Philosophy of physics (pp. 133–228). Amsterdam: Elsevier.

  8. Bleecker, D. (1981). Gauge theory and variational principles. Reading, MA: Addison-Wesley. (Reprinted by Dover Publications in 2005).

  9. Brown, H. (2005). Physical relativity. New York: Oxford University Press.

  10. Butterfield, J. (2007). On symplectic reduction in classical mechanics. In J. Butterfield & J. Earman (Eds.), Philosophy of physics (pp. 1–132). Amsterdam: Elsevier.

  11. Catren, G. (2008). Geometrical foundation of classical Yang-Mills theory. Studies in History and Philosophy of Modern Physics, 39(3), 511–531.

  12. Curiel, E. (2013). Classical mechanics is Lagrangian; it is not Hamiltonian. The British Journal for Philosophy of Science, 65(2), 269–321.

  13. Earman, J. (1995). Bangs, crunches, whimpers, and shrieks. New York: Oxford University Press.

  14. Feintzeig, B., & Weatherall, J. O. (2014). The geometry of the ‘gauge argument’, unpublished manuscript.

  15. Friedman, M. (1983). Foundations of space-time theories. Princeton, NJ: Princeton University Press.

  16. Geroch, R. (1996). Partial differential equations of physics. In G. S. Hall & J. R. Pulham (Eds.), General relativity: Proceedings of the forty sixth Scottish Universities summer school in physics (pp. 19–60). Edinburgh: SUSSP Publications.

  17. Healey, R. (2001). On the reality of gauge potentials. Philosophy of Science, 68(4), 432–455.

  18. Healey, R. (2004). Gauge theories and holisms. Studies in History and Philosophy of Modern Physics, 35(4), 643–666.

  19. Healey, R. (2007). Gauging what’s real: The conceptual foundations of contemporary gauge theories. New York: Oxford University Press.

  20. Kobayashi, S., & Nomizu, K. (1963). Foundations of differential geometry (Vol. 1). New York: Interscience Publishers.

  21. Kolář, I., Michor, P. W., & Slovák, J. (1993). Natural operations in differential geometry. New York: Springer.

  22. Lee, J. M. (2009). Manifolds and differential geometry. Providence, RI: American Mathematical Society.

  23. Leeds, S. (1999). Gauges: Aharanov, Bohm, Yang, Healey. Philosophy of Science, 66(4), 606–627.

  24. Lyre, H. (2004). Holism and structuralism in \(U(1)\) gauge theory. Studies in History and Philosophy of Modern Physics, 35(4), 643–670.

  25. Malament, D. (2012). Topics in the foundations of general relativity and Newtonian gravitation theory. Chicago: University of Chicago Press.

  26. Maudlin, T. (2007). The metaphysics within physics (pp. 78–103, Ch. 3). New York: Oxford University Press.

  27. Michor, P. (2009). Topics in differential geometry. Providence, RI: American Mathematical Society.

  28. Myrvold, W. C. (2011). Nonseparability, classical, and quantum. The British Journal for the Philosophy of Science, 62(2), 417–432.

  29. Nakahara, M. (1990). Geometry topology and physics. Philadelphia: Institute of Physics Publishing.

  30. North, J. (2009). The ‘structure’ of physics: A case study. Journal of Philosophy, 106(2), 57–88.

  31. Palais, R. S. (1981). The geometrization of physics. Institute of Mathematics, National Tsing Hua University, Hsinchu, Taiwan, accessed from http://vmm.math.uci.edu/.

  32. Penrose, R., & Rindler, W. (1984). Spinors and space-time. New York: Cambridge University Press.

  33. Rosenstock, S., & Weatherall, J. O. (2015a). A categorical equivalence between generalized holonomy maps on a connected manifold and principal connections on bundles over that manifold. arXiv:1504.02401 [math-ph].

  34. Rosenstock, S., & Weatherall, J. O. (2015b). On holonomy and fiber bundle interpretations of Yang–Mills theory, unpublished manuscript.

  35. Swanson, N., & Halvorson, H. (2012). On North’s ‘the structure of physics’. http://philsci-archive.pitt.edu/9314/.

  36. Taubes, C. H. (2011). Differential geometry: Bundles, connections, metrics and curvature. New York: Oxford University Press.

  37. Trautman, A. (1965). Foundations and current problem of general relativity. In S. Deser & K. W. Ford (Eds.), Lectures on general relativity (pp. 1–248). Englewood Cliffs, NJ: Prentice-Hall.

  38. Trautman, A. (1980). Fiber bundles, gauge fields, and gravitation. In A. Held (Ed.), General relativity and gravitation (pp. 287–308). New York: Plenum Press.

  39. Wald, R. (1984). General relativity. Chicago: University of Chicago Press.

  40. Weatherall, J. O. (2015). Regarding the ‘Hole Argument’. The British Journal for Philosophy of Science (Forthcoming). arXiv:1412.0303.

  41. Wu, T. T., & Yang, C. N. (1975). Concept of nonintegrable phase factors and global formulation of gauge fields. Physical Review D, 12(12), 3845–3857.

Download references


This material is based upon work supported by the National Science Foundation under Grant No. 1331126. Special thanks are due to participants in my 2012 seminar on Gauge Theories, and especially Ben Feintzeig and Sarita Rosenstock for their many discussions on these topics. I am also particularly indebted to Dick Palais and Bob Geroch for enlightening correspondences on the geometrical foundations of Yang–Mills theory. Helpful conversations and correspondence with Dave Baker, Jeff Barrett, Gordon Belot, Erik Curiel, Katherine Brading, Richard Healey, David Malament, Oliver Pooley, Chris Smeenk, Bob Wald, David Wallace, and Chris Wüthrich have also contributed to the development of the ideas presented here. Erik Curiel, Sam Fletcher, and David Malament read the manuscript carefully and noted several slips (though remaining errors are, of course, my own!). Versions of this paper were presented to the Southern California Philosophy of Physics Group and at the Munich Center for Mathematical Philosophy; I am grateful to the participants there for discussion and comments.

Author information

Correspondence to James Owen Weatherall.

Appendix 1: Primer on the geometry of Yang–Mills theory for philosophers of space and time

Appendix 1: Primer on the geometry of Yang–Mills theory for philosophers of space and time

In this appendix, I provide background definitions and some further technical details regarding principal bundles and vector bundles. Unlike other presentations of this material, this appendix is targeted at philosophers of physics familiar with the formalism and conventions of the foundations of spacetime physics literature, as in (for instance) Friedman (1983), Wald (1984), Earman (1995), or Malament (2012). That said, this is not a complete pedagogical treatment of these topics. For further details, the books I find clearest are Palais (1981) and Bleecker (1981); Nakahara 1990) and Baez and Munian (1994) are also noteworthy. For additional background on the geometry, the classic source on the theory of connections on fiber bundles is Kobayashi and Nomizu (1963), which remains the most comprehensive text available, despite certain pre-modern tendencies; additional helpful sources are Kolář et al. (1993), Michor (2009), Lee (2009), and Taubes (2011). One important difference, however, between the presentation here and in the sources just cited is that, as noted in fn. 8, I use the “abstract index” notation developed by Penrose and Rindler (1984) and described in detail by Wald (1984) and Malament (2012), suitably modified to distinguish the range of vector spaces that one encounters in the theory of principal bundles.Footnote 50

Fiber bundles

A (smooth) Footnote 51 fiber bundle \(B\xrightarrow {\pi } M\) is a smooth, surjective map \(\pi \) from a manifold B to a manifold M satisfying the following condition: there exists a manifold F such that, given any point \(p\in M\), there exists an open neighborhood \(U\subseteq M\) containing p and a local trivialization of B over U, which is a diffeomorphism \(\zeta : U\times F\rightarrow \pi ^{-1}[U]\) such that \(\pi \circ \zeta :(q,f)\mapsto q\) for all \((q,f)\in U\times F\). The map \(\pi \) is called the projection map; the manifold B is called the total space of the bundle; the manifold M is called the base space; and the manifold F is called the typical fiber. The local trivialization condition guarantees that the collection of points in B mapped by \(\pi \) to a given point \(p\in M\), denoted \(\pi ^{-1}[p]\) and called the fiber at p, is an embedded submanifold of the total space diffeomorphic to the typical fiber. This fact supports the following picture: a fiber bundle may be thought of as an association of copies of F with each point of M in such a way that the result is “locally a product manifold” in much the same way that a manifold is “locally \(\mathbb {R}^n\)”. We will sometimes write \(F\rightarrow B\xrightarrow {\pi } M\) when we want to emphasize the typical fiber of a given fiber bundle; under other circumstances, when no ambiguity can arise, we will use just the total space B to refer to the entire bundle.

A fiber bundle morphism \((\Psi ,\psi ):(B\xrightarrow {\pi } M)\rightarrow (B{^\prime }\xrightarrow {\pi {^\prime }} M)\) is a pair of smooth maps \(\Psi :B \rightarrow B{^\prime }\) and \(\psi :M\rightarrow M{^\prime }\) such that \(\pi '\circ \Psi =\psi \circ \pi \). A fiber bundle morphism is said to be an isomorphism if both maps are diffeomorphisms. A fiber bundle isomorphism whose domain and codomain are the same (i.e., a fiber bundle automorphism) is said to be vertical if the associated map \(\psi :M\rightarrow M\) is the identity—i.e., if the automorphism takes the fiber over each point back to itself. A (local) section of a fiber bundle \(B\xrightarrow {\pi } M\) is a smooth map \(\sigma : U\rightarrow B\) such that \(\pi \circ \sigma = 1_U\), where \(U\subseteq M\) is open and \(1_U\) is the identity on M restricted to U. A local section of a fiber bundle may be thought of as a generalization of the ordinary notion of smooth (scalar, vector, tensor) “field” on a manifold: it is a smoothly-varying assignment of a fiber value to each point \(p\in U\).

Examples of fiber bundles include product manifolds with the projection onto one of the factors as the projection map, such as \(N\rightarrow M\times N\xrightarrow {pr_1} M\). A fiber bundle that can be written this way, i.e., any bundle admitting a global trivialization, is called a trivial bundle.Footnote 52 An example of a non-trivial fiber bundle is the Möbius strip, \((-1,1)\rightarrow M\ddot{o}\xrightarrow {\pi } S^1\), which has the circle as base space and an open subset of the real line as fiber. Here the fibers are “twisted” in such a way that \(M\ddot{o}\) is not isomorphic to the cylinder, \(S^1\times (-1,1)\).

Vector bundles and principal bundles

We will be particularly interested in two special classes of fiber bundles: principal bundles and vector bundles. A vector bundle is a fiber bundle \(V\rightarrow E \xrightarrow {\pi } M\) where the typical fiber V is a vector space and for each \(p\in M\), there exists a neighborhood U of p and a local trivialization \(\zeta :U\times V\rightarrow \pi ^{-1}[U]\) such that for any \(q\in U\), the map \(v\mapsto \zeta (q,v)\) is a vector space isomorphism. A smooth fiber bundle morphism \((\Psi ,\psi ):(E\xrightarrow {\pi } M)\rightarrow (E{^\prime }\xrightarrow {\pi {^\prime }} M{^\prime })\) is a vector bundle morphism if for each \(p\in M\), the restricted map \(\Psi _{|\pi ^{-1}[p]}:\pi ^{-1}[p]\rightarrow \pi {^\prime }^{-1}[\psi (p)]\) is linear; it is a vector bundle isomorphism if it is also a fiber bundle isomorphism.

A principal bundle, meanwhile, is a fiber bundle \(G\rightarrow P\xrightarrow {\wp } M\) where G is a Lie group—i.e., a smooth manifold endowed with a group structure in such a way that the group operations are smooth maps—and there is a smooth, free,Footnote 53 fiber-preserving right action of G on P such that given any point \(p\in M\), there exists a neighborhood U of p and a local trivialization \(\zeta :U\times G\rightarrow \wp ^{-1}[U]\) such that for any \(q\in U\) and any \(g,g'\in G\), \(\zeta (q,g)g{^\prime }=\zeta (q,gg{^\prime })\). The group G is known as the structure group of the bundle. A principal bundle morphism consists of a fiber bundle morphism \((\Psi ,\psi ):(G\rightarrow P\xrightarrow {\wp } M)\rightarrow (G{^\prime }\rightarrow P{^\prime }\xrightarrow {\wp {^\prime }} M{^\prime })\) and a smooth homomorphism \(h:G\rightarrow G{^\prime }\) such that for any \(x\in P\) and \(g\in G\), \(\Psi (xg)=\Psi (x)h(g)\). A principal bundle morphism is a principal bundle isomorphism if \((\Psi ,\psi )\) and h are isomorphisms; it is a vertical principal bundle automorphism if \((\Psi ,\psi )\) is a vertical bundle automorphism and h is the identity map.

Notational conventions

Let \(F\rightarrow B\xrightarrow {\pi }M\) be a fiber bundle. I will adopt the following notational conventions.Footnote 54 I will use lower-case Latin indices \(a,b,c,\ldots \) for vectors and tensors that are tangent to the base space of a bundle, or generically when I am considering manifolds outside the context of a particular bundle. Lower-case Greek indices \(\alpha ,\beta ,\gamma \ldots \) will label vectors and tensors that are tangent to the total space of a bundle. So given a point \(x\in B\), a vector at x would be denoted (for instance) by \(\xi ^{\alpha }\), while a vector at \(\pi (x)\) would be denoted by \(\eta ^a\). Capital Latin indices \(A,B,C,\ldots \) will label vectors and tensors valued in other spaces, including the fibers of vector bundles. In cases where there are several such vector spaces under consideration, further decorations will be used to distinguish membership in the different spaces. Finally, if a given vector space has a Lie algebra structure, we will label vectors in that space using capital Fraktur indices \(\mathfrak {A},\mathfrak {B},\mathfrak {C},\ldots \). In all cases, raised indices will indicate that an object is an element of a salient vector space; lowered indices will indicate that the object is a linear functional on the corresponding vector space.

This notation allows one to consider “mixed index” tensors and tensor fields on a manifold M, which represent multilinear maps between various vector spaces associated with a point of M. For instance, given a vector \(\xi ^{\alpha }\) at a point x in the total space of a fiber bundle, one may think of the pushforward along \(\pi \) at x, \((\pi _x)_*\), as a linear map from vectors at x to vectors at \(\pi (x)\), which might then be written as \((\nabla \pi )^a{}_{\alpha }\). In fact, this notation also subsumes the pullback along \(\pi \), since given a covector \(\kappa _a\) at \(\pi (x)\), \(\kappa _a(\nabla \pi )^a{}_{\alpha }\) is precisely the covector at x whose action on a vector \(\xi ^{\alpha }\) is the action of \(\kappa _a\) on the pushforward of \(\xi ^{\alpha }\), i.e., \((\kappa _a(\nabla \pi )^a{}_{\alpha })\xi ^{\alpha }=\kappa _a((\nabla \pi )^a{}_{\alpha }\xi ^{\alpha })\). In what follows, I will freely adopt both the mixed index and \(*\) notations for the pushforward and pullback, depending on context. Other examples of mixed index tensors include principal connections (discussed in appendix section “Principal connections”), curvature tensors (discussed in appendix section “Curvature”), and solder forms (discussed in Sect. 5). Mixed index tensors on a manifold M are said to be smooth if their contraction with appropriate collections of smooth vectors (and covectors) is a smooth scalar field.Footnote 55

Tangent and frame bundles

Any manifold M is naturally associated with a vector bundle over M, known as the tangent bundle. Let \(T_pM\) be the tangent space at \(p\in M\) and let TM be the set of all of the tangent vectors at all point of M, \(TM= \bigcup _{p\in M} T_pM\). A manifold structure on TM may be induced as follows. First note that any point \(x\in TM\) may be written as \((p,\xi ^a)\), where \(\xi ^a\) is a tangent vector at p. Then, given any chart \((U,\varphi )\) on M, one can associate a point \((p,\xi ^a)\in TM\) with an element of \(\mathbb {R}^{2n}\) (where n is the dimension of M) by \((p,\xi ^a)\mapsto (\varphi ^1(p),\ldots ,\varphi ^n(p),\overset{1}{\xi },\ldots \overset{n}{\xi })\). Here \(\varphi ^i(p)\) represents the ith coordinate of p relative to the chosen chart, and \(\overset{i}{\xi }\) is the ith component of \(\xi ^a\) in the basis determined by the chart. Requiring all such maps, for every chart on M, to be smooth and smoothly invertible induces a manifold structure on TM. The map \(\pi _T:TM\rightarrow M\) that takes elements \(x\in TM\) to the point \(p\in M\) to which they correspond—that is, \(\pi _T:(p,\xi ^a)\mapsto p\)—is smooth relative to this manifold structure. Similarly, the maps used to induce the manifold structure on TM are local trivializations relative to which \(\mathbb {R}^n\rightarrow TM\xrightarrow {\pi _T} M\) is a vector bundle over M. Sections of the tangent bundle are (ordinary, tangent) vector fields, i.e., smooth assignments of tangent vectors to points of (an open subset of) a manifold. Similar constructions may be used to define the cotangent bundle, \(T^*M\xrightarrow {\pi _{T^*}}M\), and bundles of rank (rs) tensors on M.

The manifold M also naturally determines a principal bundle over M, known as the frame bundle. The construction is similar to the tangent bundle. Suppose n is the dimension of M. Then a frame at a point p is an ordered collection of n linearly independent vectors at p. Let LM be the collection of all frames at all points of M. Analogously to the tangent bundle, any point \(x\in LM\) may be written (pu), where \(u=(\overset{1}{u}{}^a,\ldots ,\overset{n}{u}{}^a)\) is a frame at p. Now given a chart \((U,\varphi )\) on M, we may associate any point \((p,u)\in LM\) with an element of \(\mathbb {R}^{n^2+n}\) by \((p,u)\mapsto (\varphi ^1(p),\ldots ,\varphi ^n(p),\overset{11}{u},\ldots ,\overset{ij}{u},\ldots ,\overset{nn}{u})\), where now \(\overset{ij}{u}\) is the jth component in the basis determined by the chart of \(\overset{i}{u}{}^a\), the ith element of the frame. The image of this map is \(\varphi [U]\times F\), where \(F\subset \mathbb {R}^{n^2}\) is the collection of all \(n^2\)-tuples corresponding to invertible \(n\times n\) matrices. Thus F is an open subset of \(\mathbb {R}^{n^2}\) (because the determinant map \(det:\mathbb {R}^{n^2}\rightarrow \mathbb {R}\) is continuous and therefore the set of matrices with determinant 0, \(det^{-1}[0]\), is closed), diffeomorphic to the Lie group \(GL(n,\mathbb {R})\) of invertible real valued matrices. (The Lie algebra associated with \(GL(n,\mathbb {R})\), \(\mathfrak {gl}(n,\mathbb {R})\), is the algebra of all \(n\times n\) matrices, not necessarily invertible.) Requiring all such maps, for all charts on M, to be smooth and smoothly invertible induces a manifold structure on LM. The map \(\wp _L:LM\rightarrow M\) where \(\wp _L:(p,u)\mapsto p\) is smooth with respect to this structure, and once again the chart-relative maps defined above are local trivializations relative to which \(GL(n,\mathbb {R})\rightarrow LM\xrightarrow {\wp _L} M\) is a principal bundle, where the associated right \(GL(n,\mathbb {R})\) action corresponds to a smooth change of frame at each point. Sections of the frame bundle are (local) frame fields, which are smoothly varying bases of the tangent space assigned to each point of (an open subset of) a manifold.

More generally, given any vector bundle \(V\rightarrow E\rightarrow M\), one may readily construct an associated principal bundle \(GL(V)\rightarrow LE\rightarrow M\), called the frame bundle for E, whose fibers correspond to the frames for the fibers of E.

Associated bundles

The frame bundle construction provides a sense in which a given vector bundle may be used to build a principal bundle. But one can also move in the other direction, from a given principal bundle to a vector bundle. Let \(G\rightarrow P\xrightarrow {\wp } M\) be a principal bundle, and let V be some vector space with a (fixed) representation \(\rho :G\rightarrow GL(V)\) of G. Now for each point \(p\in M\), consider maps \(v^A:\wp ^{-1}[p]\rightarrow V\) that are equivariant in the sense that for any \(x\in \wp ^{-1}[p]\), \(v^A(xg)=(\rho (g^{-1}))^A{}_B v^B\) and smooth in the sense that, given any fixed linear functional \(u_A\) on V, \(u_Av^A\) is a smooth scalar field on \(\pi ^{-1}[p]\). Let \(P\times _{G} V\) denote the set of all such maps, for all points \(p\in M\). Then there is a unique manifold structure on \(P\times _{G} V\) such that \(V\rightarrow P\times _G V\xrightarrow {\pi _{\rho }} M\), where \(\pi _{\rho }:(v^A:\pi ^{-1}[p]\rightarrow V)\mapsto p)\), is a vector bundle over M. This bundle is called an associated vector bundle. Under this construction, if V is an n dimensional vector space, then any vector bundle \(V\rightarrow E\xrightarrow {\pi } M\) is isomorphic to \(LE\times _{GL(V)} V\xrightarrow {\pi _{\rho }}M\), for any faithful representation of GL(V) on V, where \(LE\xrightarrow {\wp _L} M\) is the frame bundle for E.

Vector valued forms

Let M be a manifold and let V be a vector space. A vector valued n form (or a V valued n form) on an open set U is a mixed index tensor field \(\kappa ^A{}_{a_1\ldots a_n}\) that is totally antisymmetric in its covariant (tangent) indices, and where the A index indicates membership in the (fixed) vector space V.Footnote 56 These fields are required to satisfy the following (degenerate) smoothness condition: given any (fixed) linear functional \(\beta _A\) on V, we require that \(\beta _A\alpha ^A{}_{a_1\ldots a_n}\) is a smooth (ordinary) n form on U. Note that from this perspective, ordinary n forms are also vector valued forms, where the vector space in which they are valued is \(\mathbb {R}\); in such cases, one simply drops the index corresponding to membership in \(\mathbb {R}\).

Recall that there is a natural notion of differentiation on (ordinary, \(\mathbb {R}\) valued) n forms on M, given by the exterior derivative \(d_a\), which takes n forms \(\kappa _{a_1\ldots a_n}\) to \((n+1)\) forms \(d_a\kappa _{a_1\ldots a_n}\). The exterior derivative extends to vector valued n forms as follows. Given a (smooth) V valued n form \(\kappa ^A{}_{a_1\ldots a_n}\), we take the exterior derivative of \(\kappa ^A{}_{a_1\ldots a_n}\), written \(d_a\kappa ^A{}_{a_1\ldots a_n}\), to be the unique V valued \((n+1)\) form whose action on any (fixed) linear functional \(\beta _A\) on V is given by \(\beta _A(d_n\alpha ^A{}_{a_1\ldots a_n})=d_n(\beta _A\alpha ^A{}_{a_1\ldots a_n})\), where the expression on the right should be interpreted as the (ordinary) exterior derivative of the (ordinary, smooth) n form \(\beta _A\alpha ^A{}_{a_1\ldots a_n}\).Footnote 57 Similarly, given a smooth map \(\varphi :M\rightarrow N\) and a V valued n form \(\alpha ^A{}_{a_1\ldots a_n}\) on N, one can define the pullback of \(\alpha ^A{}_{a_1\ldots a_n}\) along \(\varphi \), \(\varphi ^*(\alpha ^A{}_{a_1\ldots a_n})\), as the unique V valued n form on M such that, given any (fixed) linear functional \(\beta _A\) acting on V, \(\beta _A\varphi ^*(\alpha ^A{}_{a_1\ldots a_n})=\varphi ^*(\beta _A\alpha ^A{}_{a_1\ldots a_n})\).

Finally, consider the special case of vector valued n forms on the total space P of a principal bundle \(G\rightarrow P\xrightarrow {\wp } M\) (or, more generally, forms defined on \(\wp ^{-1}[U]\), for some open \(U\subset M\)). In this context, it is often natural to fix a representation \(\rho \) of G, the structure group of the bundle, on the vector spaces in which forms of interest are valued. One can then consider forms that are equivariant with respect to the G action on P, in the sense that, given a V valued n form \(\kappa ^A{}_{\alpha _1\ldots \alpha _n}\) on P, \(\kappa ^A{}_{\alpha _1\ldots \alpha _n}\) is such that given any element \(g\in G\), any point \(x\in P\), and any vectors \(\overset{1}{\eta }{}^{\alpha },\ldots ,\overset{n}{\eta }{}^{\alpha }\) at x, the following condition holds:

$$\begin{aligned} (\kappa ^{A}{}_{\alpha _1\ldots \alpha _n})_{|xg}(R_g)_*(\overset{1}{\eta }{}^{\alpha _1}\ldots \overset{n}{\eta }{}^{\alpha _n})=((\rho (g^{-1}))^{A}{}_{B}\kappa ^{B}{}_{\alpha _1\ldots \alpha _n})_{|x}\overset{1}{\eta }{}^{\alpha _1}\ldots \overset{n}{\eta }{}^{\alpha _n}. \end{aligned}$$

Here \((\rho (g^{-1}))^{A}{}_{B}\) is the tensor acting on V corresponding to \(g^{-1}\) in the representation \(\rho \), and \(R_g\) is the smooth right action of g on P. Note that this equation makes sense, since both sides are elements of the fixed vector space V. Note, too, that, following the construction of the previous section, we now see that sections \(\kappa ^A:U\rightarrow P\times _G V\) of the associated bundle \(V\rightarrow P\times _G V\xrightarrow {\pi _{\rho }}M\) may be identified with equivariant V valued 0 forms on \(\wp ^{-1}[U]\).

Connections and parallel transport

We now turn to connections. Some preliminary definitions are in order. Fix an arbitrary fiber bundle \(F\rightarrow B\xrightarrow {\pi } M\) and let \(\xi ^{\alpha }\) be a vector at a point \(x\in B\). We will say \(\xi ^{\alpha }\) is vertical if \((\nabla \pi )^a{}_{\alpha }\xi ^{\alpha }=\mathbf {0}\). Since \((\nabla \pi )^a{}_{\alpha }\) is a linear map, its kernel forms a linear subspace of \(T_x B\), written \(V_x\) and called the vertical subspace at x; in general, the dimension of \(V_x\) will be the dimension of F. Indeed, one can think of the vertical vectors at x as “tangent to the fiber” in the precise sense that they are tangents to curves through x that remain in the fiber over \(\pi (x)\). If a vector \(\xi ^{\alpha }\) at a point x is not vertical, then it is horizontal. A subspace \(H_x\) of \(T_xB\) will be called a horizontal subspace if any vector \(\xi ^{\alpha }\) at x may be uniquely written as the sum of one vector in \(V_x\) and one vector in \(H_x\). It immediately follows that, given any horizontal subspace \(H_x\) at x, \((\nabla \pi )^a{}_{\alpha }:H_x\rightarrow T_{\pi (x)}\) is a vector space isomorphism, and the dimension of any horizontal subspace is the same as the dimension of M.

Though the vertical subspace is uniquely determined by the map \(\pi \), there is considerable freedom in the choice of horizontal subspace. A connection on B, roughly speaking, is a smoothly varying choice of horizontal subspace at each point \(x\in B\). This idea of “smoothly varying” may be made precise as follows. Given any point \(x\in B\) and a horizontal subspace \(H_x\), one can always find a tensor \(\omega ^{\alpha |}{}_{\beta }\) that acts on any vector \(\xi ^{\alpha }\) at x by projecting \(\xi ^{\alpha }\) onto its (unique, relative to \(H_x\)) vertical component, \(\omega ^{\alpha |}{}_{\beta }\xi ^{\beta }\). (Here we have used a | following a Greek index to emphasize that the index is vertical.) Conversely, given any tensor \(\omega ^{\alpha |}{}_{\beta }\) at x such that (a) \(\omega ^{\alpha |}{}_{\beta }\omega ^{\beta |}{}_{\kappa }=\omega ^{\alpha |}{}_{\kappa }\) and (b) given any vertical vector \(\xi ^{\alpha |}\) at x, \(\omega ^{\alpha |}{}_{\beta }\xi ^{\beta |}=\xi ^{\alpha |}\), one can always define a horizontal subspace \(H_x\) at x as the kernel of \(\omega ^{\alpha |}{}_{\beta }\). This permits us to adopt the following official definition of a connection: a connection on a fiber bundle \(B\xrightarrow {\pi } M\) is a smooth tensor field \(\omega ^{\alpha |}{}_{\beta }\) on B satisfying conditions (a) and (b) above. Every fiber bundle admits connections.

A connection provides a notion of parallel transport of fiber values along curves in the base space M. This works as follows. Consider a smooth curve \(\gamma :I\rightarrow M\). One can always lift such a curve \(\gamma \) into the total space B, by defining a new curve \(\hat{\gamma }:I\rightarrow B\) with the property that \(\pi \circ \hat{\gamma }=\gamma \). This curve \(\hat{\gamma }\) is generally not unique. One gets a unique lift by specifying some additional data: fix a connection \(\omega ^{\alpha |}{}_{\beta }\) and choose some \(t_0\in I\) and some \(x\in \pi ^{-1}[\gamma (t_0)]\)—that is, choose some point in the fiber above \(\gamma (t_0)\). Then there is a unique horizontal lift of \(\gamma \) through x, that is, a unique lift \(\hat{\gamma }:I\rightarrow E\) such that (1) \(\hat{\gamma }(t_0)=x\) and (2) the vector tangent to \(\hat{\gamma }\) at each point in its image is horizontal relative to \(\omega ^{\alpha |}{}_{\beta }\). Then, given any \(t\in I\), we say the parallel transport of x to \(\gamma (t)\) along \(\gamma \) is \(\hat{\gamma }(t)\), which is a point in the fiber above \(\gamma (t)\).

Principal connections

Now suppose one has a principal bundle \(G\rightarrow P\xrightarrow {\wp } M\). Here, one is interested in connections that are compatible with the principal bundle structure in the sense that they are equivariant under the right action of G on P. A bit of work is required to make this precise. First, recall that given any Lie group G, \(T_eG\), the tangent space at the identity element, is endowed with a natural Lie algebra structure induced from the Lie group structure as follows. Take any vector \(\xi ^a\in T_eG\). One can define a (smooth) vector field on G, called a left-invariant vector field, by assigning to each point \(g\in G\) the vector \(({}^g\ell _e)_*(\xi ^a)\), i.e., the pushforward of \(\xi ^a\) along the left action on G determined by g. The Lie bracket of two vectors at e, then, is just the ordinary commutator of the corresponding vector fields induced by the left action.Footnote 58 The Lie algebra associated in this way with a Lie group G is often denoted \(\mathfrak {g}\). The Lie algebra \(\mathfrak {g}\), understood as a vector space, comes with a privileged representation of G, called the adjoint representation, \(ad:G\rightarrow GL(\mathfrak {g})\), defined by \((ad(g))^{\mathfrak {A}}{}_{\mathfrak {B}}=(\nabla \Upsilon ^g_{|e})^{\mathfrak {A}}{}_{\mathfrak {B}}\), where (1) \(\Upsilon ^g:G\rightarrow G\) acts as \(\Upsilon ^g:h\mapsto ghg^{-1}\), and (2) \((\nabla \Upsilon ^g_{|e})^{\mathfrak {A}}{}_{\mathfrak {B}}\) should be understood as the pushforward along \(\Upsilon ^g\) at the identity, which maps \(T_eG\) to itself because \(\Upsilon ^g(e)=geg^{-1}=e\) for every \(g\in G\).

The right action of the structure group G on the principal bundle P allows us to define a canonical isomorphism between the vertical space \(V_x\) at each point \(x\in P\) and the Lie algebra \(\mathfrak {g}\) associated with G. Given any vector \(\xi ^{\mathfrak {A}}\in T_eG\), let \(\gamma _{\xi }:I\rightarrow G\) be the (sufficiently unique) integral curve of the left-invariant vector field associated with \(\xi ^{\mathfrak {A}}\). We assume \(\gamma (0)=e\). Then, given any point \(x\in P\), one can define a curve \(\tilde{\gamma }_{\xi }:I\rightarrow P\) through x by setting \(\tilde{\gamma }_{\xi }(t)=x\gamma _{\xi }(t)\). We will take the tangent to this curve, \(\overrightarrow{(\tilde{\gamma }_{\xi })}^{\alpha |}\), which is necessarily vertical because the right action of G on P is fiber-preserving, to be the vertical vector at x associated with \(\xi ^{\mathfrak {A}}\). This construction defines a linear bijection that can be represented by a mixed index tensor field on P, \(\mathfrak {g}^{\mathfrak {A}}{}_{\alpha |}\), with inverse \(\mathfrak {g}^{\alpha |}{}_{\mathfrak {A}}\). (Here the | next to a covariant index means that \(\mathfrak {g}^{\mathfrak {A}}{}_{\alpha |}\) is only defined for vertical vectors.) Thus, we can always think of vertical vectors at a point x of a principal bundle as elements of a fixed Lie algebra, independent of x.

The isomorphism just described allows one to think of a connection \(\omega ^{\alpha |}{}_{\beta }\) on a principal bundle as a vector valued one form (sometimes called a Lie algebra valued one form), given by \(\omega ^{\mathfrak {A}}{}_{\beta }=\mathfrak {g}^{\mathfrak {A}}{}_{\alpha |}\omega ^{\alpha |}{}_{\beta }\). Since this is a vector valued form on P, we also fix a representation of G on \(\mathfrak {g}\), the vector space in which the form is valued (recall appendix section “Vector valued forms”); here, as with all Lie algebra valued forms we will consider, we take G to act on \(\mathfrak {g}\) in the adjoint representation. Finally, we may define a principal connection as a connection \(\omega ^{\mathfrak {A}}{}_{\alpha }\) on P that is equivariant in the sense of Eq. (6). This condition guarantees that for any point \(x\in P\) and any \(g\in G\), the horizontal subspace determined by \(\omega ^{\mathfrak {A}}{}_{\beta }\) at xg equals the pushforward of the horizontal space at x along the right action of g, i.e., \(({}^gR)_*[H_x]=H_{xg}\). Every principal bundle admits principal connections.

Covariant derivatives

Consider a vector bundle \(V\rightarrow E\xrightarrow {\pi } M\). A covariant derivative operator \(\nabla \) on E is a map from sections \(\sigma ^A:U\rightarrow E\) of E to mixed index tensor fields \(\sigma ^A\mapsto \nabla _a\sigma ^A\) on U satisfying the following conditions: (1) given two smooth sections \(\sigma ^A,\nu ^A:U\rightarrow M\), \(\nabla _a(\sigma ^A + \nu ^A)=\nabla _a\sigma ^A + \nabla _a\nu ^A\) and (2) given any smooth scalar field \(\lambda :M\rightarrow \mathbb {R}\), \(\nabla _a (\lambda v^A)=v^A d_a\lambda + \lambda \nabla _a v^A\), where \(d_a\) is the exterior derivative. The covariant derivative of a section \(\sigma ^A:U\rightarrow E\) has the following interpretation. Given a vector \(\xi ^a\) at a point \(p\in U\), \(\xi ^a\nabla _a\sigma ^A\) is the derivative of \(\sigma ^A\) in the direction of \(\xi ^a\), relative to a standard of fiber-to-fiber constancy given by \(\nabla \). The covariant derivatives one encounters in general relativity are special cases of this more general definition, where the vector bundle in question is the tangent bundle (and, by extension, various bundles of tensors constructed out of tangent vectors). Note that a covariant derivative operator on an arbitrary vector bundle also provides a notion of parallel transport, by a construction directly analogous to that for covariant derivatives on the tangent bundle.Footnote 59

Exterior and induced covariant derivatives

Now consider a principal bundle \(G\rightarrow P\xrightarrow {\wp } M\) endowed with a principal connection \(\omega ^{\mathfrak {A}}{}_{\alpha }\) and a vector space V with a fixed representation \(\rho :G\rightarrow GL(V)\) of the structure group of P on V. In this case, we can define a second notion of differentiation for V valued n forms, called the exterior covariant derivative relative to \(\omega ^{\mathfrak {A}}{}_{\alpha }\). It is denoted by \(\overset{\omega }{D}\). The action of \(\overset{\omega }{D}\) on a V valued n form \(\kappa ^A{}_{\alpha _1\ldots \alpha _n}\) on \(U\subseteq P\) is given by \(\overset{\omega }{D}_{\alpha }\kappa ^A{}_{\alpha _1\ldots \alpha _n}=(d_{\beta }\kappa ^A{}_{\beta _1\ldots \beta _n})\bar{\omega }{}^{\beta }{}_{\alpha }\bar{\omega }{}^{\beta _1}{}_{\alpha _1}\ldots \bar{\omega }{}^{\beta _n}{}_{\alpha _n}\), where d is the ordinary exterior derivative and where \(\bar{\omega }^{\alpha }{}_{\beta }=\delta ^{\alpha }{}_{\beta }-\mathfrak {g}^{\alpha |}{}_{\mathfrak {A}}\omega ^{\mathfrak {A}}{}_{\beta }\) is the horizontal projection relative to \(\omega ^{\mathfrak {A}}{}_{\alpha }\). (Recall appendix section “Principal connections”.)

A V valued n form \(\kappa ^A{}_{\alpha _1\ldots \alpha _n}\) on \(\wp ^{-1}[U]\), for some open \(U\subseteq M\), is said to be horizontal and equivariant if (1) it is horizontal in the sense that given any vertical vector \(\xi ^{\alpha }\) at a point \(p\in U\), \(\kappa ^A{}_{\alpha _1\ldots \alpha _i\ldots \alpha _n}\xi ^{\alpha _i}=\mathbf {0}\) for all \(i=1,\ldots , n\) and (2) it is equivariant in the sense of Eq. (6) . The important feature of the exterior covariant derivative is that if a V valued n form \(\kappa ^A{}_{\alpha _1\ldots \alpha _n}\) is horizontal and equivariant, so is its exterior covariant derivative, \(\overset{\omega }{D}_{\alpha }\kappa ^A{}_{\alpha _1\ldots \alpha _n}\). In the special case where V is the Lie algebra associated with the principal bundle and \(\kappa ^{\mathfrak {A}}{}_{\alpha _1\ldots \alpha _n}\) is a horizontal and equivariant Lie algebra valued n form, we have the relation \(\overset{\omega }{D}_{\alpha }\kappa ^{\mathfrak {A}}{}_{\alpha _1\ldots \alpha _n}=d_{\alpha }\kappa ^{\mathfrak {A}}{}_{\alpha _1\ldots \alpha _n}+[\omega ^{\mathfrak {A}}{}_{\alpha },\kappa ^{\mathfrak {A}}{}_{\alpha _1\ldots \alpha _n}]\), where the bracket is the Lie bracket.

Finally, recall that in appendix section “Vector valued forms”, we observed that sections \(\kappa ^A:U\rightarrow P\times _G V\) of the associated vector bundle \(V\rightarrow P\times _G V\xrightarrow {\pi _{\rho }} M\) determined by P, V, and \(\rho \) are naturally understood as V valued 0 forms on (subsets of) P. We now see that in fact they are horizontal and equivariant 0 forms. We may thus define an induced covariant derivative operator \(\overset{\omega }{\nabla }\) on \(P\times _G V\) as follows: given any section \(\kappa ^A:U\rightarrow P\times _G V\) of \(P\times _G V\), we take \(\overset{\omega }{\nabla }_a\kappa ^A\) to be the unique mixed index tensor on U with the property that, given any point \(p\in U\) and any vector \(\xi ^a\) at p, \(\xi ^a\overset{\omega }{\nabla }_a\kappa ^A\) is the vector in the fiber of \(P\times _G V\) over p corresponding to the equivariant V valued 0 form \(\xi ^{\alpha }\overset{\omega }{D}_{\alpha }\kappa ^A\) defined on the fiber of P over p, where \(\xi ^{\alpha }\) is any vector field on the fiber with the property that at every point \(x\in \wp ^{-1}[p]\), \((\nabla \wp )^a{}_{\alpha }\xi ^{\alpha }=\xi ^a\).Footnote 60 Note that this fully determines the action of \(\overset{\omega }{\nabla }\) on sections of \(P\times _G V\), and that, with this definition, \(\overset{\omega }{\nabla }\) is both additive and satisfies the Leibniz rule (recall appendix section “Covariant derivatives”). Conversely, given a principal connection on a principal bundle, this construction yields a unique covariant derivative operator on every associated vector bundle, and given a covariant derivative operator on a vector bundle, there is always a unique principal connection on the frame bundle of that vector bundle that induces the covariant derivative in this way.


In general, the standards of parallel transport given by a principal connection or a covariant derivative operator are path-dependent. The degree of path-dependence is measured by the curvature of a connection or derivative operator. Given a principal bundle \(G\rightarrow P\xrightarrow {\wp } M\) and a principal connection \(\omega ^{\mathfrak {A}}{}_{\alpha }\) on P, the curvature of \(\omega ^{\mathfrak {A}}{}_{\alpha }\) is a horizontal and equivariant Lie algebra valued two form \(\Omega ^{\mathfrak {A}}{}_{\alpha \beta }\) on P, defined by

$$\begin{aligned} \Omega ^{\mathfrak {A}}{}_{\alpha \beta } = \overset{\omega }{D}_{\alpha }\omega ^{\mathfrak {A}}{}_{\beta }. \end{aligned}$$

Its interpretation is as follows. Given a point \(p\in M\), an infinitesimal closed curve through p may be represented by a pair of vectors, \(\xi ^a\) and \(\eta ^a\), at p, corresponding to the “incoming” and “outgoing” directions of the curve at p. Given an arbitrary point \(x\in \wp ^{-1}[p]\), the (infinitesimal, or limiting) parallel transport of x along this infinitesimal curve in M is encoded by the vertical vector \(\mathfrak {g}^{\kappa |}{}_{\mathfrak {A}}\Omega ^{\mathfrak {A}}{}_{\alpha \beta }\xi ^{\alpha }\eta ^{\beta }\) at x, where \(\xi ^{\alpha }\) and \(\eta ^{\beta }\) are arbitrary vectors at x with the properties that \((\nabla \wp )^a{}_{\alpha }\xi ^{\alpha }=\xi ^a\) and \((\nabla \wp )^a{}_{\alpha }\eta ^{\alpha }=\eta ^a\), respectively. This vertical vector represents the direction and magnitude of displacement of x under the infinitesimal parallel transport.

It is often convenient to express this curvature in a slightly different form, using the so-called “structure equation” (Bleecker 1981, p. 37):

$$\begin{aligned} \Omega ^{\mathfrak {A}}{}_{\alpha \beta } = d_{\alpha }\omega ^{\mathfrak {A}}{}_{\beta } + \frac{1}{2} \left[ \omega ^{\mathfrak {A}}{}_{\alpha },\omega ^{\mathfrak {A}}{}_{\beta }\right] . \end{aligned}$$

Here the bracket \([\cdot ,\cdot ]\) is the Lie bracket on the Lie algebra \(\mathfrak {g}\). It is in this form that the curvature appears in Sect. 2. (Observe that this relation is a special case of the general fact concerning exterior covariant derivatives of Lie algebra valued forms stated in appendix section “Exterior and induced covariant derivatives”.)

Now suppose one has a vector bundle \(V\rightarrow E\xrightarrow {\pi } M\) with a covariant derivative \(\nabla \) on E. In this case, we may define the curvature tensor \(R^A{}_{Bcd}\) as the unique mixed index tensor on M such that, given any section \(\kappa ^A:U\rightarrow E\), the action of \(R^A{}_{Bcd}\) on \(\kappa ^A\) at any point \(p\in U\) is:Footnote 61

$$\begin{aligned} R^A{}_{Bcd}\kappa ^B=-2\nabla _{[c}\nabla _{d]}\kappa ^A. \end{aligned}$$

Note that in the special case where the vector bundle is the tangent bundle, this curvature tensor corresponds exactly to the Riemann tensor. To see the relationship between the curvature tensors defined in Eqs. (7) and (9) more clearly, note that \(R^A{}_{Bcd}\) may be thought of as a mixed index tensor \(\Omega ^A{}_{\alpha \beta }\) on E, whose action at a point \(v^A\) of E on vectors \(\xi ^{\alpha },\eta ^{\alpha }\) at \(v^A\) is given by \((\Omega ^A{}_{\alpha \beta }\xi ^{\alpha }\eta ^{\beta })_{|v^A}=R^A{}_{Bcd}v^B(\nabla \pi )^c{}_{\alpha }\xi ^{\alpha }(\nabla \pi )^d{}_{\beta }\eta ^{\beta }\). In this form, the interpretation given above for \(\Omega ^{\mathfrak {A}}{}_{\alpha \beta }\) carries over essentially unchanged.Footnote 62

Hodge star operation on horizontal and equivariant vector valued forms

Now suppose we have a principal bundle \(G\rightarrow P\xrightarrow {\wp } M\), a principal connection \(\omega ^{\mathfrak {A}}{}_{\alpha }\) on P, a metric \(g_{ab}\) on M, and a volume element \(\epsilon _{a_1\ldots a_n}\) on M.Footnote 63 (Here we assume M is n dimensional.) Then, given any vector space V, we may define a Hodge star operation, \(\star \), on horizontal and equivariant V valued forms on P. First, note that the volume form \(\epsilon _{a_1\ldots a_n}\) on M determines an n form \(\hat{\epsilon }_{\alpha _1\ldots \alpha _n}=\wp ^*(\epsilon _{a_1\ldots a_n})=\epsilon _{a_1\ldots a_n}(\nabla \wp )^{a_1}{}_{\alpha _1}\ldots (\nabla \wp )^{a_n}{}_{\alpha _n}\) on P. (Observe that although \(\hat{\epsilon }_{\alpha _1\ldots \alpha _n}\) is not a volume form on P, it is of maximal rank in the sense that any horizontal and equivariant form on P with rank greater than n vanishes.) Next, we define a smooth mixed index tensor field on P, \(\bar{\omega }^{\alpha }{}_{a}\), which is uniquely characterized by the properties that (1) \((\nabla \wp )^a{}_{\alpha }\bar{\omega }^{\alpha }{}_{b}=\delta ^a{}_b\) and (2) \(\omega ^{\mathfrak {A}}{}_{\alpha }\bar{\omega }^{\alpha }{}_a=\mathbf {0}\). At any point \(x\in P\), this tensor maps vectors \(\xi ^a\) at \(\wp (x)\) to the unique horizontal vector \(\xi ^{\alpha }=\bar{\omega }{}^{\alpha }{}_a\xi ^a\) at x satisfying \((\nabla \wp )^a{}_{\alpha }\xi ^{\alpha }=\xi ^a\). Then, for any \(k\le n\), we may define a tensor field \(\hat{\epsilon }_{\alpha _1\ldots \alpha _{n-k}}{}^{\beta _1\ldots \beta _k}=\epsilon _{a_1\ldots a_{n-k}}{}^{b_1\ldots b_k}(\nabla \wp )^{a_1}{}_{\alpha _1}\ldots (\nabla \wp )^{a_{n-k}}{}_{\alpha _{n-k}}\bar{\omega }^{\beta _1}{}_{b_1}\ldots \bar{\omega }^{\beta _k}{}_{b_k}\). (Here the indices on \(\epsilon _{a_1\ldots a_n}\) are raised with \(g^{ab}\).) Finally, given any V valued horizontal and equivariant k form \(\kappa ^A{}_{\alpha _1\ldots \alpha _k}\), we may define a horizontal and equivariant V valued \((n-k)\) form, \(\star \kappa ^A{}_{\alpha _1\ldots \alpha _{n-k}}\), by \(\star \kappa ^A{}_{\alpha _1\ldots \alpha _{n-k}}=\hat{\epsilon }_{\alpha _1\ldots \alpha _{n-k}}{}^{\beta _1\ldots \beta _k}\kappa ^A{}_{\beta _1\ldots \beta _k}\).

Notice that this Hodge star operator has the property that, given any V valued k form \(\kappa ^A{}_{a_1\ldots a_k}\) on M and any section \(\sigma :U\rightarrow P\) of P, \(\sigma ^*(\star \kappa ^A{}_{\alpha _1\ldots \alpha _k})=\star \sigma ^*(\kappa ^A{}_{\alpha _1\ldots \alpha _k})\), where the first \(\star \) (acting on \(\kappa ^A{}_{a_1\ldots a_k}\)) is the ordinary Hodge star operation, defined by \(\star \kappa ^A{}_{a_1\ldots a_k}=\epsilon _{a_1\ldots a_{n-k}}{}^{b_1\ldots b_k}\kappa ^A{}_{b_1\ldots b_{n-k}}\). Since acting on a horizontal and equivariant k form on P with the covariant exterior derivative \(\overset{\omega }{D}\) yields a horizontal and equivariant \((k+1)\) form, one may always take the Hodge dual of the exterior covariant derivative of a horizontal and equivariant k form \(\kappa ^A{}_{\alpha _1\ldots \alpha _k}\) to yield a horizontal and equivariant \((n-k-1)\) form, \(\star \overset{\omega }{D}_{\alpha }\kappa ^A{}_{\alpha _1\ldots \alpha _k}\).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Weatherall, J.O. Fiber bundles, Yang–Mills theory, and general relativity. Synthese 193, 2389–2425 (2016). https://doi.org/10.1007/s11229-015-0849-3

Download citation


  • Yang–Mills theory
  • General relativity
  • Fiber bundle interpretation
  • Holonomy interpretation