1 Introduction

The historical origins of conventionalism date back to Poincaré’s (1902) philosophy of science (see e.g. Ben-Menahem, 2006, Ch. II; de Paz & DiSalle, 2014; Ivanova, 2015). It classifies certain, non-trivial elements in theories, dubbed “conventions”, as lacking truth-values. Defying the received dichotomy of analytic/synthetic propositions, conventions express neither definitions (or tautologies) nor factual content. They are supposed to reflect an inherent leeway in our theories. With this claim, conventionalists advance a refinement of empiricism (see also Duerr & Ben-Menahem, 2022): empirical facts andFootnote 1 definitions, according to conventionalism, don’t fix all the elements in a theory’s portrayal of reality. Conventions constitute the residual non-trivial “degrees of freedom”. (The emphasis is deliberate, see below.)

Conventionalists aver that a plurality of empirically indistinguishable, yet—when read at face value—mutually incompatible theoretical alternative descriptions are available (see e.g. Ben-Menahem, 1990, 2001). Which of these descriptions should one choose? And how to justify preference for one such description over the other? According to conventionalism, we must exercise discretion: the theoretical options for a choice of a convention call for the theoretician’s free decision. Such decisions aren’t dictated, or incontrovertibly compelled, by empirical evidence or logic and definitions. (Conventionalists hasten to add that appeal to further methodological principles, such as appeals to simplicity or other super-empirical virtues, won’t settle the matter either.)

Conventional choices may be (and usually are) rationally motivated by pragmatic considerations. Conventionalists merely insist that per se such decisions have nothing to do with truth. Instead, they form genuinely human contributions: conventions reflect human preferences—matters not of truth, but of convenience.

GeometricFootnote 2 conventionalism is concerned with physical geometry—that is, the geometric structure that the spatiotemporal relations amongst material events are embedded into, and on which our physical laws draw. (As opposed to empirical geometry, which denotes the empirically accessible geometry. To be clear: it is physical geometry and not empirical geometry which is the target of geometric conventionalism.Footnote 3) Geometric conventionalists identify physical geometry as a conventional element in the preceding sense: they deny that there is a fact of the matter about spacetime geometry per se. Claims about geometry result from conventional stipulations—acts of human decision-making; on their own,Footnote 4 they don’t correspond to different ways the world might be.

At first blush, geometric conventionalism flies in the face of a received lesson from General Relativity (GR). Torretti (1983, p. 241), for instance, proclaims: “(b)y grounding geometry (and chronometry) […] on a physical field governed by testable natural laws, General Relativity escaped the trilemma of apriorism, empiricism and conventionalism in which the epistemology of geometry was caught at the turn of the century”. GR’s “physicalisation” of geometry (cf. Giovanelli, 2021) thus appears to vitiate conventionalism: according to Torretti’s argument, physical facts (presumably, ascertainable through probing GR’s empirical content) determine the spacetime geometry; absent any residual freedom, questions of the geometry’s conventionality seem to have become moot.

Such a line of thought rests on a false di/tri-lemma (Duerr & Ben-Menahem, 2022). Conventionalists may readily concede that GR’s metric resembles other physical fields—that GR “physicalizes” spacetime geometry in this sense (see e.g. Rovelli, as quoted in Brown, 2005, p. 159). Yet, conventionalists can stand by their guns: the clincher for conventionalism—its sine qua non and starting point—is the co-existence of empirically equivalent, but geometrically incompatible rival theories to GR, the choice amongst them, conventionalists maintain, being conventional. This is motivated (but also backed up by independent arguments specific to geometry) by the wish to alleviate the pressure that the coexistence of empirically indistinguishable alternatives exerts on realist ambitions. So long as such alternative theories aren’t ruled out, conventionalism stands unrefuted. Geometric conventionalism thus goes hand in hand with an injunction to explore the unconventional—to diligently study theories “outside the canon”.Footnote 5

From the outset, we’d like to pre-empt a widespread misunderstanding about conventions. As emphasised already by Poincaré (see e.g. Ben-Menahem, 2001, p. 484) and elaborated in detail by Reichenbach (1938), conventions aren’t completely arbitrary in two principal regards. First, via their role in laws of nature, and typically their interdependence, conventions are subject to empirical constraints. Empirically,Footnote 6 conventionalists don’t champion an “anything goes”-attitude towards geometry. Secondly, notwithstanding their conventional status, and hence detachment from truth, some conventions can be rationally preferable over others: conventionalists recognise that one can make rational choices amongst conventional options—based on pragmatic, truth-unrelated considerations.

The present paper will gauge how geometric conventionalism (henceforth, referred to merely as “conventionalism”) fares vis-à-vis modern (i.e. Newtonian/pre-relativistic and relativistic) spacetime theories.Footnote 7 Albeit pronounced dead in today’s prevailing philosophical literature (with Torretti’s quote being a representative obituary), conventionalism remains a live and attractive philosophical position—a nuanced, “selective anti-realism” in response to a peculiar, worrisome form of empirical underdetermination of geometry (see also, Duerr, 2021, Sect. 4). Accordingly, our subsequent agenda of challenging, and in fact gainsaying, the prevalent consensus in the contemporary literature encompasses three main tasks:

  1. 1.

    To elucidate the key tenets of geometric conventionalism, and conversely highlight the misconceptions about it that underlie the presently dominant anti-conventionalism;

  2. 2.

    To articulate (old and new) arguments for geometric conventionalism as we, following Poincaré, conceive of it;

  3. 3.

    To demonstrate, by means of concrete examples, that modern spacetime theories fully and naturally support geometric conventionalism.

We’ll proceed as follows. §II will concentrate on the conventionalist’s core tenets. In §III, we’ll present, and flesh out, the principal arguments for the conventionality of geometry. §IV will rehearse, and respond to, some objections. In §V, we’ll cash out how conventionalism applies to, and can be said to be vindicated in, classical and relativistic physics. Readers primarily interested in the paper’s philosophical parts can skip the section. It serves as a (relatively non-technical) appendix with the pertinent arguments from physics; it provides a systematic overview of the main alternatives to GR that the philosophical literature has largely neglected entirely. In §VI, we’ll summarise our findings, and delineate future lines of inquiry.

2 Conventionalism

This section will scrutinise the two core tenets of conventionalism:

  1. (C1)

    the co-existence of empirically equivalent theories that postulate mutually incompatible physical geometries (§II.1);

  2. (C2)

    in reaction to this, the classification of geometry’s status as a convention (§II.2).

2.1 The underdetermination thesis

The underdetermination thesis (C1) forms the starting point for conventionalism. It asserts a multiplicity of distinct theories which (at least for our world) can accommodate the same empirical facts. The theories differ over their physical geometries (see e.g. Ben-Menahem, 1990, 2006, Ch. 1). By observational means alone, it’s impossible to discriminate amongst those geometric alternatives. This we’ll refer to as the (empirical) underdetermination of geometry. Empiricists about geometry, such as Helmholtz (see e.g. Carrier, 1994a; diSalle, 2006), by contrast, postulate a unique (one-to-one) correspondence between candidate geometries and empirical facts; as per (C1), conventionalists postulate a many-to-one mapping.

(C1) purports not only that the theories in question employ distinct geometries: their respective spacetime-geometric structures are supposed to be incompatible. Geometric incompatibility comes in two main forms (Duerr & Ben-Menahem, 2022, Sect. IV). In the first, the respective geometries make mutually inconsistent geometric claims, for instance, about the validity of Euclid’s parallel postulate or the path-independence of parallel-transport. A different form of incompatibility arises, when one geometry makes more general geometric claims than the other (but perhaps contains it as, in some sense, a special case): the latter geometry then prohibits more geometric possibilities than the other. This is the case, for example, for the pairs Euclidean and Riemannian geometry, or Riemann geometry and Riemann-Cartan geometry.

Conventionalism as we conceive of it, is congenial to scientific realism: a friendly amendment to an overall empiricist and realist outlook, it seeks to uphold as much realist commitment as appropriate given the empirical underdetermination of geometry. Against such realist ambitions, geometric incompatibility becomes even more trenchant. Suppose that one starts with a straightforward realist stance towards incompatible geometries (that is, if one presumes them to be “literally true” (van Fraassen, 1980, §1)): metaphysically, it’s then impossible that they are simultaneously true. Conventionalism should, we submit, be seen as a strategy to evade this impasse—without condemning us to insuperable ignorance about geometry. We’ll resume this thread later on (§II.2).

The non-triviality of (C1) is worth dwelling on. Generic underdetermination theses have been widely impugned in the literature (e.g. Hoefer & Rosenberg, 1994; Newton-Smith, 2001; Norton, 2003; Worrall, 2011). Conventionalists are therefore obligated to proffer examples to support their more specific underdetermination thesis. Sections §V will provide this.

Here, let’s hone in on some conceptual subtleties behind (C1)’s beguiling simplicity. The thesis hinges on a number of prerequisite, substantive assumptions and commitments that in typical discussions of underdetermination remain tacit (cf. Carrier, 1994b, Ch. VI; Duerr, 2021). Rendering them explicit, we’ll successively comment on:

  1. (i)

    The defeasibility of underdetermination

  2. (ii)

    The reliance on alternative theories

  3. (iii)

    The required distinctness of theories

  4. (iv)

    Their empirical equivalence

  5. (v)

    The meaning of “physical geometry”.

Ad (i). Underdetermination of geometry is a sine qua non for conventionalism, as studied here. Hence, it’s vital to be clear on ways which might undermine the underdetermination thesis, or at least remove its philosophical sting (and thereby, in turn mar the motivation for introducing conventions as a new category, as per (C2)). One such way is to contest the premise: to deny the existence of genuine examples of empirically equivalent theories. To refute this most effectively, one must point to examples that the critic is obliged to engage with—our strategy pursued in §V.

Another option to challenge (C1) would be to accept it, but add that further considerations can “break” the underdetermination. That is, despite acknowledging a many-to-one mapping of candidate geometries and observational facts, one might insist that (possible) observations alone constitute too meagre a basis for proper theory evaluation. One should invoke additional considerations. Thereby, it’s hoped, a one-to-one correspondence will be achieved. If successful, such a strategy prevents the conventionalist from making philosophical hey with the empirical underdetermination of geometry: empirical underdetermination is acknowledged but dismissed as unimportant. Judicious theory assessment supplements empirical criteria for theory selection with super-empirical ones.

Two main options along those lines spring to mind. Both highlight that conventionalism is a philosophically meaty position: it can scupper on both further philosophical commitments, as well as scientific developments. One strategy for breaking underdetermination whittles down the number of candidate geometries by appealing to so-called super/extra-empirical theory virtues, such as parsimony or unificatory power (a strategy prominently promoted e.g. by Friedman, 1983, Ch. VII): because one theory (but not the other) exemplifies a salient theory virtue, we should prefer it over the other.

The crux for such a move is to render plausible the epistemic relevance of such virtues: ideally, one should demonstrate that they serve as reliable indicators of truth—that a theory exemplifying such a virtue is more likely to be true than one not exemplifying it (see e.g. Schindler, 2022 for a recent attempt). Conventionalists (as well as others, e.g. van Fraassen, 1980, esp. Ch. 4.4; Worrall, 2000; Ivanova, 2024, or Norton, 2021, Ch. 5–7), by contrast, deem such factors pragmatic: they merely concern matters of usefulness (or, occasionally, taste); fundamentally, they are disconnected from claims to knowledge or truth. Advocating super-empirical theory virtues as epistemically relevant thus counts as a philosophical commitment that can threaten conventionalism.

Empirical underdetermination might also be overcome through future scientific developments (see e.g. Laudan & Leplin, 1991). The idea pivots on differential support furnished by the same empirical data. Consider two theories T1 and T2 with the same empirical content.Footnote 8 Further, suppose that at some point in the future, a successor or super-theory T* emerges such that:

  1. (a)

    T* has greater empirical content than T1 or T2,

  2. (b)

    It’s sufficiently well-confirmed in these new regimes, and

  3. (c)

    It reduces to, or contains in some other suitable sense, T1—but not to T2.

Under these circumstances, it seems prima facie plausible—with the benefit of hindsight—to prefer T1 over T2 in virtue of T* and its relationship to both theories. The empirical underdetermination of the choice between T1 and T2 is thus reduced to our perfunctory knowledge at the time. If one accepts this reasoning, it shows how future scientific results can subvert conventionalism.

Ad (ii). Why formulate (C1) in terms of theories—rather than alternative descriptions in terms of singular statements (about the empirical facts of our world)? Our reason is that else (C1) is trivialised (see also Norton, 1994; Acuña, 2014a): deploying stratagems reminiscent of Russell’s teapot, it would be too cheap to concoct empirically equivalent descriptions by introducing undetectable ad-hoc differences.Footnote 9 Such alternative descriptions strike us as inherently artificial; to countenance them would seem tantamount to lowering standards typically expected of scientific speculations about the world. For conventionalism to be a philosophically interesting thesis—namely, a nuanced response to a particular kind of underdetermination—we thus demand that the alternative descriptions it presupposes be supplied by theories, the bread and butter of physics; philosophical fuss over artefacts and sophistry is to be shunned.

Ad (iii). Yet another way in which an underdetermination thesis can become trivial (and thereby emasculate conventionalism) concerns theory identity (or “synonymy”, Carrier, 1994b, p. 236): the theories in question may turn out not to be distinct, but merely notational variants of the same theory. In that case, their differences are confined to mathematical representation—not to physical content. The resulting underdetermination is analogous to our freedom to choose different bases for representing the vector in a vector space, or to substitute a word for a synonym—the insipid truism that “different signs can be used to denote the same referent, and the same sign can be used to denote different referents “ (Ben-Menahem, 2006, p. 66). The resulting “underdetermination” is of little interest here; it only spawns what Grünbaum (1973, p. 27) decries as “trivial semantic conventionalism “.

The kind of conventionalism we’re after requires empirically equivalent, non-identical (“theoretically inequivalent”) theories. We thus need criteria for theory individuation: by which principles to identify two theories as the same, rather than declare them genuinely distinct? To-date, no consensus has been reached on the matter (see e.g. Weatherall, 2019a, b). Accordingly, the fate of conventionalism—insofar as, per (C1), it pivots on the coexistence of distinct theories—hinges on one’s criteria for theory individuation. Again, we see that conventionalism as a substantive thesis (as well as convincing examples for it) is intertwined with further philosophical commitments.

For some such commitments, conventionalism can’t even get off the ground. Consider, for instance, Reichenbach’s (1938) position: he regards empirical equivalence as both a necessary and sufficient criterion for theory identity. This immediately short-circuits (C1): genuinely distinct empirically equivalent theories are impossible; ex hypothesi, empirical equivalence precludes theory distinctness (cf. Duerr & Ben-Menahem, 2022, Sect. IV).Footnote 10

Pace Reichenbach, we believe that caution counsels permissivity—also with respect to theory identity (cf. Glymour, 1970): conventionalism oughtn’t to be taken hostage to gratuitously specific further commitments. In this regard, theory individuation qua interpretative differences seems apposite: we’ll regard theories (even when sharing the same formalism!) as distinct, iff they admit of distinct, coherent interpretations; distinct theories must limn the world in metaphysically distinct, but individually intelligible and coherent ways (see Butterfield, 2018; Coffey, 2014; Møller-Nielsen, 2017; for details and independent arguments for such a view).

Ad (iv). Likewise, the notion of empirical equivalence—by which, as we’ll expound below, we mean a sufficiently large overlap of relevant empirical content—harbours subtleties that further contribute to (C1)’s non-triviality.Footnote 11

First, empirical content is arguably not an intrinsic feature of a theory. It depends on one’s background knowledge (see e.g. Bunge, 2000, Parte IV for a detailed analysis). What counts as empirical/observable phenomena is determined by other (theoretical) assumptions in our “web of beliefs” (Quine): they link our theories and observations, and penetrate our interpretations of observational data. Such background assumptions can, and usually do, change over the course of physics; concomitantly, also what counts as empirical/observable can, and usually does, shift (see also e.g. Read & Møller-Nielsen, 2017). Prima facie, nothing precludes that different background assumptions may render two theories at some point empirically equivalent, and at a different point empirically distinct. As before with (i), we here witness another instance of how scientific developments can impinge upon (C1): by altering (improving) our background knowledge, they may force us to revise pronouncements of two theories as empirically equivalent. In light of enhanced knowledge, they may turn out to be empirically distinguishable after all.

Secondly, (C1)‘s empirical underdetermination must be sufficiently robust. If future advances, theoretical or experimental, may reasonably be expected to break the empirical underdetermination, it’s not robust enough. Conventionalism would be staked on too volatile a form of uncertainty; it would be epistemically too unstable.

Thereby so-called “transient” underdetermination (see e.g. Stanford, 2017) is ruled out as too tenuous, at least in its generic form: it originates in merely momentary nescience (e.g. current technological limitations tainting measurement devices or a preliminary dearth of data). In principle, it can be remedied by more information, technological improvements, etc. On the other hand, as Pitts (2011, 2016) has stressed, for sufficiently robust underdetermination one needn’t have strictly empirically equivalent theories; weaker relations suffice. The theories might, for instance, be observationally distinguishable only in situations that either happen not to be physically realised in our universe (think of, say, causally exotic spacetimes or speculative types of matter, say, tachyons or massive photons), or that are, as a matter of principle, beyond our observational/experimental control.Footnote 12

The notion of empirical equivalence that conventionalists need is observational indistinguishability to the best of our knowledge and present theoretical expectations. Once more, it’s evident how the fate of conventionalism is inextricably tied to developments in science.

Ad (v): Conventionalism as a doctrine about the conventionality of (physical) geometry mandates a clarification of what one means by “physical geometry”. Once more, we’ll see that one’s answer can affect the standing of conventionalism.

In this regard, Reichenbach’s (1928) view about geometry is instructive. Reichenbach denied that talk about geometry is substantive: for him geometry was merely an inessential “cloak” with which we dress physics, a structural analogy between a particular representation of a theory’s physical content, and pure/mathematical geometry (Giovanelli, 2016, 2021). Such a construal of geometry defangs the coexistence of geometrically incompatible, but empirically equivalent theories: they merely differ over (contingent) representational means, but not necessarily over substantial matters (viz. their physical content): in Reichenbach’s metaphor, the same person may wear different cloaks. Consequently, Reichenbach’s view about geometry would diminish the motivation for conventionalism: (C1)’s geometric incompatibilities per se would pose no serious challenge to the realist; the selectively-antirealist manoeuvre, (C2), would become moot (see below in §II.2).Footnote 13

For our purposes, “(physical) geometry” will denote the spacetime geometric structure to which our best physical theories are ontologically committed (in the sense of Quine, 1948). “This physics presupposes geometrical structure that it is natural to interpret as primitive and physically instantiated in an entity ontologically independent of matter” (Pooley, 2013, p. 20).

Together with our stance on theory individuation qua interpretation, naïve realism about spacetime would entail a metaphysical incompatibility of theories ontologically committed to incompatible geometries (say, one with vanishing curvature, and one with non-vanishing curvature): they would appear to belong to distinct worlds. This underdetermination of geometry should alarm those with realist ambitions: motley mutually exclusive candidate geometries exist, but evidence falls short of singling out one. Here, the conventionalist’s second core commitment comes into its own.

Before discussing (C2), however, we’d like to underscore that conventionalism renounces any a priori restrictions on countenanced geometries: with respect to the geometric structure of which theories may avail themselves, latter-day conventionalists—in contrast to Poincaré (who limited himself to three-dimensional spaces of constant curvature)—embrace a ‘whatever works’-pluralism. This expressly includes non-standard options departing drastically from Riemannian geometries: geometries with torsion and/or non-metricity, “multi-geometries” (Pitts, 2016, 2017), Finsler geometry, area metrics, etc.

2.2 The conventionality thesis

Conventionalists respond to the empirical underdetermination of geometry by adopting a “selective” (or “modest”, see Farr, 2022) anti-realism (Duerr, 2021, p. 25): to defuse the conflict between empirically equivalent but geometrically incompatible rivalling theories they bar geometric posits from a (naïve) realist attitude. As a result, what those descriptions essentially differ over doesn’t concern (hypothetical) features of the world; their differences are thus drained of factual substance.

For this anti-realist move, conventionalists—following Poincaré—introduce a third category, alongside the received distinction between analytic/synthetic propositions. Poincaré baptised propositions of this novel category “conventions”. They don’t possess truth values: they are neither true by definition nor, lacking factual content, true in virtue of matters of fact. Conventions are contributions to theories from human theorisers—theoretical components sui generis, originating in the exercise of free discretion. Our theorising about the world, according to conventionalism, contains an ineliminable leeway, fixed neither by definitions nor facts; it remains fundamentally underdetermined by definitional/analytic and factual/synthetic truths. From amongst a multitude of options (i.e. alternative conventions) scientists must select one for the theory to be conceptually complete/well-defined. No such choice, though, is distinguished in terms of truth; conventions are unrelated to truth. What guide the choice of a convention, and accordingly the choice of one theory from it’s empirically equally good rivals, are pragmatic considerations (e.g. simplicity): our preference is based on convenience and other predilections—reflecting human interests and volition.

As paradigmatic exemplars of conventions, Poincaré adduces the axioms of geometry. The laws of optics in their standard form utilise Euclidean notions of distance and straightness (or parallelism). Yet, as Poincaré explicitly demonstrates, we are at liberty to use Lobachevskyan/hyperbolic geometry (in which the Euclidean parallel axiom is replaced)—as long as we allow for suitable correction factors; these “universal forces” (as Reichenbach (1928) aptly calls them) systematically distort measurement results, thereby preserving all empirical findings. Standard optics and its revised counterpart, with their respective reliance on Euclidean and non-Euclidean geometry, are empirically on a par. This illustrates (C1): on the one hand, the two descriptions are mutually incompatible (in that, according to one, space has vanishing curvature, whereas according to the other, it has constant positive curvature); on the other hand, experimentally one can’t adjudicate between them. The choice is conventional: with respect to truth, none is privileged over the other—in line with (C2). To Poincaré’s mind, we opt for Euclidean geometry solely on grounds of handiness, namely its (alleged) greater simplicity.Footnote 14 Since there is no fact of the matter whether space is Euclidean or not, the descriptions’ contradictory judgements regarding its curvature have lost their sting.

Conventionalism is a selectively anti-realist position.Footnote 15 Overall, our version of it is sympathetic to realism; as per (C2), conventionalists merely exempt spacetime-geometric structures from realist commitment, together with the corrective terms in the physical laws that would arise for alternative geometric choices (in Poincaré’s preceding example: the universal forces). To our minds, conventionalism thus qualifies as broadly compatible with realismFootnote 16: it constitutes a realism-friendly, nuanced, and (as we’ll unpack in greater detail in §III) specific response to the challenge of empirical underdetermination of geometry (as per (C1)).

We side with a maximally realist construal of conventionalism along the preceding lines. Yet, conventionalism needn’t be tied to that. Those who prefer a cagier variant may abstain from positive recommendations for realist commitment, and simply remain non-partisan as to which theoretical posits merit realist commitment. Committing to a negative recommendation, such a conventionalist only opposes realism—as per (C2)—about physical geometry. The present paper’s results equally apply to this less demanding form of conventionalism. Our discussion will, however, stick to the maximally realist construal.

Let’s therefore juxtapose, in broad brush strokes, conventionalism and the “three families or camps” of generic realism (Chakravartty, 2017, Sect. 1.3; see also Psillos, 1999 for a systematic review):

  1. (1)

    Explanationism

    Advocates of this “selectively-realist” strategy, such as Kitcher (1993, Ch. 4 and 5) or Psillos (1999), recommend realist commitment to “working posits”, i.e. those parts of our best theories “that are in some sense indispensable or otherwise important to explaining their empirical success—for instance, components of theories that are crucial in order to derive successful, novel predictions” (Chakravartty, 2017); contrariwise, “idle posits”, not responsible for the theory’s empirical successes, don’t merit realist commitment.

    For geometry and its underdetermination, limiting realism to a theory’s elements responsible for its empirical success ends in an aporia. First, construe indispensability in the sense that without the geometry G and G′, the empirically equivalent, but geometrically incompatible theories TG and T′G′ are deprived of their predictive or explanatory capacities. Then, both G and G′ are indispensable (with respect to TG and T′G′, respectively)! Explanationism would, however, remain silent on a choice between TG and T′G′; the co-existence of conflicting geometries would continue. Next, construe indispensability slightly differently: can we retain TG’s predictions, without postulating its physical geometry G? By dint of TG and T′G′ empirical equivalence, we can. According to explanationism, underdetermination would ipso facto seem to entail that neither of the candidate geometries, G or G′, merits realist commitment (a consequence that goes some way towards the conventionalist’s (C2)).

  2. (2)

    Structural realism

    Structural realists, such as Worrall (1989) or Ladyman (1998), reserve realist commitment for a theory’s formal or structural content (as expressed in their mathematical equations); its further claims about entities and their nature, by contrast, don’t merit realism. Successful theories, according to structural realists, approximately capture the relations instantiated in the world; they don’t unveil insights into the entities that stand in those relations.

    On the face of it, structural realism cuts no ice against the specific predicament of an empirically underdetermined geometry: ex hypothesi, as per (C1), the geometric alternatives are different—even mutually incompatible; the theories posit incompatible structure—viz. geometry. This blocks structural realism.Footnote 17

  3. (3)

    Entity (or experimental) realism

    Entity realists, such as Hacking (1983), aver that realism ought to be restricted to “conditions in which one can demonstrate impressive causal knowledge of a putative (unobservable) entity, such as knowledge that facilitates the manipulation of the entity, and its use so as to intervene in other phenomena” (Chakravartty, 2017).

    Entity realism seems to suggest an anti-realist stance towards geometry (once more, grist to the conventionalist’s mills, as per (C2)). It’s not clear that we can attain causal knowledge of spacetime geometry: spacetime geometry is rarely regarded as a causal agent sensu stricto (see e.g. Brown, 2005; Lam, 2005; Nerlich, 1994, Ch. 7, 2013, Ch. 8). Indeed, explanations of relativistic phenomena that essentially involve spacetime-geometric structure tend to be cited as epitomes of non-causal explanations (e.g. Lange, 2017; Saatsi, 2021).

Having clarified the core aspects of conventionalism, our next task is to ponder: what arguments might buttress the case for (C2)? (Patently, (C1) can only be convincingly substantiated by means of examples; they’ll be given in §V).

3 Why be a conventionalist?

This section will expand on four further arguments in favour of conventionalism:

  1. (1)

    Empirical under-determination of physical geometry (§III.1) In a sufficiently robust sense, geometrically distinct spacetime theories accommodate the same observational facts; yet, they postulate physical geometries which can’t be literally true all at the same time. Without lapsing into instrumentalism, conventionalism offers an alternative to agnosticism about physical geometry.

  2. (2)

    Genuine alternatives (§III.2) For each theoretical alternative, prima facie non-trivial super-empirical reasons exist. They suggest that we should take it seriously, including its geometric structures. Conventionalism offers a stance towards geometry that does so—without requiring (invariably controversially) super-empirical selection principles as guides to truth.

  3. (3)

    Local inter-translatability of geometric alternatives (§III.3): The geometric alternatives stand in remarkably close relationships: their geometries admit mutual translations into one another. This suggests an egalitarian attitude towards those alternative geometries: conventionalism is a natural interpretative option for granting them an ontologically equal/egalitarian status.

  4. (4)

    The transcendental nature of geometry (§III.4) For principled reasons, spacetime geometry is more remote from empirical access than material objects and their behaviour: it has a more constructed character. As such, claims about geometry don’t sit comfortably with more robustly objective propositions, expressing analytic or factual truths. Conventionalism is a stance towards geometry that implements this peculiar standing.

3.1 Empirical underdetermination of physical geometry

We already commented on the empirical underdetermination as a precondition for geometric conventionalism (§III.1 (i)). Just like underdetermination in general challenges the realist about scientific theories (see e.g. Chakravartty, 2017, Sect. 3.1 for details; Turnbull, 2017), underdetermination of geometry challenges the realist about geometry: the royal road to knowledge, direct empirical evidence, seems obstructed. Rather than jettisoning empiricism tout court, we conceive of conventionalism as an ultimately conservative response—a more sophisticated complement to empiricism.

What other options might a friend of realism consider? As indicated before, if she’s sanguine about the reliance on super-empirical theory virtues, the underdetermination of physical geometry may not overly disconcert her. However, if she has reservations about such commitments, her realist ambitions come under pressure.

One reaction then would be to curtail them by remaining agnostic about the true geometry: following e.g. van Fraassen (1980), she might simply swallow our insuperable ignorance about it as a brute fact (and perhaps hope that the future will ameliorate this plight, e.g. through some of the strategies indicated in §II.1). Saddling oneself with such a fundamental gap in our knowledge might be deemed rebarbative, though.

Conventionalism eliminates this gap by revising a metaphysical assumption: why presume that claims about geometry admit of truth-values (of which we can be ignorant)? Conventionalists contest this, as per (C2). Of course, to deflect charges of ad-hocness, it’s incumbent on conventionalists to muster independent arguments. To three such arguments, we’ll turn next.

3.2 Genuine alternatives

In invoking empirical underdetermination of geometry, as per (C1), we’ll be fastidious: the empirically equivalent theories of §V will display some—typically distinct—super-empirical theory virtues; each theory receives a “non-vanishing methodological rating “ (Carrier, 1994b, p. 239).

We insist on this for two reasons.Footnote 18 First, it reassures us that we are dealing with genuine and genuinely distinct theories. Their super-empirical features commend them as serious theories in their own right. Each depicts reality in a way that has independent intellectual merits. It would seem unfair to shrug off this underdetermination as resulting from artificial pseudo-theories or notational variants of the same theory (cf. Acuña, 2014a; Norton, 2003). The underdetermination associated with such genuine theories itches. This makes the motivation for conventionalism particularly strong.

Secondly, Kuhn (1974, p. 357) has forcefully drawn attention to the role of empirical and super-empirical criteria (“values”) in appraising theories. They give rise to two well-rehearsed problems. One is that “(i)ndividually, they are imprecise: individuals may legitimately differ about their application to concrete cases. In addition, when deployed together, they repeatedly prove to conflict with one another.” Simplicity, for instance, is a notoriously ambivalent notion (see e.g. Bunge, 1963): inter alia, it can refer to mathematical simplicity (which in itself can be spelt out in motley ways), or quantitative or qualitative parsimony.Footnote 19

But suppose that we can render precise what the various empirical and super-empirical theory virtues consist in. Yet, according to Kuhn, a second problem looms: to compare theories, we’d also have to assign those virtues different weights—to rank them. Which balance to strike between, say, mathematical simplicity and unificatory power? Those who shy away from any particular stance towards theory virtues (and instead, say, prefer neutrality) in light of these two problems are thus likely to feel the pull of our examples in §V towards conventionalism as a response.

3.3 Local translatability

A third argument for conventionalism pivots on the close relationship that geometric alternatives exhibit. The alternative theories’ geometries admit of mutual translations. The differences between them are well-controlled; they can, as it were, be hedged (or quarantined).

The geometrically alternative Newtonian and relativistic theories that we’ll consider in §V display an interesting feature: their spacetime-geometric underpinnings admit of mutual (albeit not necessarily unique) transformations into each other, while the remaining physics is left essentially intact (after minimal adjustments, compensating for the changes, at the interface between physics and geometry—where, that is, in the equations spacetime geometry is drawn on).

This doesn’t come as a surprise in light of the holism about geometry, to which we subscribe (see §III.4). In Einstein’s (1921) evocative formula, only the “total system G + P”, geometry-cum-physics, can be the object of empirical knowledge. The geometrically alternative theories re-shuffle the G components; one can repackage the empirical content. Note, however, that such re-shuffling and re-packaging oughtn’t ipso facto be equated with a trivial subtraction/addition of elements (see below).

Thanks to this holism, we enjoy liberty in parsing the “total system” into different G and P components—without forfeiting empirical (or in fact epistemic) content. The epistemic opaqueness of geometry per se provides a strong motivation for the conventionality thesis (C2), a thought we’ll pursue further in §III.4. Here, however, we’d like to unravel a different argument.

To this end, it will be instructive to delimit the situation from two aspects of Quine’s philosophy. The first further clarifies conventionalism as we envision it; the second delivers the announced argument. First, recall that Quine (1951) endorses a total/global conventionalism.Footnote 20 It follows from Quine’s radical epistemological holism (which is not confined to geometry, cf. §III.4): only all-encompassing “systems of the world” (Quine, 1975)—that is, the entirety of theories we adopt, including logic and mathematics—possess objective content, empirical content, in particular. Individual (physical) theories (and, a fortiori, their laws and propositions) are conventions, devoid of truth-values. By contrast, the situation of geometry is more localised in two regards. First, the conventional leeway we are interested in remains limited to the alternative theories’ geometries (and the concomitant correction terms that enable the “translation” between the different geometric structures within the physical laws). Furthermore, conventionalism as we envision it needn’t embrace the total freedom in choosing one’s geometry that Quine allows—the view that, in our case with respect to geometry, “any statement can be held true come what may, if make drastic enough adjustments elsewhere in the system” (Quine, 1951, p. 43). Conventionalism can thrive on a more modest underdetermination thesis: it presupposes only the existence of some geometric alternatives—not the untrammelled infinitude of whatever geometry one decides to stipulate.

A second deviation from our views and Quine’s concerns his stance towards theory identity. Quine ties theory identity to truth-preserving translatability (inter-definability) of a theory’s basic terms: for Quine two theories count as synonymous, as notational variants of each other, if they are empirically equivalent and admit of such inter-translatability (see e.g. Barrett & Halvorson, 2016 for a critical, and technical, discussion of details of Quine’s view).

Inter-translation as Quine envisages it doesn’t apply to the examples. A cognate does, though (at least for a few of them): the spacetime settings considered §IV and §V are all “almostFootnote 21 isomorphic” (more precisely: categorically equivalent, see e.g. Weatherall, 2018; Hudetz, 2019). Here, one of the key ideas is that sets of empirically equivalent models of one theory might correspond to sets of empirically equivalent models of another theory, where the sets related by this correspondence needn’t be of the same cardinality. In this case, it's reasonable to aver that, in spite of this lack of strict isomorphism between spaces of models themselves, there is still a straightforward sense in which the two theories can be regarded as equivalent. In particular, some of theories considered in §V are “almost” isomorphic in the sense that their solution spaces stand in the above relations to one another—relations precisely captured by the notion of categorical equivalence.

To some this close structural relationship suggests that one should physically identify those spacetime settings: their differences merely pertain to differences in representation. Thereby, the geometric incompatibilities are defused: the incompatibility is relegated to the surface level of surplus structure; at the (ex hypothesi) fundamental structural level, the spacetime settings agree. Only trivial semantic conventionality ensues (see our discussion in §II.1 (iii)).

As per (C2), conventionalists pursue a different strategy of defusing the incompatibilities. Conventionalists aren’t taken hostage to a commitment to categorical equivalence as a filter for severing a theory’s fundamental structure and surplus fluff. Conventionalists can likewise be impressed with the close structural link amongst spacetime settings. But they’d draw a different conclusion—without embroiling themselves in the dispute over particular views about theory identity. To a conventionalist, the close structural link suggests that we treat all of the geometric alternatives on a par—but since their natural interpretations (namely, when read at face value) can’t be true simultaneously, she opts for an anti-realist stance towards all the geometric options—for demoting them to conventions, devoid of truth-values. Thereby, by opting for (C2), she can have her cake—an egalitarianism towards all the (remarkably closely related) geometric options—and eat it too (without choking on their geometric incompatibilities).Footnote 22

3.4 Transcendental nature of spacetime

Our final argument traces back to Poincaré (Bland, 2011; Folina, 2014; Friedman, 2002). It turns on the doubly transcendental nature of spacetime (cf. Duerr & Calosi, 2021, p. 13,814). First, spacetime geometry is empirically opaque in a natural sense. Even if one acknowledges the indirectness and theory-ladenness of empirical knowledge, something rings right about saying that material objects are the objects of empirical experience: first and foremost, we observe—and garner empirical evidence from—material bodies and their behaviour, not spacetime structure itself. Material objects are epistemologically privileged over spacetime structure.Footnote 23 For concreteness, think of Gauß’s triangulation measurements (putativelyFootnote 24) to ascertain the Euclidean nature of space: claims about the latter are inferred from the observations, based on the behaviour of light. This distance between claims about spacetime structure and experience removes them, as it were, from the binding forces of empirical reality; spacetime structure thus becomes—to stay in the image—more freely movable. Accordingly, our liberty grows for helping ourselves to whatever spacetime structure we find useful in physical theorising. How exactly do we infer claims about spacetime from the behaviour of material objects? This leads to the second (and indeed Kantian) sense in which spacetime can be said to be transcendental. Spacetime structure functions as a “condition of the possibility” for meaningfully interpretable physical laws—as a constitutive a priori.Footnote 25

That is, physical laws describing the behaviour of material bodies presuppose, explicitly or implicitly, spacetime structure: in order to formulate, and make sense of, physical laws, all standard theories requireFootnote 26 spacetime structure (e.g. notions of distances or volume, or parallel-transport); without it, one can’t write down the theory’s dynamics.Footnote 27 That physical laws conceptually rely on geometry implies an epistemological holism about geometry: empirically, we can’t study spacetime geometry in isolation; experiments ineluctably probe geometry in conjunction with physical laws that, explicitly or implicitly, incorporate it. But now it looks as if we’re stuck in a circle: how to gain empirical knowledge about geometry, if such knowledge inevitably presupposes empirical knowledge of physical laws, and at the same time, these physical laws themselves presuppose spacetime structure?

Conventionalism offers a pragmatic way out. It takes its cue from the foregoing twofold transcendence of spacetime structure: being a prerequisite for physical dynamics, and relatively remote from epistemic access, it seems that spacetime is more something we construct, something we as theorisers impose or prescribe, rather than something objective we discover. (To be sure, this construction isn’t unconstrained or utterly arbitrary. We empirically test empirically laws, in tandem with their prerequisite spacetime structure.) Together with the multiplicity of alternative physical geometries, as per (C1), this motivates (C2) as the conventionalist’s second core tenet: geometric posits are active contributions from human theorisers. Rather than grounded in facts (or analytic truths), they originate in theorisers’ (logically) free choice.

By demoting the geometric structure to conventions, one gets around the above epistemic catch-22: if geometric structure is eo ipso devoid of factual content, and originates in human discretion, one may stipulate it in whatever way one may find most convenient (and, of course, in whatever way eventually works empirically). In this way, it’s possible to exit the circle without trespassing upon truth. Conventionalism thus allows—in fact, emboldens—us to utilise whatever geometry works (i.e. is profitably employed in a physical theory). The ultimate arbiter remains, of course, experience: which geometric structure yields viable geometric alternatives depends on the (empirical) success of the theories that employ it.

4 Objections to conventionalism

Denuding geometric claims of truth-values triggers a few potential concerns. Here, we’ll inspect four of them: doubts about the coherence of propositions without truth-values, potential troubles with conditionals (material ones, as well as counterfactuals), and a slippery slope argument. We’ll sketch what we deem auspicious ways to address these concerns.

4.1 Is the notion of propositions that lack truth-values per se absurd, incoherent, or unintelligible from the get-go?

Such a general concern becomes pressing, given e.g. the traditional analytic/synthetic dichotomy. If one subscribes to the view that all meaningful propositions, endowed with cognitive content, can be classified either as analytic or synthetic truths, no space is left for conventions as propositions without truth-values. Propositions which prima facie can’t be ascribed truth-values and possess no truth-conditions would instead, for instance, be deemed merely expressing emotional attitudes, devoid of cognitive content.Footnote 28

We don’t share those hunches. Prima facie, nothing strikes us as inherently problematic in allowing for (cognitively contentful) propositions without truth-values. In fact, we seem familiar with various examples: normative claims (e.g. in ethics, but alsoFootnote 29 epistemological or methodological norms, such as those proposed as standards of good theories, e.g. the deprecation of ad-hocness) or aesthetic-evaluative claims. Patently, none of these claims plausibly count as analytic or synthetic truths. Yet, we hold them to be perfectly intelligible and coherent in and of themselves—irrespectively of further philosophical views one may espouse regarding such propositions. In particular, their meaningfulness and cognitive contentfulness don’t hinge on such more specific, further commitments (e.g. moral or aesthetic realism or anti-realism).Footnote 30

4.2 Truth-values for conditionals?

We’d like to lay down truth-tables for conditionals involving conventions. The non-ascribability of truth-values in the antecedent or consequent, however, appears to impede this desideratum.

For illustration, consider the following two kinds of material implication.Footnote 31 (The rationale behind this grouping will transparent shortly.):

  1. (A)

    “If what General Relativity asserts about the connection between matter and geometry is true, geodesics in the presence of, say, spinning neutron stars are twisted.”, “If what Nordström’s theory asserts about the connection between matter and geometry is false, the spacetime’s geodesics aren’t affected by the energy–stress associated with electromagnetic radiation.”, “If two black holes coalesce, their joint surface area doesn’t decrease.” Or “If cosmic inflation occurred, the universe’s present-day geometry is predicted to be approximately homogeneous and flat.”

  2. (B)

    “If General Relativity is true, the paths of light rays are bent in the vicinity of heavy objects.”, and “If light is bent in the vicinity of heavy objects, what Nordström’s theory asserts about the connection between spacetime geometry and matter isn’t true.Footnote 32

Those conditionals (strictly speaking: concatenated conditionals) look unproblematic—in fact, intuitively, true. It seems desirable that conventionalism preserves this intuition. But how to evaluate the corresponding truth-tables, if consequent or antecedent—as per the conventionalist’s (C2)—don’t admit of truth-values?

For an answer, we must treat the conditionals in class (A) differently from those in (B). The relevant difference consists in the fact that the latter involve empirical claims, whereas the former are merely mathematical-formal claims. Let’s start with the conditionals in (A): their consequents are geometric claims within a given theory (GR and Nordström Gravity, respectively). It’s therefore natural to construe the conditionals in (A) as merely mathematical-formal implications: given GR or Nordström Gravity (in their standard geometric interpretation, i.e. as theories about spacetime geometry), we can deduce from them the respective consequences, via purely formal arguments. In other words, we propose to construe conditionals in (A) as mathematical-logical conditionals—analytically true ones.

By contradistinction, the conditionals in class (B) contain empirical facts in the consequent and antecedent, respectively. This makes a crucial difference: empirical facts are, as per (C1), not affected by replacing one geometry by another; ipso facto the two geometric options are empirically indistinguishable. Rather than taking them at face value, conventionalists can naturally reconstrue the conditionals in (B) as claims about the empirical adequacy of the theories they mention. In other words, we propose that they be understood as “If GR is empirically adequate, light rays are bent in the presence of massive objects.”, and “If light is bent in the presence of heavy objects, Nordström’s theory of gravity is empirically inadequate.” These paraphrased conditionals can be straightforwardly evaluated as mathematical-formal claims: non-zero light-bending is derivable result from GR; zero light-bending one derivable from Nordström’s theory.

One may demur: isn’t this response, via a re-interpretation of conditionals, an ad-hoc dodge? We don’t think so. Our reasons will be unpacked in the subsequent discussion of our fourth objection. At this juncture, we content ourselves with observing that our response here is coherent: while conventionalism entails that truth-tables for the above conditionals aren’t straightforward (i.e. obtainable via standard means), it chimes with the spirit of conventionalism to simply stipulate—a convention!—that truth-values be assigned in the manner indicated above.

The next worry extends the preceding one from conditionals to counterfactuals. In our rejoinder, we’ll resort to the same strategy as before.

4.3 Truth-values for counterfactuals?

How are conventionalists supposed to make sense of counterfactuals, given that for conventionalists, geometrically alternative descriptions don’t necessarily correspond to distinct possible worldsFootnote 33 (or at least not obviously so)?

Similarly to the foregoing case of conditionals, the challenge that certain counterfactuals pose to the conventionalist can be traced back to her refusal to assign truth-values to antecedents or consequents; in no possible world, according to conventionalism, do geometric claims possess truth-values. Note, hence, a salient difference from a possible-worlds perspective, between the situation regarding geometric claims, and the usual modal eccentricities: in some possible world, it’s true (albeit false in ours) that pigs fly around in Saturn’s atmosphere, or photons have non-zero mass. It rubs against the core tenet of conventionalism, were one to regard geometric possibilities as genuinely distinct metaphysical possibilities.

As in the case of conditionals, it’s a forceful desideratum that a broad array of counterfactuals be amenable to a truth-functional analysis.Footnote 34 Consider, for instance:

  1. (A)

    “Were the universe’s mass density significantly higher, its geometry would be closed.”, “If Nordström’s theory of gravity were true, the spacetime’s geometry would be conformally flat.”, or “Were the curvature to blow up on a Schwarzschild Black Hole’s event horizon, GR would be wrong.”

  2. (B)

    “If gravitational waves didn’t travel along the ‘physical metric’s’ null-cone, TeVeS wouldn’t have been falsified.Footnote 35”, “Were our universe’s spacetime geometry given by a Weyl geometry (that is, were Weyl’s (1918) unified theory true), the spectrum of light from the sun would be blurred (display the so-called second-clock effect).”

Intuitively, it ought to be possible to evaluate these counterfactuals. Again, we expect them to come out as in fact true. Yet, the occurrence of geometric claims in the antecedents or consequents appears to impede this: in no possible world are they, according to conventionalism, true.

A similar reconstrual strategy as in the case of conditionals removes the obstacle. The counterfactuals within (A), we suggest, should be understood as conditionals in disguise: they state mathematical-formal derivability of analytic truths within the theories as mathematical-logical structures. It’s then possible to determine the truth-value proposed above. One can indeed verify that the implications hold. The case of the counterfactuals in (B) can be dealt with mutatis mutandis: they should be understood as counterfactuals not about the truth of the geometric claims in the antecedents/consequent, but about the empirical adequacy of the theories with the prerequisite physical geometry. The counterfactual can then be evaluated along a standard possible-world semantics. To be specific, for instance: “Had Weyl’s theory been empirically adequate, the spectrum of light from the sun would be blurred.”

4.4 Slippery slope towards global conventionalism?

Is conventionalism, in its local form, restricted to geometry, unstable—a slippery slope towards instrumentalism? Given the ubiquity of “geometry-ladenness” of physical claims, doesn’t conventionalism effectively collapse onto global conventionalism?

The query behind this worry already implicitly figured in the preceding issues: to what extent can the conventionalist meaningfully predicate truth/falsity of theories (or propositions) that contain conventional elements (see e.g. Sklar, 1974, p. 128)? On conventionalism, doesn’t this reliance on truth-value-free ingredients “contaminate” the whole proposition, rendering it not truth-functionally evaluable? The case of modern gravitational theories makes the question particularly acute: modelled on GR, with its canonical geometric interpretation, gravitational theories appear to make essential reference to spacetime geometry. Together with gravity’s “universal effect” (Carnap, 1966, p. 169), doesn’t conventionalism then entail a full-blown anti-realism about such theories? Presumably, such anti-realism quickly spreads all over physics: with geometric structures entering physical laws in manifold ways, the conventionality would soon “infect” significant chunks of physics!Footnote 36

Conventionalists can stave off this repugnant conclusion, we submit, by heeding the distinction between theory-relative, and absolute questions of truth-functional evaluation.Footnote 37 By the former we mean that given a physical theory (with its implicitly or explicitly presupposed geometric structures) as a mathematical system of axioms, certain geometric claims can be formally derived. Their status is that of (theory-relative) analytic, mathematical truths about geometry.

This theory-relativity of physical geometry, and (truth-functionally evaluable) propositions about it, isn’t ad-hoc. Instead, it reflects the holistic intertwinement of geometry and physical laws: talk of physical (as opposed to mathematical/pure) geometry remains incomplete without specifying the physical theory in which it occurs. Physical laws presuppose geometric structure; in turn only through the laws—through the role the geometry plays in those laws—does the geometric structure acquire its status as physical geometry. Both conceptually/semantically and epistemically, physics and physical geometry are given together.Footnote 38

Doesn’t this physics-cum-geometry holism aggravate the concern that conventionalism, and hence a kind of anti-realism, about geometry bleeds into the rest of physics? We’ll now tackle the question of assigning truth-values to a physical theory (with its physical geometry), or to propositions entailed by that theory—a truth-functional evaluation not in a theory-relative, but absolute manner (and hence capable of disclosing non-analytic truths).

An insulating strategy naturally lends itself to conventionalists: apply truth-functional evaluation only to the theory (or proposition) shorn of its geometric components. Once we have excised the latter, we can assign (or rather: ascertain) truth-values of the geometry-purged remainder in the standard manner: the source of the initial trouble—the conventions, which lack truth-value—has been removed. We must, however, qualify this excision by a natural proviso: the truth-functionally evaluable rump theory should still include the structures of empirical geometry; we mustn’t cut out geometry if it has direct empirical significance.

To unpack this, let’s distinguish between physical and empirical geometry. As already indicated above, a theory’s empirical geometryFootnote 39 denotes those geometric structures that its empirical/observable content instantiates.

Physical and empirical geometry needn’t coincide (see e.g. Brown & Read, 2021, p. 76; Read & Menon, 2021). In fact, the existence of the latter is contingent: prima facie, nothing requires that all theories even possess an empirical geometry. In this regard, the case of GR is especially interesting: its physical metric geometry happens also to be empirical. That is, GR’s metric is endowed with “operational” (Read & Menon) “chronometric significance” (Brown, 2005, Ch. 9): empirical phenomena (e.g. test particles and light-rays) can be coordinated with the metric so as to serve as indicators or measurement instruments for it (“rods and clocks” monitoring it).Footnote 40 Although empirically equivalent, geometric alternatives to GR, may postulate a different physical geometry, all—qua observational indistinguishability—will agree on this empirical geometry.

How does the proposed rule for truth-functionally evaluating theories or theoretical claims apply to concrete cases? Begin with GR. Its metric constitutes both its physical and empirical geometry. Hence all claims about, or involving, GR’s geometry allow for standard truth-value assignments!

What enables this pleasing result is the coincidenceFootnote 41 of GR’s physical and empirical geometry. Accordingly, we may naturally (and on pain of conceptual pedantry) transition from reference to the former, to reference to the latter. (Nonetheless, sensu stricto, physical and empirical geometry are distinct concepts; claims about the former, being conventional, lack truth-values, whereas the claims about the latter constitute synthetic truths.) Geometric alternatives to GR typically no longer exhibit this coincidence: in them, physical and empirical geometry come apart. As a result, geometric components—that is, physical-geometric ones—must be excised from propositions, prior to subjecting them to truth-functional evaluation. For instance, in order to truth-functionally evaluate claims involving the (undetectable) Minkowski metric in the spin-2 field interpretation of GR (see §V.2.2), one must purge them of reference to this background geometry.

It also deserves to be stressed that the assignability of truth-values (and, conversely, conventionality) of certain claims can depend sensitively on their formulation—and how broadly one is willing to construe them. For instance, “In our world, gravitational waves are ripples of spacetime curvature.” bears no truth-value: whether gravitational waves are conceptualised as torsion waves, as in TPG’s physical geometry (Hohmann et al., 2018, see §V.2.1 below), or as curvature waves, as in GR’s physical geometry, is a matter of convention. They only admit of truth-values, when relativised to a particular theory: the initial proposition comes out as (analytically) true as a GR-relative one, but as (analytically) false as a TPG-relative one. One may, of course, understand the proposition more broadly: as a claim about empirical geometry (i.e. the natural structure codifying the physical/observable effects of gravitational effects), it comes out as a synthetic truth; as such it holds independently of TPG or GR.

We close this survey of concerns with a final comment on the difference in ontological status of empirical and physical geometry, respectively. Physical geometry belongs to a theory’s ontological commitments. In this sense, it invites a realist attitude. (Conventionalists, of course, resist the temptation, for reasons compiled in §III.) Empirical geometry, by contrast, doesn’t: it’s a formal property of certain elements of a theory’s empirical content. These elements merely happen to instantiate that empirical-geometric structure. For realists, it thus seems that, according to a theory, physical geometry exists in a more substantial, and less accidental way than empirical geometry. At the same time, epistemically, empirical geometry is more robust: with physical geometry being generically a theoretical posit, we can in principle gain empirical knowledge of empirical geometry. This difference ought not to be conflated with the suggestion that physical geometry is therefore dispensable—that one forgo any reference to physical geometry, and instead exclusively focus on empirical geometry.Footnote 42 We reject this suggestion as a non-sequitur.

5 Classical and relativistic spacetimes: a smorgasbord for geometric conventionalists

In many respects, Friedman’s (1983) locus classicus encapsulates latter-day anti-conventionalist sentiments. Friedman identifies two main challenges: “[…] namely giving a clear sense to the notion of ‘equivalent descriptions’, and showing that in problematic cases—notably, the choice between alternative physical geometries—we in fact have ‘equivalent descriptions’” (p. 268). §II§IV proffered our response to Friedman’s first challenge: to unpack (and motivate) the core tenets for conventionalism. We’ll now address Friedman’s second challenge: to deliver examples, demonstrating that blood courses through the veins of the arguments for conventionalism (§III)—that conventionalism is a nuanced response to a real predicament in physics. We’ll commence with classical/pre-relativistic spacetime physics (§V.1); §V.2 will turn to the relativistic domain.

5.1 Conventionalism and classical spacetimes

After reviewing what we take to be the standard Newtonian spacetime setting (§V.1.1), we’ll here discuss alternatives to it (§V.1.2). A second group of relevant theories comprises alternative geometrisations of classical physics in its entirety (§V.1.3).

5.1.1 Newtonian spacetime

As per (C1), conventionalism thrives on the co-existence of alternative theories. Alternatives to what? Following Weatherall and Manchak (who in turn follow Malament, 2012, Ch. 4.1), we take the standard classical spacetime theory to be neo-Newtonian (or “Galilean”) spacetime (see e.g. Sklar, 1974, p. 202; Maudlin, 2012, Ch. 3; Pooley, 2013, Sect. 4.1).

It encompasses three postulates. At bottom lies the topological manifold of events, together with its differentiable structure. On it, both a temporal and a spatial metric on the topological manifold of events are defined. They furnish durations and lengths (and areas and volumes), respectively. Further constraints vouchsafe an absolute, universal simultaneity structure.Footnote 43 Points of neo-Newtonian spacetime perched on different simultaneity surfaces, can’t be identified: points of neo-Newtonian spacetime lack transtemporal identity. The final ingredient is a standard of straightness (or acceleration): it defines which paths through the neo-Newtonian spacetime are privileged—which paths are supposed to count as inertial/force-free. This standard of straightness is given by a flat connection (i.e. with vanishing associated Riemann tensor): one thereby implements that vectors, carried along inertial paths, don’t change their direction; by the same token, slices of spacetime that form a simultaneity surface are flat.Footnote 44 The laws of Classical Mechanics in its standard formulation are formulated by using only the spatiotemporal concepts listed above.

What is the most suitable spacetime setting for classical physics? A number of works (e.g. Bain, 2004; Earman, 1989; Friedman, 1983; Pooley, 2013) belabour this question. We’ll keep the discussion short, referring to the literature for details.

The leitmotif pervading this literature is Earman’s “adequacy condition” for a spacetime (cf. Earman, 1989, Ch. 3): for an adequate spacetime setting, the dynamical and spacetime symmetries are supposed to match. That is, the symmetries in the matter sector should coincide with those of the underlying spacetime. The guiding idea is to strike the right balance between frugality and fecundity. On the one hand, the spacetime geometry should be rich enough to allow us to formulate the physical laws for matter. On the other hand, parsimony mandates that this structure play a role for the dynamics: dynamically idle structure—geometric distinctions to which all matter is insensitive—ought to be pruned.

Yet, flouting Earman’s adequacy condition may be less obnoxious than is often suggested.Footnote 45 As long as the spacetime structure plays at least a metaphysical or interpretative role in the theory, a more liberal attitude seems more appropriate (cf. Møller-Nielsen, 2017; Read & Møller-Nielsen, 2020). We regard the matching of dynamical and spacetime symmetries merely as a desideratum. Other (super-empirical) considerations may override it.Footnote 46 In what follows we’ll always explicate such possible considerations.

Does standard Newtonian theory respect Earman’s desideratum? As has been known arguably even to Newton himself, it doesn’t. Consider so-called dynamical shifts: they uniformly accelerate a system, while adding a counterbalancing force that remains constant on simultaneity surfaces (see e.g. Duerr & Read, 2019 for details). A dynamical shift of a model of standard Newtonian theory (i.e. a set of variables satisfying the theory’s laws) generates another model. All empirical facts remain unaltered. Yet, in virtue of accelerative differences, a dynamical shift effects changes relative to the underlying spacetime: a dynamically shifted world instantiates a different distribution of events over spacetime. Dynamical shifts thus constitute a symmetry that outstrips the spacetime symmetries of standard Newtonian theory.

Two reasons speak against discarding standard Newtonian theory’s spacetime (i.e. neo-Newtonian spacetime) as “inadequate”; they, one may feel, outweigh the blemish of violating Earman’s desideratum. The first is standard Newtonian theory’s flexibility: its structural richness—even if not all of it is harnessed by gravitation and point-particle mechanics—facilitates and allows for straightforward extensions. They straddle both continua (with phenomena such as the behaviour of fluids or elasticity) as well as the various phenomena, associated with friction.

A second argument against discarding neo-Newtonian spacetime in classical mechanics pivots on so-called massive Newtonian gravity. For this 1-parameter family of theories, one modifies the Newtonian theory of gravity, whilst retaining its neo-Newtonian spacetime: one adds a so-called mass term μ2 to the gravitational field equation. As a result, the l.h.s. of the Poisson Equation acquires an extra term linear in the gravitational scalar, \(\Delta \varphi \to \Delta \varphi + \mu^{2} \varphi\). This is a natural modification, familiar (to contemporary physicists) from field theory. Such a term ensures that the dynamical symmetries match the symmetries of neo-Newtonian spacetime. The “extra” symmetry, in virtue of which standard Newtonian theory contravenes Earman’s adequacy condition, is broken: dynamical shifts no longer map models of massive Newtonian gravity onto models.

For any given set of observational data we can choose \(\mu \) sufficiently small such that massive Newtonian Gravity and standard Newtonian theory become indistinguishable (vis-à-vis that data).Footnote 47 This establishes the empirically quasi-equivalence of the two. This is all we need for our present purposes (cf. Pitts, 2011).

But why might one be inclined to prefer massive gravity over the standard Newtonian Gravity? Two reasons can be gleaned. First, let’s reiterate our last point: no compelling a posteriori/empirical reason is forthcoming for privileging the special value \(\mu =0\) (which corresponds to standard Newtonian theory). Worth bearing in mind here is that measurements ineluctably have non-zero error bars: hence, one can never have evidence-based certainty of this value (Pitts, 2018, p. 15). A vanishing mass term thus isn’t merely numerically special; the epistemological basis for assuming \(\mu =0\) is invariably tenuous. Another reason to dis-favour \(\mu =0\) is the primary motivation for Neumannn and Seeliger to consider massive Newtonian gravity (avant la lettre) towards the end of the nineteenth century: for a non-vanishing mass term \(\mu \ne 0\) one overcomes certain problems besetting standard Newtonian theory, when applied to cosmology (Norton, 1999).

In broad brush strokes, we’ll next outline a few empirically equivalent, theoretical alternatives to standard Classical Mechanics. All seem mutually incompatible: prima facie, they can’t be literally true at the same time. The theories we’ll consider fall into two groups. Members of the first, (A), differ from standard classical mechanics primarilyFootnote 48 with respect to their spacetime structure. Members of the second group, (B), display farther-reaching—more general structural, and (arguably) metaphysical—differences. With our liberal attitude towards theory individuation (see §III), all considered theories count as distinct. We’ll briefly stress the principal merits of each. In particular, in addition to occasional computational advantages, they have distinctive explanatory resources (see North, 2021, Ch. 7; Sklar, 2013). The theories thus satisfy our methodological criteria for persuasive examples in support of conventionalism (§III).

5.1.2 Alternative spacetime settingsFootnote 49

5.1.2.1 Absolute velocities

Due to its historical influence, Newtonian theory with absolute velocities deserves mentioning. Vis-à-vis Newtonian standard theory, it merely adds further structure to the spacetime—a vector field representing absolute velocity. On the one hand, this velocity field appears dispensable; it plays no role in the formulation of the dynamics. This is nothing but Galilean Relativity: Classical Mechanics being invariant under constant velocity boosts, the absolute velocity in principle evades detectability—“an epistemological embarrassment” (Pooley).

On the other hand, it might seem a bit rash to dismiss Newtonian theory with absolute velocities out of hand. First, it remains a coherent, and in many regards intuitive, interpretative possibility: the said epistemological embarrassment isn’t tantamount to a metaphysical absurdity.Footnote 50 After the demise of verificationism, unobservability per se may no longer elicit philosophical abhorrence. Secondly, before the development of the inertial frame concept (and the mathematical means to perspicuously implement it via the notion of a connection as a standard of straightness), one may argue that it remained the only then-available spacetime setting: in this sense, Newtonian theory with absolute velocities at least used to be the most rational option—given what at the time could be ontologically articulated (cf. Martens & Read, 2020; Møller-Nielsen, 2017). Lastly, at least two other theories call for absolute velocities. Neither, admittedly, is classical in the usual sense. Yet, the fact that they are empirically successful suggests that at least with the benefit of hindsight, a Newtonian spacetime with absolute velocities might be better off than one might initially think. One such theory is Bohmian mechanics, an alternative to non-relativistic quantum mechanics (see e.g. Passon, 2004 for details). It endows particles with always definite paths, with a distinctively quantum action-at-a distance force guiding them. The other theoretical context in which historically a role was expected for absolute velocity (relative to the luminiferous ether) was of course electromagnetic theory. Coherence with both theories buttresses absolute velocities.

5.1.2.2 Newton-Cartan-theory/geometrised Newtonian gravityFootnote 51

Newton-Cartan Theory (NCT) geometrises (“inertialises”, Duerr, 2020a, 2020b, p. 93) Newtonian Gravity in the same way as GR (Carrier, 1994b, p. 242; Friedman, 1983, Ch. III.4; Knox, 2014; Malament, 2012, Ch. 4): gravitational effects are conceptualised as a manifestation of a non-flat inertial structure (represented by a connection with non-zero curvature). That is, test-particles in gravitational free-fall are treated as force-free/inertial; unlike in neo-Newtonian spacetime, their paths trace out non-straight geodesics. Its dynamics is furnished by certain, purely geometric conditions on the connection’s curvature on the one hand, and a geometrised version of the gravitational Poisson Equation (in striking similarity with the Einstein Equations) on the other.

Vis-à-vis standard Newtonian theory, NCT’s resulting spacetime’s symmetry group is enlarged: the dynamical shifts, plaguing standard Newtonian theory, now are manifest gauge symmetries. The dynamical and spacetime symmetries thereby match: NCT respects Earman’s adequacy condition.

Yet another feat commends NCT: it provides an (eliminative) explanation of the equivalence of inertial and gravitational mass (Duerr, 2020a, 2020b, fn 146; cf. Weatherall, 2011). That is: as in GR, NCT dispenses with gravitational mass; inertial mass fully takes over its role. This constitutes an advantage of NCT’s in terms of parsimony (and explanatory power).

Also with respect to inter-theory relations, NCT scores highly. Thanks to its conceptual similarities with GR, it plays a privileged role in illuminating GR’s Newtonian (i.e. weak-field, static) limit of GR: NCT is arguably the theory to which GR can be most naturally said to reduce (see Fletcher, 2019 for details).

In light of these accomplishments, what more to say? Haven’t we identified, with NCT, the right spacetime setting for classical physics? Upon further reflection, things turn out to be less straightforward. For one, one may recoil from the exclusive focus on gravity (and the equations of motion for test-particles)—as we already queried in Weatherall and Manchak’s (2014) analysis of conventionalism: it’s unclear how NCT coheres with the rest of physics. Standard presentations of NCT remain silent on how NCT’s variables couple to matter other than test-particles. Which standard of inertia/acceleration is matter adverting to—NCT’s or standard Newtonian theory’s flat one?

Perhaps for certain historical periods, such a restriction was forgivable. But to reduce classical physics to Newtonian gravity and test-particles seems a bit of a stretch. With the discovery of electromagnetic forces, at the least, awkward questions arise (even within the non-relativistic regime). A classical, charged particle, for instance, radiates, when accelerated. But in gravitational free-fall—which according to NCT counts as unaccelerated/inertial—does it radiate or not? Absent further information of how NCT relates to physics beyond (uncharged) test-particles, it’s hard to fathom an answer. Hailing NCT as the obviously most appropriate spacetime setting for classical physics certainly seems too quick.

5.1.2.3 Barbour-Bertotti theory

Of all spacetime settings contemplated so far, Barbour-Bertotti Theory (Barbour & Bertotti, 1982) requires the sparsest structure: as primitive spatiotemporal posits, it only needs (instantaneous) relative, spatial Euclidean distances.Footnote 52 It can dispense with a fundamental notion of time.

This goes hand in hand with changes in dynamics (Pooley, 2013, p. 39). The latter is given by an extremal principle, the so-called Best Matching Technique. It picks out sequences of spatial configurations, based on minimal differences in relative particle configurations of particles (i.e. differences in instantaneous relative particle configurations modulo rotations). Thereby a natural order between configurations is induced, an order grounded in intrinsic differences of configurations. In this sense, time emerges—as envisioned by Leibniz (cf. Barbour, 1982)—as an order parameter of a “sequence of co-existences”.Footnote 53 The symmetry group of these dynamics is significantly larger than that of standard Newtonian theory. In fact Earman’s adequacy condition is thereby restored: Barbour-Bertotti Theory’s dynamical symmetries and those of its spacetime match.

Another result of the Best Matching dynamics is that, in one sense, Barbour-Bertotti Theory and standard Newtonian theory also differ empirically: there are possible observations that allow one to discriminate between them. In fact, the pertinent data would falsify Barbour-Bertotti theory. Barbour-Bertotti Theory and standard Newtonian theory are only empirically equivalent for a universe with zero net angular momentum (more on this below). But, to our best knowledge, our universe seems to have vanishing angular momentum (with our experimental techniques not allowing any interventions on this fact!). Hence, Barbour-Bertotti and standard Newtonian theory are in another, relevant sense empirically indistinguishable—at least for us (i.e. our world).

Besides parsimony with respect to its geometric underpinnings and compliance with Earman’s adequacy condition, Barbour-Bertotti Theory scores highly on two other virtues. One is that it maximises explanatory power (see Pooley & Brown, 2002 for details). For definite predictions, Barbour-Bertotti Theory requires fewer inputs (initial data) than standard Newtonian theory: only relative distances between N particles, and their instantaneous rates of change suffice for the particles’ evolution to be well-defined; in standard Newtonian theory, this is too little (see Barbour, 2003, Ch. 5 for an intuitive illustration). Not only do Barbour-Bertotti Theory’s explanatory capacities need fewer resources (thereby, in a natural sense, boosting the theory’s explanatory power). In fact, as Pooley and Brown (2002) have stressed, vis-à-vis standard Newtonian theory, it makes a genuine prediction: that the angular momentum of the universe (equivalently, its global rotation) must vanish—a prediction borne out by the empirical evidence. By contrast, in standard Newtonian theory, “[…] that the angular momentum of the universe is zero is contingent and, in fact, rather extraordinary (given the range of possible values it might have had according to [standard Newtonian theory])“ (p. 8, their emphasis).

Another achievement is indebted to other philosophical—broadly MachianFootnote 54—commitments. Hence, they won’t strike everybody as inherently advantageous (cf. Pooley, 2013, Sect. 5; for a more positive evaluation of Machianism see Thébault, 2021). The first concerns relationalism (cf. Barbour, 1982): Barbour-Bertotti Theory counts as a fully relationalist theory (Huggett et al., 2023; Pooley, 2013, Sect. 6.2; Sklar, 2013, Ch. 20). That is, neither spacetime nor space are basic, fundamental entities in their own right, on a par with matter: time, we saw already, emerges from genuine non-temporal facts (viz. intrinsic relative differences in instantaneous particle configurations); rather than an independent entity, space, on the other hand, is an abstraction in that it’s an interpolation (and hence ontologically derivative) of relative configurations instantiated by actual material entities. Such relationalism is certainly attractive on the grounds of ontological parsimony.

In short: Barbour-Bertotti Theory needs only relative instantaneous particle configurations (and their rates of change), and no geometric structure richer than that; in particular, it gets by without a primitive time metric. Geometrically, the contrast with standard Newtonian theory is patent.

Let’s now segue into our second group of geometric alternatives to standard Newtonian theory. Its members geometrise classical physics in different ways; their physical geometries absorb the latter in different manners.

5.1.3 Geometrisations of classical physics

5.1.3.1 Hamiltonian mechanics

As its state space (i.e. the space each point of which represents possible states of a classical-mechanical system) Hamiltonian Mechanics posits the so-called co-tangent bundle \({T}^{*}Q\). It consists of the configuration space \(Q\), with cotangent space (the dual to tangent space) \({T}_{p}^{*}Q\), attached to each point \(p\in Q\).Footnote 55 For a classical system with \(n\) particles, the configuration space \(Q\subseteq {R}^{n}\) denotes the space of the particles’ generalised positions (i.e. a set of parameters that describe the system’s total state). It forms a \(2n\)-dimensional manifold. This manifold is now equipped with further structure. Hamiltonian Mechanics geometrises Classical Mechanics through a so-called symplectic structure \(\omega \). It defines a volume—but not a metric/line-element on \({T}^{*}Q\).Footnote 56 In contrast to (non-flat) Riemannian manifolds, all resulting symplectic manifolds, \(\langle {T}^{*}Q,\omega \rangle \), locally look the same; they are isomorphic. Eventually, one introduces the theory’s fundamental object, the Hamiltonian \(H\)—a real-valued, differentiable scalar field on \(Q\). These structures now fully yield the system’s dynamics: the latter is furnished by (“Hamilton’s canonical”) \(2n\) first-order equations. (They leave invariant exactly the manifold’s symplectic structure.)

Already at the level of this cursory gloss, it becomes evident that Hamiltonian Mechanics proves to be an auspicious example for geometric conventionalism (cf. North, 2009, 2021, 2022 for a deeper analysis) First, the geometric structure of Hamiltonian Mechanics—a symplectic manifold of the co-tangent bundle—clearly differs from that of standard Newtonian theory. In fact, the geometries are structurally inequivalent (non-isomorphic, cf. Curiel, 2014). This establishes geometric incompatibility between standard Newtonian theory and Hamiltonian Mechanics, when read at face value.

Notwithstanding these geometric differences, Hamiltonian and Newtonian Mechanics (under suitable circumstances that we’ll grant here) are inter-translatable: one can recover/derive the dynamical equations of each from the other. This is rehearsed in any advanced textbook on Analytic Mechanics (e.g. Landau & Lifshitz, 1976). For our purposes, we may thus presume the empirical equivalence of Hamiltonian and Newtonian Mechanics as unproblematic.

Hearing of inter-translatability, the conventionalist is reminded of Poincaré’s (and Reichenbach’s) inter-translatability argument for the conventionality of geometry; she is likely to prick her ears. But mention of inter-translatability is also reminiscent of Quine (1975). Hence, worries may arise concerning the theory identity (synonymy, in our parlance of §3.1) of Hamiltonian and Newtonian Mechanics: are they merely representational variants of the same theory? Despite nigh-universal claims to that effect in the physics literature, the profound differences—mathematical, interpretative and metaphysical—suggest otherwise (cf. North, 2009, 2021, 2022). For instance, while in Newtonian Mechanics forces are genuine entities, responsible for pushing particles around, in Hamiltonian Mechanics they seem to be absent. Concomitantly, both also differ with respect to their resources for explaining and concomitant understanding of the pertinent phenomena. To our minds, caution thus counsels a verdict in favour of theory individuation: Newtonian and Hamiltonian Mechanics count as distinct theories.

Further support for this is accrued by the super-empirical merits of Hamiltonian Mechanics; they underscore its status as a theory in its own right. Its practical advantages are attested to by most textbooks covering the subject. For instance, the (\(2n\)) first-order differential equations of the Hamiltonian formalism tend to be much easier to tract than the (\(n\)) second-order ones of standard Newtonian theory. Particularly impressive applications concern the study of chaotic systems, and statistical mechanics. The most compelling super-empirical virtue of Hamiltonian Mechanics, however, arguably resides in its relevance for quantum mechanics: Hamiltonian Mechanics provides the basis for canonical quantisation.

In sum: Hamiltonian Mechanics qualifies as a bona fide theoretical rival to Newtonian Mechanics; geometrically and metaphysically incompatible with, yet empirically equivalent to the latter, it provides a genuine alternative description that satisfies our desiderata for a convincing pro-conventionalist example.

5.1.3.2 Lagrangian mechanics

Lagrangian Mechanics is formulated on the tangent bundle. The latter is the manifold generated by attaching to each point in configuration space (as the theory’s state space, comprised of so-called generalised positions) its tangent space (so-called generalised velocities). In this arena, one first installs a Riemannian metric. It defines the system’s kinetic energy (Lanczos, 1986, Ch. 1.5, 5.7)—the energy of a system of particles in the absence of forces (in Newtonian parlance). “This metric gives distances between nearby points, the lengths of curves, and the geodesics, but it differs from the metric of Newtonian mechanics in that the configuration space needn’t be intrinsically flat” (North, 2021, p. 115). Next, one adds a scalar—the so-called Lagrangian. It’s constructed from the system’s kinetic energy, and the potentials, associated with the particles’ interactions. The Lagrangian encodes the theory’s dynamics. In contrast to the situation of Hamiltonian Mechanics’ cotangent bundle, using the tangent bundle allows us to formulate, in a natural manner, second order differential equations for the Lagrangian: the solutions of these so-called Euler–Lagrange equations yield the system’s dynamical evolution; they determine how it moves through configuration space.

In several regards, the differences between standard Newtonian theory and Lagrangian Mechanics parallel those commented on for Hamiltonian Mechanics. We therefore confine ourselves to highlighting some differences between Hamiltonian and Lagrangian Mechanics (cf. North, 2009, 2021, 2022). Their geometric settings manifestly differ. Their arenas are the cotangent bundle and the tangent bundle, respectively. Their salient geometric structures are those of a symplectic geometry and that of a (non-flat) Riemannian manifold, respectively. Also apart from geometric structures, Hamiltonian and Lagrangian Mechanics differ mathematically. For instance, the dynamics in the latter is given by second-order equations, whereas that of the former by first-order equations. More generally, in terms of their structures, Hamiltonian and Lagrangian Mechanics aren’t isomorphic (Curiel, 2014). Metaphysically, too, differences are plausible. The generalised velocities in Lagrangian Mechanics, for instance, are time derivatives of the generalised positions. This suggests, as North (2021, 2022) has observed, that the latter are the fundamental variables. This makes Lagrangian Mechanics the natural starting point for classical field theory, dealing with infinitely many degrees of freedom. In that context, it unfolds its full potential.

By contradistinction, Hamiltonian Mechanics treats the generalised positions and generalised momenta on a par as variables. Accordingly, it may seem natural to treat both as equally fundamental—as is indeed implemented in quantum mechanics. As before in our discussion of Hamiltonian Mechanics, despite the inter-translatability between Lagrangian and Hamiltonian Mechanics (under suitable conditions, not questioned here), the foregoing differences suggest that the two aren’t notational variants of each other; rather, they are empirically equivalent, distinct theories.

In short: Hamiltonian and Lagrangian Mechanics display profound geometric differences: the former exhibits the structure of a symplectic manifold, whereas the latter exhibits that of a Riemannian manifold with a non-flat metric.Footnote 57

These differences can be further amplified (with yet another geometric twist!), thanks to an astounding result by Trümper (1983): for “all of the forces that one normally encounters in classical mechanics, including the Lorentz force” (p. 216) or friction, Lagrangian Mechanics admits of a fully-fledged geometric interpretation. It applies to any n-particle system (possibly, subject to holonomic and/or explicitly time-dependent constraints). We can completely express the physics “more geometrico”, via a non-Riemannian manifold. The latter consists of two basic structures. One is a (degenerate) temporal metric. It induces a foliation in terms of simultaneity planes. The other structure is a (symmetric) dynamical connection. It’s compatible with the temporal (and the spatial) metric. This connection absorbs all forces such that the n-particle system’s time evolution corresponds to that connection’s (affinely parameterised) geodesics/auto-parallels!Footnote 58 In full analogy with Newton-Cartan theory, forces need no longer be posited as extraneous causes, pushing matter around, as in the Newtonian picture. The evolution of the system is characterised via geodesic motion through configuration space. In this sense, Trümper has achieved a full geometrisation of Lagrangian Mechanics. (The connection does depend on the system’s parameters, i.e. its constituent particles’ masses or charges. Hence, Trümper’s geometrisation generically doesn’t exhibit the universality familiar from geometrising gravity in Newton-Cartan Theory or GR.Footnote 59)

In short: Trümper’s geometrised Lagrangian Mechanics expresses Lagrangian Mechanics purely geometrically—as a theory about configuration space’s tangent bundle, equipped with the structure of a non-Riemannian manifold with degenerate temporal metric and a compatible, non-Levi–Civita connection.

With Lagrangian Mechanics, we have thereby found another serious distinct rival theory (in fact: two!) to standard Newtonian theory that employs a significantly different physical geometry.

5.2 The conventionalist case for relativistic spacetimes

We’ll now turn to the conventionalist case for general-relativistic spacetimes. What are the geometric alternatives to GRFootnote 60 that (C1) requires, and whose co-existence motivates the conventionalist stratagem (C2)? We group them—without pretension to completeness—into three loosely defined categories. The first comprises, together with GR itself, a triplet of alternative geometric settings for GR (§V.2.1)—a “Geometric Trinity of Gravity”. It reveals an inherent indeterminacy of GR’s geometry within the general framework of metric-affine geometries. Members of the second category (§V.2.2) differ from GR by minor, mathematically equivalent modifications of its formalism, together with an attendant alternative interpretation. We’ll dub them “re-interpretations of GR”. Finally, we consider geometrically alternative theories that achieve approximate empirical equivalence (§V.2.3).

5.2.1 The geometric trinity of gravity

Our first showcase for conventionalism about relativistic spacetimes centres on what has been dubbed “the geometrical trinity of gravity” (Jiménez et al., 2019; also for a review, also Capozziello et al., 2022). The empirical content of general-relativistic gravity can be cast in three alternative ways which are empirically equivalent, yet mutually incompatible with respect to their underlying geometry: GR with its Riemannian geometry, Teleparallel Gravity (§1) with its non-Riemannian geometry in which curvature vanishes but torsion doesn’t, and Symmetric Teleparallel Gravity (§2), with its likewise non-Riemannian geometry, one however in which both curvature and torsion vanish, but instead non-metricity is non-trivial.

5.2.1.1 Teleparallel gravity—metric-affine geometrisation of gravitation

The basic geometric objects of Teleparallel Gravity (TPG) are a metric (which coincides with GR’s), together with a connection (different from GR’s, viz. with vanishing curvature and non-metricity, but non-trivial torsion—to be explained shortly). In TPG, one conceives of gravity as manifestations of this geometry.

TPG’s mathematical framework is that of metric-affine geometry (see Jiménez et al., 2019 for details): in TPG’s variationalFootnote 61 formulation, one treats the connection and metric, with which the geometry is endowed, as conceptually distinct; one varies them independently. (Contrast this with GR’s Riemannian geometry, within which the connection is a priori assumed to be uniquely determined by the metric.) As a priori constraintsFootnote 62 (via suitable Lagrange multipliers), one demands that the connection’s curvature and non-metricity be zero.

From this, one proceeds in two steps. The first is concerned with the purely gravitational (vacuum); in the second we incorporate—that is, couple gravity to—matter. As TPG’s purely gravitational (vacuum/matter-free) action, one stipulates as the Lagrangian density the most general, even-parity second-order quadratic form, built from the torsion (Aldrovandi & Pereira, 2013, Ch. 8 and 9; Hayashi & Shirafuji, 1979, 1982) and constructed in exact analogy with the (free) action of electromagnetism (or Yang-Mills theories, more generally). The resulting action depends on three parameters; they can be chosen such that TPG’s purely gravitational Lagrangian differs from the Einstein-Hilbert one only by a surface term, ensuring dynamical equivalence with GR. That is, variation of this action with respect to the connection and the metricFootnote 63 yields vacuum field equations fully equivalent to the Einstein Equations (in vacuo).

The TPG field equations’ solutions consist of pairs of a metric and a connection. The former coincides with GR’s (in vacuo, as well as in the presence of matter—as we’ll see shortly); the latter, the so-called Weitzenböck connection \(\dot{\Gamma }\), differs from the metric’s Levi–Civita connection. Like the Levi–Civita connection, it’s metrically compatible\(:\) parallel-transporting two vectors via \(\dot{\Gamma }\) along some curve preserves both the angles between them, and their lengths. Unlike that of GR’s Levi–Civita connection, however, the Weitzenböck connection’s curvature vanishes: a vector parallel-transported via \(\dot{\Gamma }\) along a closed loop doesn’t change its orientation; parallel-transporting a vector along a closed loop returns it to its original position. By the same token, a congruence of geodesics with respect to \(\dot{\Gamma }\) doesn’t exhibit geodesic deviation.

By contrast to a (Riemannian geometry’s) Levi–Civita connection, the Weitzenböck connection’s torsion—the connection’s anti-symmetric part, quantifying how frames twist, when parallel-transported via \(\dot{\Gamma }\)—doesn’t vanish. Accordingly, a parallelogram formed by two vectors parallel-transported along each other (via \(\dot{\Gamma }\)) doesn’t close; the gap is correlated with the torsion.

Now to the final step in TPG’s setup: how to include matter as sources for gravity? Here, TPG’s empirical equivalence is built into the theory ab initio (modulo potential subtleties related to boundary terms, see Wolf & Read, 2023). In virtue of a mathematical identity, the Levi–Civita connection of a metric can be decomposed into the Weitzenböck connection, plus a correction term that consists of a combination of the Weitzenböck connection’s torsion and the metric (without need for any derivatives of the latter). As a corollary, one obtains an identity between the Levi–Civita connection’s Riemann curvature and the Weitzenböck connection’s, together with correction factors which only depend on the Weitzenböck connection’s torsion and the metric (see e.g. Aldrovandi & Pereira, 2013, p. 91 for details; Mosna & Pereira, 2004).

This decomposition allows a “translation” of any general-relativistic term (action or equation of motion).Footnote 64 The general-relativistic matter action is obtained from the “minimal coupling prescription” (or “minimal substitution rule”, see e.g. Wald, 1984, p.70). Without loss of empirical content, we can “teleparallelise” it by merely couching it in terms of TPG’s geometric quantities: wherever the general-relativistic coupling prescription involves the Levi–Civita connection, one replaces the latter by the Weitzenböck connection, together with the correction factors alluded to above.Footnote 65 To arrive at TPG’s full action, we finally need to add this teleparallelised matter action to the purely gravitational part (exactly like in GR). Variation yields the full field equations (which can be explicitly shown to be equivalent to the Einstein Equations, see e.g. Aldrovandi & Pereira, 2013, Ch. 9 and Appendix C; Krššák et al., 2019, sect. III for a more convenient, equivalent formulation in terms of tetrads/frame fields).

By construction, teleparallelised equations are empirically indistinguishable from general-relativistic ones: matter “feels”—and consequently we as observers empirically probe—the same gravitational influence as in GR. The salient difference between GR and TPG concerns how each conceptualises gravity. In our terminology of §IV, GR and TPG agree on their empirical geometry but differ over their physical geometry. GR, on its standard “geometrical” interpretation (canonised by Misner et al., 1973; see also Duerr, 2020a, 2020b, esp. Ch. 1 and 7), represents gravity by a non-flat Riemannian geometry. Rather than attributed to a force, gravitational effects become manifestations of this geometry’s deviation from Minkowskian spacetime structure, both with respect to chronogeometry (represented by the metric structure) as well as with respect to inertial phenomena (represented by the metric’s Levi–Civita connection). Test particles in gravitational free-fall, in particular, are conceived of as inertial (force-free); their trajectories trace out the curvilinear geodesics of this geometry.

By contrast, TPG conceptualises gravity in terms of a metric-affine geometry. It shares with GR the same (non-special-relativistic) chronogeometry, associated with the metric. Both theories allow for the same gravity-induced chronogeometric effects, such as gravitational redshift, light bending or echo delay. Like general-relativistic gravity, teleparallel gravity also elicits non-chronogeometric effects. Phenomena such as tidal effects or the precession of gyroscopes in the presence of gravity are qualitatively the same as in GR. They only receive a different—but nonetheless geometric—interpretation.Footnote 66 What in GR is interpreted as inertial structure—the privileged path-structure (parallel-transport), furnished by the Levi–Civita connection (and generated by the metric)—one construes in TPG as a non-fundamental compound quantity, produced by a combination of the metric structure and the parallel-displacement given by TPG’s Weitzenböck connection. TPG does away with inertial structure proper as a fundamental concept.

TPG still remains a geometric theory of gravity; it geometrises gravity. It does so, however, in a way doubly different from GR (see Lehmkuhl, 2009, Ch. 9 for a taxonomy)—with differences both at the level of geometry, as well as at a more general ontological one. First, TPG’s and GR’s geometries are incompatible. GR employs a Riemannian geometry (with a torsionless, but non-flat Levi–Civita connection). In TPG, by contradistinction, gravity is represented by a metric (endowed with chronogeometric significance), together with a (flat, but torsionful) Weitzenböck connection. The latter’s torsion defines a natural measure for the strength of gravity.

Secondly, GR (partially) reduces gravitational effects to inertial ones.Footnote 67 By contrast, TPG doesn’t have a meaningfully defined inertial structure (except in the limit of Special Relativity). At the same time, it geometrises gravity. That is, TPG dispenses with inertial structure as a substantive fundamental concept: what in GR one interprets as inertial motion (or in Newtonian Gravity as motion under the influence of a gravitational force, responsible for deviations from Newtonian theory’s inertial motion) within TPG, is construed as motion exhibiting the metric-affine geometry, representing TPG’s gravity. In other words, unlike in GR, TPG doesn’t display “strength-1 geometrisation” of gravity (Lehmkuhl, 2009), i.e. a reduction to metric structure; instead, TPG displays “strength-2 geometrisation” (ibid.), i.e. a re-conceptualisation of gravity as manifestations of a non-Riemannian geometry.

Despite TPG’s slightly more involved conceptual structure, arguably less elegant, than GR’s, TPG can lay claim to potential—but not necessarily compelling—advantages (see e.g. Aldrovandi & Pereira, 2013, Ch. 18; Pereira, 2014, Sect. 4). In line with our demands on rivalling theories (§III), TPG thus counts as a genuine (albeit not necessarily superior) alternative to GR. We mention threeFootnote 68 reasons in light of which one ought to take TPG seriously:

  1. (1)

    Prima facie, TPG exhibits strong conceptual similarity with gauge theories (see e.g. Pereira & Obukhov, 2019; cf. however Wallace, 2015). While Knox (2014, p. 271) downplays this similarity—as far as it goesFootnote 69—as merely “a methodological unification, not an ontological one”, one may (following e.g. Morrison, 2000, 2013) still regard this form of coherence in terms of conceptual frameworks or principles as a non-trivial theory virtue (even if perhaps more a pragmatic, rather than truth-related one).

  2. (2)

    As Lakatos once observed (1978, p. 69, fn1), empirically equivalent theories can differ in terms of their heuristic power: they needn’t exhibit the same fertility and potential to suggest novel applications and natural extensions (which, if empirically borne out, might advance gravitational research). Indeed, natural extensions of TPG are currently explored as promising candidates for Dark Energy and inflationary cosmology (see e.g. Cai et al., 2016).Footnote 70

  3. (3)

    For spacetimes with a boundary, GR’s variational formulation requires an extra term that one must insert by hand, in an ad-hoc manner, into the action, the so-called Gibbons-Hawking-York surface term (see e.g. Padmanabhan, 2010, Ch. 6.3). It also plays a role in the context of black hole entropy; it determines the latter. In contrast to GR, such a term follows in TPG naturally; one doesn’t need to add it to the TPG action by hand (Oshita & Wu, 2017).

5.2.1.2 Gravity as non-metricity

In Symmetric Teleparallelism—or as we’ll call it, Non-Metricity Gravity (NMG)–GR’s (or TPG’s) gravitational degrees of freedomFootnote 71 are expressed via a metric and its non-metricity with respect to a class of connections whose curvature and torsion vanish (Nester & Yo, 1998). Non-metricity—like curvature and torsion, a property of a connection—measures how much the length of vectors changes upon parallel-transport via this connection. Formally, it’s defined as the covariant derivative of the metric with respect to this connection.

Let’s sketch NMG’s construction in steps analogous to those of TPG. As in TPG, NMG’s formal framework is that of metric-affine geometry. As a priori constraints, (via suitable Lagrange multipliers) we enforce, in the case of NMG, however the connection’s curvature and torsion to be zero. For NMG’s purely gravitational action, consider the most general even-parity second order quadratic form for the non-metricity. This form has five parameters—to be determined shortly. As before, we can now use an identity for the connection: we decompose it into a Levi–Civita part and part, the so-called distorsion, consisting solely of the metric and the connection’s non-metricity (with no derivatives of either involved). By dint of this identity, it can be shown that, for a particular choice of coefficients, the action just introduced coincides with the Einstein-Hilbert action, up to a surface term. The latter won’t contribute to the field equations. To include matter, we employ the same procedure as in the case of TPG: drawing again on the said identity for decomposing a connection, we re-cast the occurrence of the Levi–Civita connection in the general-relativistic coupling prescription for the matter action in terms of NMG’s connection and a suitable correction factor, made up only of the metric and the connection’s distorsion. Varying the total action, purely gravitational and the “non-metricity geometrised” general-relativistic matter action, with respect to the metric and the connection, we obtain NMG’s field equations (Jiménez et al., 2019, Sect. IV; Capozziello et al., 2022, Sect. 4). They can, again, explicitly be shown to be equivalent with GR’s field equations.

The theory has an enhanced gauge symmetry; NMG’s solutions are given by pairs, consisting of the metric (which, as in TPG, coincides with GR’s), and an equivalence class of connections. The latter can be parametrised in an extraordinarily elegant way (see e.g. Jiménez et al., 2019, p. 9). NMG shares with TPG the benefits with respect to the similarity with gauge theories, with respect to a variational principle (no need for a Hawking-Gibbons-York surface correction term), and with respect to new, auspicious avenues for modifications (Jiménez et al., 2019). Jiménez et al. (2019, Sect. 4C) adduce two further “theoretical advantages of [NMG] compared to GR and to [TPG], respectively. They are: (1) The invariant [Lagrangian purely gravitational action, introduced above] can be bootstrapped; and (2) the minimal coupling of spinors is available.” (1) Expresses the idea that one can deduce the full non-linear theory from a linear approximation through systematic self-coupling in a way that respects the gauge symmetries. (2) Means that “in [NMG] it is not necessary to consider any adjustments to the minimal prescription to consistently incorporate Dirac matter” (op.cit., p. 18); the somewhat unusual coupling of matter and gravity, characteristic of TPG, isn’t needed. This is closely related to a remarkable property distinctive of NMG, worthy to be mentioned in its own right: there exists a choice of spacetime coordinates for which the connection can be globally set to zero! As a result, for this choice of coordinates, NMG “is arguably the simplest amongst the three equivalent representations [of general-relativistic gravity, i.e. GR and TPG]” (Jiménez et al., 2019, p. 8).

5.2.2 Re-interpretations of GR

5.2.2.1 “Universal force”-interpretations

The label of this first category takes its cue from Reichenbach (1928, following the proposal, argued to be also exegetically adequate, of e.g. Duerr and Ben-Menahem (2022)). The guiding thought parallels the standard, pre-relativistic invocation of forces to explain deviation from inertial motion (cf. Carnap, 1966, p.169); unlike in GR’s standard/geometric interpretation (see e.g. Lehmkuhl, 2014, esp. Sect. 3), however, the inertial structure is now not identified with the metric’s Levi–Civita connection.

Assume that such inertial motion is given via auto-parallels with respect to a further connection \(\overline{\nabla }\), with the coefficients \({\overline{\Gamma } }_{bc}^{a}\). The connection is no longer the metric’s Levi–Civita connection; we thereby depart from Riemannian geometry (in favour of a more general, so-called metric-affine one).

Two choices for such an alternative connection, furnishing the theory’s inertial structure, naturally suggest themselves (for reasons to be sketched below):

  • The Minkowski metric’s Levi–Civita connection: \({\overline{\Gamma } }_{bc}^{a}={\left\{\begin{array}{c}a\\ bc\end{array}\right\}}_{\eta }.\)

  • The Levi–Civita connection of the (putative) cosmological FLRW-background.

We’ll confine our attention to the first option. Test-particles in gravitational free-fall don’t follow the trajectories of flat/Minkowski spacetime; this much is an empirical fact. The “force theory”-interpretation accounts for it in terms of the action of the “universal force”Footnote 72

$$ K_{bc}^{a} : = \left\{ {\begin{array}{*{20}c} a \\ {bc} \\ \end{array} } \right\}_{g} - \Gamma_{bc}^{a} \equiv \frac{1}{2}g^{ad} \left( {\overline{\nabla }_{b} g_{dc} + \overline{\nabla }_{c} g_{db} - \overline{\nabla }_{d} g_{bc} } \right) $$

The effective (i.e. empirically traced out) trajectories of test-particles in gravitational free-fall—phenomenologically described as geodesics of the metric—arise from the compound effect of the (ex hypothesi) inertial Minkowski trajectories and this universal force: \({\left\{\begin{array}{c}a\\ bc\end{array}\right\}}_{g}={\Gamma }_{bc}^{a}+{{\text{K}}}_{bc}^{a}\). Thus, recasting general-relativistic equations (or the actionFootnote 73) yields the corresponding “universal force”-theoretical counterparts.Footnote 74 Gravitational phenomena originate in the influence to the “universal force” \({{\text{K}}}_{bc}^{a}\). The epithet is chosen to reflect the universality of its effects (cf. Carnap, 1966, p. 169): neither can \({{\text{K}}}_{bc}^{a}\) be screened off by, for instance, intervening insulating walls, nor does it depend on the properties of the substances on which it acts.

The “universal force”-interpretation of GR demotes the status of the general-relativistic metric \({g}_{ab}\) as a physical field; it ceases to be ontologically on a par with, say, electromagnetic fields. Instead, it’s best viewed as a gravitational potential, akin to its Newtonian precursor: a merely auxiliary quantity that generates the physical one, the universal force (cf. Reichenbach, 1928, §37).

Why take these mathematically trivial re-formulations seriously? Why consider them genuine alternative interpretations of GR? One reason concerns what Einstein (cited in Lehmkuhl, 2014, p. 323) calls “the continuity of thought” (and prized it as “not worthless” (ibid.)). The “universal force”-interpretation retains Special Relativity’s inertial structure. Likewise, ontologically, the force-theory formulation doesn’t entail as drastic a revision of the status of gravity as GR: it remains a force—rather than a manifestation of spacetime geometry. In short, the force-theory formulation scores high on conservativeness, or conceptual continuity with earlier theories.Footnote 75

Secondly, coherence with respect to other theories might not merely be a pertinent diachronic virtue of the force interpretation, but also—and perhaps more persuasively so– a synchronous one: the smaller the differences, at the level of concepts and broader principles, between our best theory of gravity and the rest of fundamental physics (standard quantum field theory, in particular), the more our body of physical knowledge may be said to be unified. In short, the force-theory formulation scores high on harmony/unity with other parts of physics, or “external coherence” (see also Carrier, 1994b, p. 249).

5.2.2.2 “Universal field”-interpretations

Next, let’s inspect what one may call “(universal) field interpretations of GR”.Footnote 76 They denote views that regard GR’s standard metric as the gravitational field on a background spacetime.

We’ll consider two particularly attractive options (see, however, e.g. Pitts, 2022 for more). The first consists in construing the gravitational field, \({h}_{ab}\), as the (not necessarily small!) perturbation of the Minkowski metric, i.e. as the difference between GR’s standard metric, “only an effective notion” (Lehmkuhl, 2008, p. 101), and the “fundamental” or “actual” (ibid.) one, the special-relativistic Minkowski metricFootnote 77:

$$ h_{ab} : = g_{ab} - \eta_{ab} $$

The gravitational field \({h}_{ab}\) is conceptually distinct from the special-relativistic background spacetime; the former is a field on (or “over”) the latter, in complete analogy with an electromagnetic or a Klein-Gordan field. One variant of this field interpretation has been systematically developed in great detail, known as “the spin-2 view on GR” (see e.g. Baker et al., 2023; Deser & Henneaux, 2007; Feynman et al., 1995).

The field interpretation is compatible with conceding that \({h}_{ab}\) has some special features as a field (cf. Brown, 2005, Ch. 9): most notably, it couples minimally to the background metric—it’s just added to the latter; and the resulting effective metric \({\eta }_{ab}+{g}_{ab}\) couples directly and universally to all non-gravitational matter fields. Hence the above term “’universal field’- interpretation”. To our minds, this doesn’t impinge on the field interpretation’s viability.

The field interpretation has a number of advantages. First, as has been stressed by e.g. Pitts and Schieve (2001) (see also Petrov & Pitts, 2019), “it permits […] a formal derivation of general relativity from a linear theory in flat spacetime” (op.cit., p. 1319). Such a derivation isn’t merely interesting in its own right (cf., for instance, Weatherall’s (2017), “puzzleball conjecture”). It redounds to the theory’s “robustness” (see e.g. Soler et. al., 2014)—clearly epistemological benefit—as well as to our understanding of it.

But the field interpretation may boast of two advantages of square appeal to theoretical physicists. First, as mentioned and elaborated by Rosen (1940, 1963), it “enables one to formulate a gravitational stress-energy tensor, not merely a pseudotensor, so gravitational energy–momentum is localized in a coordinate-independent way” (Pitts & Schieve, 2001, p. 1319).Footnote 78 Thereby, one can evade the problems that notoriously vex gravitational energy in GR (see e.g. Duerr, 2020a, 2020b; de Haro, 2021a, 2021b). Secondly, introducing the background metric seems congenial to canonical quantum gravity approaches: it allows us to easily make sense of space-like separation and hence commutation relations, which otherwise are difficult to comprehend (see e.g. Pitts & Schieve, 2004).Footnote 79

Finally, also philosophical virtues commend the field interpretation. The two main ones carry over from our discussion of the “universal force”-interpretation: continuity with predecessor theories (where the similarity with Newtonian gravity as a field theory is patent), and external coherence. The latter deserves to be underscored (as do e.g. Weinberg (1972) and Feynman et al. (1995); cf. Ben-Menahem, 2006; p. 93): the spin-2 view (see above) has pronounced structural similarities with quantum field theories.

Our second candidate amongst GR’s field interpretations is unimodular gravity. It’s a “slightly bi-metric” alternative (Pitts & Schieve, 2001). It’s only slightly bi-metric (i.e. employs two metrics) in that it doesn’t require all the structure of the flat background metric: only the volume-related information (contained in the metric’s determinant) induced by it is need (see e.g. Stachel, 2011, Sect. 4 for details on the break-up of a metric into volume-related information and light-cone structure).Footnote 80 The background metric again is assumed to be the Minkowski metric \({\eta }_{ab}\). It more precisely, its “volume part”, its determinant \(\left|\eta \right|\)—serves as an ab initio constraint on the metric in GR’s (standard) Einstein-Hilbert action. That is, one requires the metric in the latter to have the form: \({\overline{g} }_{ab}:=\sqrt[4]{\frac{\left|\eta \right|}{\left|g\right|}}{g}_{ab}.\)

\({{\overline{g}}}_{ab}\) and \({g}_{ab}\) share the same light-cone/conformal structure: they agree on which vectors count as time-like/space-like/light-like. But they differ over the length of vectors. Hence, the requirement of geometric incompatibility is satisfied.

Imposing this constraint on the Einstein-Hilbert action, and varying the metric and the matter fields in the standard way, leads to the new field equations (e.g. Finkelstein et al., 2001). They are the so-called Trace-Free Field Equations. The appellation is apt: they correspond to the l.h.s. and the r.h.s. of what one obtains upon rendering the standard Einstein Equations (with a metric subject to the above constraint) trace-free.

This simple procedure yields a “remarkable result: the [Trace-Free Einstein Equations, together with the covariant divergence law for the energy–stress tensorFootnote 81] are functionally equivalent to the [Einstein Field Equations] with the cosmological constant as an arbitrary integration constant […]” (Ellis et al., 2011, p. 5). Ellis et al. explicitly show the empirical indistinguishability from GR: “the Trace-Free Einstein Equations are indeed viable for cosmological and astrophysical applications”.

That the cosmological constant becomes a constant of integration in the Trace-Free Einstein Equations makes them of extraordinary physical interest. Commonly (though not uncontroversially, cf. e.g. Koberinski et al., 2022), one identifies the cosmological constant as the quantum-field theoretical vacuum’s contribution to the universe’s energy content—entailing in the colossal discrepancy between a theoretical expectation and the observationally favoured value, a mismatch of 50–120 orders of magnitude. The Trace-Free Einstein Equations block this interpretation of the cosmological constant as of quantum field theoretical origin (Ellis et al., 2011, p. 9): thereby, at least one of the so-called fine-tuning problems of the standard model of cosmology, according to advocates of the Trace-Free Einstein Equations, is solved (or at least alleviated). This isn’t the place to evaluate such a claim (see however Earman, 2022 for a critical analysis, suggesting however that one should primarily see the theory’s main attraction as an propitious “starting point in the search for a quantum theory of gravity” (p. 35)). For us, this shall suffice as a prima facie motivation to take the Trace-Free Einstein Equations seriously, as a geometric rival to standard GR.

5.2.3 Approximate rivals to GR

We’ll conclude our survey of conventionalist alternatives to the standard spacetime theories by examples of empirically quasi-equivalent rivals. They are observationally indistinguishable from GR in a sufficiently robust, even if not strict, sense (recall our pertinent remarks in §III.1). We’ll discuss successively Shape Dynamics (V.2.3.A) and classical Kaluza-Klein Theory (V.2.3.B).

5.2.3.1 Shape dynamics

Shape Dynamics can be understood as the cousin of Barbour-Bertotti Theory for relativistic field theory (§V.1.2.C). For its construction, consider the subspace of solutions of general relativity that are foliable into spacelike 3-geometries (i.e., the subspace of solutions of GR which are ‘globally hyperbolic’). To obtain Shape Dynamics, one applies the Best Matching Technique to said 3-geometries: in particular, one minimises intrinsic differences in the three-geometries with respect to (a) diffeomorphisms, and (b) spacetime-dependent scale transformations.Footnote 82 One thereby arrives at a theory with a sparse set of spatiotemporal commitments: said 3-geometries (as in the case of Barbour-Bertotti theory, there is no commitment to primitive absolute time) which involve conformal structure only—i.e., facts about absolute angles, but no facts about absolute distances. (See (Mercati, 2018) for a masterly recent introduction, and (Hoefer et al., 2023; Pooley, 2013) for philosophical discussions.)

Unlike the case of Newtonian particle mechanics versus Barbour-Bertotti theory, the solution space of shape dynamics isn’t a proper subset of the solution space of general relativity. While in Shape Dynamics the overall angular momentum of the universe must indeed vanish, one can in fact “glue” certain solutions of the theory, such that there is no specific solution of general relativity to which the new solution of shape dynamics corresponds (Mercati, 2018, Ch. 13). Thus, the solution spaces of the two theories should be understood as taking the form of a Venn diagram: overlapping, but neither space wholly subsuming the other.

In the intersection of these solution spaces, issues of geometric conventionalism naturally arise: for a world represented by a model of general relativity in this intersection (say), the empirical goings-on in that world would equally well be modelled by the corresponding solution of Shape Dynamics. So, here as before, underdetermination looms: is the geometry of the world, one may ask, “really” Lorentzian, or is it “really” as described by Shape Dynamics? Indeed, in broad brush strokes, one can view the move between GR and Shape Dynamics as trading the relativity of simultaneity, for the relativity of scale (invariance under conformal transformations applied to 3-spaces). Naturally understood, these are different spatiotemporal commitments; in this regard, the case recalls again Poincaré’s point regarding the empirical equivalence of theories set in Euclidean versus hyperbolic geometries. At this point that geometric conventionalists will again aver their tertium quid: in such worlds, the choice between the geometries of these two theories is a conventional matter only; assertions regarding the geometry of one theory versus another are to be denuded of truth-values. And of course, since the geometries of our two theories are very different indeed, essentially all geometrical structure in this case will come out as being conventional, on this account!

On what grounds might one prefer shape dynamics over GR? Here, the answer is similar to that in the case of Barbour-Bertotti theory. First, Shape Dynamics is more predictive than GR: it makes an explicit assertion about the overall rotation of the universe which GR does not. (This is true even if its solution space isn’t a proper subset of that of GR.) And secondly, the theory at least purports to be “relationalist”, in the sense that much of the (supposedly) unobservable spatiotemporal structure of GR (e.g., the four-dimensional interval between events) is expunged.Footnote 83 Whether one is convinced by these two arguments is, of course, up for grabs. Our point here is simply that these are the kinds of factors which might weigh in favour of the application of Shape Dynamics over that of GR, for the geometric conventionalist—in line with our standards (§III.2).

5.2.3.2 Kaluza-Klein theory

Kaluza-Klein theory (KKT), developed in the 1920s, “minimally extends” (Overduin & Wesson, 1998, 2019 also for a comprehensive review) the geometric framework of Einsteinian GR from four to five dimensions. Through such a higher-dimensional geometrisation, KKT can be said to unify electromagnetism and gravity (with certain qualifications, on which we’ll comment briefly below).

KKT’s starting point is a five-dimensional Lorentzian metric; vis-à-vis GR, spacetime is endowed with one extra dimension of space. One now imposes that the metric not change in one direction (i.e., that the metric derivative vanish in that direction, meaning that direction is an isometry). With this so-called cylinder condition, the metric splits intoFootnote 84:

$${\widehat{g}}_{AB}= \left(\begin{array}{cc}{g}_{ab}+\Phi \left(x\right){A}_{a}\left(x\right){A}_{b}(x)&\Phi \left(x\right){A}_{a}(x)\\\Phi \left(x\right){A}_{b}(x)&\Phi (x)\end{array}\right)$$

This is plugged into the 5d-Einstein-Hilbert action. Setting \(\Phi =1\), and subsequently varying the action, one obtains the dynamical equations coupling \({g}_{ab}\) and \({A}^{a}\)—the 5d vacuum Einstein equations (i.e. \({\widehat{G}}_{AB}:={\widehat{R}}_{AB}-\frac{1}{2}\widehat{R}{g}_{AB}=0,\) where the hatted quantities denote the 5d-counterparts of the standard/general-relativistic ones in 4d, built from the metric \({\widehat{g}}_{AB}\)). They reduce to those of the source-free Einstein-Maxwell theory: this variant of KKT fully reproduces the Einstein Equations in electro-vacuum together with the free (source-free) Maxwell Equations—the so-called “Kaluza-Klein Miracle”.

Before turning to the inclusion of matter (sources for the electromagnetic field in particular), we mention an alternate variant of KKT. Rather than postulating the cylinder condition as a brute fact constraint (which is, of course, a logically possible move), authors generally seek to derive this result, by “compactifying” spacetime in the fifth dimension; that is, one assumes that the latter to be “curled up” to a very thin straw, so that the higher-dimensional space forms a cylinder with \(0\le {x}^{5}\le 2\pi {R}_{KK}\), where \({R}_{KK}\) denotes the radius of the compactified dimension.

One can then expand the metric in Fourier modes, as

$${g}_{AB}\left(x,{x}^{5}\right)=\sum_{n=-\infty }^{\infty }{g}_{AB}^{n}\left(x\right){e}^{in{x}^{5}/{R}_{KK}}$$

where \(x\) denotes the first four spacetime dimensions. For small \({R}_{KK}\), the higher-order Fourier components are effectively negligible; the five-dimensional metric then becomes independent of the \({x}^{5}\) dimension. We can always choose the radius of the compactified dimension sufficiently small so as to recover the content of the cylinder condition to any desired degree of approximation; the Kaluza-Klein Miracle can thus be (effectively) repeated.

At this point, two natural questions arise regarding KKT’s scope. First, how to incorporate sources for electromagnetic fields (i.e. charged matter), as well as matter sources for the gravitational field in KKT? To the best of our knowledge,Footnote 85 this must be done in a non-geometric manner: either one adds their standard general-relativistic (i.e. 4-dimensional) action to the 5-dimensional Einstein-Hilbert action, or one adds the corresponding source and terms (i.e. 4-dimensional energy–stress tensors) just at the level of the field equations.Footnote 86 A second question pertains to (massive and charged) test bodies: what of their trajectories in KKT? Roughly speaking, one expects such bodies to follow geodesics of the Levi–Civita connection of the 5d Lorentzian metric, and for this to amount to non-geodesic motion (quantified by a Lorentz force) in the 4d picture. This, broadly speaking, is correct (for discussion of some technical subtleties here, see Gomes & Gryb, 2021, §5.3).

Having arrived at the two above-mentioned approaches to KKT which achieve the Kaluza-Klein Miracle (i.e., one which takes the cylinder condition as brute, and the other which seeks to account for the cylinder condition via compactification), one may wonder: how seriously to take KKT? Surely, KKT isn’t pseudo-theory we can discard out of hand. What seems its salient theory virtue militates against such short shrift: KKT arguably offers—at least for the source-free electrovacuum—a non-trivial form of unification of the electromagnetic and the gravitational field (see Muntean, 2008; Lehmkuhl, 2009, Ch. 6 and 7; Karaca, 2012 for nuanced analyses). KKT also offers further unificatory resources which might justify its pursuit: for example, in works such as (Korunur, 2021), the possibility of understanding dark energy in terms of the Kaluza-Klein scalar \(\Phi \) (or generalisations thereof) is explored; moreover, in (Sunahara et al., 1990), the possibility that the Kaluza-Klein scalar drives inflation is considered.

But if prima face we ought to reckon with KKT as a theoretical description of the world, the realist faces a threat of underdetermination: is the world truly described by four-dimensional physics (for concreteness, say, the standard general-relativistic Einstein-Maxwell Equations) or rather as described by the five-dimensional KKT picture (where, in particular, source-free electromagnetic fields are manifestations of a five-dimensional geometry—given by solutions of the five-dimensional vacuum Einstein equations)? As before, geometric conventionalism offers a way out of this impasse, by refusing to assign truth values to propositions associated with the above two formulations which refer to geometry.

KKT’s kind of geometrisation is worth remarking on. Vis-à-vis GR’s 4-dimensional Riemannian geometry, KKT’s geometric arena is enlarged by an extra dimension. The resulting 5d-geometry subsumes both the gravitational d.o.f. and electromagnetic fields. Contrast this with GR’s reduction of matter d.o.f.s to spacetime geometry: the constraints on the structural properties of a relativistic spacetime are sufficiently relaxed so that the spacetime can absorb the gravitational d.o.f.; the latter becomes a manifestation of the former. With its extra dimension, KKT, by contrast, requires the introduction of additional, qualitatively novel, geometric structure; only in virtue of this structural augmentation can the dimensionally extended spacetime accommodate both the gravitational d.o.f.s, as well as those of the electromagnetic field.

KKT’s geometrisation also differs from that we witnessed in the Geometric Trinity (§V.2.1). The latter’s three incarnations of gravity originate in an indeterminacy of the salient geometric structures: within the framework of Cartan geometry, the same physical d.o.f. can be cast as either non-vanishing curvature, torsion or non-metricity (with the respective other two set to zero). Consequently, the inherent leeway permits gravity to be re-conceptualised in three different ways—corresponding to GR, TPG and NMG. By contrast, KKT’s geometrisation presupposes a substantial departure from the general-relativistic spacetime’s geometric setting. These extra geometric structures have to be circumspectly massaged to accommodate electromagnetic fields: rather than ensconcing them in a naturally open habitat, one must squeeze them into the larger geometry. In this regard, KKT’s geometrisation is akin to that discussed in the context of Lagrangian mechanics (§V.1.3.B).

Finally, KKT illustrates an important point for conventionalism as a locally anti-realist strategy (recall §III.3): although KKT geometrises (non-gravitational) forms of matter and thereby, on conventionalism, deprives claims about electromagnetic fields of truth-values, conventionalism about KKT’s geometry still wouldn’t inflate into general anti-realism. Since KKT’s geometrisation is limited in scope (viz. restricted to gravity and electromagnetic fields—without the sources of each), geometric conventionality remains quarantined: geometric conventionalists would classify only KKT’s geometric claims about electromagnetic fields and/or gravity as conventional—not claims about sources or non-electromagnetic forms of (non-gravitational) matter. (Recall also the strategies, open to conventionalists, for dealing with geometric claims, rehearsed in §IV. They carry over, mutatis mutandis, to KKT.)

6 Conclusion

As its central tenet, geometric conventionalism classifies a given theory’s propositions pertaining to spacetime geometry as lacking truth-values. We argued that this constitutes an attractive version of selective realism (or, more precisely, selective anti-realism). Conventionalism is thus capable of overcoming issues of underdetermination when faced with the prospect of multiple distinct but empirically equivalent theories. Plausible cases of such underdetermination, we showed, in fact abound in both non-relativistic and relativistic contexts.

To be sure, conventionalism isn’t without prima facie snags. We highlighted two key challenges. The first misgives that its locality is unstable—that conventionalism is a slippery slope towards global conventionalism (i.e. general anti-realism/instrumentalism). A second qualm warns against alleged, unpalatable implications of conventionalism: one may be apprehensive that it’s too impoverished (or too anti-realist) a philosophy for standard moves in scientific practice, such as the evaluation of counterfactuals. For both challenges we adumbrated encouraging responses.

A number of open questions stand out. With conventionalism occupying one corner in the “epistemology of spacetime” (Dewar et al., 2022), one concerns its relation to topics in that vicinity (ibid., Sect. 3–6), such as neo-Kantian/transcendentalist approaches to spacetime (e.g. Friedman’s (2001) or Stump’s (2015)), or “constructive axiomatics” (an approach going back to none other than Reichenbach (1924)).

Another rewarding follow-up would be to investigate the appeal that conventionalism about classical spacetime geometry might accrue from the perspective of so-called emergent spacetime theories, quantum gravity approaches in particular (e.g. Wüthrich, 2018; Wüthrich et al., 2024). In those approaches spacetime-geometric structure as discussed here, emerges from more fundamental, but non-spatiotemporal d.o.f. Should the conventionalist welcome such theoretical developments as grist to her mills (cf. Martens, 2019)?

A third strand of questions concerns the scope of geometric conventionalism (whilst preserving its local, or selectively anti-realist, character). Can conventionalism be extended to also topology (see e.g. Sexl, 1970, 1983; Glymour, 1973; Sklar, 1974, Ch. IVD, or Meyer, 2021 for promising indications)? Of special interest here are so-called dualities in high-energy physics, such as the AdS/CFT duality (see e.g. De Haro et al., 2016; De Haro, 2021a, 2021b).

Future research will have to tackle these exciting questions. For now, we hope to have restored to geometric conventionalism its place at the top table of positions on the nature of spacetime geometry, a place which—both for historical and systematic reasons—it rightly deserves.