In fundamental theories of physics, we may distinguish between a generic framework and specific laws depending on the subclass of systems considered. For instance, in Newtonian mechanics we have a framework defined by Newton’s three laws of motion and specific laws of force for gravitation, (corpuscular) optics, etc.; in quantum mechanics, we have a framework defined by the Hilbert space of states and the Schrödinger equation on the one hand and specific choices of the Hamiltonian operator on the other hand. Different frameworks may have very different mathematical structures and thus define different sorts of objects, both mathematically and physically. It is therefore tempting to place the framework of a theory at a more fundamental epistemological level than the rest of the theory. For instance, the framework could be an expression of basic constraints for the possibility of experience.

This transcendental view is the one Immanuel Kant propounded for the Newtonian framework in the late eighteenth century. In his system, Newton’s laws result from the application of rigidly and a priori determined categories of our understanding. This view is of course incompatible with the advent of relativistic mechanics and quantum mechanics, which require new frameworks. Historically, there were many attempts to accommodate the Kantian doctrine to the relativity of frameworks.Footnote 1 Two classes of attempts will be considered in this essay. In the first class, the rationalist ambition to derive the possible frameworks by a priori means is given up; their constitutive principles are identified by analysis of the accepted theories of physics; rational justification is limited to the historical replacement of a given framework by another. The accent here is on the way theoretical frameworks constitute the objects and processes of the theory. In the second kind of approach, the accent is on the comprehensibility of the investigated domain: an attempt is made to identify natural conditions of comprehensibility from which the relevant framework derives. Compared to Kant’s transcendental idealism, the implied rationalism is moderate since the conditions of comprehensibility are tentative, local, and refutable. The purpose of the present essay is to compare these two varieties of neo-Kantianism and to argue that the second variety solves some of the difficulties of the first.Footnote 2

In Sect. 1, the constitutive principles of the first variety of neo-Kantianism are illustrated through three examples: Ernst Cassirer’s “forms of knowledge,” the young Hans Reichenbach’s “principles of coordination,” and Michael Friedman’s “relativized a priori.” The first two examples were responses to the challenge of Albert Einstein’s theory of relativity around 1920. Friedman’s much more recent proposal builds on this early neo-Kantianism, on Rudolf Carnap’s linguistic frameworks, and on Thomas Kuhn’s paradigms. It is nowadays the most persuasive and most influential counter-reaction to post-Quinean epistemological holism. Cassirer’s approach is singular in its Marburgian denial of any counterpart to Kant’s sensibility. His forms of knowledge are preconditions of measurement conceived within a purely relational and progressively unified structure, with no pre-conceptual given in perception. In contrast, for Reichenbach and for Friedman there are no constitutive principles without coordination between the theory and the empirically given. While Cassirer’s neglect of coordination implies a certain vagueness of his forms of knowledge, Reichenbach’s and Friedman’s first notions of coordination are so rudimentary that they easily degenerate into mere conventions. This is the reason why Friedman, in more recent writings, endows the target of the coordination with a complex adaptable structure including experimental, technological, and socio-cultural dimensions.Footnote 3

This recent evolution of Friedman’s views suggests that the identification of constitutive frameworks should be subordinated to a sufficiently realistic account of the concrete application of physical theories. Section 2 is a sketch of the account that I proposed a few years ago after inspecting many historical cases. In this conception, physical theories have an evolving substructure that mediates between their symbolic universe and concrete experiments. This substructure implies interpretive schemes acting as blue prints of conceivable experiments, and modular connections with other theories. This modular structure is shown to be essential to the application, comparison, construction, and communication of theories. It also enables us to construct descriptive schemes for experiments in a given domain of physics without yet knowing the relevant theory.Footnote 4

This last point is what permits a sharp formulation of the comprehensibility conditions discussed in Sect. 3. The “comprehensibility of nature” (Begreiflichkeit der Natur) is the expression Hermann Helmholtz used in the nineteenth century to characterize natural but tentative regularity requirements for nature. This could mean, in his earliest works, a Laplacian and Kantian reduction of physical phenomena to central forces acting in pairs of material points, or it could more broadly mean lawfulness (causality) and the measurability of basic quantities.Footnote 5 In the same vein, I define the comprehensibility conditions as specific implementations of ideals of causality, measurability, and correspondence (with earlier theories) in a given domain of experience. From Greek statics to general relativity and quantum mechanics, there were many attempts to prove that assumptions of this kind completely define the theoretical framework that fits the given domain of experience. Two strikingly successful examples are given in this section: Helmholtz’s derivation of the locally Euclidean character of physical space, and Lagrange’s derivation of the general principles of statics. For the history and criticism of similar derivations of other important theories of physics, I refer the reader to my Physics and necessity.Footnote 6

Lastly, I compare constitutive principles and comprehensibility conditions. Whereas the former are obtained by inspection of a theory in its usual presentation, the latter are obtained by implementing regulative ideas of causality, measurability, and correspondence. Intertheoretical relations and epistemological considerations play an important role for supporters of both views. But they occur at different moments. In the first view, they serve to rationalize the transition between successive theories. In the second view, they provide the modular structure and the regulative principles that orient our search for comprehensibility conditions. Most strikingly, comprehensibility principles have a naturalness and a theory-generating power that elude constitutive principles. The conjunction of these two qualities may seem paradoxical: how could easily accepted principles have so far-reaching structural implications? The answer lies in the circumstance that comprehensibility principles are not applied in an epistemological vacuum. They presuppose a general conception of physical theory in which comprehensibility can take a mathematically and empirically precise form.

1 Constitutive principles from Kant to Friedman

1.1 The Kantian heritage

In Kant’s Critique of pure reason, any empirical knowledge requires the mental processing of sensorial data through two separate faculties working in tandem. The first mental faculty, called sensibility, allows us to receive sensorial information, prior to any conceptual synthesis. The resulting imprints on the mind, called intuitions, all have two forms independent of their sensorial content: time as the pure form of internal intuition, and space as the pure form of external intuition. The second faculty, called the understanding, allows us to synthesize our experience in a systematic, unified manner through concepts, rules, and laws. Some concepts of the understanding exist prior to any experience, and they are necessary preconditions of any empirical knowledge. In what Kant calls a transcendental deduction, the existence of these pure concepts or categories derives from the unity of our pre-perceptual thinking (synthetic unity of apperception). There are four categories reflecting the fundamental forms of judgment (quantity, quality, relation, and modality), and each category has three subcategories.Footnote 7

The most basic act of empirical knowledge according to Kant is the application of concepts to intuitions. Since the understanding and sensibility are of a very different nature by definition, Kant bridges these two faculties through schematism. Namely, he relies on temporal imagination to associate a mediating counterpart or schema to each pure concept. For instance, number (of occurrences in time) is the schema of quantity, and order (in time) is the schema of quality. The application of pure concepts to pure intuitions is a priori, since pure concepts, pure intuitions, and the mediating schemata all are so. The results of this application are synthetic, namely: they condition possible experiences. They include Euclidean geometry and ordinary arithmetic, which thereby acquire their synthetic a priori status.

The categories, the intuitions of space and time, and the schemata determine the general form of any scientific knowledge of the world. The content remains largely open. In Kant’s words, transcendental criticism is only a method. In order to provide a foundation of natural sciences, the transcendental apparatus must be supplemented with a “metaphysics” of empirical objects or with some empirical induction. In his Metaphysical foundations of natural science, Kant follows the metaphysical route in defining matter as a continuous distribution of centers of force. Applying the three subcategories of relation (substance, causality, and community) to this representation, he gets three laws of Newtonian mechanics (conservation of mass, rectilinear inertia, equality of action and reaction). Newton’s mechanics thus acquires rational necessity and the only empirical input is the force law (of gravitation for instance).Footnote 8

In the Critique of judgment, Kant describes the empirico-inductive route in which we gradually fill in the contents of the transcendental form of knowledge. In this view, we do not know in advance how the categories, especially those of substance and cause, should be implemented in the empirical world. We can only hope that the laws of a mature science will employ them in a simple, unified manner. In addition to the constitutive principles of the understanding, which determine the transcendental form of knowledge, we need the regulative principles of a parsimonious reason, which guide us in giving empirical content to the form.Footnote 9

There are many obscurities in Kant’s architectonic. The strictly non-conceptual and the non-sensorial character of pure intuition seem hard to reconcile; the deduction of the table of categories is opaque; the time-based schematism seems arbitrary; the necessity of Euclidean geometry is asserted without proof; and the Newtonian concept of matter is not sufficiently justified. The suspicion runs high that Kant arbitrarily rigidified contingent presuppositions of the Aristotelian logic and the Newtonian science that prevailed in his time.Footnote 10

In the nineteenth century, it became clear that Euclidean geometry was not the only conceivable geometry and that mathematics should not be tied to the intuitions of space and time. These blows to Kant’s doctrine were not fatal. In his memoirs on the foundations of geometry around 1870, Helmholtz argued that Kant’s form of external intuition could be preserved if it was restricted to the general idea of space as a continuous, homogeneous manifold. He derived the locally Euclidean structure and the constant curvature of this manifold from the measurability of space by freely mobile rigid bodies, which may loosely be regarded as a rule of the understanding. In this view, experience serves only in the determination of the value of the curvature. Helmholtz thus gave a new content to Kant’s intuitions and categories. He also deeply altered their nature by giving intuition a physiological basis, and by downgrading the categories to tentative assumptions of comprehensibility.Footnote 11

Poincaré adapted Kant in a different manner: he gave up the idea of a passive sensibility and he made the general notion of space depend on the concept of Lie group, which he regarded as a synthetic a priori “form of the understanding.” He judged the choice of the group to be conventional because geometrical laws are never tested independently of dynamical laws (ruling the deformation of bodies or the propagation of light) and the latter laws can always be adjusted to fit a conventionally given geometry. A regulative principle of simplicity induced Poincaré to maintain Euclidean space and Galilean spacetime in face of his and Einstein’s relativity theory.Footnote 12

In Marburg, the neo-Kantian philosopher Hermann Cohen condemned Kant’s sensibility as an unwitting remnant of a “psychologism” in which objectivity is traced to a fixed, non-conceptual given. What is left of Kant’s theory of cognition after evacuating the theory of sensibility (the transcendental aesthetic), is the theory of the understanding, which Kant calls transcendental logic. This is why Cohen calls his mature philosophy a Logic of pure knowledge (1902). Like Kant, he characterizes this logic through a table of twelve basic judgments. But he gives up the strict connection between kinds of judgment and categories, as well as the justification of the categories through their applicability to the manifold of intuition. In his view, the basic judgments are fixed guidelines for deriving categories (including time, space, substance, and cause) that could evolve together with the sciences of nature. His table of judgments relies on Platonic principles of unity and identity, and on Leibnizian principles of continuity and infinity.Footnote 13

A few characteristics of Cohen’s variety of neo-Kantianism were influential both within and without the Marburg school. For Cohen, the transcendental project is defined as the identification of the a priori conditions of knowledge, and not as a speculative metaphysics of the relation between subject and object (as had been the case in the perverted Kantianism of Naturphilosophie). Knowledge is thereby defined as scientific knowledge in mathematical form, and not as elementary sensuous cognition. The pure, a priori, character of the transcendental apparatus excludes the notion of a fixed given to our thought, not even Kant’s pure intuitions; thought is purely generative. The categories, general laws, and principles of scientific knowledge may evolve in the course of time, although they are subjected to invariable regulative demands. To be true, the specific way in which Cohen formulated these demands was judged obscure by most of his contemporaries. Their general necessity nonetheless became a commonplace of neo-Kantianism.

1.2 Cassirer’s Substance and function (1910)

Cohen’s most outstanding disciple, Ernst Cassirer, judged the earlier transcendental projects to be misled by an antiquated, Aristotelian conception of logic based on the subject-predicate relation and naturally implying an abstractionist view of concept formation. In this view, concepts are reached inductively by detecting similarities in a collection of individuals. Like Cohen, Cassirer excluded any pre-conceptual given in experience and required a purely intellectual construction of concepts and objects. He found the means of this construction in the new relational logic of Bertrand Russell, in which structures or abstract systems of relations are the sole source of mathematical concepts. In Cassirer’s view, any cognition, from the most rudimentary to the most advanced, relies on a synthesis of our experience through relational structures. The basic expression of this synthesis is a function, through which the value of a given variable can be generated from the value of another variable. It should not be a substance, which would presuppose a hidden fixed being:Footnote 14

Instead of imagining behind the world of perceptions a new Dasein built solely out of the materials of sensation, [knowledge] contents itself with throwing universal intellectual schemata in which the relations and connections of perceptions can be completely represented.

At the most elementary level of perception, we become aware of stable relations between signs and inductively assume their generality. The signs, according to Helmholtz’s theory of perception, are not direct reflections of the properties of external objects. They are elementary data of perception whose correlations prompt us to assume the existence of stable, external objects. In truth, Cassirer concludes, the assumed objects are intellectual constructs, based on the properties of systems of relations.Footnote 15

In more advanced, scientific knowledge, Cassirer goes on, measurement plays an essential role. Contrary to the naive empiricist view, measurement is not a passive collection of empirical data. As Pierre Duhem emphasized, it presupposes concepts that define the conditions and purpose of measurement. In order to create the intellectual preconditions of quantitative, lawful theory, we must depart further and further from the sensory given. Whenever we identify constant structural elements in a theory, we tend to forget the intellectual presuppositions of measurement and theory-making, and we reify these constants just as we do with the objects of ordinary perception. This was the case, in Cassirer’s time, for the ether and for space and time. Independently of relativity theory (which he did not discuss in Substance and function), Cassirer reduced the ether to a “mere unification and concentration of objectively valid, measurable relations” and Newton’s absolute space and time to “pure functions.”Footnote 16

In the history of mathematics and physics, Cassirer saw a gradual substitution of functional forms for naive substantialist descriptions. Typically, the mind’s demand for permanence induces us to reify the stable components of the relational structures through which we organize our experience. At a given stage of science, we are at a resting-point [Haltpunkt] in which this stability remains unchallenged. In the next stage, we reach a higher systematic unity in which the earlier stable components become interrelated and variable:

Thus we stand before a ceaseless progress, in which the fixed fundamental form of being and process that we believed we had gained, seems to escape us. All scientific thought is dominated by the demand for unchanging elements, while on the other hand the empirically given constantly thwarts this urge. We grasp permanent being only to lose it again.

We are thus reminded of the merely functional character of our forms of knowledge.Footnote 17

To sum up, at each stage of the history of science, our theories rely on functional forms (Funktionsformen) that constitute the object of knowledge in an essentially relational manner. These forms are truly hypothetical, although we may mistake them for necessary, strictly a priori conditions of knowledge. The true a priori, in Cassirer’s terminology, should be a necessary premise of any cognition, and therefore cannot be revised in future science. It includes a principle of projected unity, which prompts us to depart more and more from immediate experience in order to reach higher formal unity and generality in our theories. It also entails the possibility of comparing successive forms of knowledge and to judge the superiority of the latter when the scope of our empirical investigation is increased. The successive stages of theory are rationally related inasmuch as the earlier stage contains the questions answered by the latter stage, and because it remains an approximation of the latter stage. These regulative principles restrict the choice of the functional forms, so that they are not mere conventions.Footnote 18

Despite the recurrent redefinition of the object of scientific inquiry through new functional forms, and despite the lack of a Kantian sensibility that would yield a fixed given in cognitive processes, the scientific enterprise remains objective because of its ability to self-correct, to reach higher unity and generality under a “common forum of judgment.” Although the functional form and the concomitant definition of the object of knowledge is constantly corrected, it converges toward a final, never reached object because the act of correction follows the rules of reason.Footnote 19

In addition to the regulative demand of progressive unity, Cassirer identifies a few “logical invariants” or categories that could pretend to be truly a priori: time, space, number, magnitude, permanence, change, causality, interaction. At the same time, he recognizes that these categories are purely functional and therefore might someday be dissolved into a higher relational unity. In his own words:Footnote 20

The goal of critical analysis would be reached, if we succeeded in isolating … the ultimate common elements of all possible forms of scientific experience; i.e., if we succeeded in conceptually defining the elements that persist in the advance from theory to theory because they are the conditions of any theory. At no given stage of knowledge can this goal be perfectly achieved; nevertheless it remains as a demand, and prescribes a fixed direction to the continuous unfolding and evolution of the systems of experience.

1.3 Cassirer on relativity theory

Cassirer did not discuss special relativity in Substance and function. In 1910, this theory already had a strong hold on German theoretical physics, and it was an obvious challenge to the Newtonian and Kantian views about space and time.Footnote 21 Yet it rarely attracted the attention of philosophers before the advent of general relativity and its spectacular postwar confirmation. Special relativity mixes up the two forms of intuition that Kant had separated, and it downgrades Newtonian mechanics to an approximation of a deeper theory in which the Newtonian law of acceleration no longer holds. General relativity undermines the distinction between inertial and non-inertial forces and brings the metric properties of spacetime to depend on the distribution of matter. The very idea of space and time as a fixed stage for phenomena needs to be given up.

When, around 1920, Cassirer studied Einstein’s theory of relativity, he welcomed it as a confirmation of the general picture of the nature and evolution of science he had given ten years earlier in Substance and function. Newtonian mechanics, special relativity, and general relativity could indeed be seen as three stages in the gradual process of unifying “functionalization” there described. In Cassirer’s analysis, the first, Newtonian stage relies on a synthesis in which spatial and temporal relations are the same in any reference system. In the second, special-relativistic stage, this double constancy is replaced with a higher synthesis based on the constancy of the velocity of light and the constancy of the Lorentz-group structure. In Substance and function, Cassirer had removed a fundamental obstacle to the transition from the first to second stage by regarding space and time as systems of relations instead of quasi-substantial beings; and he had anticipated the capital role of an analysis of the conditions of measurement in redefining space and time. In the third, general-relativistic stage, the equivalence of all coordinate systems deprives space and time of any fixed structural invariant and their metric structure becomes correlated with their material content. A limited synthetic unity is replaced by a deeper one in which any objectification of space–time as a necessary, fixed background is made impossible. Metric and energetic properties merge in a unified dynamical scheme.Footnote 22

In retrospect, the three stages share a proto-notion of space–time as a non-metric manifold of events, which Cassirer still calls pure intuition (although he has rejected Kant’s purely passive definition of intuition) and defines conceptually as “coordination under the viewpoint of coexistence and proximity, or under the viewpoint of succession” (p. 85). Each stage is structured by specific forms of knowledge. Cassirer names and characterizes these forms in a loose manner. Besides “form of knowledge” (pp. 57, 119: Erkenntisform), he uses the expressions “form of thought” (p. 88: Denkform), “ordering form” (p. 58: Ordnungsform), “logical system of coordinates to which we refer the phenomena” (p. 24), “point of leverage” (pp. 25, 40: Angelpunkt), “fixed intellectual pole,” (p. 35: ruhender Gedanklicher Pole), “rule of the understanding” (p. 82: Regel des Verstands), “norm of investigation” (p. 82: Norm der Forschung), and “prescription for the formation of physical concepts” (p. 97: Vorschrift für unsere physikalische Begriffsbildung). These constitutive forms are of a diverse nature: they include the absolute character of metric space and time (in the Newtonian stage), the constancy of the velocity of light and the principle of relativity (for special relativity), the equivalence principle and the equivalence of all coordinate systems (in general relativity).Footnote 23

Just as described in Substance and function, the motor of the evolution from one form of thought to the next is the regulative principle of convergence toward higher systematic unity, in a mutual adaptation of the forms of thought to experience:Footnote 24

Physics, as an empirical science, is equally bound to the “material” content, which sense perception offers it, and to these form-principles in which are expressed the universal conditions of the “possibility of experience.” It has to “invent” or to derive deductively the one as little as the other, i.e., neither the whole of empirical contents nor the whole of characteristic scientific forms of thought, but its task consists in progressively relating the realm of “forms” to the data of empirical observation and, conversely, the latter to the former. In this way, the sensuous manifold increasingly loses its “contingent” anthropomorphic character and assumes the imprint of thought, the imprint of systematic unity of form. Indeed “form,” just because it represents the active and shaping, the genuinely creative element, must not be conceived as rigid, but as living and moving.

The great merit of Cassirer’s variety of transcendental idealism is its ability to accommodate for even the deepest conceptual changes known in the history of science. This was later confirmed by the rationalization of the quantum revolution he offered in his Determinism and indeterminism.Footnote 25 This flexibility results from his reducing the absolute a priori conditions of knowledge to regulative principles of synthetic unity, lawfulness, and progressive convergence. The drawback is the vagueness of the characterization of the constitutive principles or “forms” that operate in the successive stages of a given science. Cassirer never meant his epistemology to be normative; he did not pretend to offer working rules for theorizing scientists. On the contrary, he believed the evolving forms of scientific thought and the direction in which they converge could only be identified by critical analysis of our best science in its historical development.

For Cassirer, the relative, hypothetical, and partially empirical character of the forms of thought reduces any a priori derivation of them to an illusion. If that is the case, how can we decide, at a given stage of physics, which principles of the leading theories play a constitutive role? It would be trivial and unhelpful to assume that the theory as a whole constitutes its object. In order to remain in the spirit of the transcendental method, we need to distinguish form from content, if only in a temporary manner. Although Cassirer is frustratingly vague on this point, he gives us some clues about where to find the constitutive principles: in the analysis of the preconditions of measurement (which he regards as an essential part of the critique of knowledge), in unifying symmetries, in invariants and constants of nature, and in constructive thought experiments. Especially important is the idea that when numeric determinations turn out to be relative to the observer, an intrinsic object can be constructed through the group of transformations relating the various determinations.Footnote 26

1.4 Reichenbach’s principles of coordination

Before having seen Cassirer’s treatise on relativity, his former student Hans Reichenbach published his own interpretation of the relativistic revolutions in a neo-Kantian framework. In Kant’s a priori, Reichenbach argued, one should distinguish two aspects: apodictic certainty, and constitutive power. The first has to go, but the second remains indispensible. The mind imposes some order and some resilience not to be found in raw sensory data.Footnote 27

The central concept of Reichenbach’s new theory of knowledge is coordination (Zuordnung). Although he borrows the word from Moritz Schlick’s empiricist theory of a one-to-one (eindeutige) set-theoretical correspondence between theoretical concepts and physical reality,Footnote 28 he believes that the epistemological concept of coordination essentially differs from the set-theoretical concept because the coordinated elements of physical reality (Wirklichkeit) are not defined before the coordination. Somewhat paradoxically, coordination defines the object of experience even though experience is the sole source of the relevant order. The coordinated elements are defined by the coordination itself. In order to be successful, the coordination has to be one to one, surely not in the set-theoretical sense of the word (which presupposes the target elements to be predefined), but in the following empirically testable sense: the value of any measurable quantity must be the same whatever be the data used for its determination.Footnote 29

At this stage of his reasoning, Reichenbach introduces the principles of coordination (Zuordnungsprinzipe) as the principles that make the coordination one to one. These principles constitute the objects of the theory as they define the mathematical form of physical quantities and the kinds of structures they can form. They do not by themselves alone determine the theory; in addition, Reichenbach admits laws of combination (Verknüpfungsaxiome or Verknüpfungsgesetze) that relate different physical quantities in an empirically testable manner (once the principles of coordination are given). For instance, Euclidean geometry and the vector character of forces are principles of coordination in classical mechanics, and a specific law of force is a law of combination. In general relativity, the differential manifold and the rules of tensor calculus on this manifold are principles of coordination, and Einstein’s equations relating the (derivatives of) the metric tensor with the energy–momentum tensor are laws of combination.Footnote 30

In the latter theory, the coordination depends on an arbitrary choice of coordinates and does not require a fixed given metric. On the one hand, this arbitrariness shows the necessity of a subjective form in the physical description. On the other hand, it shows that there are equivalent coordination frameworks. These are equally one-to-one and they are related by differentiable coordinate transformations. The invariants of the theory under these transformations define the objective content of reality (den objektiven Gehalt der Wirklichkeit) according to Reichenbach. The similarity with Cassirer’s notion of intrinsic object is here obvious.Footnote 31

As history teaches us, the principles of coordination do not share the apodictic certainty of Kant’s a priori. Radically new theories such as Einstein’s two theories of relativity require new principles of coordination. Future theories may require still different principles of coordination, as Reichenbach infers from Weyl’s contemporary proposal of a variable gauge for length measurement. Owing to the constitutive value of these principles, any such change implies a new mode of constituting the object of knowledge. In each such change Reichenbach sees a closer and closer approximation to reality. He believes in a “procedure of continual extension” (Verfahren der stetigen Erweiterung) enabling us to move inductively from one mode of coordination to the next.Footnote 32

To sum up, Reichenbach retains the Kantian idea of constitutive principles that define the object of knowledge. He departs from Kant by allowing these principles to vary in the history of physics. In a more apodictic vein, he stipulates the one-to-one character of the coordination provided by the constitutive principles, although he admits that even this meta-principle might have to be given up in a future science. In proper Marburg fashion, he does not distinguish sensibility from understanding. He reduces both of them to a logic of coordination. He does not admit predefined elements of reality as the target of the coordination. Yet the very idea of coordination seems to betray the nostalgia for a fixed given in perception. He indeed maintains a notion of space and time coordinates as a subjective form of description from which the true objects are extracted by constructing invariants. At the end of his book, he proposes to replace the Kantian deduction of categories with the art of extracting invariants from the subjective form of description.Footnote 33

There are several obscurities in Reichenbach’s notion of coordination: It is not clear how the coordination between the mathematical formalism and empirical reality is effectively done; it is not clear how the principles of coordination should be chosen and how they permit one-to-one coordination; and it is not clear how the one-to-one character of the coordination can be tested without knowing what the measured quantities and the measurements should be in the imagined tests. Then how are we supposed to distinguish the constitutive principles from mere conventions? Reichenbach believes that in a given empirical context, the choice of the coordination principles is constrained by the empirical data if other a priori “self-evident” principles are assumed. For instance, he argues that coordination by absolute time is excluded if we accept the relativity principle, the principle of contiguous action, and a principle of “normal induction.” As Reichenbach would himself later realize, this argument is invalid, if only because Poincaré’s version of special relativity satisfies all these principles without giving up the absolute time. In his correspondence with Reichenbach, Schlick similarly argued that nothing distinguished the alleged constitutive principles from conventions à la Poincaré. Shaken by this criticism, Reichenbach soon gave up his neo-Kantian claims and became the advocate of a “relativistic conventionalism.”Footnote 34

We may now compare the ways in which Cassirer and the young Reichenbach accommodated the relativistic revolution. They both saw three stages in the evolution of our concepts of space and time, corresponding to Newtonian mechanics, special relativity, and general relativity; they characterized each stage by its constitutive principles; and they had ways of comparing the successive principles. With some extrapolation and modernization of their identification of these principles, we might say that the Galilean group, the Lorentz group, and the group of diffeomorphisms respectively constitute the object of Newtonian mechanics, special relativity, and general relativity. For Reichenbach, these groups play a double role: they warrant the one-to-one character of the coordination between the mathematical apparatus of the theory and physical reality, and they interconnect equivalent coordinations. For Cassirer, they express systematic unity in the manifold of events, the better with the less metric background.

This difference in the interpretation of the constitutive groups results from a basic difference in the definition of constitutive principles. For Reichenbach, they are “principles of coordination” warranting the one-to-one character of the coordination between theory and physical reality. For Cassirer, they are “forms of knowledge” providing uniform, structuring presuppositions of measurement. For Reichenbach, objectivity remains tied to a presupposed empirical reality, whereas for Cassirer it results from a higher regulative principle of systematic unity. This is why Reichenbach’s neo-Kantianism could easily evolve into a conventionalist empiricism, whereas Cassirer strict regulative a priori prevented his forms of thought from degenerating into mere conventions.

Cassirer and the young Reichenbach both situated themselves in the Kantian tradition. Yet they presented their relation to Kant in a different manner. Cassirer saw himself as capturing the essence of Kant’s transcendental project, properly interpreted as a regulative ideal in the search for constitutive principles. In contrast, Reichenbach believed that Kant’s a priori depended on a notion of self-evidence (Evidenz) that led to inconsistencies in the face of the newer physics. In a letter thanking Reichenbach for his text, Cassirer wrote:

Our viewpoints are related –but, as far as I can see for the moment, they do not coincide in regard to the determination of the concept of a priori and in regard to the interpretation of the Kantian doctrine; in my opinion, you interpret this doctrine in a too psychological manner, and you consequently exaggerate the contrast with your “analysis of science.” When understood in a strict “transcendental” manner, Kant is much closer to this conception than you suggest.

What Reichenbach called the method of science analysis (wissenschaftsanalytische Methode) was an inquiry in the implicit presuppositions of modern science. As suggested by Cassirer, this agreed with the transcendental method in Marburg style and contradicted Kant only if the more contingent and psychological aspects of this doctrine (intuition and the self-evidence of some categories) were taken seriously.Footnote 35

1.5 Friedman’s “relativized a priori”

In recent years, Michael Friedman proposed his own neo-Kantian interpretation of constitutive change in physical theory under the label “relativized a priori.” The most detailed account of his views is found in his Dynamics of reason published in 2001. His main sources of inspiration are Reichenbach’s early principles of coordination, Cassirer’s idea of a converging sequence of forms of knowledge, Rudolf Carnap’s linguistic frameworks, and Thomas Kuhn’s paradigms. His main target is the holism of Willard Quine’s “webs of beliefs,” which exclude any distinction between the formal and empirical components of a theory. According to Friedman, Quine’s proofs of the impossibility of this distinction (or of the related analytic/synthetic distinction) only apply when it is expressed in purely logical terms (as is the case for Carnap). They do not apply to the distinction between “constitutive principles” and “properly empirical laws” which Friedman takes to be a central feature of any advanced physical theory (echoing Reichenbach’s distinction between principles of coordination and laws of combination). Like Reichenbach’s principles of coordination, Friedman’s constitutive principles define the object of scientific knowledge without having the apodictic certainty of Kant’s a priori. They may undergo radical changes in revolutionary circumstances. Friedman defines his constitutive principles as basic preconditions for the mathematical formulation and the empirical application of a theory.Footnote 36

Like Reichenbach and Cassirer, Friedman sees some rationality in the transition from one set of constitutive principles to the next and he assumes asymptotic convergence toward stable principles. For Cassirer, increased systematic unity by de-reification was the rational motor of change. For Reichenbach, intertheoretical approximation and a higher principle of normal induction controlled such transitions. These are internal mechanisms of change in which physicists are being their own philosophers. Although Friedman also dwells on intertheoretical relations of approximation and reinterpretation, he does not believe that a purely internal process of theory change suffices to overcome the Kuhnian incommensurability barrier between successive paradigms. In his eyes, philosophical meta-frameworks play an essential role in allowing for a continuous, natural transition toward new constitutive frameworks.Footnote 37

Friedman’s choice of constitutive principles in the three standard examples of Newtonian physics, special relativity, and general relativity somewhat differ from Reichenbach’s and Cassirer’s. For Newtonian physics, the constitutive principles are Euclidean geometry and Newton’s laws of motion, in conformity with Kant’s Metaphysical foundations of natural science. Newton’s law of gravitation has no empirical meaning without the laws of motion. For special relativity, Friedman’s constitutive principles are the light postulate, the relativity principle, and the mathematics needed to develop the consequences of these principles. For general relativity, the relevant principles are the Riemannian manifold structure, the light postulate (used locally), and the equivalence principle (understood as the statement that free-falling particles follow geodesics of the Riemannian manifold)Footnote 38 ; Einstein’s relation between the Riemann curvature tensor and the energy–momentum tensor is regarded as a “properly empirical law,” whose content cannot be expressed and tested without the constitutive principles.Footnote 39

One might object that some of Friedman’s constitutive principles, notwithstanding their being preconditions of properly empirical laws, are themselves empirical laws. Friedman anticipates this objection through a Poincarean remark: some constitutive principles do have antecedents in merely empirical laws, but they have been “elevated” to a higher status in which they become conventions for the construction of the new theory.Footnote 40 Let us see how this would work in the case of Newtonian physics. Newton’s laws of motion do have empirical content. For instance, the law of inertia implies the testable existence of a reference system in which all free particles travel in straight lines and travel proportional distances in equal times (granted that global synchronization is possible); and the law of acceleration can be tested by comparing the observed motion in an inertial frame with the static measure of the force.Footnote 41 Friedman would reply that the derived “coordinating principles” differ from empirical laws by an element of decision or convention that makes them rigid preconditions for the description of any mechanical behavior and for the expression of further empirical laws such as the law of gravitation. At least this is Friedman’s convincing reply to the similar difficulty for the light postulate and for the relativity principle in the case of special relativity, and for the equivalence principle in the case of general relativity.

Another difficulty concerns the rationality of changes in the systems of constitutive principles. As Friedman has these changes depend on the intellectual context of the time (especially the philosophical debates), he introduces an element of historical contingency that seems incompatible with a rational view of scientific progress. Friedman deflects this charge in three different ways: by insisting on the rational demand that the earlier system should in some sense be an approximation of the earlier one, by displaying the inner logic of each intellectual context, and by showing a natural evolution of each of these contexts from Kant’s rigid definition of the a priori toward more adaptable neo-Kantian notions.Footnote 42

Following Reichenbach, in his Dynamics of reason Friedman regards the mathematical apparatus of physical theories as purely formal and therefore requires coordinating principles that mediate between abstract mathematics and physical phenomena. As noted by Ryckman, a first difficulty with this view is that it relies on an unnecessarily formal conception of mathematics.Footnote 43 In contrast, the neo-Kantian and Husserlian conceptions of mathematics imply structures that prepare the physical application of mathematical constructs (especially the group structure for Poincaré and Weyl). Another difficulty is the unrealistic simplicity of the modes of coordination imagined by Reichenbach and Friedman. For instance, much background knowledge is needed before the light postulate and the geodetic principleFootnote 44 truly inform the application of general relativity.

Around 2010, some of Friedman’s readers suggested ways out of these troubles with coordination. Scott Tanona recommended a richer definition of the target of the coordination as a pre-structured phenomenal frame, just as Niels Bohr had the interpretation of quantum formalism depend on classical accounts of observed phenomena. Other critiques of coordination instead regarded the idea of a phenomenal target as an undesirable remnant of the Kantian duality between understanding and sensibility. In place of it, Ryckman recommended a Husserlian-Weylian “sense bestowal” based on the inner evidence of consciousness; Thomas Uebel argued for a strictly analytic version of the relativized a priori in Carnapian spirit; Massimo Ferrari and Jonathan Everett advocated a return to Cassirer’s variety of neo-Kantianism. In this last view, the object of scientific inquiry is constituted by evolving forms of lawfulness, under the control of fixed regulative principles.Footnote 45

Friedman then agreed that his original notion of coordination was “too thin” but he distanced himself from attempts to do without something like Kant’s intuition.Footnote 46 In his opinion, there cannot be any genuine constitutive principles without empirical intuition, because the very distinction between regulative and constitutive principles depends on sensibility: In order to constitute objects of experience, the categories of the understanding must be applied to sensible intuitions; the ideas of reason are merely regulative because they do not operate on our sensibility. For the new Friedman, constitutive principles must be able to coordinate the mathematical structure of a physical theory with generalized “physical frames of reference” defined as “ostensively introduced and empirically given systems of coordinates (spatial and temporal) within which empirical phenomena are to be observed, described, and measured.” The frames define the concrete conditions of observation in a manner structured by earlier available theories:Footnote 47

We thus have (relativized) a priori mathematical structure at both the observational and the theoretical levels, and the two are coordinated with one another by a complex developmental interaction in which each informs the other.

Although Friedman’s characterization of the “frames” of observation remains sketchy, he clearly means them to contain much more than the physicists’ notion of reference frame. They encompass the approximate applicability of observational structures, all the technical knowledge accumulated during the earlier application of these theories, as well as broader cultural elements. Friedman’s relativized a priori thus acquires much historical thickness:Footnote 48

Our problem, therefore, is not to characterize a purely abstract mapping between an uninterpreted formalism and sensory perceptions, but to understand the concrete historical process by which mathematical structures, physical theories of space, time, and motion, and mechanical constitutive principles organically evolve together so as to issue, successively, in increasingly sophisticated mathematical representations of experience.

By redefining Kant’s sensibility in a pragmatic, historicized manner, Friedman significantly alters the meaning and function of his earlier constitutive principles. Yet some crucial components of his view have not changed. The constitutive principles still serve the purpose of coordinating a mathematical structure to an observational structure; they still define the object of the theory; and they still remain distinct from the properly empirical laws that determine the behavior of the object. Although Friedman is not quite explicit on this last point, it is trivially needed in order to avoid the degenerate view in which the constitutive principles would define the entire theory (save for the observational structure). What has changed is the kind of coordination provided by the constitutive principles, now involving experimental technologies and lower-level theoretical presuppositions.

Quite explicitly, Friedman introduces this new idea of coordination as an amplification of Kantian sensibility and thus moves further apart from the Marburg school. Yet, ironically, there is a sense in which he has come closer to Cassirer’s structuralism: the theory-ladenness of his frames of reference tends to reduce them to systems of relations. The fixed perceptual basis, if there is any, seems to be at the end of a chain of interlocked models of measurement. Perhaps the only important difference left between Friedman and Cassirer is in the way they conceive the change of constitutive principles. For Friedman, a very rich meta-framework including experimental physics, technology, philosophy, theology, and other cultural elements determines this evolution in rational continuity with Kant’s original project. For Cassirer, the evolution of constitutive principles is more narrowly driven by a regulative principle of systematic unity and by empirical challenging of the invariants assumed in earlier theories.

1.6 Criticism

Granted that constitutive principles are necessary to define the object of scientific inquiries, the problem is to give a precise and effective characterization of these principles. Kant’s original notion was too tied to Newtonian physics to survive the later evolution of physics. The more flexible and more evolvable notions of Reichenbach, Cassirer, and Friedman seem more apt to capture the constitutive component of modern theories. On the one hand, Reichenbach and Friedman define the constitutive principles as the means through which we connect a mathematical structure with physical phenomena. On the other hand, Cassirer defines the constitutive (or regulative) principles as the means through which we satisfy the basic demand of systematic unity in diverse contexts of measurement. The first view has the advantage of addressing the notoriously inscrutable difference between a mathematical theory and a physical theory, and the defect of relying on an ill-defined concept of coordination (as long as Friedman’s notion of frames is not made more precise). The second has the advantage of being based on a universally acceptable demand of unifying synthesis and measurability, and the defect of leaving measurement and physical interpretation in the dark.

Let us look more closely at the first, coordination-based view of constitutive principles. For Reichenbach, these principles are the warrants of the one-to-one character of coordination. This notion is problematic because it cannot make sense without some preconceived idea of what is being coordinated in the physical world, an idea that seems to lead to naive realism or to psychologism. Friedman avoids Reichenbach’s notion of one-to-one and instead propounds a characterization of constitutive principles as the component of the theory that is needed to express and test the properly empirical laws of the theory. This leaves us with a number of questions: How selective is this characterization? Where do the constitutive principles come from? Do they have some sort of necessity or are they merely convenient conventions? How do they connect the mathematical formalism to the world of experience?

The first question is about the legitimacy of Friedman’s distinction between constitutive principles and properly empirical laws. According to Friedman, some constitutive principles such as the light postulate or the equivalence principle do have empirical content (they are empirical generalizations), but unlike ordinary empirical laws they are regarded as preconditions for the expression of any other empirical law. The problem with this view is that it seems to rely on a subjective decision. For instance in Newtonian physics, why could not we regard both the acceleration law and the law of gravitation as properly empirical laws in a constitutive framework defined by Euclidean geometry and Newton’s first and third laws? Pace Kant, this was the view of Daniel Bernoulli and of later textbook writers who asserted the empirical character of the second law. In order to regard this law as merely constitutive, we would have to demonstrate that it is in some sense more necessary than other empirical laws such as the law of gravitation. Such demonstration is not to be found in Kant’s writings.Footnote 49

The opposite difficulty occurs in the context of general relativity. Here Friedman regards the geodetic principle (according to which free-falling particles follow geodesics of the spacetime manifold) as constitutive and Einstein’s relation between the curvature tensor and the energy–momentum tensor of matter as properly empirical. Why could not we also regard this relation as constitutive, and confine the properly empirical in the expression of the energy–momentum tensor? This seems to be the more natural choice for physicists accustomed to regard the principle of least action as constitutive, because this principle together with general covariance and plausible simplicity assumptions leads to both the geodetic principle and the Einstein field equations. In this case it is tempting to admit more constitutive principles than Friedman does, whereas in the case of Newtonian mechanics it may be tempting to admit more properly empirical laws.

Owing to these ambiguities, Friedman’s notion of constitutive principles seems dangerously close to the Quinean notion of subjectively and contingently “entrenched principles,” which is precisely what Friedman wanted to avoid. It remains true, however, that some empirical laws, for instance the law of universal gravitation, cannot be regarded as preconditions for the formulation of other empirical laws, while others, such as Newton’s laws of motion or Einstein’s field equations, are necessary preconditions for the application of other empirical laws. What is questionable is the decision to call some laws of the latter kind constitutive and others not. This does not affect Friedman’s general idea of a stratification of our epistemic resources; it only shows the difficulty of sharply characterizing this stratification for a given theory.

The second question is about the origin of the constitutive principles. Some of the principles, for instance Euclidean geometry in Newtonian physics or the pseudo-Riemannian manifold structure in general relativity are mathematical preconditions for the formulation of the theory. They may already being known to mathematicians, or they may require new inventions or adaptations. The rest of the constitutive principles, Friedman tells us, are the “coordinating principles” obtained by elevating empirical laws to a higher constitutive status. Not every empirical law is a candidate for this elevation, only laws that have sufficient generality and appear to play a critical role in a philosophical meta-framework. In a situation of crisis, this elevation allows us to constitute the object of a new theory. Listen to Goethe: “The highest art in intellectual life and in worldly life consists in turning the problem into a postulate that enables us to get through.”Footnote 50

This sounds fine except that there may be other ways to get through the crisis. For instance, it became clear to Poincaré, in 1900, that the velocity of light as measured by moving observers would be the same as in the ether frame and that, as a consequence, the time measured by moving observers who synchronize their clocks by optical means would differ from the time measured in the ether frame. In the spring of 1905, this insight led Poincaré to a version of the theory of relativity that was empirically equivalent to Einstein’s slightly later theory, and yet Poincaré refused to regard the light postulate (constancy of the velocity of light in the ether frame) and the relativity principle as constitutive with respect to the definition of space and time. In general, the underdetermination of theories by experimental data leads to several equivalent options for solving the same difficulties, and these options have different constitutive principles. The choice between these options is largely contingent. In physics narrowly considered, they are conventions the suitability of which is a matter of convenience. At least Poincaré thought so.

Friedman is of course aware of this underdetermination and he removes it by appeal to philosophical meta-frameworks in which the new conventions become necessary and thus acquire a constitutive status.Footnote 51 Although his accounts of the genesis of Einstein’s coordinating principles work well as rational reconstructions, their historical pertinence remains debatable. Firstly, the availability of the needed meta-framework at the right moment seems contingent. Secondly, the true historical motivations of the central actor may have differed from those suggested by Friedman. For instance, a Hertzian dislike of redundancies in theoretical representation may have weighed more than Poincaré’s philosophy of geometry when Einstein elevated the constancy of the velocity of light to a principle constitutive of space–time relations. These remarks answers my third question about the necessity of the constitutive principles.Footnote 52

The fourth and last question is about the manner in which the constitutive principles connect the mathematical apparatus of a theory with the world of experience. The relevant principles are Friedman’s “coordinating principles.” Are these principles sufficient to determine the applications of a theory? The textbook definitions of major physical theories suggest so much, because these definitions typically involve a “mathematical formalism” and a few “rules of interpretation” which play a role somewhat similar to Friedman’s coordinating principles. Also, the particular theories Friedman discusses, Newtonian mechanics and relativity theory, seem to provide for their own interpretive resources, unlike other theories such as electromagnetic theory, statistical mechanics, fluid mechanics, or quantum mechanics, whose interpretation requires modular connections with other theories.

Yet, as Friedman himself emphasizes in his more recent writings, simple coordinating principles do not suffice to define the application of theories. It is not by contemplating the textbook definition of a theory that physicists learn how to apply it; it is by applying the theory to a series of exemplars given in any good textbook. That is not only for pedagogical reasons. Knowledge is needed that is not contained in the bare rules of interpretation, even in the simple cases of classical mechanics and relativity theory. Consider general relativity. Any effect involving concrete clocks (for instance the gravitational redshift of spectral lines or the gravitational slowing down of atomic clocks) requires Einstein’s equivalence principle according to which the laws of physics in a free-falling, non-rotating local reference frame are locally the same as in the absence of gravity. This principle is not contained in the geodetic principle that Friedman takes to be the basic coordinating principle (in addition to the light postulate).Footnote 53 Friedman would perhaps have no objection to accepting Einstein’s stronger principle as one of the constitutive principles. This might indeed square well with his recent notion of physical frame of reference. Note, however, that this principle differs from his other coordinating principles by implicitly involving phenomena (and theories) that do not belong to the official domain of the theory under consideration, for instance electromagnetism.

Admittedly, there are many concrete consequences of general relativity that do not principally involve extra-theoretical time gauges (clocks whose functioning cannot be described by gravitation theory only). But even in such cases more is needed than just the light postulate and the equivalence principle to interpret relevant experiments. As Friedman himself remarks in recent writings, optical instruments and goniometric techniques are used in any observation of the astronomical consequences of general relativity. Their implying electromagnetic theory is not much of a problem, because additional constitutive principles could be introduced for the electromagnetic sector of the theory. What is more problematic, for anyone interested in the actual application of theories, is the fact that the global theory, as long as it is defined only by its constitutive principles and its general laws does not provide the means to conceive the experimental setups through which it is applied. For this purpose we rely on previous theoretical and practical knowledge that remains regionally valid.

One might retort that this knowledge is implicitly contained in the new theory because the older theories are regional approximations of the new theory in some operationally meaningful sense. This sort of reductionism is a will-o’-the-wisp because we would generally lack an incentive to consider the regional approximations if we did not have previous knowledge of the relevant regions of experience. Even if we chanced to consider these approximations for purely formal reasons, we would thus access only the formal apparatus of the earlier theories, not the associated laboratory practice. Moreover, the reductions would require the entire theory, not only its constitutive principles, so that the allegedly non-constitutive, empirical component of the theory indirectly plays a coordinating role in defining local conditions of measurement.

To sum up, Friedman’s version of the constitutive a priori has several defects. Firstly, it does not provide a sufficiently sharp criterion for distinguishing the constitutive principles from other assumptions of a theory. Secondly, the coordinating principles are contingent on the availability of a proper philosophical meta-framework. Thirdly, the constitutive principles do not by themselves determine the concrete application of a physical theory. Although Friedman’s recent idea of “physical frames of reference” addresses this last difficulty, it is still too vague to address the application problem in a realistic manner.

An additional remark concerns the scope of the relativized a priori. For the sake of historical continuity with Kant’s transcendental project, Cassirer, Reichenbach, and Friedman adapted their theories of knowledge to the historical evolution of the concepts of space and time from Newton to Einstein and they focused on three theories of physics: Newtonian mechanics, special relativity, and general relativity. There is no doubt, however, that any mature neo-Kantian epistemology should encompass all the theories of modern physics. As was earlier mentioned, the pet physical theories of neo-Kantian philosophers are peculiar in their direct bearing on the most primitive concepts of experience (space and time). Other theories, for instance electrodynamics or thermodynamics, are constructed on the basis of higher-level concepts. There is no warranty that epistemological notions developed in the limited context of space–time theories would be relevant to physical theory in general.

These critical remarks contain some hints for how to improve on Friedman’s constitutive principles. Firstly, we should investigate the way in which physical theories of all kinds are applied in concrete experimental setups before we decide on their constitutive apparatus. In other words, we need a realistic and nonetheless generic account of coordination. Secondly, we need to distinguish coordination from constitution, which is only a constraint on the form of coordination. Thirdly, we need a way of justifying the constitutive apparatus of a theory without relying on the contingent availability of philosophical meta-frameworks. Here Cassirer’s emphasis on measurement, on its preconditions, and regulative demands of systematic unity and lawfulness seems most promising.

2 The coordinating substructures of physical theories

The purpose of this subsection is to identify the substructures that allow the coordination of physical theories with the phenomenal world and also to investigate the relevant inter-theoretical relations.

2.1 Generic definition of physical theories

Not much can be said on the nature, role, and constitution of physical theories without a sufficiently precise definition. I arrived at the following definition after inspecting the most important physical theories and the ways in which they are concretely applied.Footnote 54

A physical theory is defined by four components:

  1. (a)

    a symbolic universe in which systems, states, transformations, and evolutions are defined by means of various magnitudes based on powers of R (or C) and on derived functional spaces and algebras.

  2. (b)

    theoretical laws that restrict the behavior of systems in the symbolic universe.

  3. (c)

    interpretive schemes that relate the symbolic universe to idealized experiments.

  4. (d)

    methods of approximation that enable us to derive the consequences that the theoretical laws have on the interpretive schemes.

To illustrate this definition, take the simple example of the classical mechanics of a finite number of mass points. The symbolic universe comprehends systems defined by a number N of mass points, in states defined by the spatial configuration of the particles (N vectors in a three-dimensional Euclidean space), evolutions that give this configuration as a twice differentiable function of a real time parameter, the list of masses of the particles (N real positive constants), and an unspecified force function that determines the forces acting on the particles for a given configuration and at a given time. The basic law is Newton’s law relating force, mass, and acceleration. Interpretive schemes vary. A first possibility is that the scheme consists in a given system (choice of N) and the description of ideal procedures for measuring spatial configuration, time, forces, and masses. Then an idealized experiment may consist in the verification of the motion predicted by the theory for given initial positions and velocities and for a properly selected frame of reference. Or the scheme may involve mass, position, and time measurements only, allowing idealized experiments in which the forces are determined as a function of the configuration. Or else, the scheme may involve mass, position, and time measurements and the choice of a specific force function, allowing idealized experiments in which the motion predicted by the theory is verified for given initial conditions.

This first example suggests a more precise definition of an interpretive scheme as the choice of a given system in the symbolic universe together with a list of characteristic quantities that satisfy the following three properties:

  1. (1)

    They are selected among or derived from the (symbolic) quantities that are attached to this system.

  2. (2)

    At least for some of them, ideal measuring procedures are known.

  3. (3)

    The laws of the symbolic universe imply relations among them.

The characteristic quantities of a given interpretive scheme are divided into measured quantities and more theoretical quantities. Only the former quantities are measured in experiments based on the scheme. The theoretical quantities either are the unknowns that the experiments aim to determine, or their value is taken from empirical laws established by preliminary experiments.

Before further comments, let us consider the more difficult example of quantum mechanics. There the symbolic universe involves a Hilbert space of infinitely many dimensions, operators representing physical quantities, and a few real-number parameters such as time, mass, charge, and external fields. The two basic laws are Schrödinger’s equation and the law giving the statistical distribution of a given quantity for a given state. Interpretive schemes involve the various quantities attached to the particles and fields and the parameters. The laws of the symbolic universe imply statistical correlations between the quantities for given values of the parameters. The complexity of the symbolic universe and of the interpretive schemes varies with the type of system considered (single particle in external fields, several interacting particles, quantum fields).

In this example, it is obvious that the interpretive schemes do not spontaneously derive from the symbolic universe because there is no direct correspondence between the symbolic state vectors and the measured quantities. In the classical case, one may be tempted to believe the contrary because one can easily imagine an approximate concrete counterpart of the symbolic universe. This would be a mistake, because the symbolic quantities never have a direct concrete counterpart. Their concrete implementation requires ideal measurement procedures that are not completely definable within the symbolic universe of the theory. Most mechanical experiments appeal to position and time measurements, and these require, besides the laws of the symbolic universe, a notion of inertial frame and the means to concretely realize the measurements.

In general, the set of interpretive schemes associated with a theory varies in the course of time. Some schemes are there from the beginning of the theory, as they are associated with its invention. Others come at later stages of the evolution of the theory when it is applied in a more precise or a more extensive manner. In this process, some purely symbolic quantities may be promoted to the schematic level. For instance, late nineteenth-century studies of gas discharge and cathode rays provided experimental access to the invisible motions assumed in the electron theories of Hendrik Lorentz, Joseph Larmor, and Emil Wiechert.

A last remark on the present definition of theories is that all the structures it employs are defined mathematically. In this respect, it agrees with the so-called semantic view, in which physical theories are set-theoretical constructs serving as models of a putative linguistic formulation. The main difference is that it contains evolving substructures, the interpretive schemes, that enable us to conceive blueprints of concrete experiments. In a vague way, we may understand this power of the schemes as a consequence of their being generated all along the history of applications of the theory. But we are still in the dark regarding the precise way in which physicists articulate the relation between symbols, schemes, and experiments. This is where the notion of modules is indispensable.

2.2 Modules

Any advanced theory contains or is constitutionally related to other theories with different domains of application. The latter theories are said to be modules of the former. Modules occur in the symbolic universe, in the interpretive schemes, and in limits of these schemes. Since by definition they are themselves theories, they also contain modules, submodules, and so forth until the most elementary modules are reached. There are (at least) five sorts of modules. In reductionist theories such as the mechanical ether theories of the nineteenth century, there is a reducing module diverted from its original domain in order to build the symbolic universe of another domain. In many theories, the symbolic universe also appeals to defining modules that define some of the basic quantities. For instance, mechanics is a defining module of thermodynamics because it serves to define the basic concepts of pressure and energy. There are schematic modules that occur at the level of interpretive schemes and serve to describe the relevant measurements. These may belong to the symbolic universe, as is the case for pressure in the schemes of a thermodynamic gas system; or they may require additional modules as is the case for position and momentum in the schemes of one-particle quantum mechanics. There are specializing modules that are exact substitutes of a theory for subclasses of schemes under certain conditions. For instance, electrostatics is a specializing module of electrodynamics. Lastly, there are approximating modules that can be obtained by taking the limit of the theory for a given subclass of schemes. For instance, geometrical optics is an approximating module of wave optics. These categories are not mutually exclusive: for example, a schematic module can also be a defining module or an approximating module.Footnote 55

Thus we see that there are diverse ways in which the full exposition of a given theory calls for other theories. The choice of the word “module” is intended to convey metaphorically this diversity as well as the fact that the same theory can be a module of a number of different theories. For instance, classical mechanics is a module of electrodynamics, thermodynamics, quantum mechanics, general relativity, etc.; and it can be so in different ways. At any given time, any non-trivial theory has a modular structure, namely: it includes a number of modules of the above-defined kinds.

The modular structure of a theory is not unique and invariable. It depends on a number of factors: (1) the conception we have of this theory, (2) the type of experience that is conceivable at a given period of time, (3) the degree of elaboration of the theory, etc. As an example of the first factor, for some nineteenth-century physicists mechanics was a reducing module of electrodynamics; for phenomenologists it was only a defining module; for believers in the electromagnetic worldview, it was a schematic module. As an example of the second factor, approximating modules for the description of stochastic processes appeared in statistical mechanics only after the development of relevant experiments. As an example of the third factor, the boundary-layer approximating module of hydrodynamics appeared only at a late stage of its evolution, even though it concerned an old domain of experience.

This ambiguity and variability of modular structure may explain why philosophers of physics have paid little or no attention to it. This structure seems to elude any formal, rigorous epistemology. It seems too fleeting and too vague to embody the epistemic virtues that philosophers wish to find in physical theories. Against these impressions, it will now be argued that modular structure is essential to the application of theories, to their comparison, to their construction, and to their communication. These four aspects of theorizing activity will thus appear to be intimately related to each other. Modular structure has some sort of necessity: without it physical theories would remain paper theories.

The symbolic universe of a theory never applies directly to a concrete situation. The application is mediated through interpretive schemes that describe ideal devices and quantitative properties of these devices. In order to build a concrete counterpart of a scheme, we must know the correspondence between ideal device and real device, as well as concrete operations that yield the measured quantities. In any advanced theory, this correspondence obtains in a piecewise manner, through the modules involved in the schemes. The most superficial observer of a modern test of a theory cannot fail noticing the contrast between the simplicity of the theoretical statement to be tested and the complexity of the experimental setting. What enables physicists to make sense of this complexity is, for the most part, the modular structure of schemes.

The modular insertion of a previously known theory enables us to exploit the competence we have already acquired in applying this theory. This application may involve sub-modules and their schemes, and so forth until the concrete operations become so basic that their description can be expressed in ordinary language. Take the relatively simple case of mechanics. The schemes involve a geometric module, which one already knows how to realize by means of surveying with rigid rods (for example). This knowledge is essential in building the apparatus and realizing the relevant measurements. Other useful modules are those of kinematics and statics.

This is not to say that modules are all we need for realizing schemes. Non-theoretical knowledge is also needed on the part of the experimenter, and external theories may be involved in the functioning of the measuring apparatus. These complicating circumstances do not make modules less useful. On the contrary, they make us appreciate two additional virtues of modules. Firstly, the non-theoretical knowledge implied in the application of a given theory can be exploited in the application of any other theory that contains this theory as a module. Secondly, when two theories share the same module, the applications of one theory may benefit from the other theory in the measurement of modular quantities. For instance, electronics can be used in building galvanometers, relativistic mechanics in building oscilloscopes, and optics in measuring distances.

The modular structure of theories also affects the discussion of their refutability. The freedom in defining the schemes and the tacit knowledge involved in their concrete realization seem to leave plenty of room for protecting theories from refutation. In reality, the modular structure severely limits the protective strategies because it restricts the form of the schemes and because it tends to confine tacit knowledge in the application of well-understood modules. As long as experimental error and reasoning lapses can be avoided, the accommodation of adverse experimental results is made difficult. Surely there still is some sort of Lakatosian protective strategy: as long as no better alternative theory is available, physicists prefer to modify the symbolic universe or the non-modular components of the schemes. But the modules themselves usually remain untouched. Duhemian holism, or unrestricted “open-endedness,” do not occur in the actual practice of physics. The modular structure of theories gives them much more rigidity in their adaptation to the empirical world than some historians would have it.

The comparison of two theories obviously requires a non-vanishing intersection of their domains of application. In my terminology, this means that the two theories should share the same subset of interpretive schemes. More exactly, the characteristic quantities for a subset of schemes in one theory should be the same for a subset of schemes of the other theory. This can only be the case if the characteristic quantities are defined through modules that belong to both theories. Once this condition is met, the predictions of the two theories are said to agree if and only if the laws of the two theories imply the same relations between the schematic quantities in the compared subsets of schemes. The physicists’ practice of comparison always involves schematic quantities defined by shared modules. Radical incommensurability is a philosophical fiction.

In rare cases, the two compared theories do not share basic defining modules such as Euclidean geometry or mechanics. This happens for instance when the predictions of classical and relativistic electron dynamics are compared, or when the predictions of Newton’s theory of gravitation are compared with those of general relativity. It would seem that in such cases the shared interpretive schemes could only involve pre-spatial and pre-mechanical observations about the coincidence of two small material objects or the emission and reception of light flashes. This very limited conception of interpretive schemes may in principle allow the comparison of the two theories, for it yields an idealized coordination between theory and simple concrete procedures. In practice, however, physicists never work on a tabula rasa devoid of Euclidean theory, Newtonian mechanics, and other pre-relativistic theories. Comparative schemes involve approximate, local use of these older theories in a complex manner that would deserve systematic study. At any rate, the astronomical tests of general relativity all involve earth-based or satellite-based instruments whose internal design requires earlier accepted geometry and optics, even though the tested spacetime relations are essentially non-Euclidean and non-Minkowskian.

The comparison between two theories may lead to the approximate inclusion of one theory into the other, also called reduction. In this case, the schemes of the reduced theory must correspond to a subset of those of the more general theory. The sharing of schematic modules is trivial, since by our definition of modules, the reduced theory is itself a module of the general theory. What is less trivial is the necessity of defining the schemes of the reduced theory. In a common misconception, the reduction of a theory to another is regarded as a mere limiting process involving a characteristic parameter of the more general theory (for instance c in relativistic mechanics, h in quantum mechanics) and some correspondence between the theoretical quantities of the two theories. In reality, one must introduce the schemes that define the domain of the reduced theory. Limits performed in the symbolic universe alone are ambiguous and lack definite empirical applicability.

From history we learn that theory construction is a very complex process, depending on diverse resources both internal and external to the investigated domain. This complexity of what Reichenbach called the “context of discovery” has often discouraged philosophers from finding any rationality in it. Yet a closer analysis of the practice of modern theoretical physics shows that the construction of theories is highly constrained and that at some stages it may proceed almost automatically, as if the plan were known in advance. Well-known constraints in theory construction are experimental laws and general principles such as the conservation of energy or the principle of least action. Less appreciated is the fact that the construction of a new theory always relies on earlier theories in specifiable ways. In other words, some anticipation of the modular structure of a theory efficiently guides its construction.Footnote 56

Most generally, theory construction depends on defining modules whose validity is assumed from the start. For example, the construction of Newtonian mechanics presupposed the module of Euclidean geometry; and the construction of electrodynamics presupposed the module of mechanics (at least in the definition of forces). Such defining modules occur both in the symbolic universe and in the interpretive schemes. They sustain our theoretical imagination in a concrete manner, in direct connection with measurement possibilities.

In a less universal and less concrete mode, theory construction may rely on reducing modules, as was for instance the case in Maxwell’s first derivation of his electromagnetic field equations. The analogy between magnetic phenomena and the rotational motion of a substance inspired William Thomson’s and Maxwell’s idea that the electromagnetic medium could be a connected system with internal rotations to be identified with the magnetic field. The consistent development of this idea led to Maxwell’s equations. Although Maxwell suppressed the mechanical model in the final version of his theory, he retained a broader principle of Lagrangian structure. This is only one example of a historical process in which a reducing module evolves into a general principle of a more abstract nature. Our theories are full of such vestiges of past modular reductions.Footnote 57

In the development of his mechanical model of the ether, Maxwell was also guided by his desire to integrate electromagnetic, electrostatic, and optical modules in the same theory. In this case modular structure played a double role: in founding a reductionist strategy, and in bringing together different partial theories as modules of a new theory. To sum up, reduction and unification are modes of theory construction that explicitly depend on modular structure. There are two kinds of reduction of a theory to another: one in which the second theory is a reducing module of the second, and another in which the first theory becomes an approximating module of the second. The unification of two or more theories is a process following which these theories end up being approximating or specializing modules of the same theory.

Theory construction also depends on the important modular constraint that the new theory should contain earlier successful theories as approximations. In my terminology, the earlier theories should be approximating modules of the newer theory. This constraint is usually called a correspondence principle, in reverence to Bohr’s endeavor to construct quantum theory in a way that ensures the asymptotic validity of classical electrodynamics. In combination with some assumed symmetries or some general postulates, this principle may completely define the sought-after theory. This happened in the case of relativistic dynamics and in the case of quantum mechanics. In Bohr’s conception of the latter theory, the classical module is important not only in the construction of the symbolic universe but also in the definition of the interpretive schemes. Indeed for Bohr any measurement ultimately relies on classical modules.Footnote 58

The pursuit of modularity does not always bring progress. In some cases, theories that had long been used as defining or reducing modules must be thrown away or relegated to the humbler modular role of approximation. For instance, most nineteenth-century physicists regarded mechanical reduction as a legitimate and accessible aim for all physics. They were blinded by the success of early reductions of this kind. Toward the end of the century, the pragmatist or positivist convictions of a few leading physicists confined mechanics to the more modest function of a defining module. The downgrading of classical mechanics went on in the early twentieth century, when it appeared to be an approximating module of a more fundamental relativistic or quantum mechanics. Even the defining modules of Euclidean geometry and Galilean kinematics came under attack. They were ultimately replaced by and became approximating modules of the pseudo-Riemannian geometry of general relativity theory.

The lesson to be drawn from this evolution is that the modular structure of a theory should never be regarded as definitive. The most we can say is that any theory that has been successful in a given domain of physics is likely to remain, after adequate purification or reformulation, a module of future, more general theories. But its modular function may evolve in time. As we saw, classical mechanics once played the role of a defining or a reducing module. It remains a defining module in useful macroscopic theories. But it is only an approximating module for the most fundamental theories such as general relativity or quantum field theory. Theories all have modular structure. But the modular structures of successive theories bear partial resemblance only. Any stiffening of our modular habits could become an obstacle to further progress.

The modular structure of theories is essential to successful communication between practitioners belonging to different social groups. Physicists who belong to different local subcultures may adhere to different theories of the same domain. As Poincaré and Ludwig Boltzmann forcefully argued, this cultural diversity is usually beneficial to science, because it favors the exploration of a greater variety of symbolic universes and thus increases chances to find the one that best fits the widest domain of experience. This can happen only if communication is possible between the different subcultures. Maximal communication, in which the physicists of one subculture perfectly understand the theories of the other, almost never occurs. It is not even to be wished, because it would interfere with the creativity of each group. More commonly, the two groups communicate through interpretive or descriptive schemes that involve shared modules only. The shared modules are in part given, or they are constructed by a few “bilingual” individuals who labor for the easy communicability of science.Footnote 59

An instructive example is that of electrodynamics in the nineteenth century. British physicists favored a field-based approach; German physicists favored direct action at a distance. Yet these two communities were able to benefit from each other’s results and to compare the predictions of their theories. In part, this was possible because of their inheriting mechanical, electrostatic, magnetostatic, and electrodynamic modules from the same French sources (Coulomb, Poisson, Ampère). For the rest, William Thomson played a crucial role in designing modular concepts that could be used equally well by physicists and engineers in any country. For example, he defined the electric potential through the mechanical concept of energy, independently of any deeper interpretation in the competing symbolic universes. As a consequence or as a motivation, electrometers and other electrical apparatus could be traded between the two cultures, because the modules necessary to their use were made available on both sides.Footnote 60

As was just mentioned in the Thomson case, modules are also essential in the communication between physicists and engineers. Engineers almost never have a detailed knowledge of the deeper theories through which physicists would understand some aspects of their practice. Yet they are constantly benefiting from these theories because they master the modules that are sufficient for their own purposes. More generally, division of work necessitates modular communication between groups who have unequal access to deeper theory. In some domains of modern physics such as particle physics, there are separate subcultures of theorists, experimenters, and instrument makers. As Peter Galison has argued, the necessary communication between these various groups leads to the formation of “trading zones,” that is, virtual places of exchange in which the various protagonists can benefit from each other’s competences without ever acquiring all of them. Theoretical modules play a crucial role in this sort of trade. In some cases, physicists forge the modules just for this purpose. This does not mean, however, that modules only are an arbitrary product of a social consensus formed in the trading zone. The structure they reflect is an inherent structure of the embedding theories, and it becomes part of our understanding of these theories.Footnote 61

Modularity is also important in the communication and understanding of theories within the same subculture of physicists. Physics courses and textbooks are divided into chapters that often correspond to approximating or specializing modules of the theory to be taught. For instance, a textbook of electrodynamics typically has chapters of electrostatics, electrokinetics, quasi-stationary electrodynamics, and electromagnetic radiation. Within each chapter, exemplars are given of interpretive schemes for which the consequences of the laws of the relevant modules can be fully worked out. Eventually, approximation methods are taught for dealing with systems that somewhat depart from these exemplars. The importance of exemplars in learning physics has already been emphasized by several authors including Thomas Kuhn, Ronald Giere, and Nancy Cartwright. What must be emphasized is that exemplars usually concern approximating or specializing modules of a theory rather than the whole theory, and that their treatment almost always involves defining and schematic modules that the students have already learned in other contexts.

Modularity may also be illustrative. Its purpose then is to feed intuition. Many British physicists of the nineteenth century believed that a theory could not be properly understood without illustrating some of its parts by other well-understood theories. They relied on illustrating modules, namely: reducing mechanical modules that worked for limited classes of interpretive schemes of the global theory. For instance, in “On Faraday’s lines of forces,” Maxwell illustrated the electrostatic, magnetostatic, and electrokinetic modules of electrodynamics by means of the mechanics of resisted flow in a porous medium. He thus injected some life into the dry symbols of potential theory. His British contemporaries similarly liked to flesh out the equations of their theories by attaching them to partial reducing modules. This sort of fictitious concreteness is part of any understanding of theories. Besides its pedagogical virtue, it eases the mental associations through which a theory can evolve and fuse with other theories. Concrete illustrations are not there to replace theory, they are there to enliven theory.Footnote 62

The theoretical substructures introduced in this section have a partial resemblance with neo-Kantian notions. Most evidently, the intertheoretical notion of approximation occurs in the aforementioned writings of Cassirer, Reichenbach, and Friedman. The concept of approximating module makes it more precise by specifying the common ground (shared interpretive schemes) that allows the comparison between two successive theories. The defining and schematic modules reminds us of Reichenbach’s remark that the measuring devices for testing a new theory may rely on an older confirmed theory and Cassirer’s remark that “the new form of theory contains the answer of questions proposed within the older form.” The defining modules overlap with some of Friedman’s constitutive principles (for instance Euclidean geometry in classical mechanics).Footnote 63

The interpretive schemes serve the same coordinating purpose as Tanona’s and Friedman’s “frames of reference.” However, they are much more general than Tanona’s notion (which is confined to the testing of kinematic relations in special relativity and to the Bohrian classicality of observing devices). Even in Friedman’s broader formulation, the “physical frames of reference” do not capture the evolving diversity of experimental tests conceivable within a theory. In some sense, Friedman’s notion is richer since it includes the technological and socio-cultural aspects of experimentation, whereas the interpretive schemes and the modular structure remain purely ideal, mathematically expressible constructs. However, these constructs in part respond to the same socio-technical challenges: interpretive schemes allow us to capitalize on the technologies developed for the testing of earlier or independent theories; and the modular structure of theory is in part an adaptation to the socio-institutional and cognitive needs of scientific communication.

The efficient application, comparison, construction, and communication of theories is a basic regulative demand. In a neo-Kantian perspective, modular structure may be regarded as a major regulative principle of empirical knowledge. As I have argued elsewhere, it offers a new perspective to answer basic question of the philosophy of physics, including the nature of laws, the relation between theory and model, the unity or disunity of physical theory, reduction and emergence.Footnote 64 It contradicts static, homogeneous, and isolating views of physical theory. As we will now see, it permits a new approach to the transcendental question of the a priori conditions of experience in physics.

3 The comprehensibility of nature

Constitutive principles, be they Cassirer’s forms of knowledge, Reichenbach’s principles of coordination, or Friedman’s relativized a priori, are meant to precisely define the structure within which the empirical laws of a given theory are formulated. They are highly specific and cannot be identified without prior knowledge of the theory whose object they are meant to constitute. By focusing on such principles, one gives up Kant’s ambition to derive the necessity of a theory (Newtonian mechanics in his time) from general, a priori identifiable conditions for the comprehensibility of nature. For Kant, these conditions corresponded to fixed, necessary properties of our cognitive apparatus. A more modest approach would be to define tentative conditions of comprehensibility that seem natural and even obvious at a given time of history but need to be relaxed at a later time, for instance forms of measurability and forms of causality. Could such conditions share the structuring power of neo-Kantian constitutive principles?

In my definition of physical theories, a symbolic universe and its laws are first given. This universe and attached modules enable us to conceive interpretive schemes which involve ideal measurements and serve as blueprints of experimental setups. The laws of the symbolic universe imply relations between the characteristic quantities of each interpretive scheme, and these relations can be tested experimentally. This conception of theories allows us to formulate the following question: in a given domain of physics, can we identify relevant interpretive schemes by idealizing concrete conditions of comprehensibility of this domain and then infer the symbolic universe and its laws from these conditions? In other words, can we move upward from the conditions of possibility of experience in a given domain to the theory (or theoretical framework)Footnote 65 of this domain? At first glance, this sounds like a foolish ambition, for history teaches us that the results of experiments—not just our ability to conceive them— are needed to design new theories. Yet the question is not so farfetched because in some cases, such as Einstein’s discovery of relativity theory, a priori conditions of measurability did play a significant role in the genesis of the theory. Moreover, there are cases in which already known theories have been a posteriori shown to derive from simple conditions of comprehensibility. We will first consider the paradigmatic case of physical geometry.Footnote 66

3.1 Physical geometry

One of the simplest physical theories, if not the simplest, is Euclidean geometry understood as the geometry that correctly predicts the properties of concrete figures at a reasonable human scale. Let us first see how this theory fits our general definition of theories. The systems of the symbolic universe are the subsets of a three-dimensional real affine space in which a distance is given in the mathematical sense (a positive real symmetric function of two points that vanishes if and only if the two points are equal and that satisfies the triangular inequality). The basic law of the universe may be taken to be the existence of a positive non-degenerate bilinear form from which the distance derives. In more concrete terms, there exist systems of coordinates \( (x,y,z) \) in the affine space such that the distance of a point from the origin is simply given by the Pythagorean formula \( \sqrt {x^{2} + y^{2} + z^{2} } \). We may now define interpretive schemes in which the systems (subsets of points) are figures made of straight lines (defined by the linear structure of the vector space) and circles, with traditional synthetic geometry in view. The characteristic quantities of these schemes are angles and lengths (one could add surfaces and volumes). The ideal measurement of a length is given by the transport of an invariant unit and its subunits thanks to the positive isometries (translations and rotations) of the Euclidean space, and the ideal measurement of an angle is given by the measure of the length of the associated arc of a unit circle. The Pythagorean law implies relations between the angles and the sides of a given figure. For instance, in the triangle ABC the length of BC is a well-defined function of the lengths AB and AC and the angle (AB, AC). A typical geometric experiment would be to measure the three segments and the angle and to verify the theoretical relation between them.

Geometry sounds artificial in this form and the empirical verification of triangular relations seems to be in need of some explanation. An axiomatic, neo-Euclidean reformulation of the theory would not help much, because we would not know the rationale behind the axioms from which Pythagoras’s theorem and the triangular relations would now derive. An alternative route, opened by Helmholtz in the late 1860s, consists in inquiring about the meaning of space measurement before any relevant mathematical structure is given. Following Helmholtz, let us assume that geometry is about measuring distances by means of some gauge. For instance, we may count the minimum number of steps needed to go from one point to another; or, better, we may do the same with a rigid rod. The success and non-ambiguity of this procedure entails the following assumptions for the class of rigid rods:

  1. (1)

    For any two rods, if an extremity of the first rod is kept in contact with an extremity of the second, the other extremity of the first rod can be brought in contact with at most one point of the second (no plasticity or elasticity).

  2. (2)

    If coincidence can be obtained in one place and at one time between a pair of points of one rod and a pair of points of the other, this coincidence will be possible at any other place and time, no matter how variously and differently the two rods have traveled before meeting again (free mobility and stability).

This definition does not entail any prior concept of distance. It permits a direct empirical test of rigidity and free mobility. Of course, there are infinitely many classes of rigid rods according to this definition. Rods obtained by subjecting the rods of a given class to a dilation that depends only on their location will form another class of rigid rods, whatever be the dilation law. For instance, in a thermostatic universe with heterogeneous temperature, iron rods and copper rods define two distinct classes of approximately rigid rods.Footnote 67

Once congruence has been defined by means of a class of rigid rods, the distance between two points (that is, two small objects) can be measured by means of chains of unit rods. At the precision of the unit, the distance is given by the minimal number of links of a chain joining the two points. This distance measurement can be refined by using smaller and smaller unit rods. In common practice, a sequence of subunits is used such that the lengths of two consecutive subunits differ by a factor ten (for instance). The outcome of the measurement is a decimal number whose last digit corresponds to the last subunit whose extremities can be distinguished.

So far, we have considered concrete objects and operations that can be realized in an approximate manner only. We may now leave the empirical world and take our flight to a mathematical set-theoretical world in which the properties (1) and (2) of rigid rods hold exactly, and the sequence of subunits can be pursued indefinitely. The usual sets of natural, rational, and real numbers can thus be engendered in harmony with the geometer’s needs. Whether or not geometry truly motivated the historical introduction of these mathematical constructs, it is important to recognize that any theory of measurement requires these constructs or similar non-standard ones.

We now know how to measure distances with arbitrary precision. Suppose there exist three points A, B, and C whose mutual distances are found to be invariable. We know by experience that except for singular cases, the location of any fourth point within a sufficiently small domain is determined by its distances from these three points. In other words, the location is determined by three coordinates. Moreover, the distance of a variable point M from a fixed point O varies linearly under small increments of its coordinates, except when M is originally at O. In the latter case, the variation cannot be linear since this would allow the distance between M and O to vanish without their coordinates being equal. This variation must nonetheless be a homogeneous function of first degree of the coordinate increments, because for a sufficiently small unit of length, a (reasonable) unit change implies a multiplication of all measured distances by the same constant.

These conclusions are only valid to a certain approximation, given by the precision of the distance measurement. Again we may jump to the ideal, mathematical level in which coordinates are sharply defined as real numbers. At this level, the distance OM should be a differentiable function of OA, OB, and OC whenever M differs from O. This implies the differentiability of any change of coordinates resulting from a different choice of the reference points A, B, and C. The resulting mathematical concept is a differentiable manifold in Bernhard Riemann’s original sense, endowed with a metric that is not necessarily of the locally Euclidean form.Footnote 68

In order to further restrict the form of the metric, we need some additional condition. By experience we know that the position of a point M with respect to three rigidly connected reference points A, B, and C is completely determined by its distance from these three points. This implies that the distance between two points M’ and M” is a function of their distances from the reference points. This function of course depends on the choice of the reference points. By experience we know that it only depends on the mutual distances of these reference points, as long as none of the involved distances is exceedingly large. This fact can be regarded as a precise expression of the homogeneity of space over moderate distances, since it means that the same relations between all measured distances can be used in surveys performed in a not too large domain.

This local homogeneity implies the existence of rigid bodies in the following sense: the distances between any number of points remain the same when their distances to three points A, B, C and the distances between these three points are kept constant. In other words, there exist transformations that preserve the mutual distances of any number of points. The rigid bodies defined in this manner enjoy sextuple free mobility, since the choice of three new reference points A’, B’, C’ such that AB = A’B’, AC = A’C’, BC = B’C’ has six degrees of freedom (nine coordinates minus three constraints).

The assumption of freely mobile rigid bodies allows us to define an angle as a rigid connection of two straight segments of arbitrary length with a common extremity. The addition or difference of two angles is defined by making these two angles share one of their sides, and taking the maximal angle made by the two remaining sides. The straight angle is the angle that makes a flat angle (a single straight line) when added to itself.

Call d the length of the hypotenuse of a rectangular triangle, x the length of one of its shorter sides, and α the (positive or negative) angle between this side and the hypotenuse. The intersection between two lines being unique (at least locally), and there being only one line perpendicular to a given line through a given point, the length d is a function of α and x only. This function vanishes for \( x = 0 \), and it must be differentiable with respect to x because the space manifold is differentiable. Therefore, d is a linear function of x for small triangles. This means that the ratio between one of the shorter sides of a small rectangular triangle and its hypotenuse is completely determined by the angle that they make.

Owing to the free mobility of rigid bodies, this ratio cannot be altered by rigid displacement of the triangle. Now take a look at Fig. 1. The former theorem implies AH/OA = OA/AB as well as BH/OB = OB/AB. Since AB = AH + BH, we have the relation

$$ {\text{AB}}^{ 2} = {\text{ OA}}^{ 2} + {\text{OB}}^{ 2} , $$

which is Pythagoras’ theorem. As is well known, Euclid’s axioms support the same proof. The reason is that his “common notions” contain the assumption of freely mobile rigid bodies, and his “postulates” turn the local (small-scale) validity of the theorem into a global one.

Fig. 1
figure 1

Figure for a proof of Pythagoras’ theorem

In a two-dimensional space, the validity of Pythagoras’ theorem for small triangles would imply the validity of Euclidean geometry at a sufficiently small scale and the Riemannian character of space at large scale. Other considerations are needed to extend this result to three dimensions. They are found in Helmholtz’s original memoir. For our present concern, the two-dimensional case is sufficient to illustrate the power of a simple consideration of measurability.Footnote 69

To sum up, the existence of a class of (small) freely mobile rigid rods leads to a consistent surveying technique. Although the choice of this class is largely conventional, it must meet certain empirical conditions such as the criteria (1) and (2). Together with the observed smoothness and local homogeneity of the space manifold, the existence of freely mobile rigid rods leads to the existence of freely mobile, approximately rigid bodies, from which the Riemannian character of physical geometry follows. The further determination of this geometry depends on the class of rigid bodies that has been conventionally adopted. For a given convention, systematic surveys determine the metric of the Riemannian manifold.

This approach involves a gradual sharpening of elementary spatial notions. Ultimately, it leads to a highly idealized theory in which the thickness of material points, errors in the appreciation of congruence, the imperfect rigidity of rods, and the rationality of the numbers yielded by actual measurements are ignored in order to permit exact relations and deductions. The outcome can be considered as a purely mathematical theory, as an abstract set-theoretical construct whose properties no longer depend on experience. However, the empirical motivation of this idealized geometry explains its success as a physical theory. The most important assumption, the existence of freely mobile rigid bodies, depends on a kind of invariance assessed in our experience of the world: congruence among a certain type of objects is constrained and reproducible at sufficiently small scale. In concrete form, this assumption is the basis of our physical concept of space. In idealized from, it is the basis of the Riemannian concept of space. Although mathematicians are free to imagine more general concepts of space, the locally Euclidean property is essential to the measurability of physical space.

3.2 Statics

Statics is the theory that gives the conditions of equilibrium of simple mechanical systems called connected systems, made of levers, pulleys, threads etc. in permanent rolling or sliding contact and subjected to a given set of forces. The symbolic universe of this theory involves rigid bodies, inextensible threads, and incompressible fluids whose possible configurations are defined by means of an Euclidean geometric module and constrained by the contact conditions. It also involves a vector space of forces. The fundamental law or this universe is the principle of virtual work:

If \( {\mathbf{F}}_{\alpha } \) denotes the force acting on the material point \( \alpha \) of the system, the system is in equilibrium if and only if \( \sum {{\mathbf{F}}_{\alpha } \cdot\updelta{\mathbf{r}}_{\alpha } = 0} \) for any possible displacement \( \updelta{\mathbf{r}}_{\alpha } \) of the material points (called virtual displacement).

The forces involved in this statement do not include the internal contact forces used in elementary expositions of statics. They may include internal forces of elastic, gravitational, or electric origin. An interpretive scheme of the theory is a choice of a system together with characteristic quantities that are geometric configuration parameters and applied forces. The ideal measurement of the geometric parameters is dictated by the geometric defining module of the theory, and the ideal measurement of forces is given by the equilibrium condition of a simple mechanical system used as a comparator of forces. For example, we may use the pulley-thread device of Fig. 2. The force to be measured is applied to one end of the thread and it is balanced by a force that is produced by multiples of a unit force and its subunits applied at the other end of the thread (concretely the latter forces may correspond to calibration weights on a pan suspended to the thread). The direction of the measured force is given by the direction of the thread that it pulls, and its intensity is given by the numbers of balancing units and subunits as a trivial consequence of the principle of virtual work. A possible experiment under this scheme involves the measurement of the forces acting on the system and the verification of the condition of equilibrium.

Fig. 2
figure 2

The pulley-thread comparator of forces

Just as we did in the case of geometry, we may now reverse the logic and try to infer the symbolic universe and its laws by idealizing the concrete measurement conditions. In this approach, the balancing of forces through a pulley-thread system concretely defines their direction and their intensity, without prior mathematical concept of force. This concrete definition leads to the real vector space of forces through the idealization of indefinitely precise measurement, just as concrete length determination leads to real-number lengths. The consideration of concrete, well-built connected systems leads to the following ideal conditions:

  1. (1)

    For any infinitesimal change of configuration compatible with the kinematic constraints (virtual displacement) in which the position of the material point \( \alpha \) changes by \( \updelta{\mathbf{r}}_{\alpha } \), the opposite change \( -\updelta{\mathbf{r}}_{\alpha } \) is also possible.

  2. (2)

    Arbitrarily small forces acting in the direction of mutually compatible displacements suffice to break equilibrium.

The first condition implies the permanence of contacts between rigid bodies. The second excludes solid friction (caused by roughness, for example). Building on reasoning given by Joseph Louis Lagrange in 1798, it will now be proved that the principle of virtual velocities derives from the impossibility of perpetual motion combined with these two conditions and with the pulley-thread definition of force.

The basic idea is to synthesize the forces \( {\mathbf{F}}_{\alpha } \) through a set of tackles, a single rope running through them, and a weight. The simple tackle of Fig. 3 yields the force 2F under the tension F of the rope. Indeed if the force acting on the axis of the pulley differed from 2F, a perpetual motion could be generated by connecting this axis and the two ends of the rope to the same rigid frame. Similarly, the triple tackle of Fig. 4 yields the traction 4F, and so forth. As the intensities \( F_{\alpha } \) can all be regarded as even multiples \( 2N_{\alpha } F \) of the same small intensity F (with a precision increasing with the smallness of F), they can be generated by a properly arranged set of tackles through which the same rope runs (see Fig. 5). The tension F of the rope is produced by a weight W.Footnote 70

Fig. 3
figure 3

Simple tackle

Fig. 4
figure 4

Triple tackle

Fig. 5
figure 5

Lagrange’s contraption for a proof of the principle of virtual velocities in the case of a lever. The forces \( {\mathbf{F}}_{1} ,\,\,{\mathbf{F}}_{2} ,\,\,{\mathbf{F}}_{3} \), acting on the lever AOB, are produced by three rigidly held tackles through which a single rope runs from the anchor K to the suspended weight W

The virtual displacement \( \updelta{\mathbf{r}}_{\alpha } \) of the material point α on which the force \( {\mathbf{F}}_{\alpha } \) is acting induces a shift \( 2N_{\alpha } ({\mathbf{F}}_{\alpha } /F_{\alpha } ) \cdot\updelta{\mathbf{r}}_{\alpha } \) of the rope, as an obvious consequence of the makeup of the tackles. The resulting shift of the end of the rope is \( F^{ - 1} \sum {{\mathbf{F}}_{\alpha } \cdot\updelta{\mathbf{r}}_{\alpha } } \). If there exists a virtual displacement such that \( \sum {{\mathbf{F}}_{\alpha } \cdot\updelta{\mathbf{r}}_{\alpha } \ne 0} \), this displacement or the opposite displacement (warranted by the property (1) of connected systems) is such that \( \sum {{\mathbf{F}}_{\alpha } \cdot\updelta{\mathbf{r}}_{\alpha } > 0} \). The weight W therefore pulls the rope in the direction of a possible displacement. According to the property (2) of connected systems, the rope must move no matter how small this weight is. Therefore, the system is not in equilibrium. By contraposition, the virtual work \( \sum {{\mathbf{F}}_{\alpha } \cdot\updelta{\mathbf{r}}_{\alpha } } \) must vanish for the system to be in equilibrium.

Reciprocally, the vanishing of \( \sum {{\mathbf{F}}_{\alpha } \cdot\updelta{\mathbf{r}}_{\alpha } } \) for any virtual displacement implies equilibrium. We will prove this ad absurdum. Suppose that the system is not in equilibrium under the forces \( {\mathbf{F}}_{\alpha } \). Then equilibrium can be restored by applying additional forces \( {\mathbf{X}}_{\alpha } \) directed against the initial displacements \( {\text{d}}{\mathbf{r}}_{\alpha } \) of the material points α. Otherwise, the same weight W and the same resulting displacements \( {\text{d}}{\mathbf{r}}_{\alpha } \) could be used to lift arbitrary heavy weights through a pulley-rope mechanism and perpetual motion would be possible. On the one hand, the counterbalancing forces \( {\mathbf{X}}_{\alpha } \) verify the inequality \( \sum {{\mathbf{X}}_{\alpha } \cdot {\text{d}}{\mathbf{r}}}_{\alpha } < 0 \). On the other hand, the restored equilibrium requires \( \sum {({\mathbf{F}}_{\alpha } + {\mathbf{X}}_{\alpha } ) \cdot {\text{d}}{\mathbf{r}}}_{\alpha } = 0 \). These two relations contradict the vanishing of \( \sum {{\mathbf{F}}_{\alpha } \cdot {\text{d}}{\mathbf{r}}_{\alpha } } \).

Thus we see that a very powerful principle of statics simply results from idealized definitions of forces and mechanical systems and from the impossibility of perpetual motion. The definitions serve to define the domain of the theory, and they involve considerations of measurability directly in the case of forces and indirectly in the case of spatial relations. Measurability evidently is a condition for the quantitative comprehension of the world. The impossibility of perpetual motion also has to do with the comprehensibility of the world, because it implies a weak kind of causality: the impossibility of spontaneous changes (such as the rise of a weight) without any compensatory change in the environment.

3.3 Kinds of comprehensibility

The arguments given above are rational inasmuch as they carry the conviction that the only conceivable geometry at a moderate scale is Euclidean geometry, and the only theory of mechanical equilibrium is the one based on the principle of virtual work. They are not purely rational because the premises of the reasoning are empirically fallible. For instance, the measurement of space through the congruence of rigid bodies or the measurement of force through a pulley-thread mechanism may fail when the scale of measurement is too small or too large. In fact, we know from general relativity that purely spatial measurement only makes sense at sufficiently small scale (there cannot be any extended rigid body or frame), and we know from quantum theory that the laws of classical mechanics fail at sufficiently small scale. So all we can prove is that simple conditions of comprehensibility that seem natural in a given domain of experience may determine the theory appropriate to this domain.

With the same caveat, we may derive several important theories (theoretical frameworks) from comprehensibility conditions. Newtonian mechanics derives from the measurability of space and forces, Galilean relativity, a causality principle that relates any alteration of motion to the action of forces, and the secular principle that requires motion at the macroscopic scale to be independent of microscopic fluctuation of the applied forces. Thermodynamics derives from the impossibility of perpetual motion and from the uniqueness of thermodynamic equilibrium. Relativistic mechanics derives from the measurability of space and time, the relativity principle, and the correspondence with Newtonian mechanics for moderate velocities. The pseudo-Riemannian spacetime of general relativity derives from the measurability of space and time in an optically controlled manner. Or the broader Weylian structure of spacetime can be derived from geodesy based on light rays and free-falling particles. Our main classical field theories, including electromagnetic theory and general relativity, derive from the Faradayan principle that the field action can only depend on properties that can be tested by point-like particles. The necessity of the Hilbert-space structure of quantum mechanics derives from natural assumptions about the statistical correlations of measurements performed on an individual system. In simple cases, the associated dynamics derives from the principle of correspondence.Footnote 71

Altogether, we see that the deductions involve three kinds of comprehensibility. Firstly, we may require the measurability of some basic quantities. A broad consequence of this requirement is the relevance of mathematical analysis in the formulation of physical theories. More detailed consequences depend on the type of quantity and on the way in which the measurement process is idealized. Space measurement by rigid bodies leads to the locally Euclidean character of space; time measurement by inertial motion, together with space measurement, to the locally Minkowskian structure of spacetime; space and time measurement by light signals and free-falling particles leads to a Weyl spacetime. Field measurement by point-like particles leads to the accepted classical field theories. Considerations of measurability often go hand in hand with a requirement of objectivity: measurements performed by different observers or with different conventions should be interrelated in a consistent manner. The principle of relativity expresses this sort of objectivity.

The second kind of comprehensibility rests on varieties of causality. The broadest variety is the stability of statistical correlations between measurements performed on the same system. This is the one admitted for quantum systems and leading to the Hilbert space structure of states when combined with natural assumptions on the type of correlations. In the classical case, we require a stricter kind of causality according to which the same cause creates exactly the same effect in similar circumstances. In addition, we may require secular average effects at a given scale to be unaffected when the causes fluctuate at a finer scale: this is the principle of secularity, which can be used together with the previous principle for deriving the mechanical law of acceleration. We may also assume the weaker kind of causality implied in the principle of the impossibility of perpetual motion. This principle not only contributes to a proof of the necessity of classical mechanics, but it also helps justifying the energy principle and the first principle of thermodynamics without appealing to mechanical reduction. The uniqueness of thermodynamic equilibrium may also be seen as a kind of causality since it requires the uniqueness of the macrostate of a system under given macroscopic circumstances. The second principle of thermodynamics, expressed as the impossibility of spontaneous heat flow from a cold body to a hot body, derives from the uniqueness of equilibrium if we regard the state of equal temperatures as an equilibrium state.

The third kind of comprehensibility rests on the applicability of correspondence principles. Features of a theory T are thereby derived from its agreement with a theory T’ known to be (approximately) true in a restricted domain of experience. This agreement implies the existence of sub-theories that are approximations of the theory T, as well as the identity or the equivalence of one of these sub-theories with the theory T’. In combination with other comprehensibility arguments, a correspondence argument can be sufficient to construct a new theory. The strength of the construction depends on the quality of the other arguments and on the necessity of the restricted theory. The latter may be established empirically, or it may itself derive from comprehensibility arguments. Both circumstances are met in the case of non-relativistic mechanics as a correspondence-basis of relativistic mechanics, or in the case of classical mechanics as a correspondence-basis for quantum mechanics.

As was announced at the beginning of this section, the formulation and the deployment of comprehensibility arguments requires three features of the definition of theories given in the preceding section: the interpretive schemes, the schematic modules, and the approximating modules. Idealized measurability conditions are proto-theoretical schemes that imply definitional modules. Correspondence arguments rely on the notion of approximating module. Although causality principles may be formulated without much theoretical substructure, their effective deployment requires measurement and correspondence principles that require modular structure.

3.4 Coordination and constitution

My definition of physical theory involves a symbolic universe, its laws, and interpretive schemes. The laws serve to restrict the possible states and evolutions of the symbolic universe. By themselves they do not have any concrete value and they could very well be included in the definition of the symbolic universe. Their extraction from this definition is a matter of convenience. The interpretive schemes, not any law or principle, are responsible for the coordination between formalism and concrete experiments. Unlike coordinating principles, the interpretive schemes are not an invariable component of the theory: they follow the history of its applications. The symbolic universe and its laws control their form without dictating it. Their definition and their deployment requires a modular structure which itself evolves in the history of applications of the theory. The interpretive schemes are blueprints for possible experiments in which the correlation between various modularly-defined quantities is tested or exploited.

Being variable and partly contingent, the interpretive schemes cannot replace Friedman’s coordinating principles in defining a constitutive a priori component of a theory. However, the coordination by means of schemes and modules enables comprehensibility arguments from which, in a given domain of experience, the entire structure of the relevant theory may derive. The comprehensibility conditions offer a workable definition of a relativized a priori. They proceed from broad and natural assumptions on the expected or desired regularity of the world, and they vary when the domain of inquiry is changed or extended. These assumptions include kinds of causality, measurability, and correspondence. They are not components of the theory in its received form (as Friedman’s constitutive principles would be), and they do not originate in the challenging character of former empirical laws (as some of Friedman’s coordinating principles do). They are the constitutive form that the regulative ideas of causality, measurability, and correspondence take in a given experimental context. Both the regulation by reason and the emphasis on measurability remind us of Cassirer’s pronouncements, although he lacked the modular means to discuss the implications of measurability in a given domain before knowing the theory of this domain. As for the correspondence idea, it is important in every neo-Kantian rationalization of the transition from one system of constitutive principles to the next.

Regarding this last point, there is an interesting contrast between comprehensibility arguments and constitutive principles. For the most, Cassirer, Reichenbach, and Friedman first obtain their constitutive principles by inspection of a standard sample of theories (Newtonian physics, special relativity, and general relativity); afterwards they introduce intertheoretical considerations in order to rationalize changes of the constitutive principles. Friedman compares these changes to Kuhn’s revolutions: they involve global, holistic changes of our way of conceptualizing the world and they challenge our ability to rationally interpret the evolution of science. In the approach defended in this essay, the intertheoretical relations of modular structure come first as a most basic requirement of applicability and communicability of theoretical knowledge; afterwards this structure is used to express comprehensibility conditions that show the necessity of some of our best theories. The relevant picture of scientific change differs essentially from Kuhn’s idealized historiography: it involves much substructure within Kuhn’s alleged theoretical wholes, and much modular continuity during Kuhn’s alleged revolutions despite discontinuity in basic theoretical concepts and principles.Footnote 72

Thanks to the modular substructure, comprehensibility conditions can be expressed precisely and mathematically in given domains of physics. In favorable cases, they uniquely determine the theory (theoretical framework) of this domain. Their structuring power is stunning, already in the simple examples given in this essay, and even more in the case of advanced physical theories such as general relativity and quantum mechanics. How could vague regulative ideas of causality, measurability, and correspondence generate so much? The modular structure enables us to define idealized systems in a given domain, to formalize ideal conditions of measurability, and to express conditions of correspondence with earlier theories. These symbolic elements and constraints, combined with forms of causality, turn out to be sufficient to determine the symbolic universe of a theory in favorable cases. In sum, the regulative ideas receive sharply constitutive counterparts when expressed in a given context of measurement.

Comprehensibility conditions differ from neo-Kantian constitutive principles by their naturalness and their elementary character. Real numbers, manifolds, tensor structure, or Hilbert spaces are not pre-assumed; they are generated by the comprehensibility conditions. These conditions are natural in the sense that any physicist living in present times would readily admit them to be valid in the domain of the theory they generate. Admittedly, this kind of naturalness is remote from the self-evidence of Kant’s constitutive a priori, and it would be difficult to give it a precise philosophical definition. In the best cases, it is nonetheless evident that the comprehensibility conditions are simpler and more easily accepted than the textbook definition of the theory they imply. For instance, the existence of freely mobile rigid bodies (at a reasonable scale) has a kind of empirical immediacy that the axioms of Euclidean geometry do not have. For Newtonian mechanics, the measurability of space, time, and forces, strict causality, and the principle of secularity are easier to admit than Newton’s laws of motion.

Comprehensibility conditions thus differ from constitutive principles extracted from the usual formulation of theories. The two notions are nonetheless related, because the comprehensibility conditions are easily seen to imply the constitutive principles. Also, comprehensibility conditions share a fundamental feature of Reichenbach’s and Friedman’s constitutive principles: they generate framework-theories in which additional empirical laws need to be inserted. For instance, in Newtonian mechanics the comprehensibility conditions leave the choice of the laws of force entirely free; in general relativity, the expression of the energy–momentum tensor of matter and radiation remains free. That said, the distinction between comprehensibility conditions and additional empirical laws seems sharper than Friedman’s distinction between constitutive principles and properly empirical laws because the comprehensibility conditions are by definition the unique, distinctive characteristic of a given domain of physics.

The claim that some of our most successful theories can be derived from natural conditions of comprehensibility would be enormous if it implied our ability to discover physical theories without consulting experiments. In reality, the comprehensibility conditions, no matter how natural they may seem at a given time of history and in a given domain of experience, were hard conquests of empirically motivated inquiries; and the conquest is never definitive: the conditions tend to be relaxed when the domain of experience is enlarged to include extreme scales.Footnote 73 For instance, the basic demand of quantitative description through measurable quantities was not generally admitted in physics until the second half of the eighteenth century; and the classical expression of causality by strict correlation had to be downgraded to a merely statistical causality in the quantum domain. All we can say is that some of our best theories, no matter how empirically and culturally conditioned their genesis was, can retrospectively be seen as resulting from simple requirements for the comprehensibility of the world. These requirements are refutable since their consequences are so. Simple though they are, their validity remains domain-dependent.

In any conception of the relativized a priori, we want to understand the transition between two successive versions of the constitutive apparatus. Cassirer appeals to an increasing de-reification of physics under the regulative banner of systematic unity, Reichenbach to an inductive selection between possible sets of constitutive principles, and Friedman to a philosophical meta-framework guiding the transition toward new constitutive principles. In the view adopted in this essay, the modular structure of physical theory brings much rational continuity in the evolution of physics. But it does not tell us how to change the comprehensibility conditions. Here is a hint: these conditions being known for a received theory, try to relax them to get a more general and more unified theory.

Yet it would be naive to think that physicists invent new theories just by brooding over the comprehensibility of the world. It is equally naive to have Cassirer’s de-reification or Friedman’s philosophical meta-frameworks by themselves govern the transition to a deeper theory. As Friedman emphasizes in his more recent writings, these philosophical considerations are only one component of a complex historical process that has conceptual, experimental, social, institutional, technological, and cultural dimensions. Similarly, comprehensibility considerations are only one element of the complex heuristics that historically produced fundamentally new theories. But this element was important, sometimes even crucial. For instance, the impossibility of perpetual motion was repeatedly used do derive laws of statics in the history of mechanics; Newton implicitly used the principle of secularity (decoupling of scales) in his derivation of the laws of mechanics; most evidently, an analysis of the preconditions of space and time measurement informed Poincaré’s and Einstein’s theories of relativity.

A Kantian or neo-Kantian analysis of physical theories may serve three different purposes: (1) identify the constitutive apparatus of prominent theories, (2) determine this apparatus in a given domain by rational, a priori means, (3) explain the transition between successive theories. Kant did (1) and (2) for Newtonian physics and could not worry about (3) since no alternative physics was yet in view. Cassirer, Reichenbach, and Friedman did (1) and (3) in different manners, but gave up on (2) because the motors of change they identified for (3) (regulative principles for Cassirer, self-evidence and normal induction for Reichenbach, a philosophical meta-framework for Friedman) were not by themselves sufficient to generate the theory of a given domain. Presumably, the refutation of Kant’s answer to (2) made them suspicious of any attempt to justify a given theory by a priori means. Comprehensibility principles allow (1) and (2), and also contribute to (3) when combined with an ideal of increased systematic unity. Considered by themselves, they are quite similar to Cassirer’s regulative principles. Their much higher constructive power results from their being combined with a precise coordinating structure made of modules and interpretive schemes. The deductions they permit are moderately rational, for they rely on a refutable, domain-dependent naturalness. Yet they successfully convey the inevitability of some of our best theories.