1 Introduction

This paper is a reading of the ‘history’ of different manifestations of equivalence in the mathematical texts. In modern-day mathematics, the notion of equivalence is captured by the two inseparable notions of equivalence relation and equivalence class. However, when it comes to history, the connection, and even the coexistence, of the two notions is not as clear as a mathematician might like to see. The 40-year-long debate between two friends, the mathematician Christopher Zeeman and the historian David Fowler beautifully portrays the existing historical uncertainties. The subject of the debate was Euclid’s definition of proportionality. Both Zeeman and Fowler agree that Euclid’s definition is reminiscent of an equivalence relation. However, Fowler strongly disagrees with Zeeman (2008, p. 16) claiming, “Euclid must have been thinking of a ratio as something like an equivalence class.” In a way, the Fowler–Zeeman dispute is rooted in a methodological distinction between what Zeeman expresses as “the traditional opposing roles of historian and mathematician” (Zeeman 2008, p. 16):

The historian thinks extrinsically in terms of the written evidence and adheres strictly to that data, whereas the mathematician thinks intrinsically in terms of the mathematics itself, which he freely rewrites in his own notation in order to better understand it and to speculate on what might have been passing through the mind of the ancient mathematician, without bothering to check the rest of the data.

The current paper bothers to check the rest of the data, not only in the Elements, but also in a span of 2000 years beyond the Elements. The paper offers a picture of the various conceptions of equivalence in the history of mathematics. As such, it might also help the reader to see Euclid’s experience of equivalence in a new light.

The paper has two main parts.

In the first part, the texts under study are on the verge of decontextualizing the mathematical notion of equivalence. Roughly speaking, this period covers about a hundred years from the second half of the nineteenth century to the first half of the twentieth century. The purpose of this part is to find out the main participants in the standardization of the notion, as we know it today. This part is mostly indebted to the clues left by Fowler (1998; http://mathforum.org/kb/message.jspa?messageID=1174830) on the Historia Matematica Forum (regrettably, this forum is now closed; hereafter any reference to Fowler with no date is from this forum).

In the second part, the texts under study manifest certain contextualized equivalences. Roughly speaking, this period starts from Euclid and continues until the first half of the nineteenth century. For this period, I adopted a variation approach in which the texts under studies are being compared in the search of critical differences.

2 Decontextualized equivalence

On Aug 1, 1998, Moshé Machover posted a seemingly innocent question on the Historia Matematica Forum with the title “Equivalences classes as objects”:

Who was the first to use the reductionist ploy of using equivalence classes—themselves—as the required objects \(x^{*} \) (where \(x^{*}\) is the equivalence class of x, given an equivalence relation R).

On Aug 22, 1998, FowlerFootnote 1 submitted a lengthy answer with a rather daunting starting paragraph:

Not simple! In fact, I know of nobody who has attempted a serious answer. I have written an (enormous!) draft of an (unfinishable!) article on equivalence classes, about which I had an (equally enormous!) correspondence with others who know a great deal more than I do; I’ve tried to get them to write up something, or to correct and complete my thing, without success so far.

To the best of my knowledge, nobody has yet attempted a direct answer. There are some attempts embedded within the history of relevant mathematics, say, the concept of quotient group (Nicholson 1993), or the foundation of mathematics, and in particular set theory (Ferreirós 2008), or the philosophy of mathematics (Dummett 1991; Rodriguez-Consuegra 1991). However, none has addressed the history of the mathematical construction of equivalence exclusively. The current paper fills this gap, starting from the end, naming.

2.1 Equivalence relation; the name

I call a relation which is reflexive, symmetrical, and transitive an isoid relation.

Jourdain (1912) presented his article On isoid relations and theories of irrational number to the Fifth International Congress of Mathematics, starting with the sentence above. At the time, the process of “definition by abstraction” (Russell 1903, pp. 219-220) was quite well established but the term “equivalence” was mainly attached to the context of cardinal numbers.

Consider the case of cardinal numbers. Where \(u, v, \ldots \) are classes which have the isoid relation of what Cantor called “equivalence,” and what Dedekind and Russell called “similarity,” to each other. (Jourdain 1912, p. 492)

Jourdain was one of the first who suggested a decontextualized term for what we now know as “equivalence relation”. However, he was not successful at popularizing the term. Even Russell, one of his closest correspondences, refers to such relations with no name and only qualify them as an “important kind of relation”, noting that, “similarity is one of this kind of relations” (Russell 1919, p. 16).

By 1919, neither the combination “equivalence relation” nor “equivalence class” was in use. The necessity of naming was first felt with the relation, not with the classes that are formed by that relation. For example, Hasse (1926) who might be credited with freeing the term Äquivalenzrelationen” from the context of the relation did not find it necessary to name corresponding classes. Under the section Äquivalenzrelationen und Klasseneinteilungen of the first chapter of the first edition of Höhere Algebra, Hasse (1926) wrote:

We call such a decomposition a partitionofM, and the subsets thereby determined its classes. (Hasse 1954, p. 22; Higher Algebra is the English translation of the third edition of Höhere Algebra published in 1933)

Higher Algebra has every bit of a modern account but an attention to naming the subsets of the partitionFootnote 2. The book was reviewed in the Bulletin of the American Mathematical Society (Moore 1927; Ore 1928). Yet, Hasse’s terminology needed a few years more to be established and used without any reference to any other alternative terminology. In the third edition of Mengenlehre (Hausdorff 1914; the third edition was published in 1937 and then as Set Theory in English in 1957), the term equivalence is still attached only to the context of cardinality. Also, Birkhoff (1935, p. 446) felt obliged to footnote that “What Hasse (and I) call an “equivalence relation”, Carnap calls an “equality relation”.” Interestingly, Birkhoff defined an equivalence relation to be any reflexive and “circular” relation (i.e. \(a\,x\,b, b\,x\,c\) implies \(c\,x\,a\)), claiming that this definition “amounts in effect to the more conventional one of Hasse” (Birkhoff 1935, p. 446). He also differs from Hasse in his attention to the naming of the classes of a partition.

2.2 Equivalence class; the name

There is a \(\left( {1,1} \right) \) correspondence between equivalence relations x on C and partitions of the objects of C into non-overlapping “x-categories”‡, under which axb if and only if a and b are in the same x-category”. (Birkhoff 1935, p. 446)

The symbol ‡ guides the reader to the footnote where it can be read that an “x-category” is an “Abstraction class” according to Carnap (Der Logische Aufbau der Welt, Berlin, 1928, p. 102).” It is now 1935, and the classes of equivalent objects are called the “x-categories” or “abstraction classes”, not yet, “equivalence classes”. Even in 1942 and in a paper with the title Theory of Equivalence Relations, Oystein Ore called them “the blocks of the partition P” (Ore 1942, p. 574). Birkhoff (1948, 1940), Birkhoff and Mac Lane (1941, p. 165) used R-classes for the non-overlapping classes determined by the equivalence relation R.

Van der Waerden’s Moderne Algebra (1930, p. 4) explicitly uses Äquivalenzrelation:

so nennt man die Relation \(a\sim b\) eine Äquivalenzrelation.

Yet, even the English translation of the book (van der Waerden 1949) where he had a “welcome opportunity for several minor changes” (p. iv) does not embrace the combination “equivalence class”. Interestingly it seems that “equivalence class” was enjoying a path of its own for a long time.

Von Neumann (1926, 1929), Hopf (1930), and Seifert and Threlfall (1934) used the term “Äquivalenzklasse”, Von Neumann (1936) “equivalence-class” (with a hyphen) and Solomon Lefschetz (1938, 1942) “equivalence class” (without a hyphen). The terminology was for denoting the classes constructed by certain contextualized equivalences, none of which was referred to as an “equivalence relation”. Moreover, none of these authors felt obliged to define “equivalence classes” in general terms and outside the contexts that they were working. To use Fowler’s words, these works were “within a mathematical, as opposed to foundational contexts”. A foundational treatment is found in the first chapter of Tukey’s Convergence and Uniformity in Topology (Tukey 1940, p. 4), where the combinations “equivalence relation” and “equivalence class” appear in the same place.

It is well known that an equivalence relation (that is, one which generates a reflexive, transitive and symmetric ordered system) divide the set on which it is defined into mutually exclusive equivalence classes. If we denote the equivalence relation by \(\sim \) and the equivalence class containing a by \( \left[ a \right] \), then \(b \in \left[ a \right] \) if and only If \(b\sim \) a.

The class notation [ ] is also found in Lefschetz (1938, p. 292). Note that Lefschetz was Tukey’s Ph.D. advisor. Within their academic family, they had a common language freed from the context. Tukey (1940) and Lefschetz (1942) precede Steenrod’s The Topology of Fibre Bundles (1951) that was mentioned by Fowler as the first book to use the combination equivalence class. Incidentally, Steenrod was also within the same family: his Ph.D. advisor was Lefschetz. Interestingly, if you find a text in this period of history, say Set Theory and Metric Spaces (Spanier 1955), where the term “equivalence class” and the class notation [ ] have been used in a clear and modern manner, you might suspect that the author belongs to the Lefschetz’s academic family. In fact, in my example, Spanier is one of the academic grandsons of Lefschetz, having Steenrod as his academic father! Outside Lefschetz’s family, for such fundamental concepts, in addition to the ones mentioned so far, there remains a large body of literature that need to be addressed (e.g. the Göttingen tradition, the literature in algebra in general, and in particular, cosets and quotient groups, etc.). However, it seems that, by and large, the combination “equivalence class” was lagging behind the “equivalence relation” as it is beautifully portrayed in the three editions of Garrett Birkhoff’s Lattice Theory (1940, 1948, 1964), as we shall now see.

2.3 A portray of the evolution of the terminologies

In the first edition of Lattice Theory (Birkhoff 1940), there is a reference to Waerden (1930) for using the mathematical construction of equivalence. However, Birkhoff preferred not to use the terminology:

Theorem 1.2: The algorithm of identifying x and y when (and only when) \(x \,\rho \,y\) and \(y \,\rho \, x\), yields a partially ordered system from any quasi-ordered system Q.

Proof: Let \(x\, \sim \, y\) mean that \(x\, \rho \, y\) and \(y\, \rho \, x\). Then (van der Waerden 1930, vol. 1, p. 11), \(x\, \sim \, y\) means that x and y belong to the same subdivision under some partition of Q (Birkhoff 1940, p. 7)

In the second edition, the terminology “equivalence relation” is admitted, hence the rewording of Theorem 1.2 that is Theorem 3 in the new edition. Immediately before the theorem, Birkhoff introduced the terminology, without assuming that the expected readers are familiar with it:

Each such quasi-ordering is associated in a natural way with an equivalence relation (i.e., a reflexive, symmetric, and transitive relation) and a partial ordering.(Birkhoff 1948, p. 4)

Then he reworded the original theorem and for the construction used in the proof cites Birkhoff and MacLane (1941), not van der Waerden (the English edition of van der Waerden’s book was published in 1949, a year after the publication of the second edition of Birkhoff’s Lattice Theory).

THEOREM 3. Let\(\rho \)be any quasi-ordering of a set. The relation\(x \sim y\), meaning that\(x\, \rho \,y\) and \(y\, \rho \, x,\) is an equivalence relation. If “equivalent” elements are identified,\(\rho \)becomes a partial, ordering.

Proof. ...This shows that \(\sim \) is an equivalence relation, or (Birkhoff-MacLane, Ch. VI, Thm. 27) that there is a partition of X into non-overlapping subclasses, such that \(x\sim y\) if and only if x and y are in the same subclass...It follows immediately from this that ...\(x \rho y\) and \(y \rho x\) imply \(x \sim y\) (i.e., \(x = y\) in the system formed by the subclasses). (Birkhoff 1948, p. 4)

The third edition has no reference for the terms used or the relevant construction. Theorem 1.2 (1940, p. 7) evolved into the following lemma in the third edition of Lattice Theory:

We will now show how to construct a poset from any given quasi-ordering.

LEMMA 1.In any quasi-ordered set\(Q = \left( {S, \prec } \right) \), define\(x \sim y\)when\(x \prec y\)and \(y \prec x\). Then:

  1. (i)

    \(\sim \)is an equivalence relation on S

  2. (ii)

    if Eand Fare two equivalence classes for\(\sim \), then \(x \prec y\)either for no \(x \in E\), \(y \in F\)or for all \(x \in E\), \(y \in F;\)

  3. (iii)

    the quotient-set \(S/\sim \) is a poset if \(E \leqq F\) is defined to mean that \(x \prec y\) for some

(henceall) \(x \in E\), \(y \in F\). (Birkhoff, 1967, p. 21)

The third edition is a typical example of what we observe in the texts written by mathematicians for mathematicians after 1950. Before 1950, we might find all different sorts of the use of terminology: no mention at all (e.g. Russell 1903), alternative terms (e.g. Jourdain 1912; Carnap 1928), only “equivalence relation” with no mention of the term “equivalence class” (e.g. Weyl 1949) or an alternative (e.g. Birkhoff 1935). Generally, the combination “equivalence class” became common much later than the “equivalence relation” while closer to 1950 we can observe clear uses of both combinations alongside each other (e.g. outside Lefschetz’s family, Neal McCoy 1948).

Overall, the first half of the twentieth-century witnessed a rapid use of the mathematical construction of equivalence in every branch of mathematics, resulting in the decontextualizing of the construction and the unification of the terminology.

3 Contextualized equivalence

Some early examples of equivalence-class-style arguments do occur in contexts where some specific well-defined concrete set underlies the subject—Gauss’ number theory in his Disquisitiones Arithmeticae is an outstanding example—but the general technique of appealing to equivalence classes appeared only around the time of Dedekind, when set theory began to be introduced as a basis for mathematics in general, and it took some time to become established. (Fowler 2003, p. 371)

The originator of “the general technique of appealing to equivalence classes” is much disputed. Fowler’s candidate is Dedekind. Dummett’s choice (1991) is Frege. Weyl (1949, p. 11) traces back the idea to Leibniz, claiming that it “was consciously formulated in all generality by Pasch” in 1882 (Pasch 1882), and “still more clearly by Frege (1884, Sections 63–68).” Rodriguez-Consuegra (1991, pp. 155–156) credits Russell who independently from Frege gave his definition of cardinal numbers. Russell himself has contributed to this view on his Introduction to mathematical philosophy (Russell 1919, p. 11). Yet, in 1936, when von Neumann (1936, p. 96) wanted to remind the readers of an analogous procedure to the one that he used to define the notion of a numerical dimensionality, he mentioned not Russell but Cantor:

Our primary objective is the definition of a numerical dimensionality for \(a\in L\). Following a procedure used by F. J. Murray and the author we define first the notion of equidimensionality for two \(a, b\in L\). This is analogous to G. Cantor’s classical procedure of defining equality of power (i.e., equivalence) for sets before defining the powers (i.e., alephs) themselves. But while G. Cantor was led by this procedure to a new kind of quantities, the alephs, our axioms will lead us back to the well-known system of real numbers.

Even before Cantor’s 1895, the procedure was so much in the air that Hermann Schubert (1894) in his Monism in Arithmetic, written “for beginners”, freely (i.e. with no citation), though implicitly, employed it to define negative numbers. According to Schubert, the work was a reproduction of his “System of Arithmetic” (written a decade ago) that “was the first to work out the idea referred to, fully and logically and in a form comprehensible for beginners.” In fact, Schubert approach resembles the earlier work of Dedekind in 1854, modified in 1872, where he defined integers as pairs of natural numbers in the way that “is exactly the one that is still being employed today: except that in a modern exposition one would deal with equivalence classes of pairs.” (Sieg and Schlimm 2005, pp. 136–137; more on this in the next sections). As Nicholson (1993, p. 76) says: “It may be that several of these mathematicians came across the idea of equivalence independently.” As such, we might never be able to single out one person as the originator. However, we might be able to figure out the role of some of the “main” characters by examining the conceptual requirements of equivalence-class-style arguments. The problem is that for a modern reader, acquainted with the set theory, the conceptual distance between an equivalence relation and its corresponding equivalence classes is a one-line theorem linking the two via ‘the’ three defining properties of the equivalence relation: reflexivity, symmetry, and transitivity. But, most historical appearances of equivalence originally occurred in non-set theoretic contexts in which our normative understanding of equivalence as a relation could hardly distinguish between historical variations in experiences of equivalence in different contexts. To capture and describe such variations, we have adopted the variation method.

3.1 Method

The variation method is particularly aimed at identifying “the variation in how the phenomenon in question might be experienced by people with certain background characteristics...in different situations” (Marton and Booth 1997, p. 128). In practic, the most difficult task of the variation method is to bracket our own understanding of the phenomenon of interest (here, equivalence) and suspend our normative judgement, and instead, “to look at it with others’ eyes” (Marton and Booth 1997, p. 129). In concrete terms, instead of measuring the historical texts against the modern account of equivalence, we compare the texts under study with each other in search of critical differences between the ways that equivalence has been tackled. As such, the outcome of our study would not be the story of a particular mathematician or a group of mathematicians. The outcome would be a variation, “captured in qualitatively distinct categories, of ways of experiencing the phenomenon in question, regardless of whether the differences are differences between individuals or within individuals” (Marton and Booth 1997, p. 124). As a starter for this way of thinking, we can see the two familiar notions of equivalence relation and equivalence class just as two categories of understanding the notion of equivalence. Historically, there are three other categories as well: matching conception, single-group conception and multiple-group conception. As we shall see, the term “group” is used in its vernacular sense as in “grouping certain elements with each other.” Any other use will be clear from the context.

3.2 Matching conception

Matching is the pre-set theoretic counterpart of the equivalence relation: the focus of both is on the pairs of elements and they both underlie any experience of equivalent elements (more than two). However, they differ critically.

An equivalence relation is firstly a relation. The relation is defined so that for any two elements (of the underlying set) it is known whether the first is related to the second. Then, the defining properties show that the initial order was redundant and allow us to say, “two objects are equivalent to each other.” That ending is the starting point for a matching conception that begins with a pair of two things that are equivalent to each other.

Euclid and Hilbert clearly exemplify the distinction between the conception of equivalence as a matching experience and equivalence as a relation.

In Euclid’s geometry, equivalence is maintained by the given definitions: “Parallel straight lines are...”, “Those magnitudes are said to be commensurable which ...” and so on (the references from Euclid are all from Heath 1956). In Hilbert’s geometry, it is only after establishing the symmetry of segment congruence (If AB \(\equiv \) A\('\)B\('\), Then A\('\)B\('\)\(\equiv \) AB) that Hilbert allowed himself to say:

Due to the symmetry of segment congruence one may use the expression “Two segments are congruent to each other.” (Hilbert 1902, pp. 10–11)

Not only the symmetry property as an if-then statement, but also the reflexivity as we know it, is redundant in the Euclid’s geometry. Euclid never expresses a phrase like, ‘a is equivalent to (read it “is equal to”, “is parallel to”, or “is commensurable with”) itself’. Hilbert has to consider the reflexive property, though he does not name it (Asghari 2009):

Since congruence or equality is introduced in geometry only through these axioms, it is by no means obvious that every segment is congruent to itself. (Hilbert 1902, p. 10)

The only similarity between Euclid’s treatment of equivalence and Hilbert’s is when an equivalence of more than two objects is involved. Both used a property that is sometimes referred to as Euclidean property:

If two objects are equivalent to a third, then they are also mutually equivalent.

Concrete examples are:

  1. Euclid:

    Things which are equal to the same thing are also equal to one another.

    Straight lines parallel to the same straight line are also parallel to one another.

  2. Hilbert:

    If two segments are congruent to a third one they are congruent to each other.

It is of prime importance to note that the Euclidean property differs from the transitive property. The former is a characteristic of equivalence, separating it from the order (relation). It is also equally important to note that we cannot distinguish between the Euclidean property and the transitive property in Euclid since whenever an equivalence is concerned it is an equivalence between two things with no order. Thus, strictly speaking, Euclid did not have the standard properties of the equivalence relation (reflexivity, symmetry, and transitivity) and yet managed two of the most important applications of equivalence relations:

  1. (1)

    Deducing and expressing the equivalence of two objects based on their equivalence with a third object.

This is when we are only concerned with pairs of equivalent objects and not with the groups or the partition determined by the equivalence. A modern example is provided by Hasse (1954, p. 94):

Definition 30. Two systems of linear equations are said to be equivalent if they have the same totality of solutions.

This is naturally an equivalence relation...However; we have no need of the partition thereby determined.

  1. (2)

    Treating “one” individual object as “any”.

This is at the heart of Euclid’s Elements and an indication of single-group conception.

3.3 Single-group conception

When we prove a statement on one specific circle drawn on a piece of paper, it is understood as a statement about any circle, all of them. Similarly, when we prove a statement about parallel lines, it does not matter in which direction they have been drawn; one pair of parallel lines stands for any pair of parallel lines, all of them. The same holds true for all the other figures/objects in Euclid’s Elements. However, sometimes Euclid cared to justify the shift from “one” to “any”. Interestingly, these occasions are where some kind of equivalence is directly addressed by a contextualized Euclidean property and there is a focus on a single object as a representative of a group of equivalent objects, hence, single-group conception. In such situations, there is a contextualized in-out pair of propositions, deciding what is in the group and what is out. Perhaps the most famous of such propositions in Euclid is the two that are related to the Fowler–Zeeman debate about Euclid’s ratios.

In-Proposition, Proposition 11, Book V: Ratios which are the same with the same ratio are also the same with each other.

Out-Proposition, Proposition 13, Book V: If a first magnitude have to a second the same ratio as a third to a fourth, and the third have to the fourth a greater ratio than a fifth has to a sixth, the first will also have to the second a greater ratio than the fifth to the sixth.

Proposition 11 justifies Definition 6 where a single group of proportional ratios is formed

inside which any two ratios are the same.

Definition 6, Book V: Let magnitudes which have the same ratio be called proportional.

Proposition 13 stresses that none outside a group of proportional ratios is the same with any inside the group. Although Euclid never uses a ratio alone, he needs Propositions 11 and 13 to justify Definition 7 of Book V, where the definition of a greater ratio is introduced.

Definition 7, Book V: When, of the equimultiples, the multiple of the first magnitude exceeds the multiple of the second, but the multiple of the third does not exceed the multiple of the fourth, then the first is said to have a greater ratio to the second than the third has to the fourth.

Following Propositions 11 and 13, the disproportionality defined in Definition 7 is justified since any ratio in a group of proportional ratios is greater or lesser than any ratio in another group of proportional ratios. Generally, the pair of in-out propositions guarantees that one ratio stands for any one of the group of proportional ratios. In Book X, another similarly worded pair guarantees the unambiguity of definition and naming of the group of rational lines commensurable with an “assigned straight line”, and thirteen disjoint groups of irrational lines (for a well-informed description of these groups see Fowler 1992 or van der Waerden 1954). The following are the in-out propositions used in Book X.

In-Proposition, Proposition 12, Book X: Magnitudes commensurable with the same magnitude are commensurable with one another also.

Out-Proposition, Proposition 13, Book X: If two magnitudes be commensurable, and the one of them be incommensurable with any magnitude, the remaining one will also be incommensurable with the same.

It is of prime importance to note that in all these cases, “Euclid always considers individuals, never sets” (Fowler 2003, p. 370). However, in each case, he was formally able to use an individual as any individual and Euclidean property is quite suitable for such purpose. Euclid’s Elements does not use the full potential of equivalence that is relying on a multiple-group conception.

3.4 Multiple-group conception plus unity

Russell’s definition of numbers provide a complete picture of a multiple-group conception plus unity, showing where the Euclid’s use of equivalence ceased.

Euclid started with the definition of proportionality (in Book V) and commensurability (in Book X).

Russell started with the definition of similarity:

Two classes are said to be “similar” when there is a one-to-one relation which correlates the terms of the one class each with one term of the other class. (Russell 1919, pp. 15–16)

Euclid confirmed Euclidean properties of proportionality and commensurability.

Russell confirmed that the similarity has the properties of that “important kind of relation” (that is reflexive, symmetrical, and transitive).

Euclid set up the relevant contextualized in-out pair of propositions.

Russell relied on the following in-out pair:

In-Proposition: Classes similar to the same class are similar to one another also.

Out-Proposition: If two classes be similar, and the one of them be not similar to any class, the remaining one will not also be similar to the same.

There is a remarkable resemblance between Euclid’s ingredients and Russell’s, and even more in Book X in which naming brings Euclid close to a multiple-group conception.

Euclid chose a straight line as the “assigned” one, fixes it and names it “rational” at the outset.

Russell did not “assign” anything at the outset.

Euclid defined “rational” lines, using the in-out propositions to make sure that the name “rational” can be unambiguously applied to any individual straight line that is commensurable with the “assigned” straight line. The relevant out-proposition guarantees that every other individual line outside the group of rational lines can be unambiguously called “irrational”. Then, using the same approach, constructing, fixing and naming, he continued to construct thirteen different irrationals (medial, binomial, major, and so forth), proving that the same name can be applied to any individual irrational commensurable with one of the named irrationals.

Russell let the relation itself (i.e. similarity) divide the classes into disjoint sets, initially giving any class a tentative focal status to define its number.

The number of a class is the class of all those classes that are similar to it. (Russell 1919, p. 18)

The in-out propositions guarantee that any individual class similar to the tentative focal class has the same number, and then allow Russell to eliminate the centrality of that focal class.

A number will be a set of classes such as that any two are similar to each other, and none outside the set is similar to any inside the set. (Russell 1919, pp. 18–19; emphasis added to show the role of the in-out propositions)

Euclid left a large infinite collection of individual irrational lines unattended and unnamed, not because of his approach (the same approach works for Gauss; see the next section), but because of the domain of discourse.

Russell succeeded in defining any natural number, not because of his approach, but because of the domain of discourse.

If the main difference between Euclid and Russell is not in their approaches, what is it?

Euclid always considers individuals.

Russell considers the set of the equivalent individuals.

To see the difference, let us name 5 as Euclid might have done it, and define 5 as Russell did it.

Euclid: Let any individual class of objects that is similar to the fingers on my right hand be called 5.

Russell: Let the set of similar classes to which the class of the fingers on my right hand belongs be 5.

Euclid’s naming approach unavoidably leads the mind of an informed observer to equivalence classes. Even Fowler who argued against the presence of equivalence classes in the Elements could not avoid the language commonly used for equivalence classes when summarising some of the propositions of Book X.

(72/73 & 111/112, strengthened). The thirteen classes of alogoi lines, (medial, binomial, apotome, first bimedial, first apotome of a medial, ...) are all disjoint; and any line that is commensurable or commensurable-in-square with a line in a given class is also in that class. (Fowler 1992, p. 251)

There is one critical differences between Euclid and Russell:

Euclid’s equivalent individuals remain as individuals, Russell’s equivalent individual form a whole, a unit.

Russell addressed “the unity of a class” by distinguishing “the many from the whole which they form” (Russell 1903, p. 70) or by considering “the in the plural” (Russell 1919, p. 181; Hausdorff 1914, p. 11). When defining numbers, Russell used a “bundle” to capture the unity.

We can suppose all couples in one bundle, all trios in another, and so on. In this way we obtain various bundles of collections, each bundle consisting of all the collections that have a certain number of terms. Each bundle is a class whose members are collections, i.e. classes; thus each is a class of classes. The bundle consisting of all couples, for example, is a class of classes: each couple is a class with two members, and the whole bundle of couples is a class with an infinite number of members, each of which is a class of two members. (Russell 1919, p. 14)

To sum up, a partition (as we know it) is a multiple-group conception plus unity. Without unity, the best that an informed observer can get is a completely divided domain of discourse. Gauss provides an example for us.

3.5 Multiple-group conception minus unity

If a number a divides the difference of the numbers b and c, b and c are said to be congruent relative to a; if not, b and c are noncongruent. The number a is called the modulus. If the numbers b and c are congruent, each of them is called a residue of the other. If they are noncongruent they are called nonresidues. (Gauss 1801, p. 1)

Gauss started with the definition of congruence, giving an equal status to two matching numbers (as Euclid whenever he dealt with equivalence). Unlike Euclid, he directly addressed reflexivity (though of course without using the term):

Since every number divides zero, it follows that we can regard any number as congruent to itself relative to any modules. (Gauss 1801, p. 1)

The Euclidean property of congruence is expressed as one of “properties of congruence that are immediately obvious”:

If many numbers are congruent to the same number relative to the same modulus, they are congruent to one another (relative to the same modulus). (Gauss 1801, p. 2; italics in the original)

Euclid needed to construct each irrational number before using it as a naming tool for the lines commensurable to it. He succeeded in constructing 13 naming tools, leaving a large infinity of irrationals unnamed. The names come naturally in the domain of integers: the least residues.

Each number therefore will have a residue in the series 0, 1, 2...m–1 and in the series 0, \(-1\), \(-2\) ... \(- (\hbox {m} - 1)\). We will call these the least residues. (Gauss 1801, p. 2).

Take one of the least residues, say a, and an arbitrary number, say A. The following in-out propositions are used to check whether A belongs to the group determined by a,  or not:

In-Proposition: If \(\left( {\hbox {a}-\hbox {A}} \right) /\hbox {m} \) is an integer then \( \hbox {a}\equiv \hbox {A}\).

Out-Proposition: If it [\(\left( {a-A} \right) /m\)] is a fraction then \( a\not \equiv A\). (Gauss 1801, p. 2).

Gauss was obviously aware that the equivalence defined has divided all of the integers into disjoint groups of congruent numbers.

Congruent numbers have the same least residues; noncongruent numbers have different least residues. (Gauss 1801, p. 3; italic in the original)

His approach, and even the language used, bears a remarkable resemblance to Russell’s approach. However, his outcome is more in line with Euclid: multiple-group conception minus unity. He continued working with the individual integer numbers, never distinguishing “the many from the whole which they form”. Unlike Euclid, he did not need to create new names since each number has a natural tag in the list of the least residues. In a way, the least residues play the role of the names. Any two numbers with the same tag can be used interchangeably without affecting the result of a calculation involving addition and multiplication. For example, let \(X=x^{3}-8x+6\) and \(m=5\). Then for \(x=0, 5\) the value of X is 1; for \(x=1, 6\) the value of X is 4. Generally, “The values of X produce these least positive residues: \(1, 4, 3, 4, 3, 1, 4,\hbox { etc}.\) where the first five numbers 1, 4, 3, 4, 3 are repeated infinitely often” (Gauss 1801, p. 3).

Gauss continued to use the old entities endowed with a new structure. Russell structured the old entities to definenew units. However, it seems that Russell was not the first.

4 The originator

Adopting a variation approach, we have not been primarily concerned with the individuals for their own sake. However, now that we have different historical appearances of equivalence we might use the variations to examine the role of some individual who might be considered as the originator of mathematical notion of equivalence by which we define a new entity. We mainly focus on Dedekind who is Fowler’s favourite nominee for introducing the idea of equivalence classes. Then we shortly compare the role of some of the other favourite nominees with Dedekind.

4.1 Dedekind

In 1857, in the study of “congruence with respect to a double modulus”, Dedekind (1857; Fricke et al. 1930, pp. 46–47) treated certain congruence classes explicitly as objects/units. He first considered an equivalence with respect to a prime modulus p, on polynomials with integer coefficients. Accordingly, we have infinitely many function-classes (Funktionenklassen) of “the whole system of infinitely many functions congruent to one another according to the modulus p.” Now the second congruence was defined on the infinitely many disjoint equivalent classes of the first congruence (hence, the name double modulus):

Two function-classes or their representative A and B are called congruent with respect to the function-class with representative M ...

Interestingly, Dedekind freely moved from the congruence to the classes of congruent elements without any further explanation about the property (or properties) of the congruence which makes the “creation” of the classes possible. In fact, he played with the idea for a long time: in 1857, no mention of any property at all; in 1863, in Dirichlet’s lectures on number theory, he addressed the break of matching in the notation introduced by Gauss, \(a\equiv b \left( {mod k} \right) \); there is the left and right of the \(\equiv \) sign.

Since the two numbers a and b play the same role in the congruence relation [the word relation has been added by Stillwell in the translation], one may obviously exchange the numbers to the left and right of the \(\equiv \) sign” (Dirichlet 1863, p. 22).

Then reflexivity and transitivity appear as the theorems that “are clear from the concept of congruence.” In the same publication, he moved beyond just dividing the integer numbers into the classes of congruent numbers; he came close to the unity of each class.

All numbers belonging to the same class have many properties in common, so that they behave almost as a single number relative to the modulus k. (Dirichlet 1863; italic is mine)

In 1871, in Supplement X, in order to classify all existing numbers into classes (mod a) he mentioned only the Euclidean property of congruence.

Since two integers congruent with the same integer are congruent with one another, Then one can classify all existing numbers into classes (mod a) by taking two congruent numbers into the same class, two incongruent ones into two different classes. (1871; my translation)

In 1888, he explicitly wrote, “Every system is similar to itself”, and then stated and used the Euclidean property of similarity to classify sets.

We can classify sets by putting in one class those sets that are similar to each other. So if Q, R, S, ...are similar to a particular set R, then they will all be in the same class, and we may call R a representative of that class. According to [33], that class is not changed by choosing a different representative from the same class.

In 1877 (pp. 64–65), he explicitly mentioned only the transitivity as the property that “leads to the notion of a class of numbers relative to a module a.” In the same publication, he referred only to the Euclidean property to partition the ideals of a field into classes (Dedekind 1877, p. 146).

If two ideals \(a^{\prime }, a^{\prime \prime }\) are equivalent to a third, a, then \(a^{\prime }\) and \(a^{\prime \prime }\) are equivalent to each other... It follows that the ideals can be partitioned into classes. [Notice the use of the word equivalent]

Eventually—following some unfinished and unpublished previous attempts at defining negative numbers—it is in an unpublished manuscript dated 1890, where Dedekind specified the three defining properties (of an equivalence relation) that we use today to “create” the integer numbers (negatives, zero, and positives).

Consider all the pairs of natural numbers \(\alpha _1 , \alpha _2\) (which must be clearly distinguished from the pair of numbers \( \alpha _2 , \alpha _1 \). To be brief, let us denote a pair of numbers \(\alpha _1 , \alpha _2\) by a single letter \(\alpha \). Let \(\alpha \) be the pair of numbers \(\alpha _1 , \alpha _2\) and \(\beta \) the pair of numbers of \(\beta _1 , \beta _2\). Let the congruence \(\alpha \equiv \beta \) mean that \( \alpha _1 +\beta _2 =\beta _1 +\alpha _2\). Then it is obvious the congruence of two pairs off numbers \(\alpha , \beta \) is symmetrical, reciprocal [eine symmetrische, gegenseitige] i.e. if \(\alpha \equiv \beta \) it follows \(\beta \equiv \alpha \); moreover, we always have \(\alpha \equiv \alpha \); finally, of \(\alpha \equiv \beta \) and \(\beta \equiv \gamma \) always follows \(\alpha \equiv \gamma \). (Dedekind 1890)

If we just replace “the pairs of natural numbers \(\alpha _1 , \alpha _2\)” with “the ordered pairs \((\alpha _1 , \alpha _2 )\)” and use the language of sets, the above excerpt would be hardly distinguishable from what we can find in a modern textbook (e.g. Stewart and Tall 2015). So would the rest of the article in which Dedekind completes the “act of creation” in which he explicitly treats each distinct class of congruent pairs as a unit. It is denoted by \(\underline{n}\) if one of the representatives is the pair \(m, n+m\), hence positive; by \({\overline{n}}\) if one of the representatives is the pair \(n+m, m\), hence negative; and by 0 (Null Klasse) if “the class \(\left( \alpha \right) \) contains a pair \(\alpha \) whose numbers \(\alpha _1 , \alpha _2 \) are identical to one another.”

It seems that “It’s already in Dedekind” (Es steht schon bei Dedekind, as Emmy Noether used to say about Dedekind’s works in ideal theory). Intriguingly, rarely has a publication close to the time of Dedekind cited him as someone who contributed to the idea of equivalence classes, let alone as the one who introduced it. The next section gathers the reasons that can be given within the scope of the paper for such an oversight.

4.2 Why not Dedekind

Let us examine the role of some of the favourite nominees for the title “The Originator”.

Frege Most of the time it does not seem that Dedekind was concerned with the whole procedure of forming equivalence classes as something worthy of particular attention on its own; he was simply applying it in the context at hand. Frege, on the other hand, was too concerned with meaning (not necessarily formulation) of the procedure in all generality, being aware that it is something out of ordinary “to use the concept of identity, taken as already known, as a means for arriving at that which is to be regarded as being identical” (Frege 1884, § 63). Of the two, Frege was more willing to take the newly defined objects to be the equivalence classes themselves (Ziegler 2013; https://mathoverflow.net/a/135375/29316). Dedekind was reluctant to take a set itself as the created object. For example, in a letter to Weber, dated 24 January 1888, he wrote (Scheel 2014, p. 277; my translation):

I would advise under the number (number, cardinal number) not to understand the class (the system) of all finite systems similar to one another, but rather something new (corresponding to the class), what the mind creates. (Underlined in the original.)

And in a few lines down, regarding his definition of real numbers, he wrote:

You say that the irrational number is nothing at all other than the cut itself, while I prefer to create something new (of the cut), which corresponds to the cut, and of which I say that it produces the cut. We have the right to associate ourselves with such a creative power.

In both cases, Dedekind just expressed his preference and encouraged Weber to go the way that he wishes. Weber’s way was somehow the same as Frege’s and the way that we treat equivalence classes today; hence the name of Frege as a pioneer. However, it is not easy to find in Frege’s examples the common properties that would lead to the formation of equivalence classes. The clearest indication of those properties is in the context of parallel lines, where Frege mentioned the essential role of the Euclidean property:

If it were false that “straight lines parallel to the same straight line are parallel to one another”, then we could not transform \(a\parallel b\) into an identity.

Of the two (Dedekind and Frege), Dedekind (at least in his later work) was more attentive to properties of equivalence leading to equivalence classes. Years later, Russell polished Frege’s treatment, paying a full attention to the properties of equivalence (see below).

Cantor As Stillwell (1996, p. 44; Dedekind 1877) puts it, “fighting 2000 years of tradition” and against “the horror of infinity”, Dedekind, as early as 1857, dealt with infinite equivalence (congruence) classes as mathematical objects. This is when most mathematicians were not willing to consider a theory based on infinite sets (Stillwell, ibid). In the context of congruence classes, those mathematicians could happily resort to the representatives (individuals) in the fashion of Gauss. However, when it came to cardinal numbers, they could not avoid Cantor’s approach, and hence years later, you might find “G. Cantor’s classical procedure of defining ...a new kind of quantities” (Von Neumann 1936, p. 96) as the origin of making the equivalence classes and working with them as objects.

Russell (nominated by Halmos 1982, and many others). Russell’s definition of numbers had the philosophical standing of Frege, the precision of Dedekind’s unpublished manuscript in 1890 (see above), and Cantor’s set theory (1895, 1897, 1915 in English) in a relatively matured state of its development.Footnote 3 In a way, Russell’s approach to the relation of equivalence and the relevant classes of equivalent individuals had nearly all the features that eventually became established but the names of “equivalence relation” and “equivalence class”. The names brings us back to where we started our long journey in this paper, which means that, it is the time to conclude the paper.

5 Conclusions

The rather simple set-theoretic treatment of equivalence might mislead us to read our own understanding into the history of equivalence. The main thesis of this paper is that the history of equivalence relation and equivalence class should be studied within the history of equivalence, not as the history of equivalence. There is a distinction between equivalence as an experience (of matching) and equivalence as a relation, and between a group (of equivalent individuals) and a set—“formed by the grouping together of single objects into a whole” (Hausdorff 1914, p. 11). As an experience of matching, the equivalence of two objects is established from the outset (e.g. Euclid starts by saying that “two segments are congruent to each other”); as a relation, the equivalence of two objects is derived by showing that the initial imposed order is irrelevant (e.g. Hilbert finishes by saying that “two segments are congruent to each other”). In matching, transitivity and the Euclidean property are indistinguishable (e.g. Euclid, Gauss, Cantor, and many others); in relating, which is initially based on ordered pairs, transitivity needs to be accompanied by some other properties to give the Euclidean property (e.g. almost every author after Russell’s classification of relations). In both experiences of equivalence (matching and relating), intuitively at least two objects are involved. Thus reflexivity might be overlooked or bypassed (e.g. Euclid), disallowed by the context (e.g. parallel lines in Euclid and Hilbert), enforced by the context (e.g. congruent numbers in Gauss), derived (e.g. congruent segments in Hilbert), or chosen (e.g. Peano’s definition by abstraction).

In fact, all of the three defining properties of equivalence relations have been chosen to recreate different aspects of our experience of equivalence in which:

  1. (1)

    We work with two unordered equivalent objects, replacing one with the other. (All)

  2. (2)

    We work with equivalent objects, giving them the same name or using one of them as a representative of all. (Euclid)

  3. (3)

    We restructure the universe of our discourse into disjoint groups of equivalent objects. (Gauss)

Up to this point, we have worked with the individuals (of the universe of our discourse); using Frege’s famous example, it is like working with parallel lines without any attention to the notion of direction. Then, “around the time of Dedekind”, we gradually learned to work with the groups of equivalent individuals, regarding each group as an object of its own. Dedekind himself contributed to this approach in several different contexts. Then, some philosophical considerations (notably, Frege, Peano, Russell, and Dedekind himself) helped us to realize that forming equivalence classes is a new way to define mathematical objects, old or new. This changed equivalence from being an organisational tool (used to organize the objects at hand) to being a creative tool (used to create new objects). This was not just recreating different aspects of our everyday experience of equivalence; rather, it was creating something out of the original equivalence. Thus, let us put it here, further away from the other three aspects mentioned above, to highlight its fundamental difference.

  1. (4)

    We learned to appreciate equivalence classes on their own differentiated from their individual representatives.

Cantor’s definition of cardinal numbers and Russell’s definition of numbers contributed to the popularity of the approach. Meanwhile, the development of set theory gave us a powerful tool to formalize and express it. To quote Quine (1940, p. 121):

This is the end; no abstract object other than classes are needed—no relations, function, numbers, etc., except insofar as these are constructed simply as classes.

That end has made the study of the relevant history a very complicated endeavour. Set theory, as powerful as it is, has nearly hidden all the distinctions made in this paper. As such, Euclid’s definition of ratio is hardly distinguishable from a modern definition of ratio as an equivalence class; after all, both start with this fundamental observation that “Ratios which are the same with the same ratio are also the same with each other.” No wonder that Fowler could not convince Zeeman for about 40 years. The only tool that Fowler had was the normative account of equivalence relations and equivalence classes. The approach used in this study allowed us to see the variation in historical examples that seemed to be the same through the window of the standard framework. However, our approach shares something critical with Fowler’s: both are less attentive to philosophical discussions historically interwoven with the subject; Fowler even less so.

Philosophers of math say they can find it in Frege; I couldn’t, at least not in a form recognisable to me! (Perhaps you may have noticed that mathematicians are in general ignorant and lukewarm about Frege, so I won’t say any more!)

Yet, perhaps the main contribution belongs not to a person, a mathematical theory, a domain of study, but it belongs to the acceptance of what Timothy Gowers (2002, p, 18) calls “the abstract method in mathematics”, that “can be encapsulated in the following slogan: a mathematical object is what it does.” The following comment on the definition of cardinal numbers, written by Hausdorff about a century ago, shows the influence of this attitude on our understanding of equivalence.

This formal explanation says what the cardinal numbers are supposed to do, not what they are. More precise definitions have been attempted but they are unsatisfactory and unnecessary. Relations between cardinal number are merely a more convenient way of expressing relations between sets; we must leave the determination of the “essence” of the cardinal number to philosophy. (Hausdorff 1914, pp. 28–29)

How and why this attitude started and spread is another story. Whatever that story is, it seems that equivalence is one of its main characters. Equivalence has had many different faces and for a long time, no name (generally accepted) and at the same time many different names (locally used). As Jeremy Gray has eloquently summed it up: “It is striking, though, as you show, that when subsequent mathematicians felt the need for these concepts they found themselves inventing new words, which strongly suggests that they were unaware that the concepts were already out there. But history does not favour the discoverers of modest concepts.” (Private communications) The four aspects of the created equivalence can be used to study the works overlooked in the current paper (e.g. Kronecker or Gauss’ theory of binary quadratic forms). However, it seems that the acceptance, polishing and mathematicising the equivalence had been mainly an unplanned collaborative work fertilizing its own underlying ground of “the abstract method in mathematics”.