Abstract

This chapter starts with the counting or natural numbers, formalising them using structured types, thus allowing the definition of the standard numeric operators such as addition and multiplication. Using the natural numbers, the notions of mathematical and strong induction are formalised and illustrated through examples. Other classes of numbers, such as the integers, rational numbers and real numbers are also discussed. The notion of cardinality is presented, showing how one can reason about the size of finite, but also infinite sets.

Keywords

Natural Number Prime Number Rational Number Proof Obligation Quotient Part 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

We have already reached Chap. 9 in a book about mathematics without having yet discussed numbers. Although we have already used numbers in previous chapters to encode notions such as multisets and sequences, unlike the formal treatment we gave the other concepts we have explored, we have brushed aside parts of proofs which use properties of numbers simply by saying that ‘this follows from basic laws of arithmetic’. It is now time to show how we can encode the notions of numbers, addition and other operators in a formal manner and prove these basic laws of arithmetic once and for all.

9.1 Natural Numbers

The notion of numbers and quantities, which we use on a daily basis, is incredibly general. The concept of a number, say eight, is an abstraction of collections of objects at some point in time (there are eight apples in the basket), but also across time (I have been to Paris eight times). Identifying and naming the abstract notion of eight, encompassing these uses, allows us to show properties across all concrete uses of the notion. For example, if we show that eight can be split into two equal parts (four and four), we can apply this to apples in a basket, times I have been to Paris and the price of a cortado in Buenos Aires.

Mathematicians have refined this general notion of quantities, identifying different types of numbers useful for different applications—the basic counting numbers (useful to count sheep), extended with negative numbers (useful to keep track of a bank account balance), fractions (useful to split cakes and divide an inheritance), etc. The most basic of these notions, and the underlying notion behind numbers, is the counting, or so-called natural numbers—denoted by the symbol ℕ, and containing all theoretically possible answers you can give to the question ‘how many angels fit on the head of a pin?’ or ‘how many sheep do you keep in your kitchen?’1 This allows for answers ranging over finite whole numbers. Note that zero is a natural number, since I have no sheep in my kitchen. Some textbooks take the natural numbers to start from one, which can be convenient in some contexts, but is not so natural, if we want to use these numbers to count.

9.1.1 Defining the Counting Numbers

Based on this informal explanation of what the natural numbers are, we can conceptually construct them by starting with zero corresponding to an empty kitchen. We then let sheep into the kitchen, one at a time, each time declaring the resulting number of sheep in the kitchen to be a natural number. This works, since when given a natural number we can take its successor (increment it) to produce a new number. This gives us a way of encoding the natural numbers using structured types: Note that the axioms for structured types which we saw in the previous chapter give us structural equality over natural numbers, which corresponds to our everyday notion of equality of numbers. For instance, zero is not equal to the successor of any natural number, and if two numbers are equal, so are their successors. Similarly, the axiom guaranteeing minimality of the type guarantees that, for instance, negative numbers and fractions are not included in the type.
The basis for this axiomatisation of the natural numbers was originally formulated by Giuseppe Peano in the 19th century, and they are usually referred to as Peano’s axioms. Here we have started with a more general formalisation for structured types and taken a concrete instance to obtain the natural numbers. The inductive principle for this structured type is nothing but the familiar rule of numeric induction: This corresponds to the principle of numeric induction with which you are probably already familiar. If we prove a property for the value of 0 and we prove that when it holds for k it also holds for k+1 (the successor of k), then the property must hold for all the natural numbers. We will see this principle applied to various operators in the rest of this chapter.

9.1.2 Defining the Arithmetic Operators

As we have seen in the previous chapter, the notion of structural equality allows us to define functions using constructor patterns.

9.1.2.1 Comparing Natural Numbers

We will use the notion of structural equality for numeric equality, which we write as n=m. We will also write nm to denote that n is not equal to m, which is defined as ¬(n=m). In addition to equality and non-equality, we will define inequality relations less-than (<), greater-than (>), less-than-or-equal-to (≤) and greater-than-or-equal-to (≥). We start by recursively defining less-than-or-equal-to as a function returning a Boolean value. Based on this definition, we can prove properties of ≤. We start by proving that it is transitive.

Theorem 9.1

The relationis transitive: ∀l,m,n:ℕ⋅lmmnln.

Proof

The proof follows by induction on variable l.
Base case:

For l=Zero. We have to prove that ∀m,n:ℕ⋅ZerommnZeron. This clearly holds, since Zeron for any value of n.

Inductive case:
For the inductive case, we assume that the result holds for l=k: On the basis of this, we have to prove that it holds for Successor(k). We thus have to prove that: It thus follows that: Successor(k)≤mmnSuccessor(k)≤n, which allows us to complete the proof of the inductive case.

The result thus follows by the inductive principle. ■

We will show another property of ≤, in which we use the structural equality axioms.

Proposition 9.2

Less-than-or-equal-to is reflexive—it relates pairs of structurally equivalent numbers: ∀n,m:ℕ⋅n=mnm.

Proof

The proof uses induction on variable n.
Base case:

For n=Zero—we thus have to prove that ∀m:ℕ⋅Zero=mZerom. However, by line 1 of the definition of ≤, it follows that Zerom for any value of m, which is sufficient to prove the base case.

Inductive case:
We will assume that the property holds for n=k: We now prove that it also holds for Successor(k): ∀m:ℕ⋅Successor(k)=mSuccessor(k)≤m. This completes the proof for the inductive case.
The result follows using the principle of induction. ■
The other inequality relations can be similarly defined using constructor matching. However, another approach is to define them in terms of ≤: This allows us to reduce all proofs about these operators to use the definition of ≤.

Exercises:

  1. 9.1

    Prove that ∀n,m:ℕ⋅nmmnn=m.

     
  2. 9.2

    Prove that > is transitive.

     
  3. 9.3

    Prove that ≤ is a total order: ∀n,m:ℕ⋅nmmn.

     
  4. 9.4

    Define ≤ as a relation over the natural numbers, using the function defined in this section.

     
  5. 9.5

    Define > using a pattern-matching approach.

     

9.1.2.2 Addition

Addition conceptually corresponds to the combination of two collections of things together—given two bags of apples their sum is the result of transferring them all into a single bag. One way of performing this operation is to repeatedly take one apple from the first bag and put it into the second until the first bag empties completely. The second bag is the result of the addition. This is the spirit of the formal definition of addition we use: For instance, to calculate the result of adding Successor(Zero) (which we usually call 1) and Successor(Successor(Zero)) (which we usually call 2) the definition gives us: This gives the result Successor(Successor(Successor(Zero))), which we usually call 3. Congratulations, you have just followed a proof that 1+2=3.

Note that the first operand of addition becomes smaller with each recursive call, thus guaranteeing that the definition is a sound one. Based on this definition, we can prove various properties of addition, such as commutativity and associativity. There are two ways of viewing these proofs. The first is that they are verification that addition satisfies these properties, while the second is that if we take the commutativity and associativity of addition as given, then the proofs are confirmation that the definition of addition makes sense.

Sometimes, a different definition of addition is used, replacing the second line equating Successor(n)+m with Successor(n+m). The intuition is that, rather than moving apples from the first to the second bag and then adding the two bags together, we put aside one apple from the first bag, add the bags, and then add the apple which we put aside to the result. We will start by proving that this equality can be proved to follow from our definition of addition.

Lemma 9.3

n:ℕ⋅∀m:ℕ⋅Successor(n)+m=Successor(n+m).

Proof

The proof will use induction on variable n.
Base case:
We start by proving the property for n=Zero. We have to prove that ∀m:ℕ⋅Successor(Zero)+m=Successor(Zero+m).
Inductive case:
We now assume that the property holds for a particular value of n, with n=k: We now prove that the property also holds for Successor(k): The proof of this statement follows:
This completes the proof for the inductive case, thus allowing us to conclude the general result using induction. ■

This result allows us to prove various other properties of addition. We will show that addition is commutative, confirming what you have been told in school since you were very young.

Theorem 9.4

Addition is commutative: ∀n:ℕ⋅∀m:ℕ⋅n+m=m+n.

Proof

The proof uses induction on variable n.
Base case:
For n=Zero. This part of the proof uses a result we have not proved, which says that zero is the right identity of addition. The proof of this conjecture will be left as an exercise to the reader.
Inductive case:
We assume that the result holds for some particular value k of n: ∀m:ℕ⋅k+m=m+k. Based on this, we prove that it also holds for n=Successor(k):
This concludes the inductive case, which allows us to conclude the desired result using the principle of induction. ■

We have shown how addition can be defined on the type of natural numbers, and how we can prove general properties of addition. In the rest of the chapter we will continue with this approach of defining new operators on the naturals and proving properties of the operators.

Exercises:

  1. 9.6

    Prove that zero is the right identity of addition: ∀m:ℕ⋅m+Zero=m.

     
  2. 9.7

    Prove that addition is associative: ∀l,m,n:ℕ⋅l+(m+n)=(l+m)+n. Hint: use induction on variable l.

     
  3. 9.8

    Prove that addition preserves >: if n>m then l+n>l+m.

     
  4. 9.9
    In the first example about addition, we proved that, from our definition, it follows that Successor(n)+m=Successor(n+m). Now consider the opposite situation—let us take the following to be our definition of addition: Based on this definition, prove that Successor(n)+m=n+Successor(m). In this manner, we will have shown that the two definitions are, in fact, equivalent.
     

9.1.2.3 Subtraction

It is worthwhile to start by noting that subtraction on the natural numbers is a partial function, in that nm is not defined when m is larger than n. There are two intuitive ways of defining subtraction:
Property-based definition:

One can look at subtraction as the operation which acts as the opposite of addition, and which should thus satisfy: (i) adding a number and then subtracting it should result in the original value: ∀n,m:ℕ⋅(n+m)−m=n; and (ii) subtracting a number from one which is not smaller and then adding it should result in the original value: ∀n,m:ℕ⋅nm⇒(nm)+m=n.

Algorithm-based definition:
Subtraction can be seen as the operation which removes an item in the first bag for each item in the second, thus corresponding to the following pattern-based definition: Note that, unlike the property-based approach, the equations give us an algorithm to calculate the subtraction of two numbers.
Which definition to adopt is largely a matter of style, because either one can be proved as a consequence of the other, thus making them equivalent formulations. In computer science texts, one tends to find more algorithmic (sometimes also called constructive) definitions, which tell us how to compute the result from which they then prove properties of the operators, whereas mathematical texts tend to prefer to define operators in terms of their properties, and then prove that a constructive approach gives a solution of the specification. Here, we will adopt the constructive approach, and prove the properties as consequences of the definitions.

Theorem 9.5

Addition nullifies the effect of subtraction: ∀n,m:ℕ⋅nm⇒(nm)+m=n.

Proof

The proof will follow by induction. Note that subtraction uses pattern-matching on the second operator, and thus induction on variable m will be applied.
Base case:

For m=Zero. We have to prove that (nZero)+Zero=Zero. The result follows by applying line 1 of the definition of subtraction, followed by line 1 of the definition of addition.

Inductive case:
We start by assuming that the property holds for m=k: ∀n:ℕ⋅nk⇒(nk)+k=n, based on which we will prove that the property also holds for m=Successor(k): We thus assume that nSuccessor(k), and prove the equality:
This completes the proof by numeric induction. ■

The proof of the second property is left as an exercise. If we were to complete a proof of equivalence, we would have to prove that (i) the property-based approach defines a function, with at most one solution for nm; and (ii) whenever the property-based approach provides a solution, the algorithmic approach always terminates, and gives an answer. At this stage we will not dwell on this further, but we will be discussing these issues later on in Chap.  10.

Exercises:

  1. 9.10

    Prove that ∀n:ℕ⋅nn=Zero.

     
  2. 9.11

    Prove that subtraction nullifies addition: ∀n,m:ℕ⋅(n+m)−m=n.

     
  3. 9.12

    Prove that, if n>m, then nm>Zero.

     
  4. 9.13

    Prove that subtraction (sometimes) commutes over addition: ∀l,m,n:ℕ⋅lm⇒(l+n)−m=(lm)+n.

     
  5. 9.14
    Another approach to defining subtraction is to start by defining the predecessor of a number, and then defining subtraction as taking the predecessor as many times as the second operand of subtraction, as shown below: Prove that the two operators give the same result:
     

9.1.2.4 Multiplication

Multiplication corresponds to repeated addition, and n×m can be seen as shorthand for the addition of n copies of m. The formal definition follows from this intuitive meaning of multiplication and is based on two observations: (i) zero copies of any number m result in zero; and (ii) Successor(n) copies of a number m is the same as n copies of m, plus an additional m.

Lemma 9.6

Multiplication can also be done by adding the first operand for each successor in the second: (i) n×Zero=Zero; and (ii) n×Successor(m)=n+(n×m).

Proof

We will prove the second law, and leave (i) as an exercise for the reader. The second proposition will be proved by induction on variable n.
Base case:
To prove that Zero×Successor(m)=Zero+(Zero×m).
Inductive case:
We assume that the result is true for n=k: ∀m:ℕ⋅k×Successor(m)=k+(k×m). Based on this we prove that it also holds for n=Successor(k): ∀m:ℕ⋅Successor(kSuccessor(m)=Successor(k)+(Successor(km).
Using the inductive principle, the result thus follows. ■

Based on Lemma 9.6, we can now prove commutativity of multiplication.

Theorem 9.7

Multiplication is commutative: ∀n,m:ℕ⋅n×m=m×n.

Proof

The proof is by induction on variable n.
Base case:
Inductive case:
We assume that the property holds for n=k: ∀m:ℕ⋅k×m=m×k. We proceed to prove that it thus also holds for n=Successor(k).
The result thus holds by induction. ■

Exercises:

  1. 9.15

    Prove law (i) of Lemma 9.6: ∀n:ℕ⋅n×Zero=Zero.

     
  2. 9.16
    Prove that multiplication is associative:
     
  3. 9.17
    Prove that multiplication on the left distributes over addition:
     
  4. 9.18

    Prove that if n>Zero, then n×mm.

     
  5. 9.19
    Prove that multiplication on the right distributes over subtraction:
     
  6. 9.20
    In this exercise, we will explore the definition of natural number exponentiation. We will write n m to denote n to the power of m: n×n×⋯×n (with m copies of n). In the same way that addition is repeated incrementation, and multiplication is repeated addition, one can define powers of natural numbers as repeated multiplication.
    1. (i)

      Keeping in mind that a number to the power of zero is equal to Successor(Zero), recursively define n m using pattern-matching on m.

       
    2. (ii)

      Prove that ∀l,m,n:ℕ⋅(l m ) n =l m×n .

       
    3. (iii)

      Prove that ∀l,m,n:ℕ⋅l m ×l n =l m+n .

       
     

9.1.3 Strong Induction

Informally, we find numeric induction to be a reasonable principle by noting that the property holds for Zero (the first proof obligation for induction), which implies that it also holds for Successor(Zero) (using the second proof obligation on the previous statement), which in turns implies that it also holds for Successor(Successor(Zero)), etc., until we reach the desired number. Moving up the number line uses the inductive case: if it holds for value k, then it must also hold for Successor(k).

Now consider another proof principle, with two proof obligations: (i) that the property holds for Zero (just like normal induction), and (ii) that if the property holds for all numbers up to and including k, then it must also hold for Successor(k). Informally, this principle also seems to be a sound one, but note that the inductive hypothesis is not simply that the property holds for k, but that it holds for all values up to and including k. In some cases, this stronger assumption enables us to prove properties which would otherwise not be directly provable using normal induction.

Formally, the principle of strong induction is the following: Here is an informal example of the use of strong induction.

Example

Every number n can be written as a sum of different powers of 2. For example, 19 can be written as the sum of three powers 20+21+24, and 24 can be written as the sum of two powers 23+24. This statement can be proved by strong induction on n.
Base case:

For n=Zero. One can express zero as an empty sum of powers of two.

Inductive case:

We will assume that the result holds for all values up to and including k. We now need to prove that it also holds for Successor(k).

Consider the value of α such that 2 α Successor(k)<2 α+1. Since Successor(k)−2 α <Successor(k), we can use the inductive hypothesis to split Successor(k)−2 α into a sum of distinct powers of two. Adding 2 α to the sum, we get a new sum of powers of two which adds up to Successor(k). Are the powers still distinct? It suffices to show that 2 α could not have appeared in the powers of two adding up to Successor(k)−2 α .

Since 2 α+1=2×2 α =2 α +2 α , and Successor(k)<2 α+1, we can conclude that Successor(k)<2 α +2 α . This confirms that Successor(k)−2 α <2 α , and therefore the powers of two adding up to Successor(k)−2 α cannot include the power α.

This completes the proof by strong induction. Note that this result justifies how we encode numbers in binary on a computer. ⋄

What is particularly interesting is that the proof rule for strong induction can be proved using normal induction. An informal argument of the proof follows.

Theorem 9.8

Strong induction is a sound principle.

Proof

To prove that strong induction is a sound principle, we assume the antecedents of the rule, and prove that the conclusion is true. Let us assume the following two predicates:
  1. (i)

    π(Zero)

     
  2. (ii)

    k:ℕ⋅(∀k′:ℕ⋅k′≤kπ(k′))⇒π(Successor(k))

     
Based on this we would like to show that: ∀n:ℕ⋅π(n), which would suffice to show that strong induction is a sound principle.
Consider the property π′, defined as follows: Note that π′(n) implies π(n). Therefore, if we manage to prove property ∀n:ℕ⋅π′(n), then we can conclude the desired result that ∀n:ℕ⋅π(n).
We will prove property π′(n) for all values of n using normal induction.
Base case:
For n=Zero. We need to prove π′(Zero): But by (i), we know that π(Zero) holds, hence so does π′(Zero).
Inductive case:
We assume that π′(k) holds: ∀n′:ℕ⋅n′≤kπ(n′), based on which we would like to prove that π′(Successor(k)) also holds: ∀n′:ℕ⋅n′≤Successor(k)⇒π(n′). Hence the inductive case also holds.
By using normal induction, we have thus proved that ∀n′:ℕ⋅π′(n), which is sufficient to prove the conclusion of strong induction. ■

We will be seeing other applications of strong induction in the rest of the chapter.

9.1.4 Division and Prime Numbers

The notion of prime numbers, numbers greater than 1 which are exactly divisible only by 1 and themselves, has a long history in mathematics. Prime numbers correspond closely to the notion of atomic numbers—numbers which cannot be constructed as a product of smaller numbers. Although studied since the Ancient Greeks, they have only found practical applications very recently. Nowadays, prime numbers play a crucial role in encryption algorithms, and securely buying from an online store would not be possible without them.

9.1.4.1 Multiples, Divisors and Remainders

Since prime numbers are based on division, we start by formally defining this operator. It is tempting to view division as a single operator which acts as the inverse of multiplication. For example, the value of 12÷3 should be the value of x satisfying x×3=12, making it equal to 4. However, this approach leaves 7÷3 undefined, since there is no natural number which satisfies x×3=7. Without substantial changes, this approach would thus limit us to talk about exact division. Instead, we split the notion of division into two parts: the quotient, or how many times a number fits into another, and the remainder. For instance, 7÷3 would give two results: (i) the quotient part equal to 2, which says that 3 fits at most twice in 7; and (ii) the remainder part equal to 1, which says that removing two threes from the value of 7, leaves 1 unused. We will write these as 7÷ q 3 and 7÷ r 3, respectively.

Let us start by looking at the remainder part of division.

Definition 9.9

The remainder of a natural number division can be algorithmically defined in the following manner. To calculate the remainder of n when divided by m, we subtract m from n repeatedly, until n<m, in which case we just take n: Two things to note are that (i) the definition assumes that mZero, otherwise it would not be well defined, since the parameters of the recursive call are not smaller; and (ii) for the second value to be taken, it must be the case than nm, which means that the subtraction is well-defined. ■

One important observation of this operator is that the remainder should always be smaller than the divisor.

Theorem 9.10

The remainder of a division operation is always smaller than the divisor: ∀n,m:ℕ⋅n÷ r m<m.

Proof

This proof uses strong induction on variable n. Keep in mind that m>Zero for the division to be meaningful.
Base case:
For n=Zero. We need to prove that ∀m:ℕ⋅Zero÷ r m<m.
Inductive case:
Since we are using strong induction, we assume that ∀m:ℕ⋅k′÷ r m<m for all k′≤k. Now let us consider the property for n=Successor(k). We consider two cases: (a) Successor(k)<m; and (b) Successor(k)≥m.
Case (a):
Successor(k)<m:
Case (b):
Successor(k)≥m:
Since the result follows in both cases, which cover all possibilities, we can conclude that the inductive case holds.
This completes the proof using strong induction. ■

It is worth noting that some of the lines of the above proof were not fully rigorous, since they used results which we have not proved elsewhere. However, we do have the tools to prove these results. We will now start taking larger steps in our proofs, knowing however, that we would be able to prove the additional results if necessary.

We can now also define the quotient part of the division.

Definition 9.11

The quotient of n divided by m is defined in the following manner: if n<m, the result is Zero, otherwise calculate the quotient of nm divided by m adding one to the result.

These operators allow us to reason about natural number division in a complete manner. The following fundamental theorem shows that, given two numbers, there is only one reasonable choice of quotient and remainder.

Theorem 9.12

For any natural numbers n and m, there exist unique natural numbers q and r, with r<m, such that n=q×m+r.

Proof

This proof is split into two parts: we first need to prove that there do exist natural numbers which satisfy the equality, and secondly, we prove that these natural numbers q and r are unique. For the first part, we note that q=n÷ q m and r=n÷ r m are solutions to the equation. We leave it as an exercise to prove that: (i) n÷ r m<m, and (ii) n=(n÷ q mm+(n÷ r m).

The proof of uniqueness is a proof by contradiction. Assuming a statement, we reach a contradiction, from which we can conclude that the assumed statement was false.

Let us assume that we have two different solutions q 1, r 1 and q 2, r 2, with q 1q 2. We can reason as follows: We now split q 1q 2 into two cases: q 1=q 2 and q 1<q 2 and consider them separately:
Case (a):

If q 1=q 2, from r 1=(q 2q 1m+r 2 we can conclude that r 1=r 2. Therefore, in this case, q 1=q 2 and r 1=r 2—the two solutions are not distinct at all, which contradicts the assumption that the solutions are different.

Case (b):
If q 1<q 2 (or equivalently that q 2>q 1), we can reason as follows: This contradicts the original statement that r 1<m.
In both cases, we have discovered a contradiction, implying that the our assumption that there exist distinct solutions for q and r is wrong. Therefore, there must be no more than one solution. ■

The opposite direction is also the case. Combining a number using a quotient and remainder will return the original quotient and remainder upon division:

Corollary 9.13

For any natural numbers q, d and r (r<d), ((q×d)+r r d=r and ((q×d)+r q d=q.

Proof

Define α=q×d+r. But by the first part of the previous theorem, we also know that α=(α÷ q dd+(α÷ r d). Since the solutions for q and r are unique, we can conclude that ((q×d)+r r d=r and ((q×d)+r q d=q. ■

Division is considerably more complex than the other operators we have encountered so far. However, the results we proved in this section are crucial for reasoning about prime numbers in the next section.

Exercises:

  1. 9.21

    Prove that ∀n:ℕ⋅n÷ q 1=n.

     
  2. 9.22

    Prove that ∀n:ℕ⋅n÷ r 1=Zero.

     
  3. 9.23

    Prove that ∀n,m:ℕ⋅n÷ r m<m.

     
  4. 9.24

    Prove that ∀n,m:ℕ⋅n=(n÷ q mm+(n÷ r m).

     
  5. 9.25

    Prove that, if l÷ r m=Zero and m÷ r n=Zero, then l÷ r n=Zero.

     

9.1.4.2 Prime Numbers

In studying objects and phenomena, an approach which frequently works is that of decomposing the object into its constituent parts. Starting by understanding the constituent parts, then how they interact and combine together, gives insight into how and why an object behaves as it does. We see this approach everywhere, from chemists studying compounds by looking at the elements which make them up, physicists studying how atoms work by looking at their constituent parts, linguists who study language by splitting sentences into phrases, to system developers, who when faced with a huge program seek to understand it by looking at the constituent parts. One of the fascinating things about this approach is that it can be iterated—we know that matter is made up of molecules, which are made up of atoms, which are made up of a combination of protons, neutrons and electrons, and so on. When to stop breaking down the objects into parts depends on the level of abstraction one would like to reason at, and whether further decomposition gives any further insight. For instance, when trying to understand a program written in a high-level language, it is rarely useful to study the machine code produced by the compiler, or the electrical signals going through your computer when the program is being executed.

The Ancient Greeks were interested in studying how numbers can be decomposed into smaller ones. Using the two main operators of addition and multiplication, they categorised numbers according to how they can be split into parts. For instance, the notion of square numbers corresponds to those which can be decomposed via multiplication into two equal parts, or according to addition as the sum of n copies of a number n—thus allowing a square placement of the items. Similarly, triangular numbers are the ones which can be decomposed as the sum of all the numbers from 1 to some number n—allowing a triangular placement.

For example, 10 is a triangular number, because it is equal to the sum 1+2+3+4:
Splitting numbers down in terms of multiplication—splitting a number into two divisors, neither of which is 1—turned out to be particularly interesting. Although, a number can be split in different ways—for example, 12 can be split as 2×6 or 3×4 —if one reiterates the process until no further breaking down is possible, it turns out that there is only one way of splitting a number. For example, if 12 is split as 2×6, we note that 2 cannot be split any further, but 6 can be split into 2×3, neither of which can be split any further—thus splitting 12 as 2×2×3. Another way of splitting 12 is into 3×4—once again, 3 cannot be split any further, but 4 can be split into 2×2, thus breaking down 12 into 3×2×2. Other than the order of the parts, we note that the two decompositions are equivalent. It turns out that this works for any number, which provides a tool to analyse and reason about numbers. The numbers which cannot be split any further give us a notion of indivisible or atomic numbers.

To be able to define and reason about prime numbers, we start by formally defining the notion of exact divisors, or factors of a number.

Definition 9.14

We say that a number n exactly divides another number m, written \(n \stackrel {\mbox {\footnotesize $\div $}}{\rightarrow }m\), if dividing m by n leaves no remainder: m÷ r n=Zero.

The set of numbers which exactly divide a non-zero number n are called the divisors, or factors of n: The proper divisors of a number n∈ℕ1 are the divisors of n excluding 1 and n:

Although the definition of divisors of a number leaves open the possibility that they may be larger than the number itself, we can prove that this is impossible:

Proposition 9.15

Divisors of a number are not larger than the number itself: ∀n,d:ℕ⋅ddivisors(n)⇒dn. Furthermore, proper divisors are strictly smaller than the number itself.

Proof

The proof is by contradiction. Take n∈ℕ1, and ddivisors(n) and let us assume that d>n. By the first line of the definition of remainder, we know that n÷ r d=n (since d>n). But since d is a divisor of n, n÷ r d=0, which means that n=0, which is impossible. Hence the assumption is wrong: ¬(d>n), or dn.

Since, by definition, a proper divisor d of n cannot be equal to n, it follows that d<n. ■

One important corollary of this proposition is that we can write a program to generate all factors of a number n, by checking the finite set of numbers from 1 to n.

We are now ready to define the notion of prime numbers.

Definition 9.16

A number n≥2 is said to be prime if it has no proper divisors: \(\mbox {\textit {prime}}(p) \mathbin {\stackrel {\mathrm {df}}{=}} \mbox {\textit {divisors}}^{+}(p) = \emptyset\). ■

Since, by the previous proposition, the exact divisors of a number are not larger than the number itself, we can write an algorithm that finds all prime numbers up to a maximum M: The program is far from being an efficient generator of prime numbers. However, we can prove properties allowing us to make the program more efficient.2 Consider the following theorem, which says that it is sufficient to check whether a number has prime factors:

Theorem 9.17

A number (n≥2) is prime if and only if it has no proper prime divisors.

Proof Outline

Clearly, if a number is prime it has no proper divisors, and therefore has no proper prime divisors.

On the other hand, we need to prove that a number which has no proper prime divisors has to be prime. We will give an informal sketch of how this can be proved using strong induction.

For the base case (in this case, we take n=2), the proof is straightforward. The number 2 has no proper prime divisors, and is prime.

For the inductive case, we assume that all number up to and including k satisfy the property. Does Successor(k) also satisfy the property? We need to show that, if Successor(k) has no prime divisors, then it is prime.

Now, either Successor(k) is prime or it is not. In the first case, if Successor(k) is prime, then the result follows trivially. If it is not a prime, then it must have proper divisors. Let d be one of them. If d is prime, then the proof is done (since the condition of the implication is false). If d is not prime, and since d<Successor(k) (since d is a proper divisor of Successor(k)), by the inductive hypothesis it follows that d must have proper prime divisors—let us call one of them d′. Since d′ is a proper divisor of d, and d is a proper divisor of Successor(k), it follows from Exercise 9.25 that d′ is a divisor of Successor(k). We have thus found a proper prime divisor for Successor(k), which concludes the inductive case. □

How does this theorem allow us to produce prime numbers more efficiently? If we start building a list of the prime numbers as we did before, we need only check for divisors amongst the prime numbers we have already generated. This algorithm was already known to the Ancient Greeks, and is known as the Sieve of Eratosthenes: The divisors of a number n cannot be greater than n itself, implying it can only have a finite number of divisors. Can the same be said of prime numbers? Is there a largest prime number, implying that there exists a finite list of prime numbers through which we can construct all the natural numbers using multiplication? The following is a proof that no such finite list exists—a result which was already known to the Ancient Greeks.

Theorem 9.18

There is an infinite number of primes.

Proof Outline

The proof is a proof by contradiction. Let us assume that there is a finite number of prime numbers, which we will call p 1 to p n . Now consider the number: α=p 1×p 2×⋯×p n +1. Is p 1 a divisor of α? Using Corollary 9.13 we know that α÷ r p 1=1, implying that p 1 is not a divisor of α. The same can be said of all the primes. The number α thus has no prime divisors, which, by Theorem 9.17 implies that it is itself prime. But α is larger than all the primes, and was therefore not included in the original set of primes. This contradicts the original assumption that there is a finite set of prime numbers. □

We started off by motivating prime numbers as providing us with a unique way of decomposing a number into constituent parts. The Fundamental Theorem of Arithmetic gives us a proof of this result. Before we prove it, we will require the following result:

Lemma 9.19

Every number n≥2 can be written as a product of primes.

Proof Outline

The proof is by strong induction on n, starting from 2. The result clearly holds for the base case with n=2.

For the inductive case, let us assume that any number up to (and including) k can be written as a product of primes. We have to prove that Successor(k) can also be written in this form. Now Successor(k) is either (i) prime, or (ii) has at least one proper divisor, and can thus be written as the product of two smaller values Successor(k)=α×β. In case (i), we can write Successor(k) in the desired format (simply as Successor(k), since it is prime). In case (ii), we know, by the inductive hypothesis, that both α and β can be written as products of primes: Since Successor(k)=α×β, we can combine the two products together to obtain a prime factorisation of Successor(k): The result thus follows by strong induction. □

With this result in hand we are ready to prove that numbers have a unique prime factorisation.

Theorem 9.20

The Fundamental Theorem of Arithmetic (also referred to as the Unique Prime-Decomposition Theorem): Given a number α≥2, there is exactly one way of expressing α as an ordered product of prime numbers.

Proof Outline

Lemma 9.19 tells us that any number can always be expressed as a product of primes, which can be ordered. What we have to show is thus that such an ordered product of primes is unique.

The proof is by strong induction on α. If α is a prime number, then the theorem trivially holds, and therefore it is trivially true for the base case with α=2.

For the inductive case, we will assume that the theorem holds for all values up to and including k. We have to prove that Successor(k) also permits only one ordered product of primes. Consider Successor(k) expressed as two ordered products of primes: Since they are ordered, we assume that, for all i: p i p i+1 and q i q i+1. Without loss of generality, we will assume that p n q m . We now consider two different cases: (i) p n >q m , and (ii) p n =q m .

Case (i): For p n >q m . Note that p n is a divisor of Successor(k) and thus also of q 1×q 2×⋯×q m . Furthermore, since p n is a prime, we note that p n must be a divisor of some q i . However p n >q m , and for every i, q m q i . It thus follows that p n is larger than each q i and cannot be one of its divisors. This case is thus not possible, leaving us with case (ii).

Case (ii): If p n =q m , the remaining products must also be equal: p 1×⋯×p n−1=q 1×⋯×q m−1. Let us call this value k′. Since k<Successor(k) (because p m is a proper factor of Successor(k)), we can apply the inductive hypothesis to k′—any two ordered prime decompositions of k′ must be equivalent: meaning that n=m and p i =q i for each 1≤in−1.

Therefore, by strong induction, any number permits exactly one ordered product decomposition of prime numbers. □

9.1.4.3 Prime Numbers for Fun and Profit

Consider the following puzzle: you are given a finite sequence of non-zero natural numbers, which you want to encode as a single number, in such a way that you can decode it back to the original sequence of numbers. For instance, consider the sequence \(\langle132,\;927,\;3,\; 3\rangle\). We cannot simply join the numbers together into a single number 13292733, since we would not know whether this originally came from, for instance, the sequence \(\langle13,\;29,\;2733\rangle\) or the original one. Padding the numbers with zeros to obtain 132927003003 does not work either, since we do not have a limit on the size of the numbers in the sequence (hence the number of digits)—and the number shown can also be decoded to \(\langle132927,\; 3003\rangle\). Think about this puzzle, and try to find a solution before reading any further.

There are various solutions to this puzzle. One of the most notable solutions is due to the mathematician Kurt Gödel, and is known as Gödel numbering. Given a sequence of length n: \(\langle v_{1},\;v_{2},\ldots v_{n}\rangle\), we take the first n prime numbers: 2, 3, 5, …p n and calculate the number \(2^{v_{1}}\times 3^{v_{2}}\times\cdots\times p_{n}^{v_{n}}\). For example, the sequence \(\langle132,\;927,\;3,\;3\rangle\) would be encoded as 2132×3927×53×73. To decode, we simply decompose the number into its constituent prime factors and reconstruct the sequence. For example, given a sequence encoded as 600, we factorise into 600=23×31×52, which means that the sequence was \(\langle3,\;1,\;2\rangle\). This encoding works thanks to two results we have proved in this section. Can you identify which ones?

The more obvious result we use is the unique decomposition theorem, which guarantees that factorising will always give us back the original sequence. The other crucial theorem is that there is an infinite number of primes, guaranteeing that we have sufficient prime numbers no matter what length the original sequence has. Do not be misled into thinking that this encoding is a simple way of remembering the whole telephone directory by transforming the sequence of numbers into a single number and memorising that single number. The resulting number would have so many digits, that you would have been better off remembering the original numbers in the first place. Still, this gives us an important mathematical tool, which we will use to reason about computability.

Prime numbers have been studied for over 2,000 years, and yet, there are various seemingly simple questions to which we have no answer. For instance, consider the prime pairs 3 and 5, 11 and 13, 29 and 31, all of which are exactly two apart. We know that there is an infinite number of primes, but can one find an infinite number of prime pairs exactly two apart? Despite the apparent simplicity of the question, no one knows its answer.

How much computing power is required to decompose a number into its factors is another open question, even if it is widely believed that factorising huge numbers is a computationally intensive task. This property is used in cryptography, for instance by the RSA algorithm,3 in which two huge prime numbers (huge as in having hundreds of digits) are multiplied together to generate an even larger number which is given out to be used to encrypt messages. Messages encrypted in this manner can, however, only be decrypted efficiently if the two original prime numbers are known. This approach can be used, for instance, so that an online store gives me the product to encrypt my credit card number, which no one but the store can decrypt, even if a malicious party were to have listened to the number which was passed on to me by the store. Such algorithms are used to encrypt sensitive data over the Internet, thus enabling secure transactions involving private and financial information.

9.2 Beyond the Natural

We have looked at how the counting numbers can be defined. However, frequently we require more complex numbers, for instance to handle negative values, or to talk about fractions. In this section we will briefly look at ways in which these types can be defined mathematically to enable reasoning about them. It is important to keep in mind that the best approach to reason about these sets of numbers is usually to axiomatise them directly. However, here we will look at ways of formulating them in terms of the concepts we already have.

9.2.1 Integers

The next step after the natural numbers is to look at the integers—the set of whole numbers, including negative ones, usually denoted by ℤ. One way of encoding the integers is as a structured type using the natural numbers: Two constructors are used to encode positive values including zero (Pos) and negative ones (Neg). Needless to say, there are various other ways in which integers could have been encoded, for instance adding another constructor for zero and excluding it from the positive numbers (by changing the type of the parameter of Pos to ℕ1). Another way could have been to encode them as a pair Boolean×ℕ, with the first denoting the sign (positive or negative) and the second denoting the value. Note that with this encoding, zero would have had two representations (true,Zero) and (false,Zero), which would have meant that we would have had to do additional work to reason about, for instance, equality of integers.
Addition of integers can now be calculated in terms of addition and subtraction of natural numbers: Note that, due to the six possible cases (four possibilities for all the combinations of positive and negative values, two of which split further into two), proofs have to handle all the cases separately. For instance proving that integer addition is commutative would involve looking at all the six possibilities.

Proposition 9.21

Addition of integers is commutative: \(\forall n,m:\mathbb {Z}\cdot n+_{\mbox {\scriptsize $\mathbb {Z}$}}m = m +_{\mbox {\scriptsize $\mathbb {Z}$}}n\).

Proof

We consider all possible cases of the two parameters.
  • When \(n= \mbox {\textup {\textsf {Pos}}}\;n'\) and \(m= \mbox {\textup {\textsf {Pos}}}\;m'\)
  • When \(n= \mbox {\textup {\textsf {Neg}}}\;n'\) and \(m= \mbox {\textup {\textsf {Neg}}}\;m'\) the proof is very similar to the previous case when both n and m are positive.

  • When \(n= \mbox {\textup {\textsf {Pos}}}\;n'\) and \(m= \mbox {\textup {\textsf {Neg}}}\;m'\).

    This case splits further into: when n′≥m′ and when n′<m′.
  • When \(n= \mbox {\textup {\textsf {Neg}}}\;n'\) and \(m= \mbox {\textup {\textsf {Pos}}}\;m'\), the proof is very similar to the previous case.

Other operators can be similarly defined by considering all the cases. With the number of different cases one has to consider, proofs tend to become long and tedious using this way of encoding integers.

Exercises:

  1. 9.26

    Define subtraction over integers, and prove that \(\forall n:\mathbb {Z}\cdot n -_{\mbox {\scriptsize $\mathbb {Z}$}}n = \mbox {\textup {\textsf {Zero}}}\).

     
  2. 9.27

    Define the comparison operators over integers: \(\geq _{\mbox {\scriptsize $\mathbb {Z}$}}\), \(\leq _{\mbox {\scriptsize $\mathbb {Z}$}}\), \(>_{\mbox {\scriptsize $\mathbb {Z}$}}\) and \(<_{\mbox {\scriptsize $\mathbb {Z}$}}\).

     
  3. 9.28

    Define multiplication over integers.

     

9.2.2 The Rational Numbers

A limitation with the integers is that they enable us to reason only about whole values. For instance fairly splitting three cakes between two persons cannot be done if we can only use whole numbers of cakes. Fractional numbers are crucial to allow us to reason about such situations. If we limit ourselves to positive fractions, the set of fractional, or rational, numbers ℚ can be encoded as a pair of natural numbers ℕ×ℕ1, with (p,q) corresponding to our notion of the fraction \(\frac{p}{q}\). Note that q is not allowed to be zero.

One problem with this encoding is that a single fraction can be expressed in different ways. For instance the notion of a half can be encoded as (1,2) or as (2,4) or in a multitude of other ways. To be able to reason about equality of rational numbers we need a notion of the lowest form of a fraction.

A fraction (p,q) is said to be in its lowest form if p and q have no common proper divisors: divisors +(p)∩divisors +(q)=∅. The lowest form of a fraction (p,q) is defined to be (p′,q′) such that, for some value of α, p=α×p′ and q=α×q′, and (p′,q′) is in its lowest form. It can be proved that any fraction has a unique lowest form. Two fractions can now be defined to be equal if they have the same lowest form.

As in the case of integers, we can define operators by giving how to compute the result. For instance, multiplication of two rational numbers \((p,q) \times _{\mbox {\scriptsize $\mathbb {Q}$}}(p',q')\) is defined to be (p×p′,q×q′).

Exercises:

  1. 9.29

    Define addition over the rational numbers.

     
  2. 9.30

    Define the comparison operators over the rational numbers.

     
  3. 9.31

    Prove that multiplication of rational numbers is commutative and associative.

     
  4. 9.32

    Modify the encoding of rational numbers to enable reasoning about negative fractions.

     

9.2.3 The Real Numbers

Using the rational numbers to reason about non-exact values is very convenient, since it reduces all reasoning to the use of two whole numbers. The Ancient Greeks, however, discovered that not all numbers can be expressed as fractions. Consider the diagonal of a square with sides of length 1, which by Pythagoras’ theorem is of length \(\sqrt{2}\):
What fraction corresponds to the length of the diagonal? For instance, the fraction \(\frac{5}{4}\) is too small to be equal to \(\sqrt{2}\), since \(\frac{5}{4}\times\frac{5}{4}= \frac{25}{16}\) which is smaller than 2. On the other hand, \(\frac{10}{7}\) is too large, since \(\frac{10}{7}\times\frac{10}{7}=\frac{100}{49}\), which is larger than 2. No matter how hard you try to identify a fraction which exactly corresponds to \(\sqrt{2}\), you are doomed to fail.

Theorem 9.22

The square root of 2 is not a rational number: \(\sqrt{2} \notin \mathbb {Q}\).

Proof Outline

The proof is by contradiction. We assume that \(\sqrt{2}\) can be written as a fraction in its lowest form: \(\sqrt{2}=\frac{p}{q}\). Squaring both sides of the equation, we get that 2×q 2=p 2. Since p 2 is even, we can conclude that p must also be even. Let us write p as 2×x.

Rewriting the equation we get 2×q 2=(2×x)2, from which we can conclude that q 2=2×x 2. Therefore, q 2 is even, which means that so is q.

Since both p and q are even, the fraction \(\frac{p}{q}\) was not in its lowest terms, which contradicts our original statement. We can thus conclude that our only assumption, that \(\sqrt{2}\) can be written as a fraction, must be false. □

This theorem confirms that there are numbers which are not rational. Various other numbers have been shown not to be rational, such as π, the ratio between the diameter and the circumference of a circle, as well as the mathematical constant e.

The set of numbers, including ones which are not rational, is called the set of real numbers, and is usually denoted by ℝ. They correspond to numbers with an infinite decimal expansion, such as 3.14159…, 0.121212… and 0.5000… . If the sequence of digits eventually starts repeating (an individual number or block of numbers), then the number can be represented as a fraction, and is thus rational. The rest of the numbers, such as \(\sqrt{2}\) and π, are called the irrational numbers. The real numbers are made up of the rational and irrational numbers.

One way of encoding real numbers is as an infinite sequence of digits, giving the decimal expansion of the number, with an additional symbol to denote the position of the decimal point. Unfortunately, such representations yield complex encodings of real number operators such as addition, and it is usually the case that real numbers are directly axiomatised.

Exercises:

  1. 9.33

    Prove that, for any natural number n, if n 2 is even, then so is n. Hint: Proving the contrapositive of the result is easier.

     
  2. 9.34

    Modify the proof to show that \(\sqrt{3}\) is also irrational. Which part of the proof would not work if we try to use the same approach to prove that \(\sqrt{4}\) is irrational?

     

9.3 Cardinality

We started off this chapter arguing that the concept of natural numbers stems directly from the everyday activity of counting objects. We have formalised natural numbers and shown how operators can be defined on these numbers. We have even looked into how one could go about reasoning about more complex numeric types. However, despite the fact that the natural numbers seem to correspond to the numbers we use to count in, we have not shown how we can use them to count the items in, for instance, a set of objects. In this section we will explore how a counting scheme can be defined, allowing us, for instance, to compare the size of sets.

9.3.1 Counting with Finite Sets

In real-life we usually count finite collections of objects.4 How many items are there in the set: \(\{ \mbox {\textup {\textsf {Red}}},\;\mbox {\textup {\textsf {Lighten Blue}}},\; \mbox {\textup {\textsf {Yellow}}}\}\)? Does the set \(\{7,\;3\frac {1}{2}\}\) contain more or fewer items? Let us start by comparing a set with another to see whether it contains more, less, or an equal number of items. Based on this, we can then formally define what we mean by the size of a set.

Let us start by examining a simple puzzle, or rather three variants of a puzzle. A boy has two boxes of toys—a green box and a red box. He decides to tie strings from toys in the green box to toys which lie in the red box—all strings can be assumed to be tied at both ends.
Scenario 1:

If, upon examination, we find that (i) each item in the green box has at most one string tied to it, and (ii) all toys in the red box are tied to at least one string, not knowing anything else about the toys, which box do you think contains most toys?

Scenario 2:

If, instead we discover that (i) each item in the green box has at least one string tied to it, and (ii) no toy in the red box is tied to more than one string, would you change your mind?

Scenario 3:

Finally, if each item in both boxes is tied to exactly one string, which box would you now choose?

Let us look at the scenarios separately. In the first case, we would reason that, if there were more toys in the red box, then some toys would not have been tied with a string, since the strings (at most one from each toy in the green box) would have run out. This means that the green box has at least as many toys as the red one, and that is the one to choose.

In the second scenario, we realise that each toy in the red box was tied with one or no string. Since every toy in the green box has a string tied to it, there are at least as many toys in the red box as there are in the green one, possibly more. The red box is the one to choose.

There are different ways of reasoning about the last scenario. The most straightforward is that the observation that each item in the red box is tied to exactly one string satisfies both the first and second scenarios, because (i) all the toys in the red box are tied to at least one string, and (ii) no toy in the red box is tied to more than one string. In the first scenario we concluded that the green box had at least as many toys as there were in the red one, while in the second we concluded the opposite. The only way to satisfy both constraints is to have the same number of toys in both boxes.

Fine, we seem to be able to tie toys to compare the number of items in different boxes. But how can we formalise this to be able to use the mathematics we have developed so far to reason about counting? String theory is not the solution.

One way of looking at a situation with toys from the green box connected to ones in the red box using strings is as a relation. If we see the boxes as two sets, and the strings as items related by the relation, the scenarios can be formalised.

The first scenario is when each toy in the green box has at most one string tied to it, meaning that the strings correspond to a function. Furthermore, all the items in the destination set are tied to at least one string—the relation covers all the codomain, a surjective relation. If we thus find a surjective function from a set to another, the former has at least as many different elements as the latter.

The second scenario, is when each toy in the green box has at least one string tied to it, corresponding to a total relation—because each item in the source set has at least one outgoing arrow. The toys in the red box are tied to at most one string—no item in the codomain has two or more incoming arrows, corresponding to injectivity. Therefore, if we find a total injective relation from a set to another, we know that the latter has at least as many elements as the former.

The final setting is when the relation turns out to be total, functional, injective and surjective, allowing us to say that two sets are of the same size if there is a total bijective function between the two.

We will use these notions to define comparison operators between the sizes, or cardinality, of sets.

Definition 9.23

Given two sets A and B, we say that A has cardinality not less than B, written \(A \geq _{\mbox {\scriptsize $\#$}}B\), if there exists a surjective function from A to B. We say that A has cardinality not more than B, written \(A \leq _{\mbox {\scriptsize $\#$}}B\), if \(B \geq _{\mbox {\scriptsize $\#$}}A\). Finally, we say that they are of the same cardinality, written \(A =_{\mbox {\scriptsize $\#$}}B\), if both \(A \geq _{\mbox {\scriptsize $\#$}}B\) and \(A \leq _{\mbox {\scriptsize $\#$}}B\). ■

Although we defined \(\leq _{\mbox {\scriptsize $\#$}}\) and \(=_{\mbox {\scriptsize $\#$}}\) in terms of \(\geq _{\mbox {\scriptsize $\#$}}\), they could have been defined directly.

Proposition 9.24

The cardinality comparison relations can be defined in a different way: (i) \(A \leq _{\mbox {\scriptsize $\#$}}B\) holds if and only if there exists a total injective relation; (ii) \(A =_{\mbox {\scriptsize $\#$}}B\) if and only if there is a total bijective function between A and B.

Proof

The proof of (i) is split into two parts. If there exists a total injective relation r between A and B, we know from the results about relations that r −1 is a surjective function from B to A. Therefore, \(B \geq _{\mbox {\scriptsize $\#$}}A\) and hence \(A \leq _{\mbox {\scriptsize $\#$}}B\). On the other hand, if \(A \leq _{\mbox {\scriptsize $\#$}}B\), then it follows that \(B \geq _{\mbox {\scriptsize $\#$}}A\), implying the existence of a surjective function f from B to A. However, based on the proofs about relations, we know that f −1 is a total injective relation.

The proof of (ii) follows immediately from (i). ■

This notion of comparing the cardinality of sets satisfies a number of expected laws.

Proposition 9.25

Cardinality comparison satisfies a number of basic laws: (i) every set is of the same cardinality as itself \(A =_{\mbox {\scriptsize $\#$}}A\); (ii) both \(\leq _{\mbox {\scriptsize $\#$}}\) and \(\geq _{\mbox {\scriptsize $\#$}}\) are transitive relations; (iii) cardinality equality (\(=_{\mbox {\scriptsize $\#$}}\)) is an equivalence relation.

Proof Outline

To prove (i) we simply note that the identify function over A is a total bijective function, allowing us to conclude that \(A=_{\mbox {\scriptsize $\#$}}A\).

For property (ii), we know that a surjective function f exists from A to B, and another such function g exists from B to C. But \(f \mathbin {\raise 0.6ex\hbox {\oalign {\hfil $\scriptscriptstyle \mathrm {o}$\hfil \cr \hfil $\scriptscriptstyle \mathrm {9}$\hfil }}}g\) is a surjective function from A to C (since the composition of two surjective functions is itself a surjective function), and thus \(A \geq _{\mbox {\scriptsize $\#$}}C\). Similarly, \(\leq _{\mbox {\scriptsize $\#$}}\) can be proved to be transitive.

Property (iii) follows from the definition of \(=_{\mbox {\scriptsize $\#$}}\) and results (i) and (ii). □

Example

We expect that adding items to a set will never reduce the size of the set. At worst, if we add items already in the set, the size of the set will remain constant. We can confirm this observation by proving that \(A \cup B \geq _{\mbox {\scriptsize $\#$}}A\).

To show this we need to show that there exists a surjective function from AB to A. Consider the partial function fABA defined as {(x,x)∣xA}. Note that (i) f is functional, since if (x,y)∈f and (x,z)∈f, it follows that x=y and x=z, and hence x=y; and (ii) f is surjective over A, since for any xA, f(x)=x. If thus follows that \(A \cup B \geq _{\mbox {\scriptsize $\#$}}A\). ⋄

We thus have a means of comparing sets together, in terms of size. This still does not provide a way of counting the elements in a set. What does it mean to say that set \(C=\{ \mbox {\textup {\textsf {Red}}},\; \mbox {\textup {\textsf {Lighten Blue}}},\; \mbox {\textup {\textsf {Yellow}}}\}\) has three elements? Since we can now compare sets, we can now equate this statement with saying that C has the same cardinality as the set \(\{1,\;2,\;3\}\). In other words, there is a way of enumerating all items in C, starting from 1 till 3. This notion can be generalised for finite sets. We will be using the notation \(\mathbb {F}X\) to denote all finite subsets of X—similar to ℙX, but excluding infinite subsets.

Definition 9.26

Given a finite set \(A:\mathbb {F}X\), we say that the cardinality of A is n, written #A=n, if \(A=_{\mbox {\scriptsize $\#$}}1...n\). Note that the set 1...n is defined to be {i:ℕ∣1≤in}. ■

Before we start using this definition, it is essential that we check that it makes sense. Is the size of a finite set uniquely defined? Is there a set for which we can prove that #A=n and #A=m for different values of n and m? To prove that this is not the case, we require the following preliminary result:

Lemma 9.27

There exists a surjective function f∈1...n→1...m if and only if nm.

Proof Outline

For the backward direction of the proof, we note that, if nm, then λi∈1...mi can be shown to be a surjective function from 1...n to 1...m.

In the forward direction, the proof follows by induction on n.

For the base case, we take n=0: For the inductive case, we assume that, if there exists a surjective function f∈1...k→1...m, then km, and prove that it also holds for k+1: if there exists a surjective function f∈1...k+1→1...m, then k+1≥m.

To prove the implication, we assume that there exists a surjective function f′∈1...k+1→1...m. We are now required to prove that k+1≥m. Note that, when m=0, the inequality trivially holds, so we will consider the case with m>0.

We can visualise f′ as a collection of arrows from the set 1...k+1 to the set 1...m. We identify two particular values α and β such that f′(α)=m and f′(k+1)=β. Note that α is guaranteed to exist by the surjectivity of f′. For the special case when f′(k+1) is not defined, we note that this would mean that f′ is a surjective function from 1...k to the set 1...m, implying that km by the inductive hypothesis, and thus k+1≥m. Let us thus turn our attention back to when f′(k+1) is defined:
Now consider function f, which removes k+1 from the domain and switches the outgoing arrow from α to β, which would be defined as follows:5
Note that f is (i) functional (since f′ is functional), and (ii) surjective over 1...m−1 (since f′ is surjective). Therefore, by the inductive hypothesis, we can conclude that km−1 and hence k+1≥m.

Using induction, we can thus conclude that the lemma holds. □

We can now prove that set cardinality is well defined.

Theorem 9.28

Given a finite set A, if #A=n and #A=m, then n=m.

Proof Outline

Since #A=n and #A=m, we know that there exist total bijective functions f n between A and 1...n, and f m between A and 1...m. Using these, we can define the relation \(f = f_{n}^{-1}\mathbin {\raise 0.6ex\hbox {\oalign {\hfil $\scriptscriptstyle \mathrm {o}$\hfil \cr \hfil $\scriptscriptstyle \mathrm {9}$\hfil }}}f_{m}\) between 1...n and 1...m. Since total bijective functions are closed under relational inverse and relational composition, we can conclude that f is a total bijective function between 1...n and 1...m. Similarly, f −1 is a total bijective function between 1...m and 1...n.

Since f is a surjective function from 1...n to 1...m, we can conclude by Lemma 9.27, that nm. Similarly, since f −1 is a surjective function from 1...m to 1...n, using the same lemma, we can conclude that mn. Hence n=m. □

We can also show that comparing cardinalities is the same as comparing the size of sets.

Theorem 9.29

If #A=n and #B=m then (i) \(A \geq _{\mbox {\scriptsize $\#$}}B\) if and only if nm, (ii) \(A \leq _{\mbox {\scriptsize $\#$}}B\) if and only if nm, and (iii) \(A=_{\mbox {\scriptsize $\#$}}B\) if and only if n=m.

Proof Outline

We start by proving (i). Let f A A→1...n be a total bijective function showing that #A=n, and similarly f B be the equivalent function for B. We prove the two directions of the bi-implication separately:
Forward implication:

If \(A \geq _{\mbox {\scriptsize $\#$}}B\), then by definition of \(\geq _{\mbox {\scriptsize $\#$}}\), there exists a surjective function fAB. Using the results from the chapters about relations, we know that (i) the inverse of a total bijective function is itself a total bijective function, and (ii) the composition of surjective functions is a surjective function. Therefore, \(f_{A}^{-1} \mathbin {\raise 0.6ex\hbox {\oalign {\hfil $\scriptscriptstyle \mathrm {o}$\hfil \cr \hfil $\scriptscriptstyle \mathrm {9}$\hfil }}}f \mathbin {\raise 0.6ex\hbox {\oalign {\hfil $\scriptscriptstyle \mathrm {o}$\hfil \cr \hfil $\scriptscriptstyle \mathrm {9}$\hfil }}}f_{B}\) is a surjective function from 1...n to 1...m. Hence, by Lemma 9.27, we can conclude that nm.

Backward implication:

If nm, we can conclude from Lemma 9.27, that there exists a surjective function f∈1...n→1...m. Using the same results about the composition of functions, we know that \(f_{A} \mathbin {\raise 0.6ex\hbox {\oalign {\hfil $\scriptscriptstyle \mathrm {o}$\hfil \cr \hfil $\scriptscriptstyle \mathrm {9}$\hfil }}}f \mathbin {\raise 0.6ex\hbox {\oalign {\hfil $\scriptscriptstyle \mathrm {o}$\hfil \cr \hfil $\scriptscriptstyle \mathrm {9}$\hfil }}}f_{B}^{-1}\) is a surjective function from A to B, and thus \(A \geq _{\mbox {\scriptsize $\#$}}B\).

Hence \(A \geq _{\mbox {\scriptsize $\#$}}B\) is equivalent to nm.
The proof of (ii) follows directly from the proof of (i): The proof of (iii) similarly follows directly. □

Proposition 9.30

For any finite set A, and xA, the size of {x}∪A is one more than the cardinality of A: #({x}∪A)=1+#A.

Proof Outline

Since A is a finite set, #A=n for some value of n. This means that there exists a total bijective function f from A to 1...n. Now, let us define f′∈{x}∪A→1...n+1 as follows: The function f′ can be shown to be a total bijection, thus allowing us to conclude that #({x}∪A)=1+n and therefore #({x}∪A)=1+#A. □
Since we now have a relationship between finite sets and their cardinality as a natural number, we can derive a rule of induction for finite sets, which will correspond to induction on the size of the set. The base case is when the cardinality is zero, equivalent to proving the property for the empty set. For the inductive case, we assume the property is true for a set A and prove that it still holds when we add another element to the set: This rule of induction can be proved to be correct using natural number induction.

Example

Now that we can measure the size of a set, and not just compare the cardinality between sets, we can refine the previous result which showed that adding elements to a set cannot decrease its size. What is the cardinality of the union of two finite sets AB? If we add the cardinality of A and that of B, we would end up with an over-approximation, since elements in both A and B would be counted twice. To obtain the exact value, we would then need to subtract the number of such common elements so that the overall effect is that of counting them once. We can prove that #(AB)=#A+#B−#(AB).

The proof is by finite set induction on A.
Base case:
For the base case, when A=∅, we have to prove that #(∅∪B)=#∅+#B−#(∅∩B).
Inductive case:
For the inductive case, we will assume that the property holds for a finite set K: #(KB)=#K+#B−#(KB). We now have to prove it also holds for {k}∪K. We consider three distinct cases: (i) when kK, (ii) when kK and kB, and (iii) when kK and kB. It is not difficult to confirm that the three cases cover all possibilities.
Case (i)
Case (ii)
kK, kB.
Case (iii)
kK, kB.
This completes the inductive case of the proof.
The property thus holds by finite set induction on A. ⋄

The notion of counting gives rise to an important mathematical principle frequently used in mathematics and computer science—that of the pigeon-hole principle. The informal idea behind the pigeon-hole principle is that, if one has n pigeons which are all placed in m pigeon-holes, where n>m, then there must be one pigeon-hole with at least two pigeons inside it. The principle can be used, for instance, to show that amongst 13 persons, at least two must share the month of their birthday—the persons correspond to the pigeons, the months to the pigeon holes. Formally, we express the notion of placing a set of pigeons A in a set of pigeon-holes B (with #A>#B since we have more pigeons than pigeon-holes) as a total function f from A to B. Having at least two different pigeons p 1 and p 2 (with p 1p 2) share a pigeon hole thus corresponds to f(p 1)=f(p 2), implying that f is not injective.

Theorem 9.31

The pigeon-hole principle: Given two sets A and B such that #A>#B, any total function fAB cannot be injective.

Proof Outline

We use a proof-by-contradiction approach. Let us assume that there exists a total injective function fAB. Using the results about relations, we know that f −1 is (i) surjective (since f is total), and (ii) functional (since f is injective). From this, we can conclude that there exists a surjective function from B to A, implying that \(B \geq _{\mbox {\scriptsize $\#$}}A\). From Theorem 9.29 we can conclude that #B≥#A which contradicts the fact that #A>#B. Hence, no total injective function from A to B exists. □

Example

You are given 10 different whole numbers between 1 and 100. What are the chances of being able to pick two disjoint sets of numbers (of the given 10) such that both sets have the same sum?

For instance, given the numbers \(\{19,\;23,\;24,\;39,\;42,\;50,\;71,\; 75,\;81,\;97\}\) we can choose the sets \(\{39,\;97\}\) and \(\{23,\;42,\; 71\}\) which are disjoint and both add up to 136.

It turns out that two such sets exist no matter which 10 numbers you start off with. Here is an informal proof.

Given a set N of 10 numbers ranging over 1 to 100, there are 210 or 1024 different subsets we can choose. Each one of these has a sum which at the very least is 0 (for the empty set), and at the very most 91+92+⋯+100=955 (when we choose the numbers to be all the ones from 91 to 100). Consider the total function s∈ℙN→0...955, which calculates the sum of the set of numbers chosen from N. By the pigeon-hole principle, since #ℙN>#(0...955), it follows that s is not injective, and thus there are two subsets of N, let us call them N 1 and N 2, with the same sum (s(N 1)=s(N 2)). By removing any common elements from both N 1 and N 2 we can thus obtain two disjoint sets with the same sum.

This proof is interesting not only because it shows that we can choose sets with the same sum, but also because it is not a constructive proof—it does not show how we can compute the subsets. We know that two such subsets exist, but we still have no clue how to find them. ⋄

Here is another informal example showing an application of the pigeon-hole principle to graphs.

Example

Consider a graph \(G = (V,\;L,\;E)\) with a finite set of vertices V. Any path in G of length #V+1 or longer must contain a loop. This can be shown using the pigeon-hole principle. Consider a path of length #V+1 or longer, and the function f mapping each position in such a path to the vertex it goes through. Since the size of the set of vertices is #V, and hence less than the number of positions in the path, f is not injective, and thus, two different positions must map to the same vertex. This means that the path loops through that repeated node.

This result has various important implications. For instance, we can show that, if there is no path in G starting from vertex v and finishing at vertex v′ with length less than #V, then v′ cannot be reached from v in any number of steps. How? If the shortest path from v to v′ is longer than #V, then it must contain a loop, meaning that it can be shortened (by removing the loop). Therefore, the shortest path can never by longer than #V. This means that, to check whether a vertex is reachable from another, it suffices to analyse paths of length up to #V in the graph. ⋄

We have thus managed to define the notion of the number of elements in a set and use it to draw conclusions about sets based on their size. What is particularly interesting about this approach, is that the definition of the size of a finite set is defined in terms of set cardinality comparison, which in turn is defined in terms of the existence of particular forms of relations between two sets. As we shall see in the coming sections, this approach allows us, not only to count items in a finite set, but also to compare infinite sets.

Exercises:

  1. 9.35

    Show that the empty set is not larger than any other set: \(A \geq _{\mbox {\scriptsize $\#$}}\emptyset\).

     
  2. 9.36

    The cardinality of the empty set is zero. Prove it.

     
  3. 9.37

    Prove that, if AB, then \(A\leq _{\mbox {\scriptsize $\#$}}B\).

     
  4. 9.38

    Give a proof sketch to show that, if the cardinality of a set A is the same as that of B, then \(A=_{\mbox {\scriptsize $\#$}}B\).

     
  5. 9.39

    Sketch a proof to show that, if #A≥#B, then \(A \geq _{\mbox {\scriptsize $\#$}}B\).

     
  6. 9.40

    In the proof of Lemma 9.27, we left out the analysis for when f′(k+1)=m. Show that a function f can also be constructed in such a case, thus completing the proof.

     
  7. 9.41

    Given a drawer with 10 red and 10 blue socks, how many socks must I blindly take out of the drawer to guarantee that I have a matching pair. Show how the pigeon-hole principle can be used to justify your answer.

     
  8. 9.42

    The influence factor of a person p in a group on a social networking site is defined to be the number of members of the group to which p is connected. So John would have an influence factor of 7 in a group where he is connected to 7 of the members. Assuming that the connected-to relation is symmetric (p is connected to p′ if and only if p′ is connected to p) and not reflexive (no person is connected to him- or herself), use the pigeon-hole principle to show that, in any group with two or more members, there must be at least two persons with the same influence factor. Hint: In a group with n members, there cannot be both a person with an influence factor of 0 and another with an influence factor of n−1.

     

9.3.2 Extending Cardinality to Infinite Sets

Although the cardinality of a set only works with finite-sized sets, the underlying tool of set-size comparisons using classes of relations can also be applied to infinite sets. We will be informally visiting some of the results, originally proposed by Georg Cantor and David Hilbert in the late 19th and early 20th century. Although it may seem simpler to limit our concept of sets to finite collections, we cannot do without infinite ones. For instance, to be able to count the number of items in a finite set, we need the concept of the natural numbers—which is an infinite collection. Similarly, without the notion of infinite sets, we cannot formulate prime numbers, which are crucial in number theory.

Although the set of natural numbers is an infinite set, it is important to realise that all the numbers in the set are finite ones. One can use induction to prove this—zero is a finite number, and for any number k, if k is finite, then so is its successor. The cardinality of a set can thus only be applied to finite sets, and the cardinality of the natural numbers #ℕ or the prime numbers #Primes are ill-defined concepts, since there is no natural number n such that the set 1...n can be put in a one-to-one correspondence with the set in question. The impossibility to give the size of a set is what makes it infinite.

Definition 9.32

A set A is said to be infinite if there is no natural number n such that #A=n. ■

For instance, when we proved that the prime numbers are infinite, we assumed that such a value n exists, and then showed that there is at least one additional prime number which we did not include. Hence, no such value can exist.

Despite the fact that we cannot talk about the size of an infinite set, we can still use the previous notions of set cardinality comparison (\(\geq _{\mbox {\scriptsize $\#$}}\), \(\leq _{\mbox {\scriptsize $\#$}}\) and \(=_{\mbox {\scriptsize $\#$}}\)) for infinite sets. On one hand, common sense may suggest that all infinite sets, being infinite, are of the same size. Similarly, common sense tells us that, since the set of prime numbers is a proper subset of the natural numbers, the set of prime numbers must be smaller than the set of natural numbers. As we will see, both these common-sense arguments turn out to be wrong.

The arguments will be illustrated using Hilbert’s Hotel, which contains an infinite number of rooms. This imaginary hotel is named after David Hilbert, a German mathematician who was responsible for many of the concepts presented here.

Example

Tourism has been steadily increasing for a number of years, and Hilbert’s Hotel was built precisely to ensure that, no matter how many tourists visit the city, they will always find a place to stay. The hotel’s brochure boasts an infinite number of rooms on a single floor, numbered sequentially starting from room 0.

On the busiest day of the year, Hilbert’s Hotel was completely full. A few minutes after the receptionist handed over the keys to the last free room, a regular guest arrived at the hotel asking for a room. The receptionist explained that the hotel was already full and that he would have to find a room elsewhere. The new arrival made such a fuss that the manager was called over. A former mathematician, the hotel manager proposed a solution to give the visitor an empty room without sending out any of the current guests or having any of them share a room (and without building another room in the hotel). How did he manage to do so?

The solution: The manager made an announcement on the hotel PA system: “Can all guests kindly move their belongings to the next room down the corridor, thus moving to a room whose number is one higher than the one they currently occupy?” The person in room 0 would move to room 1, whose current occupant would be moving to room 2, etc. Although the guests were not so happy to have to move rooms, they all found an empty room (since the person in that room was also moving), and at the end of the process, room 0 was not occupied. The visitor was handed the key to this empty room.

Wasn’t the hotel already full? How could he add another guest into a full hotel?  ⋄

Although seemingly counter-intuitive at first, the moral of the story is that one can always add an object to an infinite set without changing its cardinality. This approach, of moving all persons from room n to room n+1, can be used to show that the natural numbers and the natural numbers less zero are of the same cardinality.

Proposition 9.33

and1 have the same cardinality: \(\mathbb {N}=_{\mbox {\scriptsize $\#$}}\mathbb {N}_{1}\).

Proof Outline

To prove that the two sets are of the same cardinality, we need to find a total bijective function from the first to the second. The function we desire is the one which simply increments the given value by one: \(f(n) \mathbin {\stackrel {\mathrm {df}}{=}}n+1\). It is not difficult to prove that f∈ℕ→ℕ1 is: (i) total on ℕ, (ii) a function, (iii) surjective on ℕ1, and (iv) injective. □

In fact, using this process, the manager can host any finite number of new guests arriving at the hotel. If k new guests arrive, the current guests are asked to move k rooms down the corridor, from room n to room n+k. This leaves k rooms empty for the new guests.

Example

The following morning, with Hilbert’s Hotel still full, saw an infinite bus full of tourists stopping at its front door. Since the hotel was the only infinite one in town, the guide explained that Hilbert’s Hotel was the only place where they could be hosted. The receptionist realised that there was no way he could ask the current guests to move sufficiently down the hallway to make space for an infinite number of new guests. The manager was called in once again, and came up with an even smarter solution than the previous time to ensure that everyone was hosted in the hotel. What was his solution?

The solution: This time round, the manager announces that each current guest is to move to the room whose number is double the one he or she is currently in: the person in room n moves to room 2×n, thus leaving the odd rooms empty. The tourists on the bus were then told to move into the hotel, such that the person sitting at seat n takes room 2×n+1 (assuming that the seats on the bus are also numbered starting from 0). Not only did everyone get a room, but the regular guest who had just arrived the previous night, and was in room 0, did not have to change rooms.

Weirder still than what went on the previous night, a full hotel managed to find space for an infinite number of new guests. Can it get any weirder than this? (Yes, it can.) ⋄

Even more counter-intuitive than the previous example, this example shows that adding two infinite sets together does not change the cardinality. For instance, if we take the even numbers (an infinite set) and add the odd numbers (another infinite set) we end up with the natural numbers. Does the set of natural numbers contain more elements than the even numbers? Not only is the set of even numbers a proper subset of the natural numbers, but it has infinitely fewer elements (it lacks all the odd numbers). We can show that, in fact, their cardinalities are the same.

Proposition 9.34

The natural numbersand the set of even numbers Even are of the same cardinality: \(\mathbb {N}=_{\mbox {\scriptsize $\#$}}\mbox {\textit {Even}}\).

Proof Outline

To prove that the two sets have the same cardinality, we will give a total bijective function f∈ℕ→Even. The function which doubles a number \(f(n) \mathbin {\stackrel {\mathrm {df}}{=}}2 \times n\) does the trick. It can be proved that (i) f is total on ℕ; (ii) it is a function; (iii) it is surjective on Even; and (iv) it is an injective function. □

This approach can be used to show that the cardinality of ℕ is also the same as that of the square numbers (by mapping n to n 2) and the prime numbers (by mapping n to p n , the nth prime number), despite the fact that in both these sets we progressively drop more elements from the natural numbers as we progress. It can also be used to show that there are as many natural numbers as integers.

Proposition 9.35

The natural numbersand the integersare of the same cardinality: \(\mathbb {N}=_{\mbox {\scriptsize $\#$}}\mathbb {Z}\).

Proof Outline

The total bijective function which does the trick is slightly more complicated than before, but corresponds very closely to what the manager did. Consider function f∈ℤ→ℕ defined as follows: Note that the non-negative numbers are mapped from n to 2×n (just like the persons already in the hotel), while the negative numbers are mapped from −n to 2×n−1 (just like the persons on the bus, except that in this case no one is sitting on seat 0).

It can be proved that (i) f is total on ℤ; (ii) it is a function; (iii) it is surjective on ℕ; and (iv) it is an injective function. Hence the natural numbers and the integers are of the same cardinality. □

We will look at one final example of an unexpected result about infinite sets.

Example

Hilbert’s Hotel became so popular that the owners decided to build further stories above the (infinite) ground-level corridor. Having made so much money from that last busload of tourists, they decided to build not one, not two, but an infinite number of stories. Rooms were now numbered as a pair e.g., L17-R41, indicating room 41 on level 17. The first room on the ground floor was L0-R0. The demand for rooms in the hotel did not diminish, and the hotel was completely full when a safety inspector turned up. The lack of sufficient fire escapes from the new floors led to an immediate request to evacuate all the rooms on the new floors due to safety concerns. The manager, as usual, found a solution, and all the guests ended up hosted in rooms on the ground floor. How did he manage to do so?

The solution: Consider the layout of the hotel as shown in the figure below:
The manager assigned each room to a number by going around in a spiral-like movement as shown below:
This can be written as a function from the level and room number to a single number so that every guest knows which room to move to. Note that, in this way, everyone still has a room, although they will all be housed on the ground floor. ⋄

Using the manager’s intuition, we can show that there are as many rational numbers as natural numbers.

Theorem 9.36

The cardinality of the rational numbers is the same as that of the naturals: \(\mathbb {Q}=_{\mbox {\scriptsize $\#$}}\mathbb {N}\).

Proof Outline

Every rational number \(\frac{p}{q}\) can be seen as a room in Hilbert’s Hotel Lp-Rq. Using the manager’s redistribution function, we have a way of mapping the rationals to the natural numbers such that the mapping is a total bijective function. Hence \(\mathbb {Q}=_{\mbox {\scriptsize $\#$}}\mathbb {N}\). □

This result runs very much counter to what one would expect. Intuitively, there seem to be so many more rational numbers than natural ones. Mathematically, however, it turns out that this is not the case. With the definition of equality of cardinality of infinite sets as equivalent to the existence of a one-to-one relationship between the sets, the two sets turn out to be of equivalent cardinality.

The cardinality of the natural numbers, the even numbers, the integers and the rationals are thus all the same. Such sets which have cardinality equal to that of the natural numbers are called denumerable sets or countably infinite sets. Given that all these sets of numbers have been shown to be denumerable, the question which naturally arises is whether there are sets whose cardinality is larger than that of the natural numbers.

9.3.2.1 Beyond Denumerability

Are there infinite sets which are not denumerable? We will show that the real numbers, which as we have already seen, contain more items than the rationals, are not denumerable. There are more real numbers than natural ones.

Theorem 9.37

The real numbers are not denumerable.

Proof Outline

To prove this, we show that the real numbers between 0 and 1 are not denumerable, and then apply Exercise 9.37 to conclude the result. The proof is by contradiction—we assume that the real numbers between 0 and 1 are denumerable, and then proceed to show that this leads to a contradiction.

Let us assume that the real numbers in the interval (0,1), are denumerable. This means that there is a total bijective function f between the natural numbers and this set. In other words, we can enumerate all the reals in (0,1) listing them by applying f to the natural numbers in order: Now, let us define an operator \(\bar{d}\), which takes a digit and changes it into 1 or 2, such that d is different from \(\bar{d}\): Now consider the real number: \(n = 0.\bar{a}_{1} \bar{b}_{2} \bar{c}_{3} \ldots\)—is this number listed in the table we showed earlier? The number does not match the first line since \(a_{1} \neq\bar{a}_{1}\), neither does it match the second number since \(b_{2} \neq\bar{b}_{2}\), etc. If the ith number was 0.x 1 x 2 x 3…, this does not match n since \(x_{i}\neq\bar{x}_{i}\). Since n is not listed in the table, it means that f is not surjective, and thus not a total bijective function. This contradiction means that our original assumption that the numbers between 0 and 1 are denumerable is false. It thus follows that the real numbers are not denumerable. □

The way we have constructed a counterexample in this proof uses a technique called diagonalisation. We take the table of values and change the ones lying on the diagonal so as to construct a row which is not in the table. The approach is used in other contradiction proofs and, as we shall see in the next chapter, allows us to prove that there is a limit on the power of computers.

We have thus shown that there are sets which are not denumerable. Apart from the real numbers, one can show, for instance that the power set of the natural numbers is also not denumerable. Are there sets with cardinality even larger than that of the real numbers? One can show that the power set of real numbers has cardinality even higher than that of the reals. Furthermore, taking the power set of an infinite set always results in a new set with an even higher cardinality, giving us a glimpse of an infinite chain of infinite set cardinalities. Cantor’s great achievement was to show that the notion of infinite can be categorised into different cardinalities allowing us to reason more effectively about infinite sets.

9.4 Summary

In this chapter we have looked at how we can formalise and reason about numbers. The natural numbers turned out to be a simple instance of structured types, and the range and complexity of the operators and concepts defined are evidence of how powerful structured types can be. Although we have briefly looked at integers, the rationals and the real numbers, to reason formally about these classes of numbers, one would ideally formalise them directly, rather than encode them using the natural numbers.

What is usually taken to be a basic notion in mathematics—numbers and counting—turns out to involve interesting twists, especially when it comes to reasoning about infinite sets. What may appear to be merely an intellectual exercise in comparing the sizes of infinite sets will, however, turn out to have important implications in computer science. We will be seeing this in the next chapter.

Footnotes

  1. 1.

    Theoretical, because we do not limit the answer depending on the size of your kitchen or the number of angels or sheep that exist in the universe.

  2. 2.

    As the complexity of the proofs in the rest of this chapter increases, we will not be giving full rigorous proofs, since these would be extremely long and tedious to follow. Instead, we will be giving proof sketches, which indicate the outline of the structure of a rigorous proof if we were to write one.

  3. 3.

    RSA stands for Rivest, Shamir and Adleman, the names of the scientists who published this algorithm.

  4. 4.

    Even in myths and legends in which someone is assigned a seemingly never-ending task, it usually consists of counting a finite, even if very big, collection—counting the grains of sand on a beach, or the number of drops of water in an ocean, as opposed to counting infinite collections of objects, such as counting all the numbers one can think of.

  5. 5.

    The special case when f′(k+1)=m has to be treated differently, and is left as an exercise.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Gordon J. Pace
    • 1
  1. 1.Department of Computer Science, Faculty of Information and Communication TechnologyUniversity of MaltaMsidaMalta

Personalised recommendations