To your average pianist, G-sharp and A-flat are the same. They are located at the same black key between the white G and A keys, but with different names. For other musicians, however, for example violinists, they are different notes. They would claim that the piano represents a compromise, since it is often tuned to the equal-tempered scale, while other musical scales, based on simple harmonic ratios, often appear more harmonious to the ear. This article explains and illustrates this dilemma geometrically. In these scales, G-sharp and A-flat will no longer be the same. The difference between them is called the Pythagorean comma, and the name hints at the long history of music theory.

Music is first and foremost an auditory experience, but some of its principles can also be communicated with mathematics. Although a note can assume any value on a continuous frequency scale, humans have preferred only a few selected tone combinations. The oldest theory for this preference is that of notes related by integer ratios, and it is striking how simple ratios give rise to pleasant perceptual effects. This theory is often credited to Pythagoras (fl. sixth century bce), who constructed a scale using an octave (ratio 2/1), a perfect fourth (4/3), and a perfect fifth (3/2) [2] and [1, Chap. 5], but most likely it dates back to the Sumerians [9]. Hermann von Helmholtz (1821–1894) suggested a physical theory in 1877 that asserted that dissonance is caused by a lack of harmonic overlap that gives rise to auditory roughness in the form of unpleasant amplitude fluctuations [5]. Such an interference pattern is called a beat.

Like the Pythagorean scale, the just, or pure, scale also has ancient roots. It is based on even simpler integer ratios consistent with the natural occurrence of overtones for vibrating strings and in wind instruments. The presence of simple ratios means that it should be possible to construct the ratios of the notes of both of these scales with straightedge and compass. Such constructions have only been partly presented before, but here we will introduce a new and simpler geometric method, whereby special emphasis is placed on visualizing some oddities of these scales where they differ from the modern equal-tempered scale.

The equal-tempered scale has become popular over the last few centuries. Its twelve notes are evenly distributed on a logarithmic frequency scale with successive intervals having the ratio

$$ r = 2^{n/12}, \quad n=0, \ldots ,12, $$

to the fundamental frequency. The ratio r can be viewed as a normalized frequency. In this paper we deal only with two adjacent octaves, with \(r=1\) corresponding to a low C, and \(r=2\) to a C one octave above, which we shall denote by \({{\text C}}_{2}\).

Since our perception of musical difference is based on ratios, Alexander J. Ellis, the translator of Helmholtz’s book [5], introduced the cent as a logarithmic measure for the distance between notes [4]. Two notes separated by the ratio r are said to be separated by

$$ 1200 \log _{2}(r) \;\text {cents} . $$

The seven notes of the C major scale, which correspond to the white keys on a piano keyboard, are given as ratios and cents in the first columns of Table 1 in three different scales. Throughout this article, ratio and difference will both be used to compare two notes. It will be understood that difference implicitly involves a logarithmic measure such as cents, as defined in (2).

Table 1 Ratios between notes in the equal temperament, Pythagorean, and just scales for the notes of C major. Here n is the number of semitones (half-steps) above the fundamental C

The sharp designation \(\sharp \), e.g., \({{{\text {C}}}^{\sharp }}\), means one semitone up from C, and the flat \(\flat \), e.g., \({{{\text {D}}}^{\flat }}\), denotes one semitone down from D. In the equal tempered scale, one semitone is by definition 100 cents. A property of this scale is that notes with different names can be enharmonic equivalents; that is, two notes with different names, such as \({{{\text {C}}}^{\sharp }}\) and \({{{\text {D}}}^{\flat }}\), will sound the same.

This is the case for the five black keys of the piano: \({{{\text {C}}}^{\sharp }}\) and \({{{\text {D}}}^{\flat }}\) (100 cents), \({{{\text {D}}}^{\sharp }}\) and \({{{\text {E}}}^{\flat }}\) (300 cents), \({{{\text {F}}}^{\sharp }}\) and \({{{\text {G}}}^{\flat }}\) (600 cents), \({{{\text {G}}}^{\sharp }}\), and \({{{\text {A}}}^{\flat }}\) (800 cents), \({{{\text {A}}}^{\sharp }}\) and \({{{\text {B}}}^{\flat }}\) (1000 cents). Further, there is an equivalence between the four pairs C and \({{\hbox {B}}^{\sharp} }\) (1200 cents), E and \({{\hbox {F}}^{\flat }}\) (400 cents), F and \({{\hbox {E}}^{\sharp} }\) (500 cents), B and \({{\hbox {C}}^{\flat }}\) (1100 cents).

Thus all the intervals are exactly the same size, and this makes it simpler to transpose music to a different key. The price to pay for this simplicity is that none of the intervals, except for the octave, have simple integer ratios, as is evident from the ratio column of Table 1 for equal temperament.

The Pythagorean scale, based on integer ratios, is used in some classical vocal music and for fretless string instruments, such as the violin. It is based on letting the 3:2 interval (perfect fifth) or the 4:3 interval (perfect fourth) be central [1].Footnote 1 Since these two ratios are inverses, except for a factor of 2, which represents an octave, the two approaches are equivalent. We will use a mix of musical and mathematical terminology in this paper. As an example, raising or lowering a pitch by a fourth is musical terminology for multiplication by 4/3 or 3/4 respectively in mathematics.

One way to construct the Pythagorean scale is by starting with \({\text {C}}=1\) and multiplying by 4/3 (a perfect fourth) to obtain \({\text {F}}=4/3\). Next, C is increased by an octave, i.e., multiplied by 2, to yield \({{\text C}}_{2}=2\). Further continue by decreasing by 3/4, thereby obtaining \(2\cdot 3/4 = 3/2 ={\text {G}}\). The next note is found by going down by another fourth to \(3/2 \cdot 3/4 = 9/8 = {\text {D}}\), then down again to get \(9/8 \cdot 3/4 = 27/32\). Since this ratio is less than 1, it is increased by an octave to its equivalent note \(2 \cdot 27/32= 27/16\), which is A. Then E is found as \(3/4 \cdot 27/16=81/64\). A doubling is again required to find B from E, and we get 243/128. We have thereby obtained all the notes of the scale, and the result is the succession of notes shown in the center column of Table 1.

Unlike the equal tempered scale, the Pythagorean scale exhibits ambiguities in the sharp and flat notes mentioned above. They will not be the same. The difference between, for example, \({{{\text {C}}}^{\sharp }}\) and \({{{\text {D}}}^{\flat }}\) is called the Pythagorean comma (PC). It can be found as the difference between a major second of 9:8 from, e.g., C to D and two minor seconds. In the equal tempered scale, two minor seconds will always equal a major second, but in the Pythagorean scale, such is not the case. A minor second, such as from B to \({{\text C}}_{2}\), is the ratio \(256{:}243 = 2^{8}/3^{5}\). The ratio between a major second and two minor seconds, which in the equal tempered scale is exactly unity, is in the Pythagorean scale,

$$ {\text {PC}} = \left( 3^{2}/2^{3} \right) / \left( 2^{8}/3^{5} \right) ^{2}= 3^{12}/2^{19} \approx 1.01364 \approx 23.5 \; \text {cents}. $$

The Pythagorean comma will appear as the ratio between each pair \({{{\text {C}}}^{\sharp }}\) and \({{{\text {D}}}^{\flat }}\), \({{\hbox {D}}^{\sharp }}\) and \({{\hbox {E}}^{\flat }}\), and so on, that is, all nine pairs that were enharmonically equivalent in the equal temperament scale. It is also common to interpret the Pythagorean comma as the difference or ratio between twelve perfect fifths, \((3/2)^{12}\), and seven octaves, \(2^{7}\). In the equal temperament scale, this difference is distributed about equally over all the twelve tempered semitones, which means that the notes in the equal tempered scale are slightly out of tune compared to the more naturally harmonious integer-based scales.

A third scale uses just, or pure, intonation, which is often used in folk music and some non-Western music, especially if it is based on a pentatonic scale, such as the five notes C, D, E, G, and A. The principle is to use the smallest integer ratios possible, as shown in the right column of Table 1.

The just scale differs from the Pythagorean C major scale in just three places, A, E, and B, which are replaced by notes \({\hbox {A}}_{{\text {J}}}\), \({\hbox {E}}_{{\text {J}}}\), and \({\hbox {B}}_{{\text {J}}}\), where the subscript J denotes just intonation. A reappreciation of just intonation in Renaissance practice was given in [3], and the online version of that paper also gives auditory examples.

Just intonation also exhibits differences between sharp and flat notes. Of special interest is the small difference between the Pythagorean E and the \({\hbox {E}}_{{\text {J}}}\) of just intonation. It is called the syntonic comma (SC), and its value is

$$ {\text {SC}} = (81/64)/(5/4) = 81/80 = 1.0125 \approx 21.5 \; \text {cents}. $$

Another interesting difference is that between the Pythagorean comma and the syntonic comma. It is called the schisma and has a value of

$$ {\text {Schisma}} = 5 \cdot 3^{8}/2^{15} \approx 1.00113 \approx 1.95 \; \text {cents}. $$

Geometric Constructions

We offer here several figures that display the positions of the seven notes of the Pythagorean diatonic scale as well as five sharp and five flat notes. As a result, the Pythagorean comma is displayed geometrically in multiple places as the difference between the flat and sharp notes. The scale of just intonation can be constructed similarly. This construction will visualize the syntonic comma at the three notes that differ in the two scales. The tiny difference between the syntonic and Pythagorean commas, or schisma, can also be visualized.

Figure 1
figure 1

A \(30^{\circ} \)\(60^{\circ} \)\(90^{\circ} \) right triangle with a linear frequency axis from the origin, O to \({{\text C}}_{2}\), whose frequency is twice that of \({\text {C}}=1\). The construction starts at C and then doubles that note to \({{C}}_{2}=2\). Then one follows the black arrow to the point \({{C}}_{2}\), from which the red normal to the frequency axis meets it at the point \({\text {G}}=3/2\). The black perpendicular line from C to \({\text {F}}'\) is then drawn, followed by the blue line, which meets the frequency axis at \({\text {F}} = 4/3.\)

We begin with the \(30^{\circ} \)\(60^{\circ} \)\(90^{\circ} \) right triangle of Figure 1. The horizontal base of this triangle, the hypotenuse, is the frequency axis, with 0 frequency at the origin O at the left, the note C normalized to frequency 1 in the middle, and \({{\text C}}_{2}=2\) on the right.

The construction for finding the notes begins by setting the position of \({{\text C}}_{}=1\). This note is doubled by following the black arrow to \({{\text C}}_{2}\) before continuing to the next perfect fourth downward (division by 4/3) via \({{\hbox {C}}^{\prime }_{2}}\) and the red line to \({{\text {G}}}=3/2\). The next note is found by again starting at C and following the perpendicular to \({{\text {F}}}'\) and projecting it down the blue line to \({{\text {F}}}=4/3\). This operation represents going upward by a fourth, i.e., multiplication by 4/3. The values for F and G can be confirmed by elementary geometry and trigonometry. It will be seen in the figures to come that the continuation of the operation corresponding to the red lines will result in all the diatonic notes of Table 1 except F, as well as all the sharp notes. Furthermore, the operation corresponding to the blue line will give F and all the flat notes.

We then demonstrate a simple rule for the geometric construction of the seven notes of the Pythagorean scale shown in Table 1. The scheme can be continued to yield all sharp and flat semitones and the three unique notes of just intonation. The key observation on which the rest of the article builds has already been demonstrated in Figure 1. It is that an interval of a perfect fourth either up or down is simple to construct geometrically in a \(30^{\circ} \)\(60^{\circ} \)\(90^{\circ} \) right triangle. This step is also essential for deriving the Pythagorean scale starting with the note C.

In addition, a correction by an octave may be required. This may take the form of an increase by an octave, i.e., a doubling along the frequency axis, as in moving from C to \({{\text C}}_{2}\). The doubling is shown in Figure 1 in that the triangle O–C–\({\text {F}}'\) is mirrored in the triangle C–\({\text {F}}'\)\({{\text C}}_{2}\). Likewise, a decrease by an octave consists in dividing the frequency by two. These operations will now be applied to the Pythagorean and just scales.

Figure 2
figure 2

The remaining Pythagorean whole steps can be constructed by going down successive perfect fourths, i.e., multiplication by 3/4, from \({{\text C}}_{2}\), and doubling the frequency whenever needed to stay between C and \({{\text C}}_{2}\), via \({{\text A}}_{0}\) and \({{\text B}}_{0}\), as shown by the red lines. The sequence of operations is G–D–A–E–B.

The Pythagorean Scale

The rest of the diatonic notes of the Pythagorean scale are now constructed in this order: G, D, A, E, B. The first step, from C to G, was shown by arrows in Figure 1. Continuing from G, the result of the additional steps is shown in Figure 2 in the red lines. Two notes outside of the main interval are found as intermediate results, \({{\text A}}_{0}\) and \({{\text B}}_{0}\), and doubled to fit in the principal interval.

The sharp semitones may also be found by going downward by perfect fourths, i.e., multiplication by 3/4, starting from B. This is shown by the red dashed lines in Figure 3. The first note found will then be \({{\hbox {F}}^{\sharp }}\). Then \({{{\text {C}}}^{\sharp }}\) is found, as shown in Figure 3, followed by \({{\hbox {G}}^{\sharp }}\). Likewise, the flat semitones are found by proceeding upward by perfect fourths, i.e., multiplication by 4/3, from F, giving \({{\hbox {B}}^{\flat }}\), as shown by the dashed blue lines. Half the value, \({{\hbox {B}}_{0}^{\flat }}\), is then found and projected upward to give \({\hbox {E}^{\flat }}\), and then \({{\hbox {A}}^{\flat }}\). Figure 3 also illustrates the first appearance of the Pythagorean comma as the difference between \({{\hbox {G}}^{\sharp }}\) and \({{\hbox {A}}^{\flat }}\), as hinted at in the introduction.

Figure 3
figure 3

The first Pythagorean comma appears as the difference between \({{\hbox {G}}^{\sharp }}\) and \({{\hbox {A}}^{\flat }}\) and is denoted by arrows and PC. It is found when the three sharp notes \({{\hbox {F}}^{\sharp }}\), \({{{\text {C}}}^{\sharp }}\), \({{\hbox {G}}^{\sharp }}\) (dashed red lines) and the three flat notes \({{\hbox {B}}^{\flat }}\), \({{\hbox {E}}^{\flat }}\), \({{\hbox {A}}^{\flat }}\) (dashed blue lines) are added to the Pythagorean whole steps (solid lines).

Continuing the process of generating all sharp and flat notes results in Figure 4. The mathematical details can be found in the appendix at the end of this paper. Since the Pythagorean comma is the difference between the enharmonic sharp and flat notes, it appears in all whole-step intervals, i.e., as the ratios of \({{{\text {C}}}^{\sharp }}\) and \({{{\text {D}}}^{\flat }}\), \({{\hbox {D}}^{\sharp }}\) and \({{\hbox {E}}^{\flat }}\), \({{\hbox {F}}^{\sharp }}\) and \({{\hbox {G}}^{\flat }}\), \({{\hbox {G}}^{\sharp }}\) and \({{\hbox {A}}^{\flat }}\) (as already mentioned), and \({{{\text {A}}}^{\sharp }}\) and \({{\hbox {B}}^{\flat }}\). As noted, in an equal temperament tuning, each of these pairs would have had the exact same pitch. The Pythagorean comma is also found based on the half-step intervals as the ratios of \({{\hbox {B}}_{0}^{\sharp }}\) and C, E and \({{\hbox {F}}^{\flat }}\), \({{\hbox {E}}^{\sharp} }\) and F, and B and \({{\hbox {C}}_{2}^{\flat }}\).

Figure 4
figure 4

Nine Pythagorean commas (PC) found from the Pythagorean whole notes (solid lines) and semitones. Dashed red lines represent sharp notes, while dashed blue lines represent flat notes. The Pythagorean comma, shown by a pair of arrows, appears inside every semitone as the relative distance between the sharp and the flat semitones and between the notes C, E, F, and B and the nearest dashed line.

Just Intonation

The just intonation scale is based on small whole-number ratios. As noted, it differs from the Pythagorean scale in the three notes \({\hbox {A}}_{{\text {J}}}=5/3\), \({\hbox {E}}_{{\text {J}}}=5/4\), and \({\hbox {B}}_{{\text {J}}}=15/8\). The note \({\hbox {A}}_{{\text {J}}}\) is the average of \({{\text C}}_{2}\) and F, and therefore the geometric construction consists in dividing the baseline between two notes into two equal parts, for instance by going from F to \({{\text C}}_{2}\) via K in an equilateral triangle and projecting down to get \({\hbox {A}}_{{\text {J}}}\). This is shown in green in Figure 5. Then \({\hbox {E}}_{{\text {J}}}\) is found as usual by decreasing from \({\hbox {A}}_{{\text {J}}}\) by a perfect fourth, i.e., multiplication by 3/4; \({{\hbox {B}}}_{\text {0J}}\) is found by a second decrease by a perfect fourth; and finally, \({\hbox {B}}_{{\text {J}}}\) is found by doubling. Figure 5 shows the diatonic Pythagorean scale with the three unique notes of the just scale overlaid.

Figure 5
figure 5

The just scale, the syntonic comma, and the three unique notes of just intonation (green lines), i.e., \({\hbox {E}}_{{\text {J}}}\), \({\hbox {A}}_{{\text {J}}}\), \({\hbox {B}}_{{\text {J}}}\), along with the notes of the Pythagorean scale. The difference is the syntonic comma, as indicated by SC.

The difference between the Pythagorean E and that of just intonation is the syntonic comma. The same difference is found between the A's and the B's in the two scales. The syntonic comma between \({\hbox {A}}_{{\text {J}}}\) and A is marked in Figure 5 with SC.

The diatonic Pythagorean and just intonation scales may now be combined into a single figure. However, the amount of detail is so large that the focus will be now on the interval from \({\hbox {A}}_{{\text {J}}}\) to B, as shown in Figure 6, making it possible to see more clearly the two commas. Here the difference between the Pythagorean and the syntonic commas, the schisma, is apparent. The minor half step, or limma, and the major half step, the apotome, are also indicated.

Figure 6
figure 6

Pythagorean and syntonic commas with schisma with all the details in the interval from \({\hbox {A}}_{{\text {J}}}\) to B, showing the Pythagorean comma, the syntonic comma, and the schisma as the difference between them. The minor and major half steps, the limma, and the apotome are also indicated.

The increase and decrease by a perfect fourth may be continued to yield \({{\hbox {B}}^{\flat \flat }}\), \({{\hbox {E}}^{\flat \flat }}\), \({{\hbox {A}}^{\flat \flat }}\), and so on, as well as new instances of the Pythagorean comma, but this is not illustrated in the figures in order not to make them overly complex (see the appendix). The methods used here may also be used to construct most if not all the different interval ratios promulgated by prominent Renaissance musical theorists as given in [3, Table 1].

Final Remarks

Before we end, let us just remark that many have sought to explore the connection between geometry and musical notes. Daniel Muzzulini reproduces a 1637 triangle-based figure from Descartes for constructing geometric progressions [10]. Based on this figure, it appears to be possible to construct some of the ratios corresponding to the white keys of the piano, the diatonic scale. Jahoda also has several geometric figures from which it is possible to obtain some of the ratios [6, Fig. 51 and tables pp. 87–91]. In [7], this was made more complete, and Figure 19 therein shows how all the ratios of the seven notes of the diatonic scale may be constructed. Ernest McClain demonstrated visualization of musical proportions by a succession of paper-folding operations [8]. A major scale of seven notes was found by a succession of eight folding maneuvers. Flat notes are found by an additional five operations in order to display a single occurrence of the Pythagorean comma. Even an instance of the syntonic comma is found. McClain rightly claims that the paper folding “approaches the elegance of geometry,” although it is somewhat impractical to fold a strip of paper accurately thirteen or more times. This paper complements these earlier works by geometrically finding all the instances of the Pythagorean commas of (3), the syntonic comma of (4), and the schisma of (5).

The well-known links between music and mathematics are represented in these geometric constructions. Since geometry is naturally comprehensible to a much wider circle than to those who appreciate mathematical beauty in other forms, these constructions may make the link between music and mathematics more accessible to a larger audience and may also be clarifying for students. In addition, the figures may further the understanding of the compromises inherent in different ways of distributing the commas and even how it relates to the characteristics and moods conveyed by different tunings and keys [1, Sect. 5.13].