1 Introduction

Explaining the auditory perceptions of consonance and dissonance (C and D) in musicFootnote 1 is a fascinating issue that challenged many generations of scientists over centuries. Even nowadays, it is a subject open to scientific discussion, the actual functioning of our hearing system being not fully understood. The aim of the present paper is to extend from dyads (intervals) to triads (triadic chords) our previous work  [1], contributing to the scientific discussion about C and D from the point of view of physics and its methodologies.Footnote 2

A brief historical review is useful to understand the status of the art. With the beginning of modern science in the 17th century, the problem of justifying on a physical basis the perceptions of C and D started to be formulated in a quantitative way and with an increasing amount of mathematical sophistication. The coincidence theory proposed by Galilei  [4], likely building on arguments by Benedetti and previous scholars  [5], is in fact based on dyad’s waveform periodicity  [1]. These ideas triggered contributions and debates from both musicians and scientists deep into the 18th century  [5], including, e.g. Euler  [6] and Riccati  [7]. In relation to the discoveries about higher harmonics pioneered by Mersenne et al., arguments about the role of the fundamental bass in music were formulated by Rameau  [8] and Tartini  [9]; Estève  [10] related dyad’s consonance to the largest presence of common harmonics, as did Pizzati [11], who further discussed the issue in relation to the fundamental bass  [12]. The main criticism against these periodicity and harmonicity theories was the fact that the associated C and D indicators are discontinuous functions of the frequency ratios characterizing dyads and chords. The first pioneering tentatives to obtain an experimentally-based continuous C and D function where carried out by Foderà  [13] in the first half of the 19th century. A different approach to the field appeared in the fall of the 19th century, when Helmholtz  [14] suggested C and D to be related to the absence of the roughness sensation due to beats; the associated C and D indicator being naturally continuous, this approach dominated until the fall of the 20th century and was further refined by Plomp and Levelt  [15] and Hutkinson and Knopoff  [16, 17]. More recently, the drawbacks of the roughness approach  [18] stimulated a re-evaluation of the periodicity and harmonicity approaches, as done, e.g. by Tenney  [19] and Gill and Purves  [20].

Summarizing, the scientific literature developed two main categories of explanations for C and D, that we denote here for short by “roughness” and “compactness”, the latter including the two sub-categories of periodicity and harmonicity. For an extensive review of recent models, see for instance Ref.  [21]. As already mentioned, the compactness and roughness approaches have traditionally been considered to be alternative and somewhat competing. However, none of their representative models emerged as a fully satisfactory explanation of the perceptual data about C and D.

Focusing on dyads, in Ref.  [1], we: (1) considered various historical and recent indicators, establishing whether (or not) they have some physical foundation, related to the periodicity or harmonicity approaches; (2) showed that periodicity and harmonicity indicators are essentially equivalent; (3) proposed a method to extend compactness models to the continuum; and (4) carried out a complete analysis to assess whether a single refined model among the two categories of compactness and roughness, or rather a combination of them, provides a satisfactory explanation of the perceptual data. We found that combined models, with similar weight attributed to compactness and roughness, are highly successful when confronted to the perceptual observations, and perform significantly better than the constituent models: This nontrivial result demonstrates that compactness and roughness are both fundamental ingredients of an effective explanation of C and D  [1].

The goal of the present work is to extend compactness models from dyads to triads. We first find a generalization to triads for each compactness model studied in Ref.  [1]. This requires to study mathematically the fundamental bass, that is the virtual frequency with period equal to that of the wave function obtained by superposing the triad’s constituent wave functions. This virtual frequency is nevertheless “real” and extremely important from the psychoacoustic point of view. The predictions of the compactness models have then to be confronted with the perceptual data. In particular, we exploit the results of a psychoacoustic test performed in Bowling et al.  [22], where just intonation was used to generate triadic sounds within an octave. In order to numerically estimate the performance of each compactness model, we carry out a Chi-square analysis. The C and D indicators associated to compactness models being intrinsically discontinuous, we develop a procedure to extend them to the continuum in the case of triads. In particular, following the approach of Ref.  [1], we exploit a Gaussian smoothing with standard deviation equal to the discrimination limen of the hearing system. This allows also to estimate C and D for other temperaments, like for instance the equal temperament.

The paper is organized as follows. In Sect. 2, we comment on the methodology adopted by Bowling et al.  [22] to carry out their test and discuss their results. Section 3 is devoted to the mathematical study of the fundamental bass. In Sect. 4, we consider compactness models, of the periodicity and harmonicity type, showing that they are essentially equivalent. In Sect. 5, we compare compactness model predictions with perceptual data by means of a Chi-square analysis. In Sect. 6, the compactness indicators are extended to the continuum, and comments on the equal temperament are provided. Our comments and conclusions are discussed in Sect. 7. App. A is devoted to the Euler’s indicator  [6].

2 Previous tests

There are not many psychoacoustic tests available for triads. An interesting one has been performed by Rasmussen et al.  [18] in 2014, with the goal of comparing the C and D rankings of dyads and triads. Overall, the latter were not found to be more dissonant than dyads. The authors thus suggested that the hypothesis according to which consonance decreases with the amount of interaction between present harmonics might not hold for chords. In other words, the roughness approach cannot account for the observed results: This supports the relevance of the compactness approach. Unfortunately, Ref.  [18] considers only a small set of triads, so it is not suitable for the sake of our analysis.

A relevant test for the sake of our discussion has been carried out by Bowling et al.  [22] in 2018. They tested all 12 dyads, 66 triads, and 220 tetrads that can be formed using the intervals specified by the chromatic scale over one octave, with just intonation ratios. Individual tones were created using a synthesized piano. Participants were instructed about the test, defining consonance as the musical pleasantness or attractiveness of a sound, and the opposite for dissonance. On each trial of the experiment, the participants heard a single chord and provided a rating of consonance/dissonance using a four-point scale: “quite consonant”; “moderately consonant”; “moderately dissonant”; and “quite dissonant”.

The results of Ref.  [22] for the 66 triads, normalized to a scale from 0 to 1 in order of increasing consonance, are reproduced in Table 1. The first column displays the number assigned to each chord, according to an organization criterion linked to the pitch class concept. The pitch class of a frequency \(f_i\) takes values in the interval [0, 12) and is given by \(p_{c_i}=[9+12 \log _2 (f_i/440\, \textrm{Hz})] \textrm{mod} \,12\). Introducing the three frequencies \(f_1< f_2 < f_3\), a triad can be characterized by a pitch class set, \((p_{c_1},p_{c_2},p_{c_3})\), as shown in the second columnFootnote 3 of Table 1. In this way, for instance, all C’s and G’s have pitch class 0 and 7, respectively. For an easier understanding of the type of chord, we display in the the third column the notes associated to the pitch class set.

Table 1 Triad ratings according to Bowling et al.  [22], rescaled to the range [0, 1]

In the fourth column of Table 1, we include, when possible, the name of the triad according to the common denomination used in music theory. From the music theory point of view, the notation should describe the constituents intervals. Indeed, triads are denoted in reference to the type of interval formed by the dyad \(f_1\) and \(f_2\), and the one formed by the dyad \(f_1\) and \(f_3\), as shown in the fifth column. For instance, if such intervals are a M3 and a P5, the chord is called a major chord in the rest position, and it is denoted by maj (r); if such intervals are instead a m3 and a P5, the chord is called a minor chord in the rest position, and it is denoted by min (r). Diminished and augmented chords at rest are characterized, respectively, by m3 and TT and by M3 and m6. In addition, chords can be in one among three possible positions: rest (r), first inversion (1), and second inversion (2). A relevant category of triads is the so-called power chords, where \(f_3\) is fixed to be the octave of \(f_1\), that is \(f_3/f_1=2\); the intermediate frequency \(f_2\) thus fully characterizes power chords. We denote them by “pc X”, where X stands for the interval formed by \(f_2\) and \(f_1\), for instance X=P5, P4, M3.

From the physical point of view, it is useful to characterize a triad by the pair of frequency ratios \((f_2/f_1,f_3/f_1)\), written in simple form (that is without common factors between numerator and denominator), as shown in the sixth column of Table 1; the third ratio \(f_3/f_2\) can thus be easily determined. The mean value and the standard deviation \(\sigma\) of the score assigned to each triad  [22] (upon rescaling the results to the range [0, 1]) are reported in the seventh and eight columns. For an easy visual comparison, these normalized results are reproduced in Fig. 1.

Fig. 1
figure 1

Bowling et al.  [22] results, rescaled to the C and D range [0, 1]

The two best performing triads are pc P5 and maj (r), with a mean consonance value larger than 0.9. This is not surprising: Only the consonant P5 interval is involved, together with the M3 for the latter. The third best performing triad is pc P4; the fourth best performing are maj (1) and maj (2), with the same score. Then, in decreasing order of consonance, we find: pc M3, min (r), pc m6, sus 4, pc M6, min (1), sus 2, min (2), triad n.30, etc. As for the most dissonant triads, in order of increasing consonance, we find: triad n.1 (or cluster chord); triad n.9; and triads n.10, 22, 57, 65, and 66, all with the same score.

The results of the test by Bowling et al.  [22] are globally reasonable, but there are some critical points. It is generally agreed by musicians that maj (2) is more consonant than maj (r); the opposite result obtained by the participants to the test might be due to a cultural bias, maj (r) being very popular in Western music. The same cultural bias might apply to min (r), which also gets a high score from the test; musicians would for instance typically consider sus 4 and pc m6 to be more consonant than min (r).

3 The fundamental bass

For dyads, compactness models and their generalization to continuum were studied in detail in Ref.  [1], stressing the relevance of the fundamental frequency \(f_0\) of the sound wave obtained by summing the two sound waves having (fundamental) frequencies \(f_1\) and \(f_2\). This frequency is important for both periodicity and harmonicity models. It is referred to in many ways, including missing fundamental or virtual pitch.

It is indeed a virtual frequency, in the sense, that it is not actually present in the spectrum of the dyad; nevertheless, it plays an important role in psychoacoustics, as the hearing system is able to reconstruct it (see, e.g. [24] and references therein). It is by now accepted that the brain processes the information present in a set of tones to calculate the associated fundamental frequency \(f_0\); the precise way in which it does so is still a matter of debate, but the processing seems to be based on an autocorrelation involving the timing of neural impulses in the auditory nerve  [25]. In music theory, the relevance of this fundamental frequency \(f_0\) is also largely recognized: It corresponds to the frequency of the fundamental bass which, as well known, plays a crucial role in the characterization of harmonic structures  [8, 9]. In the following, we will adopt the music theory denomination.

Since our aim is to generalize compactness models to triads, which are composed by three tones with frequencies \(f_1< f_2 < f_3\), we first calculate the fundamental bass of such triad. By definition, this corresponds the fundamental frequency of the sound wave obtained by summing the three sound waves having (fundamental) frequencies \(f_1\), \(f_2\), and \(f_3\). We fix \(f_1\) to some frequency and define \(f_2/f_1= M_{12}/N_{12}\) and \(f_3/f_1= M_{13}/N_{13}\), where \(M_{12},N_{12}\) and \(M_{13},N_{13}\) are integers. For \(ij=12,13\) (but not for \(ij=23\)), we introduce the integers

$$\begin{aligned} m_{ij}= \frac{M_{ij}}{ \textrm{GCD}[N_{ij},M_{ij}]} ,\,\,\, n_{ij}= \frac{N_{ij}}{ \textrm{GCD}[N_{ij},M_{ij}]},\,\,\, \mathrm{for} \,\, ij=12,13.\end{aligned}$$
(1)

With this notation, \(f_2/f_1=m_{12}/n_{12}\) and \(f_3/f_1=m_{13}/n_{13}\) are written in the simplest form, namely with numerator and denominator having no prime factors in common.

As for the ratio \(f_3/f_2\), since it can be written as the product of \(f_3/f_1\) and \(f_1/f_2\), one would obtain \(f_3/f_2= m_{13} n_{12}/(n_{13} m_{12})\), which in general is not in its simplest form. Nevertheless, it can be written in its simplest form by introducing the simple ratio \(m_{23}/n_{23}=f_3/f_2\), where

$$\begin{aligned} m_{23}= \frac{ m_{13} \,n_{12}}{ \textrm{GCD}[ n_{13} \, m_{12}, m_{13} \, n_{12}]}{,} \,\,\, n_{23}= \frac{n_{13} \,m_{12}}{\textrm{GCD}[ n_{13} \, m_{12}, m_{13} \, n_{12}]}{.}\end{aligned}$$
(2)

The fundamental bass of the dyad with frequency ratio \(f_j/f_i\), with \(ij=12, 13\) and 23, has frequency given by:

$$\begin{aligned} f^{(12)}_0= \frac{1}{n_{12}} \,f_1= \frac{1}{m_{12}}\, f_2, \,\,\,\, f^{(13)}_0= \frac{1}{n_{13}} \, f_1=\frac{1}{m_{13}} \, f_{3}, \,\,\,\, f^{(23)}_0= \frac{1}{n_{23}} \, f_1=\frac{1}{m_{23}} \, f_3. \end{aligned}$$
(3)

The fundamental bass of these three fundamental basses can be calculated as follows. Defining

$$\begin{aligned} n = \frac{n_{12} \,n_{13}}{\textrm{GCD}[n_{12},n_{13}]}, \,\,\,m = n \frac{m_{12}}{n_{12}}=\frac{m_{12} \,n_{13}}{\textrm{GCD}[n_{12},n_{13}]} ,\,\,\, \ell = n \frac{m_{13}}{n_{13}}=\frac{n_{12} \,m_{13}}{\textrm{GCD}[n_{12},n_{13}]}, \end{aligned}$$
(4)

the fundamental bass of the three (fundamental) frequencies constituting the triad is

$$\begin{aligned} f_0= \frac{1}{n} \, f_1 = \frac{1}{m} \, f_2 = \frac{1}{\ell } \, f_3. \end{aligned}$$
(5)

As already stressed, it is a virtual sound, not present in the spectrum of the triad, but it is of remarkable importance for the hearing system and for music theory. So, let us focus on the harmonic spectrum of a tone with (fundamental) frequency \(f_0\) and denote it by \(\{n_0 f_0\}\) with \(n_0=1,2,3,...\). Clearly, the constituent (fundamental) tones of the triad discussed previously correspond to the harmonic numbers \(n_0=(n, m,\ell )\) of the harmonic spectrum of \(f_0\). This offers another useful notation to denote a chord; in the last column of Table 1, we show the values of \((n,m,\ell )\) for each of the 66 triads that can be built within the octave, using a just intonation scale.

The highness of the fundamental bass has been itself interpreted as a C and D indicator  [11, 12]: The more (nml) are small, the more the three frequencies belong to the low harmonics of the fundamental bass, and the more there is consonance.

In particular, the knowledge of n for each of the 66 triads of Table 1 allows for an easy calculation of \(f_0\), according to Eq. (5). In Fig. 2, we show a graphical representation of the magnitude of \(f_0\), taking \(f_1\) fixed, say for instance to C\(_4\); a larger ellipse is associated to a higher value of \(f_0\).

Fig. 2
figure 2

Visual representation of the highness of the fundamental bass \(f_0\), for the 66 just intonation triads within one octave, taking \(f_1\) fixed (say to C\(_4\)). A larger ellipse and colder colors are associated to triads with increasingly higher values of \(f_0\)

From Table 1 and Fig. 2, we can see that, for triads within the octave, the highest value of the fundamental bass occurs for pc P5, and it is one octave below \(f_1\); indeed, in this case \(n=2\), so that \(f_0=f_1/2\)=C\(_3\). The second highest value of the fundamental bass happens when \(n=3\) and corresponds to \(f_0=f_1/3=\)F\(_2\): This is the case for pc P4, pc M6, and maj (2). The third highest fundamental bass is lower than \(f_1\) by two octaves: \(n=4\) for both maj (r) and cp M3, so that \(f_0=f_1/4=\)C\(_2\). This is in agreement with musicians common experience that maj (2) is more consonant than maj (r)—although the results of Bowling et al.  [22] test showed the opposite. The fourth highest fundamental bass is \(f_0=f_1/5\), which is obtained by dim (r), maj (1), and dim (1). The other triads have a lower \(f_0\). For instance, sus 4 has \(n=6\), while min (r) has \(n=10\). The cluster chord n.1 and the triad n.10 are the chords with the lowest fundamental bass, having \(n=120\).

This shows that the height of the fundamental bass is indeed an efficient C and D indicator. It has, however, some weak points: dim (r) has higher fundamental bass than min (r), but the common experience by musicians is that dim (r) is more dissonant than min (r)—in agreement with Bowling et al.  [22] test. It is reasonable to expect that C and D indicators including the effect of roughness should be better suited to explain this feature.

It is worth to consider which triads have the fundamental bass as high as possibile, that is \(f_0=f_1\); this requires \(n=1\), while m and \(\ell\) are not constrained. We call them super harmonic chords, as \(f_2\) and \(f_3\) belong to the harmonic series of \(f_1\). Necessarily, \(f_3\) is in the second octave, so they are compound chords, which have not been tested by Bowling et al.  [22]. Table 2 displays three such super harmonic chords, those with \((n, m,\ell )\) equal to (1, 2, 3), (1, 2, 4), and (1, 3, 4).

Notice also that, in general, the ratio \(f_3/f_2\), even in its simple form, might involve larger integers in the numerator and denominator with respect to those characterizing \(f_2/f_1\) and \(f_3/f_1\) in the just intonation scale. This is for instance the case for the triad n.1, for which \(f_3/f_2= (3^4 \times 5)/2^7\). In general, this happens for triads having a low fundamental bass. On the contrary, triads with a high fundamental bass typically display small integers in the numerator and denominator of \(f_3/f_2\). This for instance the case for maj (2), which has \(f_3/f_2= 5/4\).

Table 2 Examples of triads beyond the octave having \(f_0=f_1\)

4 Compactness models

There are two categories of compactness models, those based on periodicity and those on harmonicity. In Ref.  [1], we showed explicitly for dyads that they are associated to indicators that are numerically much similar. Indeed, the criterion of having the period of the dyadic sound wave as short as possible and the criterion of having the highest possible number of common harmonics in the constituent dyad tones are practically equivalent. We expect that the same holds in the case of triads.

4.1 Periodicity indicators

We start by analysing periodicity models for triads. According to the periodicity approach, the more the period of the triadic sound wave is short with respect to the period of its component tones, the more the triad is consonant. So, more consonant chords should be those with the highest value of \(f_0\) with respect to the chord’s frequencies.

Comparing the fundamental bass \(f_0\) with the lowest/middle/highest sound of the chord gives the following three periodicity consonance indicators

$$\begin{aligned} I^P_1 = \frac{f_0}{f_1} =\frac{1}{n} ,\,\,\, I^P_2 = \frac{f_0}{f_2}=\frac{1}{m} ,\,\,\,I^P_3 = \frac{f_0}{f_3}=\frac{1}{\ell }. \end{aligned}$$
(6)

These indicators span a smaller range than the interval [0, 1] for the 66 triads of Table 1; in the following, we will discuss how to normalize them. Notice that \(I^P_1\) corresponds to using just the fundamental bass as an indicator, since it compares \(f_0\) with the lowest tone of the triad \(f_1\), taken to be fixed; the dependence on the values of \(f_2\) and \(f_3\) is then fully encoded in the value of \(f_0\). The indicators \(I^P_{2}\) and \(I^P_3\) have instead a stronger dependence on the fact that the middle and highest tone of the triad should be close to \(f_0\).

The indicator \(I^P_3\) is particularly interesting. It compares the period of the fundamental bass with the period of the highest frequency of the triad: The more \(\ell\) is small, the more the triad is compact from the periodicity point of view. This indicator is somewhat related to an extension to triads of Galilei’s arguments  [4] about dyads coincidence theory.Footnote 4

Comparing the fundamental bass frequency \(f_0\) with some mean value of the triad frequencies might be an interesting option to find more sophisticated periodicity indicators. We define the arithmetic, geometric, and harmonic means

$$\begin{aligned} f_A = \frac{f_1+f_2+f_3}{3}, \,\,\,\,\, f_G = ( f_1 f_2 f_3 )^{1/3}, \,\,\,\, f_H = \frac{3}{\frac{1}{f_1}+\frac{1}{f_2}+\frac{1}{f_3}}, \end{aligned}$$
(7)

so that

$$\begin{aligned} I^P_A = \frac{f_0}{f_A} =\frac{3}{n+m+\ell }, \,\,\,\,\, I^P_G = \frac{f_0}{f_G}=\frac{1}{ (n \,m \,\ell )^{1/3}}, \,\,\,\,\, I^P_H = \frac{f_0}{f_H} = \frac{m\, \ell + n \,\ell + n\, m}{3\, n\,m\, \ell }. \end{aligned}$$
(8)

These indicators are expected to give predictions in between \(I^P_1\) and \(I^P_3\), and they correspond to the following combinations of the three indicators above,

$$\begin{aligned} I^P_A = \frac{3}{ \frac{1}{I^P_1} + \frac{1}{I^P_2}+\frac{1}{I^P_3}}, \,\,\,I^P_G= (I^P_1 I^P_2 I^P_3)^{1/3}, \,\,\,I^P_H= \frac{1}{3} (I^P_1+ I^P_2+ I^P_3). \end{aligned}$$
(9)

The indicator \(I^P_H\) can be seen as an extension to triads of the dyadic consonance indicator suggested, on the basis of very different arguments, in Ref.  [26].

More in general, any non-decreasing monotonic function of the three indicators \(I^P_i\) (\(i=1,2,3\)) can be used to define a consonance indicator somewhat related to periodicity. For instance, even the function \(\ln I^P_3 = - \ln \ell\) can be viewed as a consonance indicator related to periodicity.

Equivalently, any non-decreasing monotonic function of the three numbers \(n,m,\ell\) can be used to define a dissonance indicator somewhat related to periodicity. For instance, the product nm was suggested by G.B. Benedetti as an estimator for dyads rankings  [5]. Of course, not all such guessed indicators turn out to explain perceptual results in a satisfactory way. This is for instance the case for the indicator proposed by L. Euler  [6] and discussed in app. A for historical completeness.

4.2 Harmonicity indicators

According to the harmonicity approach, more harmonics the tones of a chord have in common, more consonance is achieved. We showed in Ref.  [1] that harmonicity models for dyads are essentially equivalent to periodicity ones. In order to generalize to triads the harmonicity models defined for dyads, we first introduce a few definitions.

The first (triple) coincidence of the harmonic sounds of a triad happens at

$$\begin{aligned} n_1^{c_1}\, f_1 =n_2^{c_1}\, f_2 = n_3^{c_1} \, f_{3}, \end{aligned}$$
(10)

where \(n_i^{c_1}\), \(i=1,2,3\), are integer numbers, with \(n_3^{c_1} \ge 1\) (the equality is included in order to include possible coincidences with the fundamental of \(f_3\)). Hence, we have that the ratios can be expressed as

$$\begin{aligned} \frac{f_2}{f_1} = \frac{m_{12}}{n_{12} } = \frac{n_1^{c_1}}{n_2^{c_1}},\,\,\,\frac{f_3}{f_1} = \frac{m_{13}}{n_{13} } = \frac{n_1^{c_1}}{n_3^{c_1}},\,\,\, \frac{f_3}{f_2} = \frac{m_{23}}{n_{23} } = \frac{n_2^{c_1}}{n_3^{c_1}}, \end{aligned}$$
(11)

where the ratios of the type \(m_{ij}/n_{ij}\) are in simple form, while those of the form \(n_i^{c_1}/n_j^{c_1}\) are in general not in simple form.

We define the integers \(\alpha _{12}\), \(\alpha _{13}\) and \(\alpha _{23}\) to be precisely those integers responsible for the non-simple form of \(n_i^{c_1}/n_j^{c_1}\), that is

$$\begin{aligned} n_1^{c_1} = \alpha _{12} m_{12} = \alpha _{13} m_{13},\,\,\, n_2^{c_1} = \alpha _{12} n_{12} = \alpha _{23} m_{23},\,\,\, n_3^{c_1} = \alpha _{13} n_{13} = \alpha _{23} n_{23}. \end{aligned}$$
(12)

Hence, it turns out that

$$\begin{aligned} \alpha _{12} =\frac{m_{13}}{\textrm{GCD}[m_{12},m_{13}]}, \,\,\,\alpha _{13} = \frac{m_{12}}{\textrm{GCD}[m_{12},m_{13}]}, \,\,\,\alpha _{23} = \alpha _{12} \frac{n_{12}}{m_{23}}, \end{aligned}$$
(13)

so that the integers \(n_i^{c_1}\) are given by

$$\begin{aligned} n_1^{c_1} = \frac{m_{13} m_{12} }{\textrm{GCD}[m_{12},m_{13}]} ,\,\,\, n_2^{c_1} = \frac{m_{13} n_{12} }{\textrm{GCD}[m_{12},m_{13}]},\,\,\,n_3^{c_1} = \frac{m_{12} n_{13} }{\textrm{GCD}[m_{12},m_{13}]}. \end{aligned}$$
(14)

Clearly, the smaller are the integers \(n_i^{c_1}\), the larger is the number of coincidences in the lower harmonics of the triad spectrum (which are typically those with the highest amplitudes), and the larger is consonance. So, any non-decreasing function of the ratios \(1/n_i^{c_1}\) can be used as a consonance indicator somewhat related to harmonicity, which turns out to be numerically similar to the periodicity indicators discussed previously.

A valuable argument to define a harmonicity indicator is to count the overlaps among the harmonic spectrum of the fundamental bass and each component of the triad. As already mentioned, the first coincidence is found at the frequency \(n_1^{c_1}\, f_1\); this corresponds to the harmonic number

$$\begin{aligned} n_0^{c_1}=n\, n_1^{c_1}=n\, \frac{m_{13} m_{12}}{ \textrm{GCD}[m_{12},m_{13}]} \end{aligned}$$
(15)

in the harmonic series expansion of \(f_0\). The common harmonics of \(f_0\) and \(f_i\), with \(i=1,2,3\), up to the first coincidence, are given by \(n_i^{c_1}\). Hence, a simple harmonicity indicator corresponds to the ratio between the number of harmonics that \(f_0\) shares with any harmonic of \(f_i\), divided by the number of all harmonics of \(f_0\), up to the first coincidence. We thus obtain

$$\begin{aligned} I^H_{S}=\frac{ n_1^{c_1}+n_2^{c_1}+n_3^{c_1} }{ n_0^{c_1}} =\frac{ m_{13} m_{12} + m_{13} n_{12} + m_{12} n_{13} }{n\, m_{13} m_{12} }, \end{aligned}$$
(16)

where twofold common harmonics are counted twice, and the first coincidence is counted three times.

If one wants to count just once the twofold and threefold coincidences, as suggested by Gill and Purves in Ref.  [20], one can proceed as follows.Footnote 5 Up to the first coincidence, the harmonics in common between \(f_i\) and \(f_j\) are given by \(\alpha _{ij}\). Hence, double counting of harmonics is avoided by considering the ratio

$$\begin{aligned} I^H_{GP}=\frac{ (n_1^{c_1}+n_2^{c_1}+n_3^{c_1}) - ( \alpha _{12}+\alpha _{13} + \alpha _{23}) + 1 }{ n_0^{c_1}}. \end{aligned}$$
(17)

Notice that for super harmonic chords \(I^H_{GP}=1\), while this is not the case for \(I^H_{S}\), which is equal to 11/6, 7/4, and 19/12 for triads with \((f_2/f_1,f_3/f_1)= (2/1, 3/1), (2/1,4/1)\) and (3/1, 4/1), respectively. We now turn to discuss a proper normalization of the compactness indicators.

5 Comparison with perceptual data

In order to compare effectively Bowling et al.  [22] results for triads within one octave with the predictions of the periodicity and harmonicity indicators previously introduced, we have to normalize the latter so that they all predict maximal consonance (that is 1) for pc P5. So, we introduce normalized consonance indicators as

$$\begin{aligned} {\tilde{I}}^C_{X} = \frac{I^C_X}{I^C_X |_{pc P5} },X=1,2,3,A,G,H, S, GP, \end{aligned}$$
(18)

where \(C=P,H\) stands for compactness, while P and H stand for the sub-categories of periodicity and harmonicity.

Fig. 3
figure 3

Bowling et al.  [22] results (red) compared with the predictions of \({\tilde{I}}^P_{1,2,3}\), \({\tilde{I}}^P_{A,G,H}\), and \({\tilde{I}}^H_{S,GP}\), in the top, middle, and bottom panels, respectively

In Fig. 3, Bowling et al.  [22] results (red) are compared with the predictions of all the periodicity and harmonicity indicators previously introduced. In general, the predictions of compactness models agree with perceptual data within one standard deviation. The error bars are, however, quite large. Notice also that periodicity and harmonicity models have quite similar predictions (the same happens for dyads  [1]): This shows that the compactness of the waveform period and the compactness of the harmonic spectrum are practically equivalent requirements. Inside each sub-category, one can recognize that there are small differences between the models. For instance, the Galileo’s inspired indicator \({\tilde{I}}^P_3\) is slightly larger than \({\tilde{I}}^P_1\); the indicators \({\tilde{I}}^P_{A,G,H}\) instead display negligible differences among them; the harmonicity indicator \(\tilde{I}^H_{GP}\) is slightly larger than \({\tilde{I}}^P_3\).

From Fig. 3, we can see that the most consonant triads according to the compactness models, do get a high score in the perceptual test. However, a high test score is also associated to some triads that should be not so consonant according to the models: This is the case in particular for maj (r), maj (1), and for all minor chords, for which the test scores are much above the model expectations. As mentioned, this might be due to cultural familiarity, whose effect is possibly enhanced by the choice of the piano timbre in the test by Bowling et al.  [22]. It would be interesting to perform a perceptual test with tones generated with a “neutral” timbre, as done in our previous work  [1], to check whether such difference persists; it would also be interesting to include triads beyond one octave. Notice also that not only \({\tilde{I}}^P_1\), but all compactness models predict dim (r) to be more consonant than min (r); this unsatisfactory prediction should disappear by including the effect of roughness.

Using the results at our disposal now, we perform a reduced Chi-square analysis to compare periodicity and harmonicity models in a more quantitative way, for triads within one octave:

$$\begin{aligned} {{\tilde{\chi }}}^2 = \frac{1}{66} \sum _{n=1,..., 66} \frac{ (\tilde{I}^C_X(n) - m(n))^2}{\sigma (n)^2}, \end{aligned}$$
(19)

where m(n) and \(\sigma (n)\) are Bowling et al.  [22] means and standard deviations, displayed in Table 1. The results of the calculations are summarized in Table 3: For all models, a reduced Chi-square around unity is achieved, which means that compactness models reproduce experimental data in a satisfactory way. The differences in the reduced Chi-square among the compactness models are not very significant: This is also related to the fact that the standard deviations of the perceptual test of Ref.  [22] are quite large. We can see that the best periodicity model is \({\tilde{I}}^P_3\), the Galilei’s inspired model. The smallest reduced Chi-square among all compactness models is, however, obtained by \({\tilde{I}}^H_{GP}\), as was the case also for dyads  [1]: This allows to conclude that the extension from dyads to triads provides consistent results.

Table 3 Reduced Chi-square for indicated periodicity and harmonicity models, calculated by using the 66 triads within one octave according to Bowling et al.  [22]

6 Extension to continuum

The extension to continuum is quite a delicate task and was addressed in many different ways, from adopting arbitrary mathematical simplifications  [26, 27] to exploiting signal processing techniques  [28]. In [1], we proposed an analytical procedure for the extension to the continuum, based on the hearing system auditory property called discrimination limen (DL)  [29]. Here, we extend this procedure to triads, by considering the plane with horizontal axis x and vertical axis y, to be identified, respectively, with \(f_2/f_1\) and \(f_3/f_1\).

Suppose that \(M_{12,13}\) and \(N_{12,13}\) take all integer values from 1 up to 30, for instance. To study triads inside the octave, we select (avoiding double counting) the k different points of the type \((m_{12}/n_{12},m_{13}/n_{13})\), such that \(1< m_{12}/n_{12}<2\) and \(m_{12}/n_{12}< m_{13}/n_{13}<2\), and we denote them by \((x_i,y_i)\), with \(i=1,.., k\). The associated normalized consonance indicator, \({\tilde{I}}^C_X(x_i,y_i)\), is thus defined in the plane (xy) only for the k discrete values \((x_i,y_i)\).

Our aim is to extend the indicator to any value of the (xy) domain such that \(1<x<2\) and \(x<y\), thus rendering it more “physical”. We recall that, as derived by Zwicker et al.  [29], the DL turns out to be about 1/30 of the critical bandwidth (CB)  [30], which is frequency dependent. In particular, the ear has a DL of about 3 Hz at the frequency of middle C, that is \(f_{DL}(\textrm{C}_4)=3\) Hz, where C\(_4=261.63\) Hz; the DL increases up to 6 Hz two octaves above  [29].

We propose to simulate the effect of the DL by smoothing the consonance peaks with a Gaussian Footnote 6 characterized by a standard deviation which reflects the magnitude of the DL at the frequencies characteristics of the triads to be considered, \(\sigma \equiv f_{DL}({\bar{f}})/f_1\), where \({\bar{f}}\) is some mean value between \(f_2\) and \(f_3\). Indeed, a variation of \(f_{DL}({\bar{f}})\) in the value of \(f_2\) and \(f_3\) is associated to a variation of \(\sigma\) along the x and y axes, respectivelyFootnote 7:

$$\begin{aligned} \frac{ f_2 \pm f_{DL}({\bar{f}}) }{f_1} = x \pm \sigma, \,\,\, \frac{ f_3 \pm f_{DL}({\bar{f}}) }{f_1} = y \pm \sigma.\end{aligned}$$
(20)

Fixing for definiteness \(f_1=\)C\(_4\), we have that \(\sigma\) spans the range \(f_{DL}({\bar{f}})/f_1 \approx (3-6)/261.63= (0.0115-0.0229)\) when \({\bar{f}}\) goes from C\(_4\) up to C\(_6\). In the following, we take for definiteness \(\sigma = 0.02\), which corresponds to a DL of 5.25 Hz. The distance between the point \(P=(x,y)\) and the point \(P_i=(x_i,y_i)\) (with \(i=1,...,k\)) is given by \(d(P,P_i)=( (x-x_i)^2 +(y-y_i)^2)^{1/2}\). We can define the continuum consonance indicator \(C^C_X(x,y)\) as the maximum consonance value resulting from smoothing with a Gaussian the consonance peaks placed at positions \((x_i,y_i)\), that is

$$\begin{aligned} C^C_X(x,y)= \textrm{Max}_i \, {\tilde{I}}^C_X(x_i,y_i) \, e^{- \frac{(x-x_i)^2+(y-y_i)^2}{2 \,\sigma ^2}}, \,\,\, \mathrm{where\,} \,i \,\mathrm{\, is\, such \,that} \,\,\, d(P,P_i) < 2\, \sigma. \end{aligned}$$
(21)

The constraint on the distance between P and \(P_i\) allows to speed up the numerical calculation: One could indeed include all k points \(P_i\) in the calculation of the maximum, but this is not necessary as only points within a circle with radius of a couple of standard deviations significantly contribute. The result is a continuous surface function, with smoothed peaks such that, within (beyond) about 3 (6) Hz from a peak, the consonance function does not (does) change significantly.Footnote 8

Fig. 4
figure 4

Top left, top right, and bottom: continuum consonance indicators \(C^P_1\), \(C^P_3\), and \(C^H_{GP}\)

Figure 4 shows the extension to continuum, taking \(\sigma =0.02\), for the compactness models \(C^P_1\), \(C^P_3\), and \(C^H_{GP}\). It turns out that \(C^P_1\) is symmetric with respect to the diagonal axis from the top left to the bottom right of the figure; while the other two display more consonance in the region below such axis. Indeed, notice that with \(C^P_3\) and \(C^H_{GP}\), the major third and many other relevant triads get a higher score with respect to \(C^P_1\).

This can be better seen by zooming on the region around major and minor triads, as done in Fig. 5. We can clearly see the high consonance peaks associated to the major triads. Notice, however, that also dim (r) has a quite high consonance peak, even at the same level as maj (r) for \(C^H_{GP}\). On the contrary, notice the low consonance value of min (r) and the fact that there is no peak associated with it: min (r) actually is in the tail of the peak of a more consonant close triad, which is associated to frequency ratios (7/6, 3/2), that is a septimal minor third with a fifth. It is reasonable to expect that these not fully satisfactory features would disappear by including the effect of roughness.

These plots are also useful to comment on the major and minor triads of the equal temperament, whose location is shown in Fig. 5 by using dark dots. As well known, equal temperament triads have a lower consonance with respect to just intonation ones: Indeed, being not too far from the peak of the just maj (r), the equally tempered maj (r) gets a quite high consonance value. On the contrary, it would be unreasonable to find the tempered maj (r) down in the dissonance valley around the just maj (r). Similar considerations apply for the other major and minor triads. Notice that min (r) in the equal temperament gets closer to the peak of the triad with the septimal minor third.

As these considerations depend on the value of \(\sigma\) adopted in the procedure of the extension to continuum, it would be interesting to further explore the effect of varying \(\sigma\) on the various models. We leave this for future work.

Fig. 5
figure 5

Top left, top right, and bottom: zoom on the region around major and minor triads for the continuum consonance indicators \(C^P_1\), \(C^P_3\), and \(C^H_{GP}\)

7 Overview and conclusions

In this work, we studied in detail the viability of compactness models in explaining C and D perceptual data for triads. Compactness models can be divided in the two sub-categories of periodicity and harmonicity, according to the fact that one chooses to relate consonance to a compact period of the waveform signal or rather to a compact structure of harmonics. In both cases, the role of the fundamental bass turns out to be outstanding in order to identify consonance indicators.

Compactness models for dyads were studied extensively in Ref.  [1], where it was shown that the two compactness requirements mentioned above (i.e. periodicity and harmonicity) are practically equivalent. Here, we showed that this turns out to be the case also for triads, see Fig. 3.

We found that compactness models are quite satisfactory in reproducing perceptual data. In particular, here, we calculated the reduced Chi-square of our compactness indicators by using the perceptual data of Bowling et al.  [22], obtained by testing all 66 triads that can be formed using the chromatic just scale within one octave. We found that, for all the compactness indicators proposed, the reduced Chi-square is of order one, see Table 3. In particular, the best performing model turned out to be the Gill and Purves harmonicity indicator  [20], followed by the periodicity indicator inspired to Galilei’s arguments  [4].

One of the most serious criticisms formulated against this category of C and D models is the fact that they are naturally associated to indicators which are discontinuous  [5]. This represents a problem, especially in order to assess the predictions of these models for the various temperaments proposed in the history of music. In Ref.  [1], a procedure to obtain continuum indicators for dyads was suggested, based on smoothing the just intonation consonance peaks by means of a Gaussian with a standard deviation \(\sigma\) taken to be equal to the discrimination limen. Here, we extended this procedure to compactness models for triads, obtaining continuum consonance indicators describing a surface in the plane \(f_2/f_1\) and \(f_3/f_1\), see Fig. 4. This allows for an effective comparison of the just scale predictions with those of various temperaments. In particular, we discussed here the case of the equal temperament, see Fig. 5.

Our general result is that compactness indicators provide a quite effective, but not fully satisfactory, description of C and D for triads. It would be interesting to consider also roughness models for triads and explore the effect of combining them with compactness models, as done for dyads in  [1]. The extension to tetrads would be also worth to explore. This is postponed to future work, together with a more refined study of various temperaments and the effect of varying the parameter \(\sigma\). From the point of view of the perceptual data, it would be worth to carry out a test with sounds not having the piano timbre (as done in [22]) and to consider triads and tetrads beyond one octave. The above findings and considerations show that justifying on a physical basis the auditory perception of C and D for triads is a subject open to further research improvements.