Introduction

Taxonomy was a vexing question for Charles Darwin, especially with regard to barnacles. He recognised that it was easy to classify such fauna into one or other species when only a few specimens were available, but as sample sizes increased the boundaries between species tended to break down (Thackeray 1999). For the classification of finches which he collected on the Galápagos archipelago, Darwin (1837) turned to John Gould (1837). As an ornithologist, the latter was an alpha taxonomist relying on the Linnaean assumption that specimens can be pigeon-holed into one or other distinct species. Darwin (1845) stated that the finches were “related to each other in the structure of their beaks, short tails, form of body and plumage. There are thirteen species, which Mr. Gould has divided into four subgroups”. However, he went on to say “the most curious fact is the perfect gradation in the size of the beaks in the different species of Geospiza”. Such observations would have raised the question as to whether there were clear boundaries, and whether particular specimens represented “varieties” as opposed to distinct species. Indeed hybridisation is known to occur in Galápagos finches (eg. Enbody et al 2023; Grant and Grant 1996, 2008, 2014; Lack 1947; Lamichhaney et al 2018), as in a great many other taxa (Thackeray and Schrein 2017), such that a definition of a species becomes an important issue.

We adopt here what we have called “sigma taxonomy”, where “sigma” is the Greek letter S (Σ) for spectrum, associated with a probabilistic definition of a species applicable in cases where there are not necessarily clear boundaries (Thackeray 2018; Thackeray and Schrein 2017). It is defined as “The classification of taxa in terms of probabilities of conspecificity, without assuming distinct boundaries between species” (Thackeray 2018), as opposed to alpha taxonomy which does make that assumption (Mayr et al 1953).

The statistical method behind sigma taxonomy has been described by Thackeray and Odes (2013) and by Thackeray and Dykes (2016). The fundamental issue is “what is the probability that two specimens do or do not belong to the same species?”. As an example based on illustrations (Fig. 1), the method is demonstrated here initially with regard to two finches, Geospiza magnirostris and Geospiza fortis, classified by Gould (1837) and to this day generally accepted as distinct species. They were illustrated in Darwin’s (1845) “Journal of researches into the natural history and geology of the countries visited during the voyage of H.M.S. Beagle round the world, under the Command of Capt. Fitz Roy”. G. magnirostris is larger than G. fortis. Both are widespread granivores, and with its relatively large beak the former feeds on large seeds, whereas the latter consumes flowers as well as seeds. Morphological variability of beaks is under genetic control (Abzanhov et al, 2004).

Fig. 1.
figure 1

Geospiza magnirostris Gould, 1837 (left) and G. fortis Gould, 1837 (right), as published by Darwin (1845)

Method, with an example of two “types” of finches

Linear measurements are obtained from anatomical elements as in the case of museum specimens studied by Thackeray et al (1997). Measurements are subjected to pairwise comparisons, using least squares linear regression to quantify the degree of scatter around a regression line of the form y = mx + c, where m is the slope and c is the intercept.

In a study of measurements of pairs of specimens of the same (extant) species of many vertebrates, Thackeray et al. (1997) reported central tendency of the log-transformed standard error of the m-coefficient, known as “log sem” which is a measure of the degree of scatter around the regression line, reflecting variability in shape. Central tendency of log sem has also been found using larger samples, including birds, associated with a mean log sem value of -1.61 (Thackeray 2007). The mean log sem value of − 1.61 ± 0.1 has been recognized as a typical degree of intraspecific variation in extant species, as described by Thackeray and Dykes (2016). The remarkable consistency of this statistic is reflected by the following sets of data for extant and extinct vertebrate taxa (conspecific pairs):

  • − 1.61 (Crania of mammals, birds, reptiles etc.) (Thackeray 2007)

  • − 1.61 (Crania: female-female comparisons of Pan paniscus) (Gordon and Wood 2013)

  • − 1.62 (Crania: male-male comparisons of Pan paniscus) (Gordon and Wood 2013)

  • − 1.61 (Crania: female-male comparisons of Pan paniscus) (Gordon and Wood 2013)

  • − 1.62 (Crania: female-female comparisons of Pan troglodyes) (Gordon and Wood 2013)

  • − 1.60 (Crania: male-male comparisons of Pan troglodytes) (Gordon and Wood 2013)

  • − 1.60 (Crania: female-male comparisons of Pan troglodytes) (Gordon and Wood 2013)

  • − 1.61 (Crania: H. sapiens, P. troglogdytes, P. paniscus, Gorilla gorilla (Thackeray and Dykes 2016)

  • − 1.62 (Molars: H. sapiens, P. troglogdytes, P. paniscus, Gorilla gorilla (Dykes 2014)

  • − 1.61 (Molars: A. africanus, A. afarensis, H. habilis, H. erectus, P. robustus, P. boisei (Dykes 2014)

When Thackeray and Dykes (2016) confirmed a mean log sem value of − 1.61 from a study restricted to extant hominoids, using data published by Gordon and Wood (2013), the standard deviation was 0.1 (n = 8,072 regressions). The same mean log sem value of -1.61 had been reported by Thackeray (2007) for a greater diversity of fauna (n > 70 species).

A mean log sem value of T = − 1.61 with a standard deviation of 0.1 has been proposed as a probabilistic definition of a species, applicable to anatomical measurements of a diversity of fauna (Thackeray and Dykes 2016). As an example of method, it is used here as a frame of reference reflecting a typical degree of anatomical variation in extant vertebrate species, for purposes of comparison of iconic sketches of two “types” of finches (Fig. 1).

In this case, one reference point is the centre of the eye (point O), while a second landmark is the tip of the beak (E). OE is a reference line for others separated at intervals of 10 degrees. Thus line OD is ten degrees below the reference line OE. Likewise, lines OF, OG and OH are separated at 10 degree intervals relative to each other, clockwise above reference line OE. 20 measurements are obtained, clockwise from OD through to OW. This radial method has previously been used by Braun et al (2004).

Two sets of measurements are obtained, one for G. magnirostris and another for G. fortis. Two regression lines of the form y = mx + c can be obtained. In the case where measurements of G. magnirostris are on the x-axis and those of G. fortis are on the y-axis, we have the following result:

$$ y{\text{ }} = {\text{ }}0.683x{\text{ }} + {\text{ }}7.400{\text{ }}(\log {\text{ }}sem{\text{ }} = {\text{ }} - 1.498 $$

In the case where measurements of G. magnirostris are on the y-axis and those of G. fortis are on the x-axis, we have this result:

$$ y{\text{ }} = {\text{ }}1.399x - 7.414{\text{ }}(\log {\text{ }}sem{\text{ }} = - {\text{ }}1.187) $$

The mean log sem for the two regression equations is − 1.342. This is outside the upper 95% confidence limit above the mean log sem value of T = − 1.61 (± 0.1) for reference specimens representing the typical degree of variation in extant vertebrate species (Thackeray and Dykes 2016).

One must go further to assess probability of conspecificity. As discussed by Thackeray and Dykes (2016), the difference between two log sem values in comparisons of conspecific pairs is designated “delta log sem” and is typically equal to 0.03 (the mean delta log sem obtained from more than 8,000 regressions for conspecific pairs). If the difference exceeds this typical value in the case of any pairwise comparison, and if the log sem is outside the upper 95% confidence limit of − 1.61 (± 0.1), this would indicate that the two specimens being compared have a high probability of being different at a species level. In our example here, the difference between the two log sem values (− 1.498 and − 1.187) is 0.31. This greatly exceeds the value of 0.03, and together with the mean log sem value of − 1.342 we may infer that there is a high probability that our specimens attributed to G. magnirostris and G. fortis represent differences at the level of species in terms of probabilistic sigma taxonomy (Thackeray 2018; Thackeray and Schrein 2017).

To apply the same technique to actual material we use measurements of specimens attributed to six species of Geospiza, namely G. magnirostris, G. fortis, G. fulginosa, G. difficilis, G. scandens and G. conirostris. For purposes of demonstration of method, we use lengths of wing, tail, culmen, gonys, depth of bill at base, width of mandible at base, tarsus and middle toe with claw, from a large database curated by the California Academy of Sciences (Swarth 1931; Lack 1945). We selected 240 specimens (20 males and 20 females of each species) from 10 islands of the Galápagos archipelago (Isabela, Santiago, Santa Cruz, Genoseva, Darwin, Wolf, Marchena, Pinta, Fernandina and Pinzon). We recognize that the sample sizes are small, but this study of specimens from only 10 islands serves primarily as a demonstration of method, applied for the first time to Darwin’s finches.

We used a computer program for analyzing large data sets, focusing on log sem calculations (Dykes and Dykes 2015).

Results

Log sem and delta log sem values for pairwise comparisons are given in Table 1, based on specimens attributed to each of the six species, as identified in the catalogue of the California Academy of Sciences. In each case, the mean log sem values are circa − 1.90, and the corresponding delta log sem values are circa 0.03 (based on a total of 4,560 regression analyses).

Table 1 Results of pairwise comparisons of measurements of six species of Geospiza

There is a high probability of conspecificity in these comparisons because the mean log sem values of circa − 1.90 are well within the upper 95% confidence limit associated with the probabilistic definition of a species (T = − 1.61 ± 0.1, n > 8,000 regressions for conspecific vertebrate taxa), and also because the mean delta log sem values are circa 0.03, corresponding to the value published by Thackeray and Dykes (2016) for conspecific taxa.

The McKAY-ZINK (MZ) single species hypothesis

McKay and Zink (2015) suggested that there is only a single species (G. magnirostris) on the Galápagos archipelago, associated with a high degree of introgression in the context of inter-island travel. This “MZ” hypothesis has since been refuted by Zink and Vázquez-Miranda (2019). Expectations of the MZ hypothesis would have been the following: that for pairwise comparisons with specimens generally understood to be representative of G. magnirostris, mean log sem values would fall within the uppermost 95% confidence limit for what is defined as a species in terms of sigma taxonomy (T = − 1.61 ± 0.1), and secondly that delta log sem values would be ≤ 0.03 (Thackeray and Dykes 2016). As indicated below, these conditions do not apply in three instances.

It is not the case in the comparison of samples attributed to G. magnirostris and G. fulginosa (mean log sem = − 1.263, delta log sem = 0.270, n = 50 regressions); (2) nor is it the case for a comparison of samples attributed to G. magnirostris and G. difficilis (mean log sem = − 1.305, delta log sem = 0.135; n = 50 regressions); (3) nor is it the case for the comparison of samples attributed to G. magnirostris and G. scandens (mean log sem = − 1.358, delta log sem = 0.122; n = 50 regressions). In terms of sigma taxonomy there is a high probability that the samples being compared do not represent a single species of G. magnirostris.

These results serve to confirm the conclusion reached by Zink and Vázquez-Miranda (2019), refuting the single species hypothesis.

It can be noted that in the case of a comparison between G. scandens and G. fortis, a mean log sem value of − 1.560 and a mean delta log sem value of only 0.023 are obtained (n = 50 regressions). These results point to the probability that G. fortis and G. scandens are not different at a species level, using T = − 1.61 ± 0.1 as a probabilistic definition of a species. Indeed, a recent paper by Enbody et al (2023) indicates that the two taxa can hybridise, a fact which is not surprising in the context of log sem statistics. Furthermore, when UPGMA was used to generate a phenetic tree for six species of Geospiza, based on log sem values (Thackeray 2022), G. fortis and G. scandens grouped together. Thackeray’s phenetic tree corresponds closely to a phylogeny obtained by Burns et al. (2014) and Reaney et al. (2020) based on genetic data.

Conclusions

Our approach serves as one way of addressing Darwin’s appeal in 1859 (in the last chapter of The Origin of Species) for the “amount” of variation in a species to be assessed. This is indeed necessary in the context of the question as to how many species of Geospiza exist on the Galápagos archipelago. In confirmation of the conclusion reached by Zink and Vázquez-Mirana (2019), it has been possible to refute a single species hypothesis on the basis of our quantitative approach based on assessment of the probability that any two specimens are or are not conspecific, using a definition of a species in the case of morphometric analyses (Thackeray and Dykes 2016). At a species level, specimens attributed to G. magnirostris have a high probability of being different from those attributed to G. fulginosa, G. difficilis and G. scandens. By contrast, our results point to the probability that G. fortis and G. scandens are not different at a species level, using T = − 1.61 ± 0.1 as a probabilistic definition of a species.

In a demonstration of method, our analysis indicates the potential of the log sem approach, combined with delta log sem values in the context of sigma taxonomy and a probabilistic definition of a species (Thackeray and Dykes 2016).