Introduction

Human interaction with the world, in science and in daily life, inevitably depends on the implicit or explicit classification of entities in kinds. This holds in particular for interaction with the living world, as it confronts us with a nearly endless diversity of organisms, impossible to work with in an unstructured manner. From time immemorial, human societies have produced implicit classifications of those living beings they interact with, often referred to as ‘folk taxonomies’, grouping organisms in accordance with specific needs (e.g., Begossi et al. 2008). In science, taxonomists see it as their task to produce a more formal catalogue of life, which they do by consistently delimiting, describing and classifying kinds of organisms in a hierarchical system of species, genera and other taxa. While one of their aims simply is to ‘map the biosphere’ (Wheeler et al. 2012), taxonomists usually aspire for their work to be useful and authoritative for the other biological sciences, and even beyond, complementing or even replacing folk taxonomies.

In that sense, the work of taxonomists is usually presented as a stable, context-independent all-purpose classification, offering the best possible kinds to be had and to be used in virtually any application. This is in line with what users of kinds appear to prefer. We do tend to seek an unequivocal answer to the question what kind of thing a given entity is, which indeed requires a unique classification of the domain in which the entity occurs. In practice, however, the classification of life is often subject to confusion or disagreement, also in formal, scientific taxonomy. Taxonomists often produce multiple, cross-cutting classifications of the same living entities, typically because there is disagreement or uncertainty with regard to the exact criteria that would allow to establish a single, best classification. For example, with regard to the notable European orchid genus Ophrys, species classifications vary from recognizing 10 to more than 350 species (Cuypers et al. 2022). Similarly, research has found out that the four authoritative global checklists of bird species disagree in up to 25% of cases about which groups of birds should exactly be recognized as species (McClure et al. 2020; Neate-Clegg et al. 2021). Such situations, which occur across the tree of life, give cause to heated debates among taxonomists, and raise problems for various stakeholders that often look to science for answers and authority, for example in the economic and conservation communities.

Despite our tendency to prefer having one single all-context classification, several philosophers have since long argued that this is untenable, and have advocated versions of classificatory pluralism, stressing that multiple classifications should be allowed to coexist, and that whichever classification is best, is context-dependent (e.g., Dupré 1993; Kitcher 1984). However, this raises several questions with regard to how such pluralism is supposed to work in practice, and how an orderly and coordinated form of pluralism can be organized. If it is asserted that what the best classification is, is context-dependent, one obvious question is how one should find out what the best classification is for a given context. Another question concerns how, if multiple classifications are used across contexts, one could overview all these different contexts and classificatory practices, and ensure that work going beyond individual contexts still remains possible. A third question concerns what role scientific taxonomists are to play if taxonomic pluralism is adopted. And, relatedly, what role is left for the category of ‘species’, which is typically held to represent a special kind of kind, and as such represents taxonomy’s claim to context-independency.

In this article, we aim to explore these questions and provide a tentative answer by means of a case study, namely that of the classification of oaks (Fagaceae: Quercus), focusing in particular on classificatory approaches to the European pedunculate oak (Quercus robur L.) and sessile oak (Q. petraea (Matt.) Liebl.). These oaks, ubiquitous in European forests, are subject to various classificatory challenges. In formal taxonomy, Q. robur and Q. petraea are fairly well established as separate species, but that has not always been the case. Moreover, many questions and challenges are raised against that situation, such as their tendency to hybridize, or the fact that it is often not possible to link individual trees to either of the species unequivocally. Meanwhile, oaks and oak wood have since long played important roles in numerous cultural and economic practices, attracting considerable scientific and policy attention, so that their classification is effectively surrounded by many, often divergent interests.

In daily life, among naturalists or in economic activities, sometimes the distinction between the two groups is made, and sometimes they are taken together, seemingly depending on the presence or absence of relevant differences, but often without much reflection. In the field, Q. robur and Q. petraea are typically distinguished following the morphology of leaves and acorns. Q. robur, pedunculate oak, is cited to have leaves without petiole, but acorns with peduncle, while Q. petraea, sessile oak, is cited to have leaves with petioles (of about two cm), but acorns without peduncles (Eaton et al. 2016). However, as already mentioned, both groups show important morphological variation and intermediate forms abound, often thought to be the result of hybridization. This annoys many users, and makes their taxonomy suspicious, at least in theory.

This confusion also affects the interaction between broader practice and science. For example, in policymaking, which heavily relies on clear classifications, reference is often made to the authority of science. However, that can raise problems if science cannot provide sufficient certainty. Consider for example the regulation of trade in seeds and timber. European law makes a distinction between Q. robur and Q. petraea, imposing that batches of traded seeds may contain no more than 1% of other species, in particular other oak species with similarly looking seeds (Muir et al. 2000; Blanc-Jolivet and Liesebach 2015). This in part to ensure maximal success in forestry, where the distinction between the two putative oak species is considered to be relevant on ecological grounds (see below). However, when their seeds are difficult or impossible to distinguish, such regulation is difficult to comply with, or to enforce. Similarly, for trade in timber, clarity on the species identity of traded wood is also a legal requirement (see Blanc-Jolivet and Liesebach 2015). However, while the directive on trade in reproductive material (EU Council directive 1999/105/EC) contains a species list and thus some explicit taxonomic information (recognizing both Q. robur and Q. petraea as species), the timber regulation (EU Regulation 995/2010) does not, and thus is strictly spoken ambiguous on which taxonomy is followed. It probably assumes scientific consensus, but if that consensus cannot be guaranteed, substantial confusion could in principle follow.

In what follows, we show how one recent account of kinds and classifications, the so-called Grounded Functionality Account of (natural) kinds (GFA hereafter, see for example Ereshefsky & Reydon 2023) allows to shed light on classificatory conundrums such as that of Q. robur and Q. petraea, and provides directions for determining which classification is the best for a given context. As such, the GFA does not fully solve the conundrum and does not provide ready-made classifications for practical contexts—pluralism implies that classifications cannot be decreed in a top-down manner and that eventually it is up to the users of classifications to find out what suits them best. Rather, the GFA provides a tool that allows us to understand the roots of the conundrum, and provides guidelines for users regarding how to see the problem and regarding the right questions to ask. In this sense, our paper shows how the GFA both performs typical tasks of a philosophical theory, and provides clues for dealing with practical problems. Subsequently, we briefly discuss how in cases such as that of oak classification, pluralism can be coordinated, and what role taxonomic experts might play in the context of taxonomic pluralism.

Classificatory programs and the Grounded Functionality Account of kinds

The fundamental tenet of classificatory pluralism is that classifications can only be understood and assessed in connection with the specific aims and objectives for which they are produced. In the case of folk or practical taxonomies, that is not very controversial, but in the case of formal, scientific taxonomy it is, given its aspiration to construct context-independent all-purpose classifications. However, even if the sole aim of a classification is to map the biosphere, i.e., to accurately represent the diversity of life, it can be argued that the best way of doing so depends on the exact representational aims one has in mind and on which components of diversity one wants to prioritize, such as reproductive isolation, morphological or molecular differentiation or ecology. Advocates of classificatory pluralism tend to stress that classificatory disagreements can usually be understood by keeping in mind the fact that classifications are conceived from different perspectives, with different aims in mind. Different aims might obviously favor different classifications, which leads to conflict if it is a separate aim to set one classification as context-independent gold standard.

To formalize this idea, Ereshefsky (2001) introduced the notion of ‘classificatory programs’, arguing that classifications should be understood as the product of investigative or practical programs consisting of certain ‘sorting principles’, i.e., operational rules that determine how entities are classified, which are in their turn inspired by ‘motivating principles’, i.e., underlying aims for which a classification is needed. These motivating principles can vary greatly, aligning with epistemic or non-epistemic aims (see also Ereshefsky and Reydon 2015; Reydon and Ereshefsky 2022). This framework applies equally to scientific classificatory programs, and to practice-oriented programs outside the sciences.

A well-known example of a classificatory program within science is linked to the long-standing aim to delimit species that are reproductively isolated, which is formalized in the so-called Biological Species Concept (BSC, Reydon and Ereshefsky 2022). The criterion of reproductive isolation counts as a sorting principle, although testing reproductive isolation is not always easy, and requires further operationalization in practice. The motivating principle behind it is to delimit units of evolution, in particular units that are susceptible to undergo future evolution as a whole. Yet, although many biologists want to produce one unitarian classification that serves everyone, there are many other classificatory programs in biology than that inspiring the BSC. Arguably, all the traditional species concepts represent divergent motivating principles. The Ecological Species Concept (ESC), for example, aims to delimit ecologically distinct species, and the various versions of the Phylogenetic Species Concept (PSC) aim to identify historically distinct species. This is indeed one of the reasons why biologists regularly produce conflicting classifications (e.g., Cuypers et al. 2022).

This plurality of scientific classifications is complemented with a variety of classificatory programs that are external to science, such as those related to the policy cases described above. In a similar way, those working with living organisms or organismal ‘products’ in trade and the production of goods might not be interested in evolutionary or ecological units, but in units that reflect aspects relevant to their work. A shipbuilder or carpenter is unlikely to be bothered by evolutionary or phylogenetic considerations, but rather by the properties and the quality of the wood in their hands. While these non-scientific classificatory programs might be seen as irrelevant by biologists, they should be acknowledged in philosophical analysis, because of the interaction between non-scientific and scientific classificatory programs.

Ereshefsky and Reydon (2023) argue that philosophy can contribute to understanding classifications and classificatory practices by focusing on the assessment of actual classificatory practices in the sciences and outside science, rather than focusing on abstract and a priori accounts of kinds and classifications. To do this, one must examine classificatory practices and elucidate their motivations, and then assess whether the classifications used do or do not serve the motivating principles for which they are constructed. To structure such assessments, the authors (Ereshefsky and Reydon 2023; Reydon and Ereshefsky 2022) offer a framework called the ‘Grounded Functionality Account’ (GFA). The GFA starts from the straightforward assumption that for any classificatory program to be successful, the sorting principles and the classifications produced through them must contribute to the aims (motivating principles) for which they are produced (functionality), and that that functionality must be linked to an aspect of what the world is like, so that the success of a classificatory program is not a mere matter of chance, but results from successfully identifying and capturing a relevant aspect of the world (i.e., from being grounded).

That way, classifications can both be linked to particular aims or motivations, while conserving the requirement to delimit units that can be considered ‘natural’ to some degree. The GFA thus imposes two conditions for classifications to be deemed acceptable: a functionality condition and a grounding condition. The GFA requires a classificatory program to actively specify how the classification is grounded in the world, i.e., what aspects of the world the classification connects to and why a classification that connects to these aspects of the world is able to do the work that it is intended to do. Even though the GFA is intended to account for the kinds that are actually used in the various sciences and in other contexts of practice, and not as an account of all possible legitimate kinds, this aspect of the GFA gives it sufficient normative force to impose strong limits on classificatory pluralism (Reydon and Ereshefsky 2022: 10). Not just any aim that is successfully achieved by a classification in some context of research or practice is acceptable as indicating that the classification is well-grounded. Only those aims are acceptable for which an account is actually available that explains why the classification that is used succeeds better in achieving the aim in focus than other possible classifications—the onus is on the classificatory program itself to provide such an account. Importantly, this applies to scientific classificatory programs, but also to practical classificatory programs.

This requirement embodies the view (which seems uncontroversial to us) that any nonarbitrary classification must represent something real about the entities that are being classified and, moreover, that any nonarbitrary classification is able to do the epistemic or practical work it does precisely because it represents something real about the entities that are being classified. For the most widely used species concepts, for example, it can be argued that their biological meaning consists in what aspects of the world out there species represent: the different species concepts highlight different factors that bind organisms together into the entities on which taxonomists focus (sets of organisms, populations, lineages and clades), and this is what makes species concepts useful as foundations for the construction of taxonomies (Reydon and Kunz 2019). Species thus are not groups that simply exist in nature independently of human classificatory activities, but groups formed by us with the explicit aim to represent theoretically relevant causal factors in nature (Reydon and Kunz 2019: 632; Reydon and Ereshefsky 2022: 5). The fact that there are many such factors (such as reproductive connections between organisms, interbreeding between populations, similarity in adaptive responses to similar environments, common descent and others) underlies the need for a multiplicity of species concepts—biologists require different concepts for contexts of research or use that focus on different factors in nature.

In this sense, a classificatory program can be understood as a theory about how a classification best serves a particular aim. As a theory, it can be tested empirically, namely by assessing whether the classification built on the basis of the program’s sorting principles indeed succeeds in achieving the aim set in its motivating principles. Such a test involves the specification of criteria for success, i.e., a clear view, on the part of the users of the classification, on when their aims are achieved in the best way. What is tested is the connection that the classificatory program makes between the motivating and sorting principles: the classificatory program should not only specify the aims of a classification and the basis for constructing a classification that meets these aims, but it must also explain why a classification constructed following its sorting principles (in comparison to other sorting principles) will best achieve the aims set by the program (Reydon and Ereshefsky 2022: 10–11). To do that, the classificatory program should specify how the aspects of the world the classification connects to (i.e., the way in which it is grounded) cause the classification to achieve the aims it was devised for. This grounding in the world, after all, is what distinguishes classifications that are merely accidentally successful and for which we do not know why they work, from classifications that are well-founded and the success of which is not a mystery. The GFA thus provides a perspective for the philosophical understanding of the success of classifications used in various contexts of research and practice.

Obviously, applying the criteria discussed above in practice requires some empirical work. Under the logic of classificatory programs and the GFA, any philosophical analysis and any critical assessment of classificatory programs and practices must be preceded by empirical investigation of the relevant classificatory practice. Such an investigation must focus on the scientific as well as non-scientific aims and motivations that underpin the use of classifications, and it must characterize how particular classificatory practices are constructed to meet these aims. In a next step, it can then be assessed whether these particular practices do serve the aims for which they are produced, and whether they meet the normative requirements provided by the GFA.

In the next section, we apply these principles to the case of Q. robur and Q. petraea, identifying some of the main classificatory programs at play, and how they interact. Non-exhaustively, of course, we try to identify the most important scientific and non-scientific classificatory aims at play, and to offer an assessment of how classificatory practices do or do not meet these aims. In reality, a distinction can often be made between specific classificatory programs, aimed at a particular, material goal, such as delimiting reproductively isolated units, or relevant kinds of timber for shipbuilding, and more transversal interests or ‘classificatory virtues’ that play a role in many classificatory programs, such as clarity, stability, identifiability and others. While this might slightly complicate images of classificatory programs, the logic of the GFA applies in a similar way to particular and universal classificatory interests. Part of the exercise will be to assess how these interact.

Classificatory programs and oaks

The taxonomic confusion on the classification of Q. robur and Q. petraea has a long history. For example, in his Species plantarum (1753), Linnaeus recognized the taxon Quercus robur, which given nomenclatural rules counts as the correct name since, but appears to make no mention of something recognizable as Q. petraea. This in contrast to his older Flora Suecica (1745), where he did seem to recognize what we know as Q. petraea, as a variety under Q. robur, referred to as Quercus latifolia mas, quae brevi pediculo (‘with short peduncle’, see Gardiner 1975; Schwarz 1935). In any case, Linnaeus appears to have treated the whole as one species. The first to have coined the epithet petraea is assumed to be the German botanist Heinrich Gottfried von Mattuschka, who described the taxon in his Flora Silesiaca (1777), yet also considered it a variety (‘Spielart’) of Q. robur. The first who formally published the name Q. petraea as denoting a taxon at species level is assumed to have been Franz Kaspar Lieblein, another German botanist, who did so in his Flora Fuldensis (1784). However, he copied verbatim the description of von Mattuschka, and did not say anything about why he elevated the group to species status. For this reason, Gardiner (1975) argues that Lieblein is in fact ambiguous on the exact taxonomic rank, and seemingly was not really bothered with the issue.

Similar confusion seemed to have occurred in the rest of Europe at the time. For example, the English botanist Richard Salisbury recognized both taxa as separate species, referring to sessile oak as Quercus sessiliflora, a name now superseded by the older petraea of von Mattuschka and Lieblein, for which he referred to the work of another botanist, Thomas Martyn. He, in his Flora Rustica (1792), lumped Q. robur (var.) sessilis as a variety under a broad species Q. robur. Interestingly, Martyn acknowledged the complexity of oak classification, and referred to the French botanist Fougeroux de Bondaroy, who had decided on the matter by referring to what was common practice among woodworkers. Fougeroux (1781) effectively argued that botanists had too little data to settle the question, but that woodworkers did have the knowledge to distinguish between the two groups, and do distinguish between them in fact.

Later on, oak taxonomy was among the cases that motivated De Candolle, at first bored at the outset of having to study an enormous amount of herbarium material, to eventually write a study on the species concept starting from oaks (De Candolle 1862). And even Darwin used the case of oaks in his chapter on ‘variation under nature’ in the Origin of Species, discussing them in the first edition of the Origin under the sub-heading of ‘doubtful species’ as an example of taxonomic disagreement, and discussing De Candolle’s work in the sixth edition. After having remarked that when it comes to ranking a form as a species or a variety ‘the opinion of naturalists having sound judgment and wide experience seems the only guide to follow’ (Darwin 1859: 47), he went on to highlight that often more investigation yields more controversy. As he wrote: ‘Look at the common oak, how closely it has been studied; yet a German author makes more than a dozen species out of forms, which are very generally considered as varieties; and in the country the highest botanical authorities and practical men can be quoted to show that the sessile and pedunculated oaks are either good and distinct species or mere varieties’ (Darwin 1859: 50).

To hybridize, or not to hybridize

Despite the historical confusion on whether one or two species should be recognized, the split between Q. robur and Q. petraea is currently rarely questioned in formal taxonomy. However, as already touched upon, many scholars do agree that that is not evident, mainly because of the fact that both groups are reputed to have a strong tendency to hybridize (see for example Rushton 1993). Hybrids of Q. robur and Q. petraea are sometimes even recognized as a separate taxon, Q. x rosacea Bechstein. Anyhow, any occurrence of hybridization puts the groups theoretically in violation of the Biological Species Concept, which aims to delimit species that are reproductively isolated. This problem is not unique to European oaks and is also observed elsewhere, for example in North America (Burger 1975): there appears to be substantial gene flow between many traditionally distinguished groups of oaks.

Interestingly, some authors have argued for that reason that the BSC ‘does not work’ for oaks: it fails to produce what they consider to be reasonable species (Burger 1975; Cannon and Petit 2020; Van Valen 1976). From the perspective of the GFA, that is an overly generalizing statement, because whether or not a species concept ‘works’ is fundamentally context-dependent. Following the GFA, a species concept or classificatory program can fail if for some reason it cannot produce a classification that meets its objectives, or if it cannot ground that classification in the world, but not if it fails to produce a classification that is upheld for other reasons.

In light of the GFA, it is important to make a distinction between two debates that are of a different nature, but that are too often intertwined in actual discussions. On the one hand, there can be a debate on motivating principles and their relevance: should we want to delimit reproductively isolated groups of oaks, or not, and why? On the other hand, there can be a debate on how exactly to identify such reproductively isolated groups, i.e., on what the best sorting principles are and, for example, on the extent of hybridization and therefore the degree of gene flow between two putative groups. The former debate ultimately is one of aims and priorities, the latter is of a much more practical and empirical nature, concerned with how motivating principles are operationalized, and with the interpretation of the output of operational sorting procedures.

This is nicely illustrated by the case of Q. robur and Q. petraea. Some authors point to conceptual and methodological difficulties with respect to measuring hybridization (e.g., Aas 1993). Given the inherent morphological variability of oaks, it is not always clear whether morphological intermediates are true hybrid descendants of pure parents. Also, hybridization often leads to a gradient, because of the backcrossing of hybrids with parental trees, so that it is difficult to say from which point an individual is hybrid rather than pure. These difficulties have led to a debate on the extent and importance of hybridization, with some authors arguing that the occurrence of hybridization has been exaggerated, and actually is not really an important problem (e.g., Becker and Lévy 1990; Curtu et al. 2007). Similarly, controlled hybridization experiments have been conducted and have shown at least the possibility of hybridization (e.g., Steinhof 1993), but again there can be debate on whether and how one can infer natural gene flow from experiments.

Such questions are real and relevant, and they exemplify how the GFA works out in practice and how classifications serving a specific objective should be empirically grounded. Uncontroversially: should we want to distinguish reproductively isolated units of organisms, we face the empirical question what reproductively isolated units there are in the world. As our example shows, that is not an easy question. Simultaneously, that empirical question is the only relevant one to come to a classification for the stated objective. Should the evidence tilt towards there being important hybridization between Q. robur and Q. petraea, and thus towards lumping under the BSC, they should effectively in this context be lumped, regardless of what tradition says, or of what happens in the application of other species concepts. Such lumping was proposed for example by Kleinschmit et al. (1995), but the situation remains unclear (See for example Coyne and Orr; 2004: 43–45).

Ecology and ecology are two

Another element that has played an important role in the debate on oak classification is ecology, partly because oaks have been central in the development of the Ecological Species Concept (ESC) by Van Valen (1976). Van Valen was among those convinced that the BSC, the gold standard among species concepts at that time, did not ‘work’ for oaks, mostly because it fails to distinguish between groups that do have ecological differences, betraying divergent adaptation to different ecological niches. To him, this was taxonomically more relevant than reproductive isolation. The criticism of Van Valen on the BSC was partly theoretical, in the sense that he believed that reproductive isolation is of minor importance for evolution—what counts is divergence and adaptation. If units are in that sense ecologically divergent, under the ESC they should be seen as separate species.

How the conflict between ESC and the BSC should be modelled under the GFA is a matter of philosophical subtleties, and partly a matter of perspective. Both concepts claim the ultimate aim of identifying units of evolution, more or less in line with de Queiroz (2007), who argued that most species concepts in fact want the same, to delimit the units of evolution, but do so with different methods. Taken in that way, in the framework of the GFA, the ESC and the BSC have similar motivating principles, but divergent sorting principles, and the debate is on which sorting principles suit the motivating principles best. However, looking with a more fine-grained resolution, there are differences in aims. The BSC, for example, takes a more future-oriented look on evolutionary units—what are the units of future evolution?—while the ESC adopts a more product-oriented view—what are the products of evolution? Focusing on these differences in nuance helps to understand why different operational sorting principles are used, and why after long debates, no consensus is reached.

Whether putative taxa like Q. robur and Q. petraea are relevant units under the ESC is then again an empirical question, depending on actual ecological differences. These have been the focus of a substantial research tradition trying to observe habitat differences between both groups, which has established that Q. robur is better adapted to wetter, often more alkaline habitats, while Q. petraea grows better in drier, more acidic habitats (Eaton et al. 2016). These ecological differences occur regardless from the occurrence of gene flow. The fact that Q. robur and Q. petraea constitute at the same time distinct adaptive units and a whole that is subject to gene flow has led some to qualify them as a ‘syngameon’, a term which exactly denotes a group of otherwise distinct taxa that do interbreed and are interlinked in their evolutionary trajectory (Cannon and Petit 2020). Both levels of such a syngameon are of evolutionary relevance. The notion of a syngameon per definition points towards classificatory pluralism: both its constituent parts, and the syngameon as a whole are of classificatory relevance, depending on which aspect or level one is interested in.

With regard to ecology the following observation seems important: much of the research exploring ecological differences in terms of habitat preference is not so much inspired by taxonomic questions and the ESC, but by proper ecological interests or interests in forestry, environmental issues and so on. Information on the habitat requirements of kinds of trees is of use for foresters, so that they know which tree is best to plant in which area, depending on relevant environmental conditions. This is again illustrated by our example. Substantial research on possible ecological differences between Q. robur and Q. petraea was initiated after the great European drought of 1976, which had led to significant dieback in oak forests. It was found that this dieback was much more pronounced among oaks identified as Q. robur than among oaks identified as Q. petraea, and subsequent investigation concluded the latter indeed fares much better in drought conditions than the former, important information in the light of global change (Becker and Lévy 1983, 1990; Lévy et al. 1992).

Arguably, this story reveals a different classificatory program, that is not so much interested in formal taxonomy or in evolution, but rather in delimiting relevant units for forestry, and maybe also climate change mitigation. The difference between these classificatory programs lies in their motivating principles (they classify for different purposes), even though these principles might support the same sorting principles, and therefore the same resulting classification. It is because research finds ecological differences that are relevant to forestry, such as concerning drought-tolerance, that a classification is validated and becomes entrenched. It also shows how it is reasonable, under the GFA, to demand empirical confirmation of a classification within the specific context for which it was devised, and thus that the functionality and grounding conditions, should be taken seriously. In this case, the classification is tuned to the demand of foresters to be able to decide which trees to plant in which environments, and the validation of the classification lies in its actual success with respect to satisfying this demand (functionality) and the availability of an explanation why this classification functions successfully in this regard (grounding).

That said, habitat preferences and drought-tolerance are but one aspect of ecology. Another ecological feature of such large organisms as oaks, is that they themselves play a role as habitat for a variety of smaller organisms, for example arthropods. Research into this matter is much more recent, and for that reason one could argue that the relevant classificatory program probably has not yet fully crystallized. For example, in their analysis of the arthropod fauna sustained by oaks in Norway, Thunes et al. (2021) explicitly do not make a distinction between Q. robur and Q. petraea, because of the inherent taxonomic difficulties discussed above, and seemingly because they do not see the distinction as relevant for arthropod fauna (although this lack of relevance could be a research question in itself). However, it might be that in the future, further research does find differences between Q. robur and Q. petraea as habitats, so that that they should be taken apart in the future. On the one hand, this shows that whether these oaks hybridize, or whether they differ in aspects such as drought-tolerance does not matter within the context of oaks-as-habitats. On the other hand, it illustrates how the development of a classificatory program could possibly be a long process, and how what is the justified classification can change when new information becomes available.

Thus, even if they share a focus on ecology, classificatory programs that have different aims can operate independently, and can justify different classifications. What that classification is, is again an evolving empirical question, that can only be answered within these aims. The GFA provides a strict perspective here. For example, next to ecological arguments, challenges to the taxonomic status-quo under the BSC are also rejected by invoking virtues such as stability. Burger (1975: 49) acknowledges that the BSC pleads against standing taxonomies in Quercus, but argues that it is legitimate to continue with morphological taxa, among other reasons because the current taxonomic approach constitutes an ‘already functional hierarchy in a time-tested system of information retrieval.’ Under the GFA, such arguments can in principle not be assessed separately from any specific motivating principle. Within a particular classificatory program, stability can be required, but it cannot be required independently of any such program. Accordingly, Burger’s argument cannot hold independently of any specification of classificatory motives (e.g., the need for a stable grouping) and the explicit connection of these motives to the aims that the classification should achieve within a particular context of practice (e.g., stable groups are needed by foresters when making decisions about planting trees in specific locations, because classificatory instability would entail that such decisions would be made on a different basis at different times, thus hampering the long-term management of a specific location).

Oakonomics: the case of winemaking

As we have argued, the lens of the GFA can be used both in the case of scientific and non-scientific classificatory programs: the principles and requirements are fully similar. Arguably, each application of oaks can have its own classificatory program. For example, applications of oak wood in the economy range from construction, to the making of furniture, the smoking of meats, or cooperage for the production of beverages, in particular wine. In some cases, there might be relevant distinctions between Q. robur and Q. petraea, in other cases there might be not, and sometimes, still more fine-grained classifications can be relevant, depending on the exact properties of oak wood an economic sector interacts with.

Let us look at the last example, the use of oak wood in cooperage for barrels in which wine and other beverages are matured. It has been shown that the wood used can strongly influence the taste of the eventual product: wooden containers are so-called ‘active vessels’ that chemically interact with the liquids stored in them (del Alamo-Sanza and Nevares 2018). Indeed, beverages such as wine are reputed for their complex and varying taste, which is brought about by a complex array of chemical compounds that is the result of an important number of factors, including the grape variety used and the environment in which the grapes are cultivated, but also various factors in the processes of winemaking and ageing, such as the barrels used (Lund and Bohlmann 2006). All these factors can give rise to important classificatory programs, i.e., to a search for relevant units of discrimination. As such, the classification of grape varieties is subject to its own discipline, named ampelography (see for example Chitwood 2020), but the same can thus be said for the wood used in wine-related cooperage: it is a relevant classificatory motivating principle to want to distinguish units of wood that give a different taste to wine.

What these units are, is once more an empirical question. That is confirmed by the existence of an actual research program that tries to test the effect of different kinds of oak wood, for example coming from different putative species such as Q. robur and Q. petraea, but also wood coming from different forests, on the chemical composition and organoleptic qualities of wine. For example, Gougeon et al. (2009) have found that both the putative species (Q. robur and Q. petraea) and the region of origin of the wood leave a distinct chemical signature in wine, even after 10 years of ageing, suggesting that there might be reasons to treat both groups separately here. If it were to be confirmed that this has a perceptible effect on taste, classificatory practices might follow from this, recognizing cooperage wood in terms of species, or of geographic origin. Currently, the exact species identity or geographic of wood used for ageing is never mentioned on wine bottles, but under the logic of the GFA, pending further empirical corroboration of possibly relevant differences, that could change. Given the underlying motivation, chemical research should also be complemented with actual tasting, to see if differences can actually be sensed. But if relevant distinctions are confirmed, they might become entrenched in classificatory practice, showing the empirical dynamics behind classificatory programs. The case of oaks and winemaking illustrates how classifications in practical contexts follow very much the same pattern as in scientific contexts. It also reveals again that, as in the case of oaks-as-habitats, the development of classificatory programs extends through time.

Oaks in law and policy

Given their importance, as was touched upon in the introduction, oaks are also subject to policy and regulation, and here too, classifications come in play. Many living entities present policy challenges, and here classificatory clarity is important, as the scope of a policy or law must be unequivocal. In the United States of America, there have been several lawsuits over the exact content of taxonomic units, often in the context of conservation law, showing that classificatory vagueness can have important legal consequences (Wheeler 2014). For example, in various contexts, it might not be a problem that there are intermediate forms that cannot be unequivocally attributed to either group—winemakers interested in distinguishing both groups can simply not use the wood of intermediate trees—but for policy-ends the occurrence of intermediate or ambiguous forms does lead to problems.

Oaks are subject to several EU policies, such as the directive regulating the trade of seeds, and the regulation regulating the trade in timber. Arguably, each of these policies comes with its own classificatory program. The motivating principle here can be complex. In the case of the trade in seeds, the protection of the purity of lots of seeds can be seen as partly inspired by the need to ensure that the right tree is planted in the right place (Muir et al. 2000). The motivation of the related classificatory program then is related to distinguishing relevant units, for example with regard to habitat preferences, in line with what was said about ecology-related classificatory programs. However, in this policy-context, other, more practical, requirements are also in place, as for example that it must be possible to attribute every individual unequivocally to one unit.

In the directive on trade in forest reproductive material, the distinction between Q. robur and Q. petraea is explicitly made. This, as discussed, makes sense from an ecological viewpoint. However, whether that classification is fully functional, remains to be seen. As Dupouey and Le Bouler (1989) argue, it is not necessarily the case that because one can distinguish adult trees belonging to both groups in the field, that one can distinguish them in lots of seeds on the market. The authors have analyzed whether acorn morphology can be indicative of belonging to either group, and have found that with a multivariate measure, mainly focusing on the minimal radius and the distance of the maximal radius to the apex, both groups can be distinguished correctly in around 85% of the cases. Whether that is enough certainty to allow smooth enforcement of a policy, can be a matter for discussion. However, it shows how practical considerations can complicate classificatory programs, and that theoretically ideal classifications are not necessarily functional in practice.

The importance of such practice-oriented reflection is also illustrated by the case of the timber regulation. Here, as was said, the law itself does not specify a classification, so that, strictly speaking, it is ambiguous whether Q. robur or Q. petraea should be seen as separate species, or whether they can be seen as one species. As scientific sources can be found that plead in both directions, it seems at least wrong for policymakers to rely on a perceived scientific consensus, as it can lead to legal conundrums. For the timber regulation as well, enforcement is an important consideration. If the distinction is made, for example because both putative species differ in relevant traits of timber, law enforcers should be able to distinguish between them in practice. Hence, Blanc-Jolivet and Liesebach (2015) tried to find the best methods for DNA fingerprinting, so that the species (and the geographic origin, for that matter) of oak material can be assessed. The success of the classificatory program related to legal regulations concerning oaks depends on whether such methods are effectively found, and thus whether a functional classification can be produced. Hence, policymakers should be aware of the plurality of possible classificatory choices, and actively aim to adopt classifications that do justice to the specific aims of their policies. Of course, as will be detailed further on, they need not do this on their own: taxonomists and other experts can play a vital role in providing context-specific classificatory expertise.

Morphology, genetics, and the GFA

The fact that Q. robur and Q. petraea are often difficult to distinguish has led to a proliferation of studies investigating morphological and molecular differences. Research on the morphology related to Quercus robur and Quercus petraea has been strongly influenced by the principles of numerical taxonomy. The tradition of numerical taxonomy was inspired by the desire to increase the objectivity of taxonomic decisions by working with a large number of characters and multivariate statistics, all while avoiding any evolutionary or phylogenetic speculation (e.g., Sokal 1963). It was thought that by using many characters, the need for the interpretation of characters could be reduced. That this practice was enthusiastically adopted in oak systematics probably follows from the fact that many morphological characters taken on their own generate difficulties, because of important in-group variability, or the lack of clear breaks. It seems a logical step to test whether these occurrent difficulties are filtered out in multivariate analysis. In a next step, such an analysis can also be used to find, if any, the best single characters for differentiation.

For example, Kremer et al. (2002) have investigated various leaf morphological traits, such as the length of the lamina and of the petiole, the width of the lobes and the sinuses, the number of intercalary veins and the degree of pubescence. Using statistical techniques that bring these variables together in one synthetic variable, they show that the morphology of the oaks under consideration have, for this variable, a bimodal distribution. This suggests the existence of two morphological groups, however with a number of intermediate forms, so that there is no absolute break between them. The authors then tested which of the individual traits correlated best with the synthetic variable, making these the most significant traits for identification. In this case, the best correlations were found in petiole length, intercalary venation and pubescence, which according to the authors is consistent with traditional practices in the identification of Q. robur and Q. petraea. Similar results were obtained for example by Aas (1993), Dupouey and Badeau (1993) and Kelleher et al. (2004), using different sets of variables. For example, Dupouey and Badeau (1993) use no less than 34 characters, including characters related to acorns. This illustrates the variety of characters that can be used, either on their own or in combination. As mentioned, Dupouey and Le Bouler (1989) focus purely on fruit characters, and still other phenotypic traits are seen as relevant, for instance regarding the anatomy of the wood (Feuillat et al. 1997).

All this reveals some interesting issues with regard to classificatory programs and the GFA. Although it claims objectivity, even a numerical approach faces many choices, for example with regard to which characters are included, and how any output is interpreted—for example as to whether the occurrence of intermediate forms is a problem or not. But the GFA shows that such choices can only be made with reference to classificatory aims—the attempt of numerical taxonomy to simply process large amounts of data on traits without reference to any theoretical or practical context was misguided. As was argued, a classificatory program related to the trade of seeds should be informed by both the traits of the eventual trees and the traits of seeds. One related to timber trade or to winemaking to properties of the wood, but not of the seeds, and neither of the leaves. A naturalist, on the contrary, will mostly try to distinguish trees by their bark, leaves, flowers and fruits. Thus, while all these data on various traits are valuable in themselves, they cannot inform classificatory decisions apart from any specific context. Similarly, whether the occurrence of intermediate, or unidentifiable individuals is a problem, also depends on the context, as was illustrated above.

The same goes for molecular traits. A variety of genetic characters have been explored, using single markers, multiple markers, or on a whole-genomic scale, and various methods have been used to assess molecular differences between and within putative populations of Q. robur and Q. petraea (e.g., Gömöry et al. 2001; Kremer and Petit 1993; Saintagne et al. 2004). Again, which are the best characters with respect to the allocation of local populations to species, the best methods to characterize these populations, and the best ways to interpret these characters, depends on the specific classificatory aims, on scientific possibilities and on practical requirements. Genomic approaches might have advantages from a scientific perspective, but if practical and cheap molecular tools are required, it can be better to build a classification on a single marker that is easy to assess.

Classificatory pluralism in practice and the role of taxonomists

The GFA and the case study on oaks presented above illustrate how classificatory questions can only be answered with explicit reference to the contexts in which given classifications are used, and that different classifications can be the most suitable in different contexts. As such, applying the GFA vindicates classificatory pluralism, and reveals what considerations are necessary—within the context of pluralism—to come to a suitable classification in a given context, for a given objective. A consequence of this is that it becomes undesirable to try to impose one single all-purpose classification, however well intended that may be. However, that obviously raises several questions and challenges, for example concerning the role that is left for scientific taxonomists in a pluralist classificatory world, just as for the notion of ‘species’ as a special kind of kind, with context-independent aspirations. And it raises a challenges as to how can we ensure that communication across contexts remains possible, if different actors use different classifications. Ultimately, everyone could use their own tailor-made classification, but that would lead to confusion as well, in particular given the increasing importance of data integration and interdisciplinarity.

Limited space does not allow us to discuss these issues in full detail, but we do want to raise some observations. First of all, while the GFA implies that multiple useful maps of the biosphere can be drawn, depending on priorities, this does not mean that the aim to map the biosphere is no longer valuable. Rather, it means that that aim can be achieved in various ways. The discovery of new diversity, the charting of unknown territories in the biosphere, remains a crucial task for taxonomy. However, taxonomists should be transparent on what units of new diversity they describe represent. Otherwise said, to which map of diversity—for example that of ecologically distinct units, or that of reproductively isolated units, descriptions and classifications of species belong. Related to this, we do not necessarily plead for abolishing the notion of species, but we do plead for everyone to be transparent on what one means when calling something a species (see also Conix et al. 2023).

Apart from this, we do believe that taxonomy needs to evolve in part from aiming to produce one all-purpose map of diversity, to aiming to produce and coordinate the multiple sensible maps of the biosphere that can be drawn. In a way, the GFA leaves ample space, or even generates an important workload, for taxonomic or classificatory science. As the GFA argues, and as our examples illustrate, classificatory programs come with important and often difficult empirical, scientific questions to determine what a functional and grounded classification is for a given context. Foresters need empirical information on the ecological preferences of trees to be able to relate trees to possible habitats. Similarly, winemakers need empirical information on relevant chemical differences between kinds of wood used in cooperage. And policymakers regulating trade need information on the identifiability of relevant units, on the reliability and cost-effectiveness of genetic markers, and so on. In practice, it is not possible to expect all actors in the fields to do all work classificatory work themselves. Rather, they should be able to rely on scientific expertise for this, and the unique classificatory expertise of taxonomists can be of great value here.

As such, we believe taxonomists have a role to play in becoming ‘managers of pluralism’, for example in keeping track of classificatory programs. This requires some sense of detail. While the issue for Q. robur and Q. petraea is usually represented as a matter of either splitting or lumping, in reality is of course more complex. Similarly looking classifications, such as Q. robur and Q. petraea taken apart, may be built on very different sorting principles, so that they have different properties, which should be mapped as well. A split based on the morphology of the acorns might look similar to one based on the morphology of the leaves or on ecology, but look different in terms of the occurrence of intermediate forms, in the ease of identification, and so on. Moreover, using different sorting principles often yields groups that do not match each other perfectly in extension, even when all classifications involve a dichotomy between Q. robur and Q. petraea: borderline cases that would be counted as closer to Q. robur in one classification may be seen as closer to Q. petraea in a different classification or as fully intermediate between the two.

As such, by keeping overview, taxonomists can also preserve the possibility to navigate and communicate between contexts. For larger classifications, they can also do this by providing taxonomic alignments. Taxonomic alignments are information tools in which all relations between the entities recognized within classifications are made explicit, so that all classifications can be mapped unto each other. In a way, they are maps of maps of biodiversity, allowing to navigate between classifications produced from different contexts (Cuypers et al. 2022; Franz and Peet 2009; Franz et al. 2016). Through them, data, research or policies built on one classification can, as it were, be ‘translated” into another classification, thus preserving the possibility of communication and exchange.

Conclusion

The classification of entities is an inevitable step in our interaction with the world, but it is in no way an evident operation, and often comes with challenges, confusion, or disagreements. The case of oaks makes this abundantly clear: there is confusion on oak taxonomy within the sciences, there are questions and difficulties concerning classifications of oaks in the economy and in policymaking, and, moreover, there appears to be confusion on how scientific and non-scientific classification should relate to each other. The Grounded Functionality Account of kinds offers a practical framework that on the one hand helps to elucidate such classificatory conundrums, and on the other hand provides a workable normative framework to assess and improve classificatory practices. By highlighting the multiplicity of interests and motivations that usually underly classificatory practices, it inevitably leads to classificatory pluralism, but a conditioned one, based on scientific input. This leaves an important role for science and formal taxonomy, in providing classificatory expertise to users of classification, and in the managing of classificatory pluralism. Perhaps paradoxically, the GFA shows, to scientists and non-scientific stakeholders alike, that it is only by acknowledging the plurality of useful classifications, that we can keep seeing the oaks for the wood.