1 Introduction

It all began with the problems of induction. Within analytic philosophy, induction has been seen as a problem concerning inferences that have been analysed as relations between sentences. However, it soon became apparent that the logical approach resulted in paradoxes. The most well-known are Goodman’s (1954) ‘new riddle of induction’ and Hempel’s (1965) ‘paradox of confirmation’. If we use logical relations alone to determine which inductions are valid, the fact that all concepts are treated on a par induces symmetries which are not preserved by our intuitions concerning which inductive inferences are permissible: ‘Raven’ in Hempel’s paradox is treated on a par with ‘non-raven’, ‘green’ in Goodman’s with ‘grue’, etc. Intuitively, however, ‘non-raven’ and ‘grue’ are not natural concepts. What is needed is a non-logical way of distinguishing the natural concepts that may be used in inductive inferences from those that may not.

There are several suggestions for such a distinction in the literature. One idea is that some predicates denote “natural kinds” (or “natural properties”) while others don’t, and it is only the former that may be used in inductive reasoning. Natural kinds are normally interpreted realistically, following the Aristotelian tradition, and therefore assumed to represent something that exists in reality independently of human cognition. However, when it comes to inductive inferences it is not sufficient that the properties exist out there somewhere, but we need to be able to grasp the natural kinds by our minds. In other words, what is needed to understand induction, as performed by humans, is a conceptualistic or cognitive analysis of natural concepts.

Goodman (1954) tries to solve the problem by distinguishing between ‘entrenched’ concepts such as ‘blue’ and ‘green’ which have been used successfully in many inferences, and concepts such as ‘grue’ and ‘bleen’ which have not. Goodman’s distinction, however, provides little more than a name of the problem. He does not give any constructive suggestions for how to determine whether a concept is entrenched or not.

A development within psychology that is relevant for a description of natural concepts is prototype theory which was put forward by Rosch (1975, 1978). The main idea of this theory is that within a category of objects, like those instantiating a property, certain members are judged to be more representative of the category than others. For example, oranges are judged to be more representative of the category fruit than are kumquats, durians and cloudberries; and desk chairs are more typical instances of the category chair than rocking chairs, deck-chairs, and beanbag chairs. The most representative members of a category are called prototypical members.

Now if the traditional philosophical characterization of a concept in terms of necessary and sufficient conditions is adopted it is very difficult to explain such prototype effects. Either an object is a member of the class assigned to a concept or it is not and all members of the class have equal status as category members. Rosch’s research has been aimed at showing asymmetries among category members. Since the traditional Aristotelian definition of a concept does not predict such asymmetries, something else must be going on.

The distinction between prototypical and non-prototypical examples of a concept can be seen as a ‘horizontal’ dimension of prototype theory. The theory also has a ‘vertical’ dimension pertaining to the level of specificity of the class of concepts under consideration. Rosch (1975, 1978) distinguishes between superordinate, basic, and subordinate levels. For example, ‘animal’ is superordinate to the basic level ‘dog’ and ‘terrier’ is subordinate.

In this article I take a cognitive approach to natural concepts. Hence, I will not discuss realistic theories pertaining to “natural kinds” and their ilk. My aim is to introduce criteria that are evaluated with respect to how they support the cognitive economy of humans when using concepts in reasoning and communicating with them. In the following section, I first briefly present the theory of conceptual spaces as a tool for expressing the criteria. Then I introduce the central idea that natural concepts correspond to convex regions of a conceptual space. I argue that this criterion, denoted Convexity, has far-reaching consequences for the properties of natural concepts. I also present some other criteria that further delimits the class of natural concepts. Finally, I show that Convexity and other criteria also make it possible to determine that people mean the same thing when they communicate using concepts.

2 Conceptual Spaces

Conceptual spaces have been developed as a research program in semantics studying the structure of concepts and their interrelations using geometrical methods (Gärdenfors 2000, 2014). The approach builds on two central ideas about the composition and structure of concepts and properties: (i) they are composed by clusters of quality dimensions, many of which are generated by sensory inputs such as colour, size and temperature; (ii) they have a geometric or topological structure that is the result of the integration of the specific structures of the dimensions.

Quality dimensions can be integral or separable. They are integral when one cannot assign an object a value in one dimension without assigning another value in another dimension (Maddox, 1992). For instance, it is not possible to attribute a value to pitch of a tone without attributing one to loudness. When quality dimensions are not integral, they are called separable.Footnote 1

The notion of domain is defined as a set of integral dimensions that are separable from all other dimensions. For instance, human colour properties are composed of three fundamental parameters of colour perception: hue, intensity and brightness (Gärdenfors 2000). Any colour perception is mapped onto some specific values to these dimensions. More generally, different colours can be described as regions of possible values of these three parameters (see Fig. 1).

Fig. 1
figure 1

Colour space consisting of the dimensions hue, intensity and brightness

The central notion of ‘conceptual space’ is defined as a collection of one or more domains with a distance function (metric) that represents similarity relations between objects. The distance function can vary; the most common one is the Euclidean, but also Manhattan and polar metrics may be appropriate in different contexts (Shepard, 1964; Johannesson, 2002, Gärdenfors 2014).

Objects are represented as points in a conceptual space, that is, as a vector of coordinates for the dimensions of the space. An observation, for example seeing a black raven, is represented as a partial vector for an object, specifying some of the properties of the object.

Similarity between concepts and objects is defined as a monotonically decreasing function of their distance within the space (Shepard, 1987). This makes my notion of similarity different from that of Tversky (1977), which is based on comparing the number of properties that two objects have in common with the properties where they differ.

3 Convexity

The general description of conceptual spaces will now be used to introduce a distinction between properties and concepts. I use the notion of a property to denote information related to a single domain. The following constraints were proposed in Gärdenfors (19902000), where the geometric characteristics of the quality dimensions are used to introduce a spatial structure to properties:

Convexity constraint for properties: A natural property is a convex region in some domain.

That a region R is convex means that for any two points x and y in R, all points between x and y are also in R. The motivation for the thesis is that if some objects located at x and y, in relation to some domain, are both examples of a property, then any object that is located between x and y with respect to the same domain will also be an example of the property. Although not all domains in a conceptual space may have a metric, I assume that the notion of betweenness is defined for all domains.Footnote 2 This will make it possible to apply the thesis about properties generally.

Properties, as characterized by the thesis, form a special case of concepts. This distinction is defined by saying that a property is based on a single domain, while a concept is based on one or more domains (Gärdenfors 2000). The distinction has been obliterated in both linguistic and philosophical accounts. For example, properties, concepts, and relational concepts are all represented by predicates in first-order logic and in λ-calculus. In Gärdenfors (2000), the following criterion is put forward:

Convexity constraint for concepts: A natural concept is a set of regions in a number of domains together with an assignment of salience weights to the domains and information about how the regions in different domains are correlated.

The constraint for properties is thus a special case of the one for concepts where only one domain is involved. I next show that the Convexity constraint generates concepts that have features that connect to naturalness.

3.1 Connections to Prototype Theory

There are interesting comparisons to make between analysing properties and concepts as convex regions and the prototype theory developed by Rosch and her collaborators (see e.g. Rosch, 1975, 1978, Mervis & Rosch, 1981). When properties are defined as convex regions in a domain, prototype effects are to be expected. Given a convex region, one can describe positions in that region as being more or less central.

Conversely, if prototype theory is adopted, then the representation of properties and concepts as convex regions is to be expected. Assume that some metric quality dimensions of a conceptual space are given—for example, the dimensions of the colour domain—and that the goal is to decompose it into a number of regions, in this case, colour properties. If one starts from a set of prototypes p1, …, pn, these should be the central points in the concepts they represent. The information about prototypes can then be used to generate convex regions by stipulating that any point p within the space belongs to the same concept as the closest prototype pi. (Note that talking about a “closest” point requires that the space has a metric.) This rule will generate a Voronoi tessellation. Figure 2 depicts two different Voronoi tessellations. The tessellations are two-dimensional, but Voronoi tessellations can be extended to any number of dimensions.

Fig. 2
figure 2

Two Voronoi tessellations of a two-dimensional space. The points represent the prototypes and the lines show the borders between the categories

A central semantic hypothesis is that the most typical meaning of a word or linguistic expression is the prototype at the centre of the convex region assigned to the word. However, note that concepts such as hot and giant do not have prototypes. They correspond to open-ended regions of a dimension, where no point can be identified as the most typical.

It is easy to prove that a Voronoi tessellation always results in a decomposition of the space into convex regions (Okabe et al. 2000). The tessellation provides a geometric answer to how a similarity measure together with a set of prototypes can determine a set of concepts. The partitioning results in a discretization of the space. Having a space partitioned into a finite number of regions means that a finite number of words can be used to refer to the regions. Psychological metrics are, however, imprecise and often context dependent. As a consequence, the borderlines will not be exactly determined (see Sect. 3.4).

The tessellation mechanism provides important clues to the cognitive economy of concept learning. If the categorization of each point in a space had to be memorized, this would put absurd demands on human memory. However, if the partitioning of a space into categories is based on a Voronoi tessellation, only the positions of the prototypes need to be remembered. Once you recall the positions of the prototypes, the rest of a categorization can be computed by using the metric of the space. In this way, the tessellation mechanism relieves memory.

3.2 Convexity Provides a Model of Generalization

Stimulus generalization is the ability to behave in a new situation in a way that has been learned in other similar situations (Shepard, 1987). The problem is how to know which aspects of the learning situations that should be generalized. On the conceptual level of representation, the stimulus is assumed to be categorized along a particular dimension or domain. The applicability of a generalization can then be seen as a function (for example a Gaussian function) of the distance from a prototype stimulus, where the distances are determined with the aid of an underlying conceptual space.

Figure 3 depicts three levels of stimulus generalization. Such levels can be explained in terms of distances between prototypes in a Voronoi tessellation. If the distances are large, then the generalization will be high, while if the distances are small, the border to the next category will be close and, consequently, the generalization will be low. The curves in Fig. 3 indicate that the closer a stimulus is to the prototype, the stronger the response will be. The upshot is that as soon as prototypes and a distance function are available in a conceptual space, the generalizations mechanisms fall out very naturally.

Fig. 3
figure 3

Three forms of stimulus generalization. Left: No discrimination along a stimulus dimension yields high generalization. Middle: Low discrimination yields some generalization. Right: High discrimination yields low generalization

3.3 Convexity Makes Learning more Efficient

We are not born with our concepts, but they must be learned. I must therefore account for how the relevant regions of a conceptual space are created from the experience of the agent. To be useful, the concepts must not only be applicable to known cases but should generalize to new situations as well.

Learning concepts can be more or less successful. When a particular perception is sorted under a concept, this may be a mis-categorization. For example, you may be out in the forest thinking that you are picking chanterelles, but when you come home and fry the mushroom, you discover that they are tasteless. You have picked false chantarelles (Hygrophoropsis aurantiaca) which are quite similar to real chanterelles (Cantharellus cibarius). Later, you learn that false chantarelles are more orange and have different gills than the real ones and next time you are in the forest, your two concepts help you separate the two species.

Learning a concept often proceeds by generalizing from a limited number of exemplars of the concept (e.g., Nosofsky, 1986, 1988, Langley, 1996). Adopting the idea that concepts have prototypes, we can assume that a typical instance of the concept is extracted from these exemplars. If the exemplars are described as points in a conceptual space, a simple rule that can be employed for calculating the prototype from a class of exemplars is that the position of the point representing the prototype is defined to be the mean of the positions for all the exemplars (Langley, 1996, p. 99). The prototypes defined in this way can then be used to generate a Voronoi tessellation. Applying this rule means that a prototype is not assumed to be given a priori in any way but is completely determined by the experience of the subject. Figure 4 shows how a set of nine exemplars (represented as differently filled circles), grouped into three categories, generates three prototypical points by calculating means (represented as black Xs) in the space. These prototypes then determine a Voronoi tessellation of the space.

Fig. 4
figure 4

Voronoi tessellation generated by three classes of exemplars. The Xs represent the inferred prototypes

The mechanism illustrated here shows how, once the structures of the domains are established, the application of concepts can be generalized on the basis of only a few examples of each concept. The additional information that is required for the generalization is extracted from the geometric structure of the underlying conceptual space. In this way, conceptual spaces add information to what is given by experience.

Furthermore, the concepts generated by such a categorization mechanism are dynamic in the sense that when the agent observes a new item in a category, the prototype for that category will, in general, change somewhat, since the mean of the class of examples will normally change. Figure 5 shows how the categorization in Fig. 4 is changed after learning about one new exemplar, marked by an arrow, belonging to one of the categories. This addition shifts the prototype of that category, which is defined as the mean of the exemplars, and consequently the Voronoi tessellation is changed. The old tessellation is marked by hatched lines, and the old prototype is marked by a grey X. Categories bordering on the changed prototype will then also change their meaning, but categories that have no common border will remain the same. Therefore, meaning change that is caused by learning will be local.

Fig. 5
figure 5

Voronoi tessellation in Fig. 4 after learning a new exemplar

A more sophisticated model of concept learning, which is grounded in how the hippocampus functions, is presented by Mok and Love (2023). Assuming an underlying space, they show how a neural network that receives input about the locations of exemplars from different categories can generate the locations of appropriate prototypes for the categories. Like the model above, the process is dynamic so that the locations of the prototypes will change, depending on the exemplars that are presented.

3.4 Convexity Explains Vagueness

Natural language is replete with vague terms. Vagueness is, however, not a bug but a design feature of natural language. I will argue that there are good reasons related to cognitive economy why language contains vague terms.

First of all, note that there is no conflict between vagueness and the requirement that concepts be represented by convex regions. What Convexity requires in relation to vague concepts is that if two object locations x1 and x2 both satisfy a certain membership criterion, then all objects between x1 and x2 also satisfy the criterion (Gärdenfors 2000).

Fig. 6
figure 6

Multiple prototypes generate vague borders in Voronoi tessellations

To a large extent, the vagueness of concepts is a result of the fact that we learn concepts by examples and counterexamples. The model of concept formation based on Voronoi tessellations that was presented in the previous sub-section provides good clues to the mechanisms of vagueness. A first clue is that if prototypes are learned from the examples that have been encountered, the location of the prototype will be expected to move over time. Consequently, the cognitive representation of how a prototype of a concept is located in a conceptual space may not be very precise (Douven et al. 2013). Figure 6 contains two closely located prototypes (marked by dots) for each of four concepts. They generate four different dividing lines between each pair of concepts. More generally, a probability distribution of the location of a prototype would, by the same mechanism, generate a distribution of dividing lines. For any point in the space, such a distribution can then be used to determine the degree of membership in a category. In this way, vagueness phenomena arise naturally.

Another clue to vagueness is that the relative weights of the dimensions in a domain are not precisely determined. For one thing, the weights often vary with the context. For example, depending on the relative weights of the two dimensions depicted in Fig. 7, the slope of the line dividing the space between the two prototypes (marked by dots) will vary. Again, a probability distribution over the weights will generate a distribution of dividing lines.

Fig. 7
figure 7

Changes in dimensional weights generate vague borders in Voronoi tessellations

The upshot is that cognitive limitations concerning the locations of prototypes and the relative weights of dimensions explain why concepts in general are vague.

4 Further Criteria for Naturalness

The previous section presented arguments why Convexity is a valuable criterion for identifying natural concepts. In this section, I present some further criteria for natural concepts. The following are proposed and discussed by Douven and Gärdenfors (2020):

Parsimony: The conceptual structure should not overload the system’s memory.

This criterion is obviously connected to the cognitive economy of concept representations. For a Voronoi tessellation, Parsimony is more generally satisfied in that an individual only has to remember the locations of the prototypes to be able to construct the tessellation, from which she can retrieve the concept under which any given item falls in the space.

Informativeness: The concepts should be informative, meaning that they should jointly offer good and roughly equal coverage of the domain of classification cases.

To some extent this criterion clashes with Parsimony, since the more concepts there are covering a conceptual space, the more informative they are. So, the problem is to strike the right balance between Parsimony and Informativeness. In relation to the vertical dimension of prototype theory, it has been noticed that the step from a superordinate level to the basic level typically involves a large increase in Informativeness, while a step from the basic level to a subordinate level typically does not markedly increase the information content of the concepts. This can be seen as an argument for selecting the basic level as the most efficient level of concept representation (see also Sect. 5). This is supported by the fact that children tend to learn words for concepts on the basic level earlier than words for other levels.

Representation: The conceptual structure should be such that it allows the system to choose for each concept a prototype that is a good representative of all items falling under the concept.

Contrast: The conceptual structure should be such that prototypes of different concepts can be so chosen that they are easy to tell apart.

If one considers the two Voronoi tessellations in Fig. 2, it is clear that the left satisfies Representation and Contrast better than the right one. A partitioning of a space where the prototypes are more or less equidistant is optimal according to these criteria.

Learnability: The conceptual structure should be learnable, ideally from a small number of examples.

I have already shown how the convex regions generated by Voronoi tessellations will help speed up the learning of concepts.

Well-formedness: The concepts should be “well-formed” in that the items falling under any one of them are maximally similar to each other and maximally dissimilar to the items falling under the other concepts represented in the same space.

This criterion can be thought of as flowing directly from Parsimony and Informativeness (Regier et al., 2007, 2015) and as being motivated by the same considerations of constrained optimization that underlie design criteria generally: One will be less prone to misclassify two items falling under the same concept as falling under different concepts if these items are always very similar to each other, and one will also be less prone to misclassify two items falling under different concepts as falling under the same concept if these items are always very dissimilar.

Regier et al. (2007), present a formalization of Well-formedness. Let variables x and y range over possible objects representable in a conceptual space S, and let P be a categorization of all possible such objects, meaning that P(x) assigns x to one of a number of mutually exclusive and jointly exhaustive regions of S. Furthermore, let sim be a similarity measure defined on S. Then Regier et al. define the “within-similarity” of P as.

$$S\left( P \right) = {\sum _x}_{,y:P\left( x \right) = P\left( y \right)}sim\left( {x,y} \right),$$

where 0 ≤ sim(x, y) ≤ 1. They further define the “across-dissimilarity” of P as

$$D\left( P \right) = {\sum _x}_{,y:P\left( x \right) \ne P\left( y \right)}\left( {1-sim(x,y)} \right).$$

The well-formedness W(P) of a categorization P is then defined as the sum of S(P) and D(P). Although Regier et al. (2007) propose these definitions in the context of colour categorization, the definitions apply generally to similarity spaces, thus providing a quantitative version of Well-formedness. On this version, concepts should be such that their combination—the category system of the space in which they live—maximizes W.Footnote 3

5 Coherence

Rosch et al. (1976) famously showed that the members of basic-level categories tend to have lots of features in common, many more than the features shared by the members of their relative superordinate categories and almost as many as the features shared by the members of their relative subordinate categories. They concluded that basic-level categories maximize within-category similarity while minimizing inter-category similarity. This differentiates them from the rest of the categories in the taxonomic hierarchy. The feature led Rosch and her collaborators to claim that basic-level categories are highly coherent and that is why they are preferred in induction and categorization.

Cognitive psychologists often assume that our concepts have different degrees of coherence and that this affects their role in inductive inference and categorization. For this reason, coherence is relevant as a criterion for natural concepts. However, what psychologists mean by coherence is often unclear, as they usually do not define this notion.Footnote 4 In brief, a measure of coherence depends on having a measure of the covariational structure of a category, something that is far from simple.

Osta-Vélez and Gärdenfors (2023) propose an explication of the notion of conceptual coherence based on the theory of conceptual spaces. In coherent categories one finds clusters of properties that ‘hang together’, that is, they covariate. If a concept is represented in a conceptual space, its instances will form a set of points in the space. This set can be more or less organized. However, for any such data set, one can identify its principal component (Jolliffe, 2002). The first component is defined as the line in the space such that if data set is projected on that line (it can be seen as a new dimension), the variance of the data is maximally reduced. In other words, the principal component is the dimension that yields the best ‘explanation’ of the data.Footnote 5 The main hypothesis in Osta-Vélez and Gärdenfors (2023) is that the principal component (or the first components) of a concept can be used to characterize its coherence. As a measure of coherence, they propose the proportion of the variance that is explained by the first principal component.

For a concept with several salient domains, the coherence value will in general be higher than for a concept with a few salient domains. This explains why concepts for natural categories such as ‘bird’ will have a high coherence value since it has several salient domains that covary, while concepts for artifacts such as ‘chair’ or ‘clock’ will have low coherence values. For artifacts, physical properties are often non-salient and it is only properties related to function that matter.

This measure of coherence explains why basic level concepts are more coherent than superordinate concepts. The reason is that the basic level brings in many more correlations between properties than are found on the superordinate level. Therefore, the first principal component of a basic level concept will be stronger than that of superordinates. This argument also explains the phenomenon of the “preferred level of induction” (Sloman & Lagnado, 2005, p. 106), that is, the tendency to favor basic-level categories (such as chair or dog) over abstract ones (like furniture or mammal) during categorization and inference. It should be noted that the argument again brings out the importance of cognitive economy in concept formation.

Another method that is commonly used to reduce the dimensionality of a data set is multidimensional scaling (Kruskal, 1964). Both methods seek to retain as much as possible of the original distances between points. There are, however, significant differences between multidimensional scaling and principal component analysis. Principal component analysis builds on a correlation matrix, while multidimensional scaling typically starts with inter-subject distances (or correlations). Multidimensional scaling is based on distances between points while principal component analysis uses angles among vectors. Multidimensional scaling often results in a lower dimensional solution than principal component analysis. A drawback of multidimensional scaling, however, is that is cannot handle large data sets in an efficient way.

A third method to reduce dimensionality is Kohonen’s (1988) self-organizing maps. Such map derives from an artificial neural network which consists of an input vector that is connected to an output array of neurons. In most applications, this array is one- or two-dimensional, but in principle it could be of any dimensionality. The essential property of the network is that the connections between the neurons in the array and the learning function are organized in such a way that similarities that occur among different input vectors are, in general, preserved in the mapping. The mapping from the input vector to the array thereby preserves most topological relations. Since dimensionality is reduced, this entails that regions of the high-dimensional space are mapped onto points in the low-dimensional space. This mapping can be seen as a form of generalization. The low-dimensional “feature map” that results as an output of the process can be identified with a conceptual space. The mapping is generated by the network itself via the learning mechanism of the network.

6 Natural Concepts in Communication

So far, the design principles have been discussed and motivated by the use of concepts for solving problems of categorization. In this section, I turn to natural concepts as tools for communication. This involves taking a social, rather than an individual, perspective on what characterizes natural concepts.

A fundamental criterion for a language is that different users mean, more or less, the same thing when they use a word. When learning to speak a language a “meeting of minds” must be achieved so that the concepts of a speaker are aligned, via the words of the language, to the concepts of other speakers (Jäger, 2007, Warglien and Gärdenfors 2013).

An example of how such a correlation can be realized is presented by Jäger and van Rooij (2007), who use computer simulations to show how semantic fixed points (in the form of Nash equilibria) can represent a meeting of minds. They refer to the domain they choose—a circular disk—as “the colour space,” but there is nothing in the process that depends on relations to colours. The problem they examine is how a common meaning for “colour” terms can develop in a communication game. In their example, there are only two players: s (sender) and r (receiver). Jäger and van Rooij assume that the two players have a common space C for “colour.” There is a fixed and finite set of n messages (“words”) that the sender can convey to the receiver.

The communication game unfolds as follows: “Nature” randomly chooses some point in the colour space. The sender s knows the choice of nature, but the receiver r does not. Then s is allowed to send one of the messages to r. In response, the receiver r selects a point in the colour space. The rewards given to s and r in the game depend on the distance between the point chosen by Nature and the point selected by r. The players maximize their rewards if they maximize the similarity (minimize the distance) between nature’s choice and r’s choice of point. The sender can choose a partitioning S of C in n subsets, assigning to each subset a unique message. For each “colour word” sent, there is a prototypical point in the region corresponding to the point that is r’s best response. There are thus n prototype points, corresponding to the typical meanings assigned by r to each of the n possible messages from s.

Following the standard definition in game theory, a Nash equilibrium of the game is a pair (S, R), where S is the sender’s partitioning (into n subsets) of C, and R is the responder’s n-tuple of prototype points of C, such that both are a best response to each other. The central result of Jäger and van Rooij (2007) is that if the colour space is convex and compact and the similarity function is continuous, then there exists a Nash equilibrium, and it corresponds to a Voronoi tessellation of the colour space that is common to s and r.Footnote 6 Their solution is guaranteed to satisfy both Convexity and Representation and it also satisfies Parsimony (though this is built in), Informativeness, and Contrast. In addition, their implementation can be seen as satisfying Well-formedness as well, because the fact that prototypes minimize the average distance to the other points in their cells makes them evenly distributed in the space and, consequently, items falling under one prototype are maximally dissimilar to items falling under another.

In a theoretical analysis, Warglien and Gärdenfors (2013) have generalized Jäger and van Rooij’s result by showing how some topological and geometric properties of mental representations make meetings of minds possible. They assume Convexity, but they assume neither that the spaces of the communicating individuals are identical, nor that they partition the spaces in the same way. Their analysis builds on the following additional constraint:

Continuity Language should preserve the nearness relations among points in conceptual spaces.

In conceptual spaces, “near to” means “similar to.” Thus, this constraint means that the more two entities (objects, properties, actions, etc.) are judged to be similar by the members of the linguistic community, the likelier it is that all members of the community use the same word to categorize the entities. Warglien and Gärdenfors (2013) present the following result (which follows from Brouwer’s fixed-point theorem):

Semantic fix-point Theorem

Every semantic function (mapping speakers’ spaces onto each other) that is a continuous mapping of a convex compact set onto itself has at least one fix-point.

In other words, the fix-point determines the prototype for each word for both the space of s and the space of r (the spaces need not be identical) and via a Voronoi tessellation also the partitioning of the spaces. A consistency of word meanings is thereby ensured. The theorem provides yet another reason for why Convexity is an appropriate condition for natural concepts.

Neither the results of the simulations by Jäger and van Rooij (2007) nor the theoretical analysis by Warglien and Gärdenfors (2013) guarantee that there is a unique best partitioning of any given similarity space. This means that there exist many “languages” that contain the same information about a conceptual domain.

Another result involving the role of convexity in language coordination is presented by Gierasimczuk et al. (2023). Their focus is on quantity expressions. Like Jäger and van Rooij (2007), they use a simulation of a guessing game, using different “words” to express quantities. The result of the game simulations is that the emerging quantity meanings satisfy Convexity, but also Monotonicity. The latter condition is divided in upward monotonicity, meaning that is a certain quantity falls under the meaning of a word, then all larger quantities also belong to the meaning; and downward monotonicity, meaning that is a certain quantity falls under the meaning of a word, then all small quantities also belong to the meaning.

Further support for the role that design criteria play in structuring similarity spaces comes from a number of experimental results concerning language transmission. In a large-scale laboratory experiment, Xu et al. (2013) showed subjects examples of how colours of Munsell chips were named, and the subjects then classified other colours on the basis of the examples. These subjects’ responses were used to generate examples for the subjects of the next “generation” of learners. This process continued for thirteen generations. The results reveal that colour classifications converge quickly toward colour systems similar to those found across human languages.

Xu and colleagues showed that the final partitionings have the same “variation of information” (in the sense of Meilă 2007) as languages from the World Color Survey with the same number of colour terms. The resulting partitionings also fulfil several of the design criteria that have been presented here. For instance, while Xu and colleagues did not consider Convexity, the colour partitionings of the five learning chains they present in their Fig. 2 (Xu et al., 2013, p. 4) show clear signs of convexity already after four generations of learning. In a related experiment involving ten generations of learners, Carstensen et al. (2015) obtained similar results concerning spatial relations based on the Topological Relations Picture Series (Bowermann & Pedersen, 1992). Carstensen et al. also showed that the partitionings become increasingly informative over the generations, where informativeness was measured as in Regier et al. (2007). And re-analysing Xu et al.’s results, Carstensen et al. (2015) found the same increasing informativeness in those results, indicating that they satisfy Well-formedness.

The conclusion to be drawn from the simulations, theoretical analysis, and experimental evidence that have been presented here is that the criteria for natural concepts are also instrumental in facilitating efficient communication of concepts.

7 Conclusion

This article has presented a cognitive approach to the problem of how the naturalness of concepts can be determined. My arguments are based on the assumption that naturalness is connected to cognitive economy and efficiency of communication. I have used the theory of conceptual spaces as a framework for proposing a number of criteria. I have argued that requiring that concepts represented as convex regions in such spaces yields several desirable features of naturalness. I have also presented some further criteria from Douven and Gärdenfors (2020). A new criterion is that of coherence, which merits further discussion in relation to naturalness.

The criteria have been proposed as tools for analysing existing systems of concepts. However, they can also be used constructively in the design of artificial systems. For example, imagine that some robots are dropped on a distant planet. How could they create common concepts so that they can communicate and work together on the planet? Constructing a system that achieves this is an example of a task for a conceptual engineer. Conceptual engineering is also required in interactions between humans and artificial systems. In Gärdenfors (to appear), I have argued that conceptual spaces form an appropriate framework for computational implementations that satisfy the criteria discussed in this article. An interesting example of such work is Tětková et al. (2023), who show that Convexity is a useful criterion in artificial neuron networks involving deep learning.