Introduction

The empirical study of concepts has a long history, beginning with the introspective analyses of Moore (1910) and Fisher (1916), and soon followed by the analytical methodology of Hull (1920). Hull’s paradigm, in which variables were introduced in a learning phase followed by a transfer test, has since become the paradigm of choice. For the purposes of the present study, the interesting aspect of Hull’s paradigm, and virtually all studies since, is that the learning set is typically recycled until some learning criterion is met, such as errorless performance or a predetermined number of learning blocks has occurred.

This paradigm has provided important insights into the formation and representation of concepts, including the identification of learning variables that shape concepts (e.g., Homa, 1984; Wills & Pothos, 2012) as well as the development of formal, quantitative models of classification and category learning (e.g., Busemeyer & Diederich, 2009; Hintzman, 1986; Nosofsky, 1988; Minda and Smith, 2001). Nonetheless, when learning many real-world categories, for example birds or faces or cars, only a tiny portion of the instances are ever exactly repeated. Even individual exemplars vary subtly across time (e.g., faces appear different depending on view angle or expression) or different instances of a category have individuating characteristics (e.g., every red Toyota Prius differs in license plate, cleanliness, and decoration). The exact repetition of instances is more likely to fall within the domain of formalized training or instruction. The category learning paradigm, in which exact instances are repeated throughout the learning phase, gives little insight into what kind of performance to expect when learning non-repeating stimulus sets.

A handful of studies have explored category learning when instances do not repeat in the training phase. Medin, Dewey, and Murphy (1983) compared the acquisition of two face categories where the faces were either repeated on each trial block or not. Only the terminal levels of learning were reported, and subjects in the non-repeat condition found the task quite difficult (only 28% of the subjects reached errorless criterion). Learning was probably slowed because the dimensions relevant for classification – hair and shirt color, hair length, and whether a smile was present or not – were unrelated to natural grouping based on facial properties, and classification was solely determined by their value on facially-irrelevant dimensions. Ashby and his colleagues (e.g., Ashby & Gott, 1988; Ashby & Maddox, 1990, 1992; Casale, Roeder, & Ashby, 2012) have explored categories in which patterns are randomly sampled from a bivariate normal distribution, thereby generating a virtual limitless number of exemplars in learning. However, their task differs from the experiments reported here in a number of ways: (1) subjects typically learned two categories; (2) stimuli were composed of stimuli that varied along two readily-identifiable dimensions; and (3) learning and transfer were not contrasted for categories composed of repeated patterns versus the learning of categories whose patterns never repeatFootnote 1.

A different approach was taken by Knowlton and Squire (1993) and Reed, Squire, Patalano, Smith, and Jonides (1999), who presented 40 patterns one time each from a single category to normal and amnesic patients. Their major focus was whether results based on subsequent classification and recognition could be explained by single- or multiple-memory systems. Disputes over the interpretation of transfer results have subsequently been provided by Nosofsky and Zaki (1998) and Zaki and Nosofsky (2001). However, the utility of the single category paradigm was questioned by Palmeri and Flanery (1999), who demonstrated that subsequent classification on the transfer test with this paradigm could be explained by factors unrelated to learning. More recently, Homa, Hout, Milikan and Milikan (2011) found that minimal category knowledge was acquired following the observational phase of a single category. Because degree of category knowledge is enhanced by the training on more, rather than fewer, categories (Homa & Chambliss, 1975), presumably because the additional categories in the learning set provide the subject with information about distinctive category cues, it is doubtful that subjects learn much about a category from exposure to the members of a single category.

In the present study, we directly contrasted the learning of multiple categories where the learning patterns were either repeated or not on each trial block. For subjects in the non-repeating condition, the learning patterns were replaced on each trial block with novel patterns of the same level of distortion from the same prototypes. Subsequent transfer included either recognition or classification tests. Our initial expectation was that categories could be learned when training patterns never repeated, but that this procedure would likely slow the rate and possibly the terminal level of learning, at least compared to the repetition condition.

Indeed, exemplar-based models of classification (e.g., Nosofsky, 1988; Nosofsky & Johansen, 2000; Shin & Nosofsky, 1992) clearly predict a faster rate of learning when patterns repeat versus a condition where the patterns never repeat (or repeat less frequently). This prediction occurs because classification, either in learning or transfer, is based on the summed evidence favoring a category, and similarity of a pattern to itself is greater than the similarity of a pattern to any other pattern of that category. This self-similarity ensures that the summed similarity to a category always favors more rapid learning when patterns repeat in learning.

To illustrate this, assume that the subject assigns a pattern into a particular learning category based on its similarity to the members of that category, relative to members of the contrasting categories. As is typically done, assume that similarity is a monotonic function of its distance to other category instances in psychological space, where the psychological space is derived via multidimensional scaling (e.g., Homa, Dunbar, & Nohre, 1991; Nosofsky & Zaki, 1998; Shin & Nosofsky, 1992). Let the similarity between patterns i and j be defined as:

$$ s\left(i,j\right)=\exp \left(-c{d}_{ij}\right) $$
(1)

where dij is the distance between items i and j. The parameter c functions as a scaling (sensitivity) parameter that reflects how well the category members are discriminated from each other. To formalize the learning algorithm used in the present study, first consider how learning was predicted by Nosofsky and Zaki (1998). The probability that pattern i is classified into category A rather than category B is given by:

$$ P\left(A|i\right)=\frac{{\left[\sum s\left(i,a\right)+\beta \right]}^{\gamma }}{{\left[\sum s\left(i,a\right)+\beta \right]}^{\gamma }+{\left[\sum s\left(i,b\right)+\beta \right]}^{\gamma }} $$
(2)

The summed similarities of instance i to patterns in category A and B are represented by ∑s(i,a) and ∑s(i,b), respectively. The parameter β is background noise, and γ is a response-scaling parameter (e.g., the subject evaluates the evidence for each category and assigns the stimulus to the category based on probability matching when γ =1 and to the category more deterministically when γ exceeds 1). Because summed similarities cumulate across learning blocks, and because the background noise is constant, learning improves across trials. Logically, c should be greater when learning patterns repeat during learning, since repetition across learning blocks should make these patterns more discriminable relative to patterns that never repeat.

This learning algorithm was modified in the present study to reflect the predictions of an exemplar model involving multiple categories (e.g., Homa, Powell, & Ferguson, 2014; Nosofsky & Johansen, 2000), where learning instances either repeated (REP) on each block or not (NREP). For the REP condition, each category was represented by five different patterns, and the subject was required to learn three categories (A, B, C). On each trial block, these 15 patterns were randomly presented, with subjects receiving either 20 (Experiment 1) or 15 (Experiments 2 and 3) trial blocks. When a pattern from set A is presented on a learning trial, this item has maximal similarity to itself, a within-category similarity to the remaining four patterns of that category, and a between-category similarity to the five patterns in each of the other two categories. A simplifying assumption that does not alter the predicted learning rate for the REP and NREP condition is to assume that the subject performs randomly on the initial learning block. Beginning with the second trial block and thereafter, classification is based on stored exemplar knowledge. Therefore, when a pattern is presented for learning on trial block N, all learning patterns have been encountered (N-1) previous times. To generate specific learning functions, we substituted multidimensional distances (Kruskal, 1964; Shepard, 1962) from a study that approximated the categories and constraints used in the present study (Home, Proulx, Blair, 2008). The particular distances are less critical for the purposes of this illustration than the constraint that similarity decreases with distance in the multidimensional space. The MDS distances between two medium patterns from the same category were 0.72, with this distance equal to 1.60 for patterns belonging to different categories. Equation 2 can be rewritten, beginning with trial block B = 2, as:

$$ P\left(A|i,B\right)=\frac{{\left[\left(B-1\right){e}^{-0c}+4{e}^{-.72c}+\beta \right]}^{\gamma }}{{\left[\left(B-1\right)\left({e}^{-0c}+4{e}^{-.72c}\right)+\beta \right]}^{\gamma }+{\left[\left(B-1\right)10{e}^{-1.6c}+\beta \right]}^{\gamma }}=\frac{{\left[\left(B-1\right)\left(1+4{e}^{-.72c}\right)+\beta \right]}^{\gamma }}{{\left[\left(B-1\right)\left(1+4{e}^{-.72c}\right)+\beta \right]}^{\gamma }+{\left[\left(B-1\right)10{e}^{-1.6c}+\beta \right]}^{\gamma }} $$
(3)

For the NREP condition, the number of different patterns supplants pattern repetition across learning blocks. The corresponding equation for the NREP condition is, therefore:

$$ P\left(A|i,B\right)=\frac{{\left[\left(B-1\right)5{e}^{-.72c}+\beta \right]}^{\gamma }}{{\left[\left(B-1\right)5{e}^{-.72c}+\beta \right]}^{\gamma }+{\left[\left(B-1\right)10{e}^{-1.6c}+\beta \right]}^{\gamma }} $$
(4)

The only difference between equations (3) and (4) is that exemplar similarity to itself occurs in the REP equation and not in the NREP equation, and the frequency of a pattern in the REP condition is swapped with an equivalent number of different patterns in the NREP condition, although the number of category exposures in the REP and NREP conditions is the same.

Figure 1 shows predictions for these two equations, based on variations of c and γ. Learning improves across trial blocks in each case, with the REP condition always exceeding the performance of the NREP version with comparable parameter values. The predicted difference between REP and NREP conditions is probably underestimated, because c should be greater in the REP condition because pattern discriminability should be higher whenever patterns are repeated in learning, as is the case in the REP condition.Footnote 2 Since learning rate is increased with increasing values of c (e.g., the learning rate in the REP condition with c= 2, γ = 1 is greater than the learning rate with c = 1, γ = 1), differences between learning rates in the REP and NREP condition should be greater as well.

Fig. 1
figure 1

Predicted learning rates by exemplar model for repeating (R) and non-repeating (NR) conditions, based on variations of c and γ

An alternative view is that category learning and subsequent classification of novel instances on a transfer test is based on similarity to a category prototype (Homa, Proulx, & Blair, 2008; Smith & Minda, 1998, 2002). The non-repeating condition has a far larger category size compared to the repeating condition, and increases in category size are known to enhance subsequent generalization (e.g., Homa, Goldman, Cornell, & Cross, 1973; Homa & Vosburgh, 1976). However, the benefits of category size have typically been demonstrated for subsequent transfer following learning, rather than learning itself. As a consequence, the learning rate predicted for the REP and NREP conditions, based on a prototype influence, is unclear. This is further complicated by evidence that a prototype influence emerges only later in learning and only following exposure to categories of large size (Homa, Dunbar, & Nohre, 1991). Since increasing category size produces substantial benefits on later transfer, subsequent transfer might be as good, if not better, in the non-repeating condition.

Recognition performance following learning in the REP and NREP conditions was also of interest. Because subsequent recognition of the training patterns should be poor in the non-repeating condition, classification of novel category members on a transfer test might also be poorer for the non-repeating condition as well.

By having subjects learn multiple categories, the interpretative problems attendant with single-category learning (Knowlton & Squire, 1993; Nosofsky & Zaki, 1998; Palmeri & Flanery, 1999) was avoided. Furthermore, this procedure allows us to objectively track learning across trial blocks, something impossible in the single-category paradigm. Recognition and classification were assessed following a common learning procedure. In the experiments by Knowlton and Squire (1993), Reed et al. (1999), Nosofsky and Zaki (1998), and Zaki and Nosofsky (2001), different learning tasks preceded classification and recognition transfer and thus the data outcome – an apparent dissociation between recognition and classification – is compromised, i.e., what was demonstrated was that it is possible to find a task that produced equivalent classification but different recognition, not that classification transfer occurred in the absence of memorial traces (recognition).

Finally, the recognition transfer test for the non-repeating condition contained training patterns that had occurred on each of the previous training blocks. This has the advantage of determining the level of memory for training patterns that had been presented early, midway, or late in learning. A reasonable expectation is that whatever recognition memory exists for these patterns should be modulated by their placement in time during original learning.

We were especially interested in three performance issues and one major theoretical issue: (1) Can subjects readily learn categories with instances that never repeat in the training phase? (2) Do subjects exhibit non-zero recognition of these training patterns? (3) Does training that likely degrades memory for particular training instances also degrade subsequent transfer to novel instances from these categories? And (4) If the exemplar-based model of classification (Nosofsky, 1988) fails to predict the rate of learning when patterns repeat or not on each trial block, can an alternative model be shown to capture the results?

All experiments in the present study used multiple learning blocks, usually with three categories (although Experiment 2 also included a two-category paradigm). In Experiment 1, the transfer test involved classification of novel instances belonging to the learning categories. In Experiments 2 and 3, the transfer test involved a recognition test containing either old, new, and foil patterns (Experiment 2) or old, new, and prototype patterns (Experiment 3). Foil patterns were stimuli generated from prototypes that were different from those used in learning. The modification of the recognition test in Experiment 3 placed a severe constraint on exemplar knowledge, since all patterns were members of the learned categories but only a portion of these were old. The memorial integrity of training patterns as revealed in recognition is relevant to the evaluation of current models of categorization. Also, signal detection measures (d’ and β) can be calculated for the REP and NREP conditions in Experiments 2 and 3.

Experiment 1

Method

Subjects

The subjects were 58 undergraduates from an Introductory Psychology course at Arizona State University, 26 subjects in the Repetition condition (REP) and 32 in the Non-repetition condition (NREP).Footnote 3 Subjects were randomly assigned to the REP or NREP condition.

Materials and apparatus

The subject sat in a sound-dampened chamber about 20 in. from a 17-in. computer monitor. The patterns used in learning and transfer were the distorted forms used previously (e.g., Homa, 1978). In brief, a form category is created by first generating a random nine-dot configuration within a 50 × 50 grid and then connecting the dots with lines. This pattern is arbitrarily designated as the category prototype; different members of this category are then generated by statistically moving each of the dots of the prototype. In the present study, six different prototypes (A–F) were generated, with about half the subjects receiving one set (A, B, C) and half receiving the other set (D, E, F).

The amount of dot displacement determines the distortion level of a pattern. Low-, medium-, and high-level pattern distortions have vertices that are displaced, on average, about 1.20, 2.80, and 4.60 units, respectively, from each corresponding dot of the prototype. Patterns belonging to different prototype categories have their dots displaced, on average, by 10–15 units (Homa, 1978). All learning patterns were medium-level distortions; transfer patterns included low-, medium-, and high-level distortions, as well as the category prototype for each category. All patterns were randomly generated from a statistical algorithm, with the only restriction being that the generated pattern fell within a pre-specified distortion range. During the learning phase, patterns were randomly sampled for each trial block and subject to the restriction that the patterns in the REP condition also appear in the NREP condition.

Each pattern appeared in white against a black background. The subject responded by depressing the A, B, and C keys on a standard keyboard

Procedure

All subjects received 20 trial blocks in the learning phase. In the repeating condition (REP), each trial block contained 15 patterns, five in each of three categories. The order of the patterns within a trial block was randomized. The same patterns were presented in each of the 20 trial blocks. For subjects in the non-repeating condition (NREP), the training patterns were always different.

The procedure on each learning block was the same – a pattern was presented in the center of the screen (about 7.5 cm along the horizontal and vertical dimensions) and remained visible until the subject responded with a key press indicating their category judgment. Immediately following the subject’s response, the correct category name appeared for 1 s below the pattern, followed by the next pattern. The presentation of the learning patterns was seamless across trials blocks, with no temporal break between blocks. For subjects in the REP conditions, each learning pattern appeared 20 times; for subjects in the NREP conditions, each pattern appeared once. All subjects saw 300 patterns.

Transfer

Following learning, a 5-min, self-paced distracter task was used (rating CVCs for their pronounceability on a 7-point scale). The transfer task immediately followed the distracter task. On the transfer test, each subject classified, without feedback, 48 patterns, 16 from each of the three categories. The 16 patterns were composed of the prototype and five instances each that were low-, medium-, and high-distortions of the category prototype.

Results

Learning

The mean proportion of correct classifications across the 20 trial blocks for the REP and NREP conditions is shown in Fig. 2. The effect of trial blocks was significant, F(19, 1,064) = 108.19, MSe = .0093, η2 = .659, p < .001, but the effect of conditions (REP vs. NREP) was not, F(1, 56) = 0.06, MSe = .146, η2 = .001, p =.809. The block × condition interaction was significant, F(19, 1,064) = 2.15, MSe = .0093, η2 = .037, p = .003, and reflected the slight crossover, with NREP exceeding REP performance for the initial 6 blocks, and the REP generally exceeding the NREP for the final 12.

Fig. 2
figure 2

Learning performance across the 20 trial blocks for the REP and NREP in Experiment 1 (standard errors at each block for each condition averaged between .01 and .03)

Transfer

Proportion correct performance on the transfer test is shown in Fig. 3 as a function of training condition.Footnote 4 The main effect of item type (low, medium, high, prototype) was significant, F(3, 168) = 31.58, MSe = .0052, η2 = .361, p < .001, but the overall difference of condition was not, F(1, 56) = 1.63, η2 = .028, MSe = .0012, p = .207. The item × condition interaction was significant, F(3, 168) = 2.86, MSe = .0052, η2 = .049, p = .038. The interaction reflected the greater decline in accuracy with increasing distortion for the REP condition compared to the NREP condition. Subsequent tests revealed that the classification rate on medium distortions was significantly higher in the NREP versus REP condition (t(56) = 2.02, p = .049, two-tailed); the differences on the remaining items failed to reach significance (p > .15 in each case).

Fig. 3
figure 3

Mean proportion correct performance (with standard error bars) on the transfer test

Discussion

Two major results emerged from Experiment 1. First, the rate of learning was not affected by having novel instances occur within each trial block. In fact, performance in the REP condition exceeded that for the NREP condition only late in learning. This latter outcome is hardly surprising, since subjects in the REP condition should eventually memorize the small set of training patterns, an outcome not possible in the NREP condition. Second, transfer performance was at least as good, if not slightly better, in the NREP condition. Taken together, having new patterns appear in each trial block did not affect the rate of learning while slightly enhancing subsequent transfer.

Experiment 2

In Experiment 2 recognition rather than classification was assessed following REP and NREP learning. The number of learning blocks was reduced from 20 to 15 because learning appeared to asymptote after 15 learning blocks. Following learning, all subjects received a mixture of old (training), new, and foil patterns, where the foil patterns were novel patterns from different prototypes. We anticipated that subjects receiving REP training would clearly discriminate training patterns from new and foil patterns, but were less certain whether subjects receiving NREP training would show any ability to discriminate old from new patterns. Also, subjects learned either two or three categories prior to transfer, primarily to assess the generality of the acquisition results of Experiment 1.

Method

Subjects

The subjects were 127 Arizona State University undergraduates, randomly assigned to the four between-subject conditions of number of categories learned × mode of presentation (REP, NREP). Of the 58 subjects in the two-category condition (2C), 33 were randomly assigned to the repeating condition (REP) and 25 to the non-repeating (NREP) condition. Of the 69 subjects in the three-category condition (3C), 34 were in the REP condition and 35 in the NREP condition. None of the subjects had served in Experiment 1.

Procedure

All subjects received 15 trial blocks in the learning phase. For the two-category condition, each trial block contained ten patterns, five in each of two categories; for the three-category condition, each trial block contained 15 patterns, five in each of three categories. Otherwise, the learning procedure was identical to that of Experiment 1.

Transfer

On the transfer test, the subject briefly inspected the pattern and indicated whether it occurred on the learning set, typing “O” for old or “N” for new on the keyboard. In the 2C REP condition, a total of 29 patterns were presented, ten old (five from each category), ten new (five per category), and nine foils. All new patterns were medium-level distortions (as were the old); the foil patterns were medium-level distortions from three different prototypes (three each from three different prototypes), generated from prototypes not used in learning. In the 3C REP condition, a total of 39 patterns were presented, 15 old (five per category), 15 new (five per category), and nine foils (again, three each from each of three different prototype categories). In the 2C NREP condition, a total of 69 different patterns were used, 30 old (15 old from each of two categories), 30 new (15 each from each of two categories), plus nine foils (as before). In the 3C NREP, there were again 69 patterns, 30 old (ten old from each of three categories), 30 new (ten per category), and nine foils. Although the number of transfer patterns differed in the various conditions, each condition had similar proportions of old and new patterns.

In the NREP conditions there were two old patterns on the transfer test from each training block, randomly selected from the available training patterns. The rationale for this manipulation is that memory strength for patterns that appeared only once in learning might be manifested primarily in the later blocks.

Results

Learning

Learning performance in the REP and NREP conditions is shown in Fig. 4, with each condition, as before, showing substantial learning. The main effect of Blocks was significant, F(14, 938) = 88.72, MSe = .014, η2 = .570, p < .001. Neither the main effect of Learning condition (REP vs. NREP), F(1, 67) = 2.38, MSe = .185, η2 =.034, p = .13 nor the Block × Condition interaction, F < 1, were significant.

Fig. 4
figure 4

Mean proportion correct classification across trial blocks for the REP and NREP conditions as a function of number of training categories, Experiment 2 (standard errors at each block and condition averaged between .01 and .03)

Transfer – Recognition

Figure 5 shows the likelihood that old, new, and foil patterns were called old on the transfer test. In the REP conditions, there was a clear discrimination between old and new patterns (Old = .896, New = .618), t(66) =12.77, and old and foil patterns (Foils = .154), t(66) = 27.18, both ps < .001, an outcome unchanged by the number of training categories. For the NREP condition, no discrimination was found between old and new patterns (Old = .771, New = .780), t(59) = 0.72, p = .474, with old exceeding foils (Foils = .188), t(59) = 19.89, p < .001. The only effect of number of training categories for the NREP condition was a higher rate of false alarming to the foil patterns in the three-category versus the two-category condition (.244 vs. .133), t(58) = 2.14, p = .037.

Fig. 5
figure 5

Mean proportion of old responses (with standard error bars) to old, new, and foil patterns on the transfer test, Experiment 2

The initial analysis included the variables of condition (REP, NREP), number of training categories (two, three), and item type (old, new, foil). The main effect of item type was significant, F(2, 246) =776.51, η2 = .863, as was the Item × Condition interaction, F(2, 246) = 31.22, MSe = .0197, η2 = .202, both ps < .001. Neither the main effect of Condition, F(1, 123) = 1.15, p = .286, nor Number of Training categories, F(1, 123) = 2.93, p = .089 was significant. The Number × Items interaction was also significant, F(2, 246) = 4.35, MSe = .0187, η2 = .034, p = .014.

Separate analyses for number of training categories did not alter the results. With either two or three training categories, the main effect of Items and the Item × Condition interaction were each significant, p < .01 in each case; the main effect of Condition was not significant.

Recognition of old patterns in the NREP condition was also analyzed across blocks. Figure 6 shows the proportion of “old” responses to training patterns when they occurred in trial blocks 1–15. Analyses revealed no obvious trend across training blocks, with linear fits across trial blocks resulting in a non-significant, slightly negative slope for both the two- and the three-category learning conditions. In particular, calling an old pattern “old” did not increase with increasing block number.Footnote 5

Fig. 6
figure 6

The proportion of old responses (with standard error bars) on the transfer test to training patterns that occurred in trial blocks 1–15, Experiment 2

Signal detection analysis

Each subject had their hit and false-alarm rate for old, new, and foil patterns converted to the signal detection measures of d’ and a criterion. Hit and false-alarm rates of 1.00 and .000 were converted to .98 and .02 for computational purposes. For REP, the d’ for the old versus foil patterns was 2.85 and 2.45 for subjects who learned two and three categories, respectively. For new items, these values were 1.81 and 1.31. The corresponding d’ values for the NREP condition for old items were 2.32 and 1.66 for two and three categories, respectively. For the new items, these values were 2.17 and 1.83. The d’ for old versus new patterns was, for REP subjects, 1.04 and 1.14 for the two- and three-category conditions, respectively. For the NREP condition, the d’ for old versus new was 0.15 and -0.17, respectively.Footnote 6

An analysis revealed that the main effect of pattern type on the transfer test was highly significant, F(1, 123) = 116.79, MSe = .156, η2 = .487, p < .001, as was the number of training categories, F(1, 123) = 10.40, MSe = 1.352, η2 = .078, p = .002. The significant type × condition interaction was caused by the substantial differences in d’ for the old and new patterns in the REP condition and the absence of a discrimination in the NREP condition, F(1, 123) = 121.08, MSe = .156, η2 = .496, p < .001. The only other significant source was the Type × Condition × Number of Categories, F(1, 123) = 4.29, η2 = .034, p = .04, MSe = .156. The REP × Number of Categories was not significant, F = 0.04, p > .20.

Figure 7 shows a signal detection representation at transfer for the Old, New, and Foil patterns for the three-category condition, separately for the REP (top panel) and NREP (bottom panel) training conditions. A notable outcome is the clear separation between old, new, and foil patterns in the REP condition, and the minimal separation between old and new patterns in the NREP condition. Placement of the criterion for calling an item old was similar for both conditions. A similar outcome (not shown) was obtained for the two-category training condition.

Fig. 7
figure 7

A signal detection representation for the transfer items, three-category condition, Experiment 2

Discussion

Five major results were found in Experiment 2: (1) As was the case in Experiment 1, learning was unaffected by having different patterns represent each category on every block; (2) Subjects in the REP condition readily discriminated old, new, and foil patterns. In contrast, subjects in the NREP condition discriminated old and new patterns from foil patterns but could not discrimination old from new patterns; (3) Having subjects learn two or three categories altered none of the major results, and thus, the conclusions here are likely robust across number of training categories; (4) An analysis of recognition in the NREP condition across training blocks revealed no apparent trend, i.e., training patterns were called old at a rate independent of where in learning these patterns occurred; and (5) A signal detection analysis suggested that subjects established a similar criterion for calling patterns old, with the old, new, and foil patterns clearly separated in the REP condition. However, no such discrimination between old and new patterns was revealed in the NREP conditions, although these patterns were clearly separated from the foils. The implication of the results in Experiment 2 was that subjects had little difficulty in learning the categories in the NREP condition, and performed at a rate equal to the REP subjects, even though no discernible memory for the training patterns was evident.

An alternative interpretation for the recognition results is that subjects in the NREP condition had some memory for the training instances but that they established a criterion to discriminate category patterns from non-category patterns, not old from new. To address this possibility, a final experiment was done in which no foils from other categories were used. Thus, all transfer patterns were patterns from the learning categories. If subjects can discriminate old from new patterns in the NREP condition, it should be apparent in Experiment 3. If, in contrast, subjects had no discernible memory for the training patterns, then no discrimination between old and new patterns should be evident. The category prototypes for the learning categories were also included in the recognition set.

Experiment 3

Experiment 3 was identical to Experiment 2 with two exceptions: (1) The transfer set contained old, new, and prototype patterns, and no foils from categories outside the learning categories were used; and (2) All subjects learned three categories.

Method

Subjects

The subjects were 55 undergraduates at Arizona State University, 29 in the REP condition, and 25 in the NREP condition. None of the subjects had served in Experiments 1 or 2.

Procedure

The procedure was identical to Experiment 2. Following learning, the transfer test contained only patterns from the categories represented in the learning phase. For the REP condition, the transfer test contained 33 patterns, 15 old (five from each category), 15 new (five from each category), and the three-category prototypes. In the NREP condition, there were 63 different patterns, 30 old (10 × 3 categories), 30 new (10 × 3 categories), plus the three-category prototypes. As was the case in Experiment 2, the transfer test for the NREP subjects contained two old patterns from each of the 15 trial blocks.

Results

Learning

Figure 8 shows the learning performance across the trial blocks for the REP and NREP conditions. The substantial learning across blocks was significant, F(14, 742) = 55.00, MSe = .014, p < .001, but neither the main effect of condition, F < 1, nor the Condition × Blocks interaction, F < 1, was significant, both ps > .20.

Fig. 8
figure 8

The mean proportion correct classification across trial blocks for the REP and NREP conditions, Experiment 3 (standard errors at each block and condition averaged between .01 and .03)

Transfer

Figure 9 shows the likelihood that old, new, and prototype patterns were called old on the transfer test. An analysis revealed that the main effect of condition was not significant, F(1, 53) = 1.67, MSe = .056, η2 = .031, p = .201. However, the main effect of pattern type was significant, F(2, 106) = 45.60, MSe = .016, η2 = .462, as was as the Pattern Type × Condition interaction, F(2, 106) = 18.15, MSe = .016, η2 = .255, both ps < .001. As was the case in Experiment 2, subjects discriminated between old and new patterns in the REP condition but failed to discriminate between these patterns in the NREP condition. The higher rate of calling the prototype old in the NREP versus the REP condition (.923, .863) was not significant, t(53) = 1.32, p = .187.

Fig. 9
figure 9

Mean proportion of old responses (with standard error bars) to old, new, and prototype patterns on the transfer test, Experiment 3

The likelihood that the old patterns in the NREP were called old as a function of where they appeared in acquisition is shown in Fig. 10. Once again, performance appeared to randomly fluctuate across training blocks. A linear fit across blocks revealed no evidence that the training patterns were more likely to be called old if they appeared later in training.

Fig. 10
figure 10

The proportion of old responses (with standard error bars) on the transfer test to training patterns that occurred in trial blocks 1–15, Experiment 3

The signal detection analysis of the transfer results revealed that, once again, subjects discriminated old from new in the REP condition (d’ = 1.09) but not in the NREP condition (d’ = -.04). The d’ values for the prototype-new discrimination was somewhat higher in the NREP than in the REP condition (d’ = 1.18 vs. 0.81).

General discussion

The main focus of the present study was to assess whether categories could be readily learned when novel patterns appeared on each trial block, rather than repeating them, as is typically done in categorization research. The results were clear-cut. In each experiment, generally when three categories were learned but also when two were used, the degree of learning was substantial and no worse than when the category patterns were repeated on each trial block. Interestingly, exemplar-based models of classification (Medin & Schafer, 1978; Nosofsky, 1988) predict a faster rate of learning in the repetition condition, an outcome forced by the greater similarity to memorial traces when patterns are repeated in learning. Since this outcome was not obtained, alternative learning mechanisms must be considered.

As revealed in Experiment 1, subsequent generalization to novel patterns was not impacted by learning from non-repeating stimuli. Classification of novel patterns occurred at a rate equal to, or slightly better than, the classification rates obtained following repetition training. Notably, rapid learning and excellent classification occurred even though subjects exhibited no significant memory for the individual training patterns. Specifically, in the non-repeating condition, recognition of the old patterns was no better than chance, as indicated by the equivalent hit and false-alarm rates of old and new patterns. In contrast, substantial recognition of the old patterns occurred in the traditional repeating condition. Subsequent signal detection analysis revealed that subjects establish a comparable criterion for oldness judgments following repetition and non-repetition learning, with d’ indistinguishable from zero for the non-repeat condition. Converging support arose from analysis of the hit rates for the patterns that occurred on each training block. One might have anticipated some memory for patterns, especially those that occurred late in the learning phase. Nonetheless, the recognition rate for all patterns was flat across blocks and at a level that did not differ from the false-alarm rate for new patterns. Finally, the subjects in the non-repeat condition, who failed to recognize any of the old patterns, nonetheless had a false-alarm rate for the category prototypes that was higher than for any other pattern and higher than the false-alarm rates for the prototypes in the repeat condition.

Taken together, these results pose substantial problems for exemplar-based models of classification. Proponents of exemplar theory might maintain that minimal memory of the training instances still existed following acquisition, which was responsible for the classification and recognition performance. Indeed, this is the basis of the argument by Nosofsky and Zaki (1998), who reassessed and re-interpreted the performance of amnesic patients in the study by Knowlton and Squire (1993). The amnesic subjects in their study had a lower recognition rate of the training patterns compared to the normal controls, but recognition accuracy by amnesics was about 65% (where chance was 50%). This reduced, but not absent, memory was sufficient to generate classification and recognition results similar to that obtained by Knowlton and Squire. Nosofsky and Zaki also addressed the performance of one subject, E. P., whose recognition performance was at chance yet whose classification was near normal, concluding that: “The baseline version of the exemplar model presented in this article is unable to fit this pattern of data.” (p. 254). In effect, the subjects in the present study are like the patient E.P., who had normal transfer for novel patterns even though his recognition memory for the training patterns was non-existent.

The argument might be made that our subjects extracted pattern fragments in the learning phase, rather than intact patterns, sufficient to promote rapid learning and later generalization even though recognition for individual patterns might be virtually zero. This interpretation seems unlikely because the storage of pattern fragments might result in reduced memory for intact patterns, but not zero. That is, the existence of these pattern fragments would be matched with the entire pattern at the time of recognition, something not possible with the new patterns. As we noted previously (Homa et al., 1991), novel pattern fragments are unlikely to exactly repeat in another pattern when ill-defined, infinitely variable patterns are used, and thus, stored fragments could be exactly matched only with the training pattern.

What we believe is responsible for the learning, classification, and recognition performance in the non-repeating conditions is the abstraction of a category prototype early in learning, which continues to evolve as more and more patterns are encountered in training. Further, this abstraction process is more likely to occur when the patterns never repeat, since an initial strategy of attempting to memorize individual patterns is doomed to failure as more and more unique patterns are encountered. This fosters an abstraction process early on, with abstracted prototypes formed and continually modified as different learning patterns are presented. There is, in principle, no necessity to store particular learning patterns in order to learn the category. The subsequent classification is easy to explain – research has shown for many years that increasing category size aids later classification, an outcome that has been repeatedly obtained (e.g., Homa et al., 1973; Homa et al., 1991; Homa, Proulx & Blair, 2008). Subjects in the non-repeating condition are exposed to precisely that – a category of substantial size, relative to the category size of the repeating conditions. The recognition results would similarly follow if, indeed, the subject stores the abstracted prototype and little else.

In the following section, we introduce a model of categorization for the non-repeating condition that assumes that continual learning increasingly alters the similarity relationship to an abstracted prototype. The current preliminary model can capture the learning, transfer classification, and transfer recognition obtained in the present study.

Model of categorization

The proposed model uses a common set of parameters to capture three critical sets of results: (1) Learning is as rapid when patterns never repeat as when they do; (2) Following learning in the non-repeating condition (NREP), subjects demonstrate little or no memory for the training patterns; and (3) Classification of novel patterns is no worse and perhaps slightly superior following training on categories whose patterns never repeat. The model has one major modification compared to previous models – we assume that similarity among the transfer patterns is modified and dependent upon the level of learning. That is, across training, categories begin to emerge, represented by increasingly distinctive clusters in the multidimensional space. Otherwise, the proposed model shares similarity to other models that represent similarity by the empirically determined multidimensional distances separating the patterns (e.g., Homa et al., 2008; Nosofsky & Johansen, 2000; Minda and Smith, 2001).

The current model makes one other critical assumption: when a category is represented by a small number of patterns that are repeatedly presented in learning, the subject stores these exemplars and uses these stored traces for both recognition and classification. However, when the patterns representing a category are sufficiently numerous (especially when never repeated, as in the present study), the subject abstracts the central tendency, which then forms the basis for all subsequent decisions. There is support for the view that small-sized categories, e.g., three to five patterns per category, are represented as singular traces whereas larger-sized categories, e.g., ten or more, are more likely to rely upon summary representations, such as an abstracted prototype (e.g., Homa, Sterling, & Trepel, 1981; Homa et al., 2008). There is also support for the assertion that similarity relationships are increasingly modified by degree of prior learning (Homa, Rhodes, & Chambliss, 1979). In Homa et al. (1979), the multidimensional space of three categories was increasingly modified via different levels of learning. In particular, additional learning increasingly reduced within-category distances (and, therefore, dissimilarity) as reflected by the subsequent multidimensional configuration, especially for categories represented by numerous instances.

Our initial model adopted the simplifying assumption that categories are largely unstructured at the outset of learning and that the distances among the patterns common to a category are increasingly reduced with additional learning blocks. Specifically, we assumed in the version presented here that the multidimensional distance among, for example, the distance between any two medium level patterns terminated close to the value of 0.72. This latter distance is the mean value obtained from the multidimensional scaling of patterns following extensive learning (e.g., Homa et al., 2008). On the initial block, we assumed that this value was 1.60 (the mean distance between any two patterns prior to learning). The convergence of distance between any two medium-level distortions, common to a category, to its terminal learning value obeyed the formula:

\( d(B)=\frac{d_b-{d}_{mm}}{B^{\lambda }}+{d}_{mm} \)

Here B is the block number, db is the MDS distance between any two patterns from different categories, dmm is the MDS distance between any two patterns from the same category, and λ is a decay parameter that regulates how quickly similarity is modified by learning.

Since db = 1.60 and dmm = .72, the equation reduces to:

$$ d(B)=\frac{1.60-0.72}{B^{\lambda }}+0.72 $$

Therefore, following an extremely large number of learning trials, the MDS distance between any two medium-level distortions was 0.72. The decay parameter (λ) allowed pattern distance to change at different rates across learning blocks for the REP and NREP conditions.

In our model, performance in the REP conditions is exemplar based, whereas performance in the NREP condition is prototype based. In the following equations, parameters in upper case are fixed parameters based on the design of the experiment (e.g., C, the number of categories is a fixed parameter; compared to c, the sensitivity-free parameter). The fixed parameters for all conditions are defined in Table 1, along with their values for this study. Empirically determined parameters, the MDS distance measures, are presented in Table 2, along with their definitions and values in this study. Lower-case parameters are free parameters and were adjusted in the process of finding values that minimized the RMSE error to mean data across subjects (using the fminsearch function in Matlab; e.g., β is processing noise that is hypothetical). The free parameters along with their definitions and best fit values are given in Table 3. Least square best fits between observed and predicted values were determined simultaneously for all conditions and three data sets – learning, transfer-classification, and transfer-recognition. In addition, best fits were determined by assigning equal weights across these three data sets. The adequacy of the data fits is addressed later.

Table 1 Fixed parameter values for all conditions, with values for present Experiment
Table 2 Fixed multidimensional distance values for all conditions
Table 3 Parameter values and root mean square error (RMSE) for the mixed and exemplar model

Similarity measures are defined by MDS distances and scaling parameter c during training by the equation (REP and NREP, respectively):

$$ {S}_{mm}=\exp \left[-{c}_r\left({D}_{mm}+\frac{D_b}{B_i}\right)\right] $$
(5a)
$$ {S}_{pm}=\exp \left[-{c}_n\left({D}_{pm}+\frac{D_b}{B_i}\right)\right] $$
(5b)

Here, Smm is the similarity between two medium-level distortions in the REP condition; Spm is the similarity between a medium-level distortion and prototype pattern in the NREP condition. The added term Db / Bi reflects the assumption that similarities increase with more training, i.e., block number (Bi).

The similarity measures for transfer are defined by the equation (REP and NREP, respectively):

$$ {S}_{my}=\exp \left(-{c}_r{D}_{my}\right) $$
(5c)
$$ {S}_{py}=\exp \left(-{c}_n{D}_{py}\right) $$
(5d)

The y subscript denotes whether we are calculating similarity scores for low, medium, or high-distortion exemplars.

Training equations

Performance is determined by Luce’s choice axiom (Luce, 1959). For REP training, the probability of correctly categorizing a stimulus in training block Bi is given by:

$$ {P}_{correct}=\frac{{\left[{\beta}_r+\left({B}_i-1\right)\Big(1+\left(N-1\right){S}_{mm}\right]}^{\gamma }}{{\left[{\beta}_r+\left({B}_i-1\right)\Big(1+\left(N-1\right){S}_{mm}\right]}^{\gamma }+\left(C-1\right){\left[{\beta}_r+\left({B}_i-1\right)X{S}_{mb}\right]}^{\gamma }} $$
(6)

This equation mirrors equations (1–4), with the major change that similarity among the training patterns is continuously modified as learning progresses.

The equation for training in the NREP condition is similar to that for the REP condition with two key differences. First, similarity values reflect the similarity of the exemplars to the prototype rather than to each other; second, when calculating similarity, the sensitivity parameter for REP (cr) and NREP (cn) (equation 5) were allowed to take on different values. The number of categories (C) remained the same between REP and NREP. For NREP training, the probability of correct categorization in training block Bi is

$$ {P}_{correct}=\frac{{\left[{\beta}_n+{S}_{pm}\right]}^{\gamma }}{{\left[{\beta}_n+{S}_{pm}\right]}^{\gamma }+\left(C-1\right){\left[{\beta}_n+{S}_{pb}\right]}^{\gamma }} $$
(7)

Figure 11 shows the learning results obtained in Experiment 1 (top panel) and the model’s predictions (bottom panel), shown separately for the REP and NREP conditions. The critical result – that NREP fares no worse in learning than REP, except for the final learning blocks when subjects likely memorized the fewer patterns in the REP condition – is nicely captured by our model.

Fig. 11
figure 11

Observed and predicted performance in learning for the REP and NREP conditions

Transfer – classification

Transfer patterns were composed of new patterns at one of three level of distortion from the prototype and the category prototype itself. For the REP condition, pattern similarities are computed based on similarity to stored patterns in memory; for the NREP condition, pattern similarities are based on similarity to the category prototype.

For a new medium-distortion exemplar in REP, the probability is:

$$ {P}_{correct}=\frac{{\left[{\beta}_r+\left({B}_T-1\right)N{S}_{mm}\right]}^{\gamma }}{{\left[{\beta}_r+\left({B}_T-1\right)N{S}_{mm}\right]}^{\gamma }+\left(C-1\right){\left[{\beta}_r+\left({B}_T-1\right)N{S}_{mb}\right]}^{\gamma }} $$
(8)

Predicted classification of the low, high, and prototype stimuli requires only that the appropriate similarity be substituted into equation (8). For example, low-distortion performance can be calculated as:

$$ {P}_{correct}=\frac{{\left[{\beta}_r+\left({B}_T-1\right)N{S}_{ml}\right]}^{\gamma }}{{\left[{\beta}_r+\left({B}_T-1\right)N{S}_{ml}\right]}^{\gamma }+\left(C-1\right){\left[{\beta}_r+\left({B}_T-1\right)N{S}_{mb}\right]}^{\gamma }} $$
(9)

The NREP classification calculations, based on prototype models, are similar in structure. In the classification trial in which the prototype is presented, we must remember that the experimenter’s prototype is actually compared to the subject’s internal prototype representation, which for our purposes are assumed to be an exact match (Spp =1).

$$ {P}_{correct}=\frac{{\left[{\beta}_n+{S}_{pp}\right]}^{\gamma }}{{\left[{\beta}_n+{S}_{pp}\right]}^{\gamma }+\left(C-1\right){\left[{\beta}_n+{S}_{pb}\right]}^{\gamma }} $$
(10)

All the other NREP classifications can be calculated by substituting Spp with Spy, where y is the appropriate distortion level desired.

Figure 12 shows the obtained and predicted values for classification (Experiment 1) of the category prototype, low, medium, and high-level distortions. Overall, the model captures the gradient across distortion level, including the slight but significant advantage of the NREP condition for the medium level distortions.

Fig. 12
figure 12

Observed and predicted transfer performance on the classification of the different transfer patterns (prototype, low, medium, high), shown separately for the REP and NREP conditions

Transfer – recognition

In recognition, we have different but conceptually similar equations for REP and NREP conditions for old, new, and prototype recognition test items. The difference between recognition judgments and categorization judgments is reflected in the use of a recognition threshold, θr for REP and θn for NREP.

To denote the change for REP, Srmm is used (similarity between two medium patterns in the REP condition) instead of Smm; NREP similarity is denoted as Snmm. The probability of calling “old” a previously-seen exemplar in the REP condition is modeled as:

$$ {P}_{rep\mid old}=\frac{{\left[{\beta}_r+\left({B}_T-1\right)\Big(1+\left(N-1\right){S}_{mm}+\left(C-1\right)N{S}_{rmb}\right]}^{\gamma }}{{\left[{\beta}_r+\left({B}_T-1\right)\Big(1+\left(N-1\right){S}_{mm}+\left(C-1\right)N{S}_{rmb}\right]}^{\gamma }+{\theta}_r} $$
(11a)

The probability for a subject to call a new exemplar “old” is modeled as:

$$ {P}_{rep\mid new}=1-{P}_{rep\mid old} $$
(11b)

NREP probability is similarly structured, except for the fact that we are using prototype-based models:

$$ {P}_{nrep}=\frac{{\left[{\beta}_n+{S}_{npm}+\left(C-1\right){S}_{npb}\right]}^{\gamma }}{{\left[{\beta}_n+{S}_{npm}+\left(C-1\right){S}_{npb}\right]}^{\gamma }+{\theta}_n} $$
(12a)

The probability for a subject to call a new exemplar “old” is modeled as:

$$ {P}_{nrep\mid new}=1-{P}_{nrep\mid old} $$
(12b)

The obtained and predicted performance on the recognition test is shown in Fig. 13. Critically, the model captures the result that subjects in the REP condition could discriminate the training from the new patterns whereas subjects in the NREP could not.

Fig. 13
figure 13

Observed and predicted likelihood of calling a pattern “old” on the transfer test, as a function of pattern type (old, new), shown separately for the REP and NREP conditions

Fit to an exemplar model of categorization

We also fit an exemplar-based model to the same set of results – learning, transfer-classification, and transfer-recognition for repeated versus non-repeated pattern learning – using the same parameters as used for our mixed model. The only change was that pattern similarity was based on the relationship of patterns to stored exemplars rather than the category prototype, regardless of whether learning involved the repetition of patterns across training blocks or the presentation of novel patterns within each block.

Overall, the mixed model outperformed the exemplar model (RMSE = .028 for the mixed model and .048 for the exemplar model). The advantage of the best-fitting mixed model, versus the exemplar model, arose primarily in the transfer data (e.g., subsequent classification [Experiment 1] and recognition [Experiment 3]). Figure 14 shows these disparities, with the mixed model in the upper panel and the exemplar model in the lower panel. Although the exemplar model provides a reasonable fit to the data, including fit to the category prototype, critical contrasts generally favored the mixed model, especially for the NREP condition.7 In particular, the exemplar model over-predicted classification of the high distortions and under-predicted the false recognition rate of the prototype. The former outcome is consistent with previous concerns that the exemplar model generally failed to capture the magnitude of the gradient across decreasing similarity to the category prototype (e.g., Homa et al., 1981; Homa & Powell, & Ferguson, 2014; Smith & Minda, 2002). Two additional comments regarding the parameters are worth noting: (1) the best fitting threshold parameter is large for the mixed model for REP but close to zero for NREP. This implies that recognition performance is above chance for REP but is essentially at chance for NREP, as in the data; the exemplar model does not have this property; and (2) the noise parameter is zero for NREP but substantially above zero for REP. This suggests that the repetition of stimuli across learning blocks in the REP condition generates increasing noise whereas the virtual absence of memory for stimuli in the NREP condition does not. We should reiterate that the mixed model is an exemplar model for the repeat condition and a prototype model for the non-repeat condition. Not surprisingly, the mixed model and the exemplar model predict with similar accuracy for the repeat condition.Footnote 7

Fig. 14
figure 14

Transfer predictions of Mixed (upper panel) and Exemplar (bottom panel) models for classification (left panels) and recognition (right panels)

Conclusion

Our preliminary model does an adequate job in capturing the major results of our study. Learning is predicted to be as rapid when patterns repeat as when they do not, little or no memory for the training patterns exists when patterns are never repeated in learning, and subsequent classification of novel patterns, including those of increasing distortion level, was demonstrated.

Being able to capture results does not prove the legitimacy of our model (Wills & Pothos, 2012), but it demonstrates that a prototype model can capture critical results following non-repeating training better than the exemplar model explored in the present study. Some assumptions were adopted for simplicity’s sake, such as the manner in which similarity changed across blocks and the assumption that learning on the first block would be at chance. We cannot claim that other models yet considered would fare less well than our current one, although the finding that learning can occur rapidly even when subjects cannot discriminate old from new patterns at better than chance accuracy poses a severe challenge for any exemplar or related connectionist models (e.g., ALCOVE, Kruschke, 1992). An interesting exception could be Minerva II (Hintzman, 1986), which stores particulars but can generate an echo mimicking the prototype. Given the severe memorial degradation of exemplars in the non-repeating condition, an issue is whether sufficient exemplar knowledge resides in memory to even generate a functional echo. A full examination of how other models may or may not capture our results is beyond the current scope of this paper. Regardless, the results in the present study are novel to the field and interesting in their own right since no classificatory cost was obtained when learning patterns never repeated, a characteristic likely to be common in natural learning situations but absent in current paradigms.