Introduction

The taxonomic work underlying species discoveries lays the foundations for all subsequent biodiversity-based research. To know how many species exist provides valuable information about progress in the rate of discovery of life on Earth [1]. Moreover, species richness—the number of different species in an area—is one of the key metrics for estimating species diversity, which is the basis for many comparative ecological, biogeographic and conservation studies.

One useful method to estimate total species richness—or the number of species that will be discovered by a particular time—is by looking at species discovery rates [2]. Using this approach, Costello et al. [3] have predicted global species richness to be around 1.8 to 2.0 million species. Appeltans et al. [4], using a statistical model of past rates of species description, field observations of undescribed species, and over one hundred expert’s assessments, estimated the number of marine species to be 0.5 ± 0.2 million. Their model could not be used for polychaetes alone, but their expert-opinion estimate was that some 6320 polychaete species remained to be described, and they speculated that “a total of 25,000 to 30,000 species would not be surprising.”

Polychaetes are segmented worms belonging to the phylum Annelida. They are predominantly marine with some species in fresh and terrestrial groundwaters [5, 6]. Their naming began before the formal start of taxonomy (arbitrarily deemed to be 1758, matching the 10th edition of Linnaeus’s Systema Naturae). The historic developments of polychaete taxonomy, and the seminal works on each family, were reviewed by Fauchald and Rouse [7] and Rouse and Pleijel [8]. All the thousands of Polychaeta names created up to the 1960’s were subsequently captured by Hartman [9, 10, 11]. Those early names, along with all the names published since, were digitized by Kristian Fauchald, and the data have been made publicly available in 2007 as part of the World Register of Marine Species (WoRMS) online database.

Despite the availability of those well-documented data, a comprehensive review of the discovery rate of polychaetes has never been done. Here, we review progress in the discovery of polychaete species and estimate the number of species that will be discovered by the end of the twenty-first century.

Methods

The present study was based on data from the World Polychaeta Database, which is part of WoRMS [12], downloaded on 10 October 2016. It included all the taxa traditionally referred to as polychaetes (assigned class rank in WoRMS) within the Annelida, plus the more recently added Pogonophora and Vestimentifera (now the siboglinids) and the echiurans, but not aeolosomatids and myzostomids (both of them are Annelida incertae sedis), and not the clitellates and sipunculans. Sipunculans appear to be basal annelids [13], not ‘polychaetes’ per se; and the clitellates, while now molecularly aligned within Sedentaria polychaetes, have been outside the scope of WoRMS to date because they are largely terrestrial, so complete data on clitellates was unavailable. WoRMS and this study do not include polychaete species that are informally named, even when these names are connected to vouchered and registered museum specimens as they are not accepted species names.

During initial analyses, we noticed issues that merited correction. These were corrected prior to data analysis to maximize the accuracy of the data used (Additional file 1: Table S1). We included only Recent species-rank names whose WoRMS-status was ‘accepted’. Consequently, 499 species of fossils, 204 names not checked by a taxonomist, 424 currently accepted subspecies, and some taxonomically uncertain name categories were excluded (Additional file 1: Table S2). We included only family names currently valid in WoRMS (notably subfamilies in Terebellidae are elsewhere treated at family level by some workers). Although many other family names have been proposed, they are now synonyms or subfamilies, either for nomenclatural reasons such as priority, or more fundamental reasons such as re-classification following insights from molecular analyses (e.g., Pisionidae as part of Sigalionidae [14]).

The discovery rate of polychaetes was studied in the following ways. First, the cumulative number of species described from 1758 to 2016—for both errant and sedentary species—was plotted to see if the curve reached an asymptote. The non-homogeneous renewal process model of Wilson and Costello [15] was then run to forecast the number of species that would be described by the years 2050 and 2100 with 95% confidence limits. The equation used was:

$$ \varvec{t} = \frac{\varvec{N}}{\mathbf{{1 + {exp}}\left( { - \varvec{N\beta }\left( {\varvec{t} -\varvec{\alpha}} \right)} \right)}} $$

where t is the number of polychaete species discovered by a particular year; N is the total number of polychaete species to be discovered; α is the year of the maximum rate of discovery; and β is the overall rate of discovery (a larger β implying a faster rate). In addition, the annual number of species described was plotted to see the general trend of species discovery.

As an indicator of taxonomic effort through time, the number of first authors describing polychaete species was plotted yearly. Here, we only considered unique surnames of first authors so that the presence of additional authors did not inflate the apparent effort. Thus, for example, the well-known nineteenth century pairing of Audouin and Milne Edwards was counted once. In cases where different authors have the same surname, we attempted to find the original source of descriptions to distinguish them based on their given names (e.g., F. Müller, O. F. Müller, M. Müller and M. C. Müller). When this was not possible, we counted a surname for 50 years from its first occurrence, with an assumption that any effects of different authors with the same surname within a year were negligible and/or random over time.

Next, the average number of polychaete species described per first authors per year was plotted. A least squares piecewise regression analysis was additionally performed to identify the period from where the number of species described per author began to decrease. Bouchet et al. [16] suggested that the decline in the number of species described per author in Mollusca may be caused by many citizen-scientist authors describing just one or two species, and thereby does not necessarily indicate a difficulty in finding new species. To ascertain whether any increase in first authors of new species descriptions was due to an increasing proportion of authors who only described one species, we looked at their contribution to species descriptions over decades (we used decades to minimize the occurrence of zero values). In addition, we used Pearson’s skewness coefficient to compare the relative number of species described by all first authors over the decades. A change in skewness could indicate a changing proportion of highly productive authors.

Finally, we counted the number of non-polychaete species described by the top 25 most prolific first authors using the ‘advanced search’ feature in WoRMS to see if polychaete taxonomists are now more specialized than they were in the nineteenth century (note that some of these authors may have named non-marine species that are not in WoRMS). We counted surnames from different authors as previously explained, and only counted non-polychaete names that were accepted species.

Results

Species richness

WoRMS is constantly being updated and corrected. At the time of the data download in 2016, as many as 11,456 polychaete species (1417 genera, 85 families) had been described (Table 1)—these are the valid names remaining from the 21,104 names actually created, which include as well all of the unaccepted names (Additional file 1: Table S2). Of these species, 6033 species belong to the subclass Errantia, whereas 5085 and 158 species belong to the subclasses Sedentaria and Echiura, respectively. Additionally, 180 species were from families currently outside of, or as yet unassigned to, subclass groupings in WoRMS, and referred to as Polychaeta incertae sedis.

Table 1 The list of valid polychaete families and their author(s), species and genera (ranked by species number per family), as well as years of first and last species descriptions and cumulative percentages of species described per half century

We found that six polychaete families were the most species-rich. That is, in order, Syllidae (993 species), Polynoidae (876 species), Nereididae (687 species), Spionidae (612 species), Terebellidae (607 species) and Serpulidae (576 species) (Table 1). About 38% of known polychaete species belonged to these families. By contrast, four polychaete families, i.e., Ichthyotomidae, Ikedidae, Laetmonectidae and Pontodoridae, were monotypic (having only one species) (Table 1). These four family names are hierarchy place-holders for morphologically distinctive species with as yet no obvious affinities to other families. Rouse and Pleijel [8] regarded such monotypic family group names as being redundant as they represented an ‘empty taxon’.

Species discovery and authors

We identified three phases of polychaete species discovery. The initial phase, where few polychaete species had been described by few taxonomists, occurred from 1758 to the middle of the nineteenth century (Fig. 2). During this period, the cumulative number of species described increased slowly (Fig. 1), and nearly 500 species, or about 4% of the known species, had been described.

Fig. 1
figure 1

Cumulative (black line) and median predicted (red line with 95% confidence limit) numbers of species described to 2100 using the model of [15]

The second phase of the discovery started from about the 1850’s to the middle of twentieth century, indicated by many species being described mostly by some very productive authors. For example, McIntosh [17] recorded 308 full species from the Challenger (1872–1876) expedition, of which 220 (71%) were new. The 1860’s stand out as an unusually productive and dynamic time for polychaete taxonomy (Fig. 2c) due to major monographs or series by Claparède, Ehlers, Grube, Kinberg, Malmgren, Quatrefages, and Schmarda. Despite a low period of activity in the late nineteenth century (Fig. 2a) and a dip in active authors during the Second World War (Fig. 2b), almost 5000 species, or about 43% of the known species, had been named by the end of this phase.

Fig. 2
figure 2

a The number of species described per year. b The number of first authors per year. c The number of species described per first author per year. The black lines are six-year moving average. The red linear correlation line represents the best fit based on the piecewise regression analysis (Additional file 1: Figure S2), showing that there is a statistically significant trend of decreasing number of species per author since around mid-nineteenth century (prior to that, there is not enough evidence to identify a trend)

The third phase of the discovery started after the Second World War. At this point, the annual number of species described rose significantly and reached a peak in the 1960’s (Fig. 2a). It then plateaued until 1990, declined to around the turn of the century, and increased again from 2010 (Fig. 2a). Over this period, approximately 6000 species, or about 52% of the known species, had been described by the most authors ever (Fig. 2b). The trend in the cumulative number of polychaete species described was similar for both errant and sedentary polychaetes (Additional file 1: Figure S1).

Based on earlier species discoveries and at 95% probability, we forecast medians of 2600 (± 300) and 5200 (± 600) more polychaete species will be discovered by the years 2050 and 2100, respectively (Fig. 1). The cumulative numbers of polychaete species described are thus estimated to be about 14,100 and 16,700 species by the years 2050 and 2100, respectively (Fig. 1).

From 1758 to 2016, 835 taxonomists were first authors of the descriptions of the 11,456 valid polychaete species. Among them, Hartmann-Schröder, Hartman, and Grube were the top three most prolific authors describing about 1400 species or about 12% of the known species (Table 2). Thus, 25 authors have described over 5200 species, or 45% of the known species (Table 2). One-third (278) of authors have described 90% of the known species.

Table 2 The top 25 most prolific first authors along with their numbers of polychaete species described, first and last discoveries, cumulative proportion of the number of polychaete species described, publication lifetime and the number of non-polychaete species described

The number of first authors describing polychaete species per year increased slowly from 1758 to mid-nineteenth century (Fig. 2b). It then increased moderately and dropped most noticeably during the Second World War (Fig. 2b). Afterwards, many more authors described species, and the past two decades were the period with the most first authors ever (Fig. 2b). In contrast to this, the number of polychaete species described per first author per year began to decrease since around the middle of the nineteenth century (Fig. 2c, Additional file 1: Figure S2).

The majority of the 25 most prolific authors had polychaete publication lifetimes of around 30–60 years. There is no indication these are decreasing (Additional file 1: Figure S3), and three of these prolific authors are still active (Table 2). Among the 25 most prolific authors, 14 individuals also described non-polychaete species (Table 2), which were mostly published between the 1840’s and 1960’s (Additional file 1: Figure S4), indicating that past polychaete taxonomists were more generalistic than recent ones. There was no clear trend in the proportion of non-career first-author polychaete taxonomists over the past centuries (Fig. 3a). This indicates that the increase in the number of authors was not due to more incidental authors. Rather, it suggests that there has been increased taxonomic effort since the 1950’s, as already shown in Fig. 2b. Moreover, the positive skewness values show that over all decades most authors described few species (Fig. 3b).

Fig. 3
figure 3

a The proportion of authors describing one species only per decade (i.e., non-career taxonomists). b The Pearson’s Skewness coefficient per decade. The red dashed lines are the linear trends

Discussion

We found that there were 11,456 validly named polychaete species at the time of the data download in 2016 (Table 1). This number is rather lower than that used by Appeltans et al. [4], i.e., 12,632 species, and coincidentally close to that used by Costello et al. [3], viz., 11,548 species. The decrease, despite well over one hundred new taxa added every year, is due to recognition of synonyms as a consequence of data revisions.

Our model based on current rates of species descriptions showed that about 5200 more polychaete species will be discovered by 2100 (Fig. 1); this number is around one-third of the total predicted number of species by then (16,700 species). In other words, as shown in Fig. 1, approximately two-thirds of the total predicted number of polychaete species by 2100 have already been described; a proportion regarded as typical for progress in marine and other taxa by some analysts [3, 18].

The high current rate of polychaete species discovery is being supported by an increasing number of people describing the animals since the 1960’s (Fig. 2b). A similar trend of an all-time peak in authors in recent decades was also observed for various taxa such as fossil mammals [19], amphibians, birds, cone snails, flowering plants, mammals and spiders [20, 21], fish [22, 23], Brazilian flowering plants and land vertebrates [24], parasites [25], and amphipod crustaceans [26]. The increase in the number of authors was also the case for all taxa on Earth [3]. Our findings on the increase in first author numbers for polychaete taxonomy were thus inconsistent with the common belief that the science of taxonomy is in crisis [27], and that the number of people specializing in taxonomy is in decline [28, 29]. Recent analyses confirm earlier indications that the increase in taxonomic authors has been particularly high in South America and Asia [1, 30, 31].

In contrast to the increasing number of first authors, the number of polychaete species described per first author in a year has declined since around mid-nineteenth century and shows a continued decline since the 1960’s, with noticeably reduced variation in the data since the 1990’s (Fig. 2c). This is different from the accepted phenomenon of author-inflation per article. As to the latter, in the case of taxonomy one possible reason for there being more authors per individual species is likely to be partly due to there being higher-quality species descriptions (especially those including molecular data) that require a wider range of expertise [32], as well as possible changing authorship practices.

The gradual decrease in the number of species described per first author per year may be a sign of an increasing difficulty in finding new polychaete species as the more widespread and conspicuous taxa have been discovered (the remaining species may require more careful taxonomic review and scrutiny to distinguish). Yet, the greater number of first authors, new sampling methods (e.g., scuba, ROVs), more advanced technology (e.g., better microscopes, digital drawing and photography tools, molecular methods), the rapid increase in the number of scientific journals publishing taxonomic works and easier access to publications since the era of the Internet [1, 33] should at least balance the greater effort needed to describe species more comprehensively in recent decades [34]. If this is the case, such a pool of taxonomic workers, at some point, will no longer maintain the description rate, and a reduction in the number of species described per year will occur. This phenomenon has already occurred for various taxa such as some insects [35], scleractinian corals [36], fossil mammals [19], marine fishes [22], amphibians, birds, flowering plants, mammals, spiders [20, 21], algae [37], flowering plants [38], beetles [39], parasites [25] and amphipod crustaceans [26].

If the drop in the number of polychaete species described per first author was due to a bigger proportion of non-career taxonomists nowadays, then the increase in the number of authors since the 1960’s would not reflect increased taxonomic effort. In this study, we do not know for how long recent authors (i.e., people who described species since the 1950’s) will continue to publish species descriptions. Therefore, whether the working lifetime of recent authors will be similar to that of previous decades remains to be seen. Nonetheless, our analysis found no trend in the proportion of non-career polychaete taxonomists over time (Fig. 3a), which is consistent with the observation on the proportion of non-career taxonomists through time globally [30]. Thus, the considerable increase in the number of first authors since the 1960’s (Fig. 2b) appears to reflect greater scientific effort, and the drop in the number of species described per first authors (Fig. 2c)—despite the greater effort—indicates difficulty in finding new species. However, reasons for the increased difficulty in finding new polychaete species may be more complex than having discovered most of them. Perhaps, an equally likely reason is that small-sized and cryptic species are being under-sampled by commonly-employed survey sampling methods and the greater focus on more obvious collectable invertebrate species [40]. Certainly Annelida, whose members show a four orders of magnitude size variation (including meiofaunal-sizes) and an apparent abundance of cryptic species, provide a bigger challenge than many other phyla in estimating species diversity.

Conclusions

This study analysed the rate of description of polychaete worms over about 2.5 centuries, and found that 11,456 species (1417 genera, 85 families) have been formally described by 835 first authors (a species total distilled down in re-evaluations from nearly double (21,104) that number of names actually created over time). Proportionally, the 11,456 is about two-thirds of the total predicted number of polychaete species of the world by the end of the twenty-first century (16,700 species). The trend in polychaete species discovery thus seems to be following the overall global estimate of Costello et al. [3] in that about two-thirds of the total predicted number of global species by then have already been discovered.

The decline in the number of polychaete species described per author since around mid-nineteenth century, despite greater taxonomic effort and more favourable conditions for science pertaining to both species sampling and descriptions, suggests that it is now getting harder to find new polychaete species (using present sampling methods) as the more widespread and conspicuous ones may have been discovered. Despite this, approximately 5200 more polychaete species are predicted to be discovered by the end of this century. Given that the most prolific specialist taxonomic authors may describe about 100 species in their lifetime, and that the remaining species are likely to be more difficult to find and/or discriminate, this suggests that the world needs a further 50 full-time polychaete taxonomists to complete the work.