1 Introduction and review of previous works

Funk (2006) pointed out how floras (manuals of plants in geographical areas) are essential in a scientific continuum of scholarly publications required for biodiversity studies. This continuum includes also checklists, databases, studies of conservation biology, as well as monographs of genera and families. The present authors would like to add that field studies in and a detailed understanding of both the characteristics and the fauna and flora of a country or a region are also absolutely necessary for a proper understanding of its biodiversity. The basic questions for biodiversity studies must be: How many species? How are they distributed and how is the turnover between areas? Are species unique (endemic), narrowly distributed or widespread? Can the species-rich areas or areas rich in endemics be linked to the size of the areas and/or to environmental factors, in other words, can a phytogeographical pattern be established? However, in agreement with the tenets by Funk, only tentative biodiversity studies could be made prior to the completion of an academic flora for Ethiopia and Eritrea.

Ethiopia and Eritrea are topographically highly diverse countries, consisting of lowlands and elevated plateaux from which rise individual mountains and tablelands. A map by Mutke et al. (2001: Fig. 3) shows that the two countries are among the most topodiverse in tropical Africa, with an index of topodiversity of ca. 0.6, the highest on the continent. The highlands of Ethiopia and Eritrea are divided by the great Rift Valley into a north-western ‘Ethiopian Plateau’ and a south-eastern ‘Somalian Plateau’, and these two plateaux are again cut by deep river valleys on both sides of the Rift. The Ethiopian Highlands are surrounded on all sides by lowlands below 1000 m a.s.l., to the west by the Nile Valley in the Sudan and South Sudan and to the east by the Eritrean lowlands along the Red Sea, the Afar triangle and its southwards continuation in the Rift Valley. Mostly, the highlands and the lowlands meet along more than 1000 m high escarpments, as with the Nile Valley, the Afar triangle and the Rift Valley, but towards the south-east a gentle slope connects the highlands of the ‘Somalian Plateau’ with the lowlands in southern Somalia, north-eastern Kenya and the shores of the Indian Ocean. Thus, the complex three-dimensional topographical diversity of Ethiopia and Eritrea ranges vertically from ca. 125 m below sea level in the Afar triangle to ca. 4533 m above sea level (a.s.l.) in the Semien Mountains on the ‘Ethiopian Plateau’ and to 4377 m a.s.l. in the Bale Mountains on the ‘Somalian Plateau.’ Together, Ethiopia and Eritrea cover 1,221,900 sqkm, ranging horizontally from 3.38° to 17.52° N latitude and from 32.99° to 48.00° E longitude (decimal degrees). The vegetation of the two countries as a unit, the FEE area, varies from desert to rain forest and Afroalpine vegetation.

The biodiversity patterns of Ethiopia and Eritrea may seem unusual. Normally, species richness increases with higher rainfall (see for example Korell et al. 2021) and moist forests are the most diverse habitats. Our studies (Friis et al. 2010) show that the highest species richness, richness in endemics and the highest number of single-flora region endemics in the FEE area are found in the deciduous woodlands and bushlands in southern and south-eastern Ethiopia. This may be related to geological diversity (Friis et al. 2010), as is known from the Cape Floristic Region of South Africa (Goldblatt and Manning 2002). Moreover, diversity is normally assumed to decline with increasing altitude (Rahbek 1995). But Mutke et al. (2001: Fig. 1) demonstrated how a relatively high diversity of plants is correlated with the high topographical diversity of Ethiopia and Eritrea and that this diversity extends further to the north than anywhere else in Africa (see also Mutke and Barthlott 2005). How diversity is influenced by elevation has long attracted the interest of researchers. Traditionally, it was accepted as a general pattern that species richness declined with increasing elevation and that this mirrored a gradient with declining diversity at higher latitudes (because of reduced temperature and consequent decrease in productivity). However, when reviewing this, Rahbek (1995) pointed out that empirical studies often showed a ‘plateau’, ‘hump’ or ‘bulge’ on curves comparing species richness with elevation, and this should not be regarded as unexpected due to the fact that “although temperature declines with elevation, another life-support factor, [namely] stable water supply [from clouds meeting mountains], increases (at least to a certain elevation)”. Since Rahbek’s review, a growing amount of empirical evidence has revealed that biodiversity in the tropics may peak at intermediate elevations, as found in a preliminary study for Ethiopia and Eritrea by Friis et al. (2001). In Nepal, a country as topographically diverse as Ethiopia and Eritrea, Vetaas and Grytnes (2002) found that the maximum species richness was found between 1500 and 2500 m a.s.l., as in Ethiopia and Eritrea, while the zone with maximum species endemicity was found considerably higher, between 3800 and 4200 m a.s.l. Based on data from palms in New Guinea, Bachman et al. (2004) took into account the varying areas at different elevations and, when the altitudinal diversity was calculated on normalised bands, they were able to demonstrate a ‘mid-elevation diversity bulge’ also there. Yet a full explanation for the widespread occurrence of this ‘bulge’ has still not been offered. Scholars after Rahbek (1995) have offered a range of explanations other than Rahbek’s. Bertuzzo et al. (2016) suggested that in mountainous landscapes the mid-elevational bands of vegetation form complex habitat patches with different areal extent and connectivity, features that according to Bertuzzo et al. are well known to be drivers of high biodiversity. When the complex geometry of mountainous landscapes is taken into account, Bertuzzo et al. generalise that ecological dynamics may produce a ‘mid-elevation bulge’ of species richness. Diversity may have increased in zones where historical up- and downward movements of elevational bands of the vegetation have met. Unfortunately, the altitudinal distribution of the species in the flora of Ethiopia and Eritrea is still not sufficiently well known to carry out studies beyond empirical documentation of a ‘mid-elevation diversity bulge’, which we will present at the end of the paper.

1.1 Early studies of Ethiopian and Eritrean biodiversity

The first attempt to write a flora for Ethiopia, and indeed one of the first attempts to write a flora for a tropical African country, was the work of the French botanist Achille Richard (1794–1852), Professor of botany at the Faculty of Medicine, Paris. He based his Tentamen Florae Abyssinicae (Richard 1847–1851) on the collections at the Museum national d‘Histoire Naturelle in Paris, received from the German botanist G. W. Schimper, who lived permanently in Ethiopia from 1836, and from a French expedition to Ethiopia in 1838–1843 (Demissew 2014). Richard’s work covered parts of Eritrea and the northern part of Ethiopia and included ca. 1750 species (Friis 2014), most of which were named and described for the first time. Ethiopian and Eritrean plants were also included in the Flora of Tropical Africa (Oliver et al. 1867–1937) and in Hochgebirgsflora des Tropischen Afrika (Engler 1892), but we have not counted the species recorded from Ethiopia and Eritrea in these works. After the establishment of an Italian colony in Eritrea, Pirotta (1903–1909) edited a flora of the colony, but that was never completed.

Nearly 100 years after Richard’s flora, Georg Cufodontis (1896–1974), who had taken part in an Italian scientific expedition to Ethiopia in 1937, began a checklist of the flora of the Horn of Africa (Cufodontis 1953–1972). The completed work included 6323 species, but the geographical remit covers the current day countries Eritrea, Ethiopia, Djibouti and Somalia, and changing borders during the production period of the checklist makes it near impossible to establish exactly which species were recorded from which country. Friis (2014) suggested an approximate count of 4850 species from Ethiopia and Eritrea. Almost concurrently, also in 1953, an Italian botanist, Rodolfo E. G. Pichi Sermolli (1912–2005), who had taken part in an Italian scientific expedition to Ethiopia in 1936, initiated a series of monographic treatments of the plants of the Horn of Africa (Adumbratio Florae Aethiopicae) which appeared in the journal Webbia. It was discontinued unfinished in 1978 after publication of 32 monographs (Demissew 2014).

1.2 The Flora of Ethiopia and Eritrea (FEE) and its Floristic Regions (FRs); how estimated numbers of species and endemics have varied

In the 1970s, the international Association pour l’Etude Taxonomique de la Flore d’Afrique Tropicale (AETFAT) drafted plans for a flora to cover Ethiopia (then including Eritrea). The ‘Ethiopian Flora Project’ was initiated in 1980 as a bilateral undertaking of the Ethiopian and Swedish governments, and the flora was written by an international team of 92 scientists from 17 countries. The first volume was published in 1989 as Flora of Ethiopia and the final two volumes out of ten were published in 2009. All the volumes except the first one were referred to as Flora of Ethiopia and Eritrea. They were, in chronological order and with full titles in the references: Hedberg and Edwards (1989, 1995), Edwards et al. (1995, 1997, 2000), Hedberg et al. (2003, 2004, 2006, 2009a, b). The planning, organizing, financing and executing of this project has been described by Hedberg (1983, 1986, 1990, 1996, 2001, 2009, 2011, 2014), Hedberg and Gebre-Egziabher (1989) and Demissew (2006, 2011, 2014). The importance of the Flora Project as a ‘springboard’ to help carrying out other projects was analysed by Demissew et al. (2011). In the volumes of the Flora of Ethiopia and the Flora of Ethiopia and Eritrea (in the following, all volumes are referred to as FEE) the distribution of each species, subspecies and variety of vascular plants is outlined on 16 pre-defined areas, not products of analyses. These areas are in the following referred to as Floristic Regions (FRs), see map next to the title page of each FEE volume and in Fig. 1a in this paper, and on altitude. These two sets of data have been extracted, compiled and utilised in the following analyses.

Fig. 1
figure 1

Diversity and endemism of all indigenous species of vascular plants in Ethiopia and Eritrea, recorded on the Floristic Regions (FRs) used to record distribution in the Flora of Ethiopia and Eritrea (FEE) and based on updated data from the FEE and newly described species, as described in the text. The total number of indigenous taxa is 5219 and the number of endemic taxa is 627. a Floristic Regions (FRs) and their acronyms. b Number of taxa in FRs. c Number of endemic taxa in FRs. d Number of taxa endemic to the specific FRs on which the numbers are indicated

When four volumes of FEE (vol. 2(2), 3, 6 and 7) had been published, Friis et al. (2001) produced a spreadsheet recording presence and absence in the FRs of the published indigenous species, their distribution on altitude, distribution outside Ethiopia and Eritrea and on a simplified classification of habitats. The number of indigenous species in the four volumes was 2959. Species richness was highest in the FR of Sidamo (SD), followed by (in decreasing order, names of the regions written in full and the acronyms for the FRs in parentheses): Harerge (HA), Bale (BA), Shewa uplands (SU), Gondar (GD), Tigray uplands (TU), western Eritrea (EW), Gojam (GJ), Kefa (KF), Arsi (AR), Gamo Gofa (GG), Ilubabor (IL), Welega (WG), eastern Eritrea (EE), Welo uplands (WU), Afar (AF). In the following the FRs are referred to by these acronyms. The number of endemic species, near-endemics and single-region endemics in the FRs were found largely to follow similar patterns. With regard to altitudinal distribution, the analysis found a ‘mid-elevation diversity bulge’ with the highest numbers of taxa and the highest number of endemics and near-endemics between ca. 1500 and 2200 m a.s.l.

A new count by Friis et al. (2005) was based on a spreadsheet with the indigenous species of the six volumes of the FEE then published (vol. 2(1), 2(2), 3, 4(2), 6, 7) and species in manuscripts ready for publication of the families Campanulaceae, Lobeliaceae, Sapotaceae, Solanaceae and Lamiaceae. The count also included the three out of a total of four volumes of the Flora of Somalia then published (Thulin 1993, 1995, 1999). A summary of the findings in Friis et al. (2005) with the data for the FEE area was published as a general chapter in the FEE (Friis 2009). Friis et al. (2005) counted the number of indigenous species in Ethiopia at 4247 and in Eritrea at 1619; the number of endemics in Ethiopia was counted at 709 and in Eritrea at 107. These figures included the approximate 60 species that had been included in FEE as potentially new, but yet undescribed. The ranking from highest species-richness to lowest was only slightly changed in relation to the figures from 2001: SD, SU, HA, EW, TU, GD, KF, BA, GG, GJ, IL, WG, AR, EE, WU, AF. The ranking of the number of endemics in the FRs was approximately the same. In the ranking according to number of single-region endemics (endemic species restricted to a single FR), Ethiopia and Eritrea were overtaken by two FRs in the north-eastern part of Somalia [in the Flora of Somalia termed N2 and N3], but for Ethiopia and Eritrea the ranking of FRs according to declining numbers of single-region endemics was as follows: HA, SU, SD, GD, BA, TU, GJ, AR, KF, EW, WG, WU, GG, IL, EE, AF. No detailed study of correlation between altitude and species richness or richness of endemics was made in Friis et al. (2005), but that work was the first to estimate expected species richness based on the area of the FRs (Friis et al. 2005: Fig. 4) and the calculation of a regression line for the correlation between FR area and richness. The most ‘overrepresented’ FRs (highest above the regression line) were SD and SU with an ‘overrepresentation’ of observed species richness in relation to expected richness of more than 500 species each. Otherwise, the most striking observation by Friis et al. (2005) was the above mentioned notable increase in number and proportion of endemism towards the very dry easternmost FRs in Somalia on the point of the Horn of Africa (Friis et al. 2005: Figs. 9, 10, 11). Similar, but lower endemism was found in the easternmost part of Ethiopia. This phaenomenon was ascribed to high geological diversity towards the tip of the Horn (isolated occurrences of limestone, gypsum, etc.) and the geographical isolation of the tip of the Horn, ‘halbinsel effect’ (peninsula effect; see further in Friis et al. 2005).

After the completed publication of the FEE, Kelbessa and Demissew (2014) counted from the FEE area a total of 6027 taxa (species and subspecies, including cultivated and introduced taxa). Of these, 1882 were common to both Ethiopia and Eritrea, 3875 were recorded only from Ethiopia and 270 only from Eritrea. Thus, there were a total of 5757 species and subspecies in Ethiopia and about 2152 species and subspecies in Eritrea. Kelbessa and Demissew (2014) recorded about 647 species and subspecies that were endemic to the FEE area. Of these, 544 taxa were confined to Ethiopia, 14 to Eritrea and 89 taxa were common to both countries. In addition but counted separately, there were taxa (potential species and subspecies) that were included in the FEE as undescribed. Friis et al. (2005) had included these undescribed taxa in their counts. Kelbessa and Demissew did not analyse altitudinal diversity or altitudinal distribution of endemics. Some, but far from all, of the species indicated in the FEE as potential new taxa are among the 56 new species published since the completion of the Flora (see Table 1, “Species discovered and described after the flora project”).

Table 1 Fifty-six new taxa described from Ethiopia and Eritrea later than relevant accounts and appendices in the FEE

Sosef et al. (2017) made a study of the biodiversity of Africa on a continental scale, using the database of georeferenced data RAINBIO. In their Table 1, they presented a count of indigenous species (species richness) at 4481 for Ethiopia (data for Eritrea were not recorded) and a total number of indigenous species estimated with Chao1 at 5627 (confidence interval 5581–5672). Without giving absolute figures, they estimated a relatively high level of endemism, 19.9%, while Kelbessa and Demissew (2014) gave the absolute number of FEE area endemics at 647. That number infers a relative figure based on the total number of species in the FEE of (647/6027 =) 10.7%. If an approximate figure for indigenous species is used, the percentage increases to ca. 12%. See more in “Estimates of diversity and endemism using specimen data from RAINBIO and GHI calculated by Hawthorne and Marshall (2019)” about possible reasons for Sosef et al.’s high figure.

In a review of the methodological background for their ‘bioquality’ value, reflecting global rarity, Marshall et al. (2016) and Hawthorne and Marshall (2019) categorised more than 40,000 African plants species by global rarity, using a ‘star’ rating with four categories (highest to lowest): Black, Gold, Blue and Green stars. Based on the weighted ‘star’ rating, they calculated community scores on one-degree squares for > 3.1 million botanical records across the African continent, focussing on the high scores from the eastern coast of tropical Africa. They calculated the scores for their units (using both FRs and one-degree-squares) with this formula 100*(NBK*27 + NGD*9 + NBU*3)/(NBK + NGD + NBU + NGN), where each NXX stands for the number of stars in each category. The star rating of all African plant species is presented in Marshall et al. (2016: Supplementary Information, Table S1). The highest scores for one-degree squares (> 450) were found to form a zone from the Horn of Africa through southern-central Africa to Angola. The highest score, 1657, was found outside eastern Africa, but squares at the tip of the Horn of Africa reached values as high as 1581. A rather well-defined belt of scores above 450 extended from the tip of the Horn through Somalia into southern Ethiopia, where the FRs of HA, BA and eastern SD and areas in central SD represented scores between 263 and 448, while the one-degree squares in central, western and northern Ethiopian highlands only rarely reached scores above 300. They found scores below 450 to be frequent in the highlands of SU and GD, both FRs with some of the highest peaks in Ethiopia, and contrary to what should be thought about ‘islands in the sky’ endemics, these figures are lower than the figures for the south-eastern lowland endemics (see discussion in “Estimates of diversity and endemism using FEE data updated with later described taxa and the FEE Floristic Regions (FRs)”). The one-degree squares in the western part of Ethiopia represent lower scores, between 96 and 263. These findings will in the following be discussed in connection with another recent study and our own studies.

1.3 Application of FEE data to propose a phytogeographical division of Ethiopia and Eritrea

Wang et al. (2020) used FEE data for highly intricate analyses of patterns of diversity and endemism of vascular plants in Ethiopia and Eritrea and for a hierarchical phytogeographical classification based on their findings. They describe in their “Materials and methods of present studies” how they extracted data from FEE: “[Based on the information in FEE] we compiled a total of 511 822 vascular plant distribution records of 5781 species … We chose 0.5° × 0.5° grid cells … to accommodate the narrowest distance in the administrative area [this must mean the FRs of the FEE] of the recorded species. We recorded the administrative distribution and elevation range of each species from the flora [FEE], matched the corresponding grid cells of the administrative area where the plant is distributed, and then combined the altitude range of the plant distribution to further match the grid cells corresponding to the elevation range extracted by ArcGIS. The grid cells that matched successfully were recorded as having the plant.” We identify several points of uncertainty in their data collection: (1) Did Wang et al. score the “elevation range of each species from the FEE” using the values inside the brackets in the FEE treatments of species distribution (if bracketed values are present) or only values outside the brackets? The values inside brackets represent outlying altitudinal records. We disregard outlying values for this study because they are exceptions to the otherwise observed altitudinal range. (2) It is not stated which digital elevation model Wang et al. used to “match the grid cells corresponding to the elevation range extracted by ArcGIS”. (3) What do Wang et al. mean by ‘county boundary’ when they describe the quarter-degree squares scored. “We considered that so long as the grid cell [one of the quarter-degree squares] touched the national borders, we undertook species statistics and calculations on it. When the grid cell appeared at the county [sic!] boundary, if a cell covered more than 50% of a county, we classified this cell as that county.” Does ‘county’ mean an area of a FR, which Wang et al. considered an ‘administrative area’ [for example on p. 35] or is ‘county’ a spelling mistake for ‘country’? The FRs in FEE are not administrative units and Wang et al. do not state explicitly what ‘counties’ are. (4) Finally, Wang et al. write that they included data from the Global Biodiversity Information Facility (GBIF; https://www.gbif.org/) without saying how they combined the quarter-degree square data from FEE with the data from GBIF.

We think that Wang et al. make a mistake when they score the altitudinal range indicated for a species across all the FRs listed for that species. From our experience in the field and during our herbarium studies, it is evident that widespread species (occurring in several FRs), will occur at different altitudes in different parts of the FEE area, for example occur at different altitudes along a south–north or a west–east gradient. The FEE data do not break down the altitudinal range of species by FR but gives a single range across the entire FEE area. It might be argued that many species occur only in a relatively low number of FRs and that the variation in altitude will therefore be relatively low, but for species distributed in three or more FRs (more than 20% of the species; see analysis in “Estimates of diversity and endemism using FEE data updated with later described taxa and the FEE Floristic Regions (FRs)”) the overestimation of species distribution may be considerable. An example of a species with both a wide latitudinal–longitudinal and altitudinal range is Ranunculus multifidus, which occurs in 13 FRs and at altitudes between 1300 and 3800 m a.s.l. with very variable ecological conditions in the different parts of the distribution area. By scoring the quarter-degree squares of the entire altitudinal range in all FRs where the species has been recorded, one overestimates the presence of species, recording it in too many quarter-degree squares. And of course a species may not occur in all the quarter-degree squares in a FR which have a spot with the right altitude. The authors of Friis et al. (2005) thought of using a method similar to that of Wang et al. for their analysis, but rejected it for risk of overestimation. Wang et al.’s method introduces a high level of unknown uncertainty into the results of the clustering analysis of the quarter-degree squares, and in the thereby defined hierarchical phytogeographical divisions. The altitudinal range of each FR is indicated in Table 2.

Table 2 Richness of indigenous species, endemics and single-region endemics counted on the FEE data (‘Data based on FEE and updates’), or extracted from RAINBIO with GIS-software (‘Data based on RAINBIO’), or from GHI (‘Genetic Heat Index’) indicated in Hawthorne and Marshall (2019: Fig. 1)

Wang et al. state that they have “a total of 511 822 vascular plant distribution records of 5781 species” in 485 grid cells. This corresponds to an average of 88.5 distribution records per species and an average of 1055 distribution records per quarter-degree square. Both figures seem high to us; we have no direct means of comparison, but can compare Wang et al.’s figures with those we found from RAINBIO (discussed further in “Estimates of diversity and endemism using specimen data from RAINBIO and GHI calculated by Hawthorne and Marshall (2019)”), which have an average of 5.7 records per species and an average of 54.2 records per quarter-degree square. The latter, in particular, is only ca. 1/20 of what Wang et al. found by their method. Wang et al. did not publish a figure with a map showing the sum of records on each quarter-degree square, which would have allowed scholars with a personal knowledge of the nature in Ethiopia and Eritrea to form an opinion on the plausibility of the numbers, which, as we here suspect, overstate the distribution of the species. The number of distribution records found by Wang et al. sounds impressive, but based on our experiences from producing and editing the FEE, this is a misleading level of accuracy not reflecting the situation on the ground.

Another point that we find unexplained is that Wang et al. worked with fewer species than the total in the FEE. They must have made a selection, most likely a selection for indigenous taxa, but the criteria for this selection in the FEE data are not stated.

Some of the phytogeographic results of Wang et al. agree with our earlier and present findings, but others do not. It will not be possible here to discuss all points of disagreement; we will focus on the relation between their Fig. 2 and their Fig. 3 and the general results presented in their Fig. 3. The boundaries between the 13 floristic subdistricts in their Figs. 3 and 4 are said to be based on geographical uninterrupted[ness] (p. 38, first paragraph), the phylogenetic beta-diversity (presented in their Fig. 2b, see text below Fig. 2) and endemic species (p. 38, first paragraph). How much the different criteria contribute to the delimitation of the boundaries is not explained. In their Fig. 2b, based on phylogenetic beta-diversity, the light brown colour (which colour corresponds to that of subdistrict 12 in their Fig. 3) has eight patches of quarter-degree squares across the FEE area, sometimes many squares, sometimes only one. According to our field observations these eight patches are very different with regard to vegetation and flora: (1) many cells (ca. 13) along the western border of Ethiopia from Eritrea to the Blue Nile—the vegetation of these cells agrees with the approximate northern part of the CombretumTerminalia woodland in Friis et al. (2010), (2) one cell away from the western Ethiopian border and just south of the Blue Nile—the vegetation of this cell is also part of the CombretumTerminalia woodland in Friis et al. (2010), (3) a line of cells along and below the eastern escarpment of Eritrea (border between our EW and EE or at lower altitudes)—this may be an extension of the Transitional semi-evergreen bushland which occurs along the eastern escarpment of Ethiopia (Breugel et al. 2016), but it also covers dry lowland vegetation, (4) two cells on the Gambela escarpment—the vegetation of these cells is part of the southern CombretumTerminalia woodland in Friis et al. (2010), (5) three cells at Lake Chew Bahir at our GG—the vegetation of these cells is either AcaciaCommiphora woodland or bushland or Semi-desert scrub, as defined in Friis et al. (2010), (6) five or six cells between our SD and BA, near the border with Kenya—the vegetation of these cells is AcaciaCommiphora woodland or bushland, as defined in Friis et al. (2010), (7) one isolated cell somewhere near the border between our BA and HA and on the border with Somalia—the vegetation of these cells is AcaciaCommiphora woodland or bushland, as defined in Friis et al. (2010), and (8) many cells that cover the whole of the Ogaden in HA—the vegetation of these cells is AcaciaCommiphora woodland or bushland or semi-desert scrub, again as defined in Friis et al. (2010). Thus these cells are found both in western Ethiopia (1, 2 and 4) and in eastern Eritrea and Ethiopia (3, 5–8). Particularly the occurrence of the light brown squares in western Ethiopia is surprising, as these areas are part of the Sudanian Floristic Region of White (1983), while the light brown squares in eastern Ethiopia, including the largest area in the Ogaden, represent White’s Somalia-Masai Floristic Region.

Fig. 2
figure 2

Clustering of the FRs used in the FEE and based on the occurrence of 5219 indigenous taxa (a) and on the 627 endemic taxa (b) in Ethiopia and Eritrea. The source of the data is described in the text. The acronyms of the FRs are shown in Fig. 1a

Fig. 3
figure 3

Linear regressions of number of taxa, number of endemic taxa and number of single-region endemics in FRs (Y-axes) in relation to the areas of FRs (X-axis). The acronyms of the FRs are shown in Fig. 1a. a Regression for all species. b Regression for all endemics. c Regression for single region endemics

Fig. 4
figure 4

Diversity of all vascular plants in Ethiopia and Eritrea according to data in the RANBIO database (Dauby et al. 2016; Sosef et al. 2017). a Diversity on one-degree squares. b Diversity on quarter-degree squares. c Endemics in Ethiopia and Eritrea on one-degree squares. d Endemics in Ethiopia and Eritrea on quarter-degree squares. The small black dots indicated the georeferenced records in the database. e Diversity on the FRs used in Fig. 1a. f Number of endemics in Ethiopia and Eritrea on the FRs used in Fig. 1a. Description of the methods and further data in the text. Note the errors encountered with recording Eritrean–Ethiopian endemics from RAINBIO; this is discussed further in the text

Some of Wang et al.’s phytogeographic units seem highly heterogenous to us. This is particularly the case with their floristic subdistrict no. 8, a unit of lowest rank, which should therefore be floristically homogenous. Subdistrict no. 8 includes the deciduous western woodlands in the lowlands around Metema at ca. 700 m a.s.l., extending nearly as far south as to the Abay (Blue Nile) River at ca. 500 m a.s.l., but also Afromontane and Afroalpine vegetation in the Tigray and Gojam Highland with altitudes up to ca. 4100 m a.s.l. and the Semien Mountains in the Gondar Highland with the highest peak of Ethiopia at 4533 m a.s.l. Thus this subdistrict includes an altitudinal range of 4000 m, and much of this range can be found within one quarter-degree square. Near the equator, the sides of a quarter-degree square are ca. 55.6 km long and the diagonal of the square is ca. 78.7 km. Such distances in Ethiopia and Eritrea include great variation in altitude and include lowland and highland flora within the same quarter-degree square. From the top of Ras Dejen, the highest mountain in Ethiopia, to the nearest point at ca. 1000 m a.s.l., ca. 3500 m below in the Tacazze Valley, there is only a distance of ca. 45 km, and the two localities can easily fit inside one quarter-degree square and thus blur altitudinal differences of groups of species which often do not have altitudinal ranges that exceed 350–1450 m (see “Estimates of altitudinal distribution of diversity and endemism using FEE data updated with later described taxa”). As seen in “Estimates of altitudinal distribution of diversity and endemism using FEE data updated with later described taxa”, most species in FEE do not span very large altitudinal ranges, rarely 1000 m or more, and completely different local floras may therefore be included in the same quarter-degree square.

Wang et al.’s analysis of patterns of species endemism, mapped on their Fig. 5, partly agrees with our experience and findings, particularly the central part of the general picture in their Fig. 5A: a zone with the maximum number of endemic species stretching from the Semien Mountains in GD through SU to the highest parts of SD and BA. However, we think that Wang et al. with this analysis understate the importance of endemism in south-eastern and southern Ethiopia, more specifically in the lowlands of SD, BA and HA, where they indicate the number of endemic species as 1–34 per quarter-degree square, figures as low as in the Afar depression, while we in our Fig. 1c find that the mainly lowland FRs of HA and SD have 147–200 endemics, figures not as high as but still more akin to the 209–227 endemics in the two most endemic-rich highland FRs, GD and SU than to the flora of the AF.

Fig. 5
figure 5

Diversity and endemism of all indigenous species of vascular plants in Ethiopia and Eritrea on altitudinal zones (350 m intervals), based on updated data from the FEE and newly described species, as described in the text. a All indigenous species. b All endemic species

In their Fig. 5B Wang et al. show a few quarter-degree squares in south-eastern Ethiopia with slightly raised weighted endemism (1.4–4.0), higher than 0.006–0.586 in the surrounding squares, but this is not commented further. The ‘weighted endemism’ in Fig. 5B is calculated by an equation involving the inverse of the species range measured as “the number of floristic units in which species occur”. We are not sure what Wang et al. here mean with “the number of floristic units”; is it the number quarter-degree squares in which the species occur? Or do they mean FRs, as is the meaning of ‘floristic units’ in their Table 2? ‘Weighted endemism’ could from this be a term comparable to ‘restricted range endemism’, a concept important in establishment of conservation areas (for East Africa, see Beentje et al. 2006). In our studies for this paper, we have not been able to find sources on which to base a study of restricted range endemism to compare with Wang et al.’s Fig. 5B, but we will in “Estimates of diversity and endemism using FEE data updated with later described taxa and the FEE Floristic Regions (FRs)” analyse the cruder measure of single-region endemics, based directly on distributional information in FEE. According to our findings, already pointed out in Friis et al. (2001), south-eastern Ethiopia is rich in single-region endemics (and hence in endemic species with restricted range), but the flora of south-eastern Ethiopia is still poorly studied, most of the recently described species in our Table 1 come from the southern and south-eastern parts of Ethiopia, and the Ogaden, the south-eastern part of Ethiopia, is ‘still a terra incognita’ in the words of Thulin (2011).

To sum up, Wang et al. (2020) have produced an ambitious and interesting study, but we find that they build on data that cannot be collected with accurate methods, particularly the distributional records per quarter-degree square based on a combination of records from FRs and altitudinal ranges from the whole of Ethiopia and Eritrea, which, according to us, over-estimates the distribution of species. We also have difficulty following some of their methods. Their results underestimate the importance of the ‘bioquality’ of southern and south-eastern Ethiopia found by Marshall et al. (2016) and Hawthorne and Marshall (2019), documented by Friis et al. (2001, 2005) and confirmed in our analyses in this paper. Therefore, we find it justified in the following to update our previous studies, using the simple methods initiated in Friis et al. (2001), including altitudinal diversity, taking into consideration the complete FEE data set and the species published after the completion of the FEE, as well as a comparison of the FEE data with data from RAINBIO. We are aware of the shortcomings of using FRs as the areas on which to record species distribution and richness. The main difficulty with using FRs for this is their topographical diversity. We are aware of the topographical diversity when we interpret the results from analyses with FRs. We are also aware that we implicitly use background knowledge about the variation in FR topography when we draw our conclusions.

2 Materials and methods of present studies

In the following we update our previous results from Friis et al. (2001, 2005). We present a new count of indigenous species in the completed FEE on Floristic Regions (FRs, shown in Fig. 1a), supplementing the FEE count with data on new taxa that have been described after the completion of the relevant families in the FEE. We found 5219 indigenous species, of which 627 are endemic to Ethiopia and Eritrea, recorded from the FEE and information on later described taxa from IPNI (https://www.ipni.org/). The data for 5194 of the 5219 species, that is the records for which we could find all necessary information, is compiled into a csv file available on request from the authors of this paper (sebseb.demissew@gmail.com, ibf@snm.ku.dk, odile.weber@mnhn.lu). Recently described species were searched for in IPNI on ‘Ethiopia’ and ‘Eritrea’ in the field ‘Distribution of types’ and in the field ‘Published after year’. Unless information to the contrary has been stated in the protologues, the new taxa described from Ethiopia and Eritrea are considered endemic to those two countries and distributed as recorded in the protologue. We entered this data in a spreadsheet, indicating the FRs from which the taxa have been recorded, as well as the altitudinal range on 350 m intervals, counting only FEE information listed outside the parentheses (values inside parentheses indicate outlying records). If no information about altitudinal range was given in the FEE or for later described taxa in the protologue, we have used or looked up (on Google Earth Pro 7.3.4.8248 (https://www.google.com/earth/)) the altitude indicated for the type collection as a point altitude.

Species richness, the number of FEE area endemics, the number of single-region endemics (species recorded only from a single FR) of FRs and of altitudinal zones were extracted from our spreadsheet. In order to compare the species composition of FRs on one hand, and separately of the altitudinal zones of 350 m each on the other, we carried out cluster analyses, using the software PCord 7.08 (https://www.wildblueberrymedia.net/pcord). FRs and altitudinal zones were used as the rows and the species as columns. We used Jaccard’s coefficient for similarity and Flexible Beta (beta = − 0.8) for group linking method. A number of other coefficients of similarity and group linking methods were tried with less satisfactory results than the ones we selected for this paper. The correlations between species richness, number of endemics and number of single-region endemics on one hand and the areas of the FRs on the other were studied with the software R 4.02 (https://www.r-project.org/).

We carried out the same analyses on the RAINBIO specimen data for Ethiopia and Eritrea. RAINBIO is a database with African plants including 26,309 georeferenced specimens from Ethiopia and Eritrea, representing 4599 taxa (https://gdauby.github.io/rainbio/download_page.html). Ethiopian and Eritrean endemics were identified in the RAINBIO database as species only occurring in Ethiopia and Eritrea in that database. This method is highly inaccurate because the RAINBIO database does not contain information about records outside Africa. Species that in Africa only occur in the dry parts of the Horn of Africa and, therefore, likely to appear in RAINBIO as endemic to the Horn, do often also occur in Arabia, south-west Asia and the Mediterranean basin.

We used a range of different kinds of software to produce illustrations: Fig. 1 was produced with ArcGIS 9.3 (https://www.arcgis.com/index.html), Figs. 2 and 6 with PCOrd 7.08, Fig. 3 with R 4.02, Fig. 4 with DIVA-GIS 7.5.0 (https://www.diva-gis.org/) and ArcGIS 9.3, and Figs. 5 and 7 with Microsoft Excel 2019.

Fig. 6
figure 6

Clustering of the altitudinal zones (350 m intervals) in Fig. 5, based in their similarity in the species that occur in them. For method, see the text

Fig. 7
figure 7

Diversity and endemism of selected families of indigenous species of vascular plants in Ethiopia and Eritrea on altitudinal zones (350 m intervals), based on updated data from the Flora and newly described species, as described in the text. a All indigenous species of Asteraceae. b All indigenous species of pteridophytes. c All indigenous species of Apiaceae. d All indigenous species of Poaceae. e All indigenous species of Burseraceae

Parallel with the data extraction from FEE and IPNI, we traced supplementary taxa, in POWO (Govaerts et al. (2021); http://www.plantsoftheworldonline.org/, see “Species discovered and described after the flora project” for additional information) and RAINBIO (https://gdauby.github.io/rainbio/index.html), species said in POWO to be indigenous to the FEE area and for which there is specimen data in RAINBIO. We found three species that are almost certainly additional to the ones in the FEE and IPNI data. We also found nine species that may also be additional. The uncertainty over the latter was due to uncertainty over identification. However, these species were not supplemented to the FEE dataset. The analyses in this work are based on the two separate datasets, data extracted from FEE and IPNI on the one hand and data from RAINBIO on the other.

2.1 Cultivated and introduced taxa vs. indigenous taxa

The taxon richness of Ethiopia and Eritrea, 6027 species and subspecies, which Kelbessa and Demissew (2014) reported, represented a count of all species of Angiosperms, Gymnosperms and Pteridophytes treated in the FEE, including taxa that are indigenous, introduced, weedy and cultivated, even in gardens and as potted plants. When Friis et al. (2001, 2005) did their partial counts they attempted to include only indigenous species, but no methods for making this distinction were described and no estimate of the total number of species in the entire FEE was made. It was only with the data in Sosef et al. (2017), based on the database RAINBIO (Dauby et al. 2016) that a stated distinction between introduced and indigenous species on specified criteria was made for the entire Ethiopian and Eritrean flora, done in fact as part of a data set for the entire African continent. It is not clear to what extent RAINBIO has included naturalised plants, but specimens from planted and/or cultivated individuals and introduced species were said to be discarded (Dauby et al. 2016). However, the following 19 taxa occur according to RAINBIO only in Ethiopia and Eritrea with a single record (and could therefore be taken for single-region endemics if only African material is considered). According to POWO these 19 taxa are considered to be introduced, which we can confirm from our personal observations, with exception of Carthamus lanatus (here considered an indigenous weed): Apium graveolens, Carduus pycnocephalus, Carthamus lanatus, Centaurea melitensis, Cirsium vulgare, Citrus paradisi, Cynara cardunculus, Datura ferox, Eruca vesicaria, Filago vulgaris, Hordeum vulgare, Melilotus suaveolens, Papaver somniferum, Pisum sativum, Senecio macroglossus, Silybum eburneum, Vaccaria hispanica, Vicia faba, Xerochrysum bracteatum. The total figure of 5781 species, which Wang et al. (2020) mention as included in their studies, would seem to indicate that at least some introduced species were included, but they gave no selection criteria. In this study, we exclude all species that in the FEE are stated to be cultivated or introduced from a natural range outside Ethiopia and Eritrea, while weedy plants of apparent local origin have been included. Species described on material from Ethiopia and Eritrea after the publication of FEE are considered indigenous, unless stated otherwise in the protologue.

As mentioned above we have searched the database POWO (http://www.plantsoftheworldonline.org/) which attempts to present an overview of the current knowledge of taxonomy and distribution of all vascular plants in the world. Our search resulted in finding 5320 species recorded from Ethiopia and Eritrea and said to be indigenous (‘native’), of which 661 are endemic to the FEE area. Another 437 taxa occurring in the FEE area are said to be introduced (see also above in “Cultivated and introduced taxa vs. indigenous taxa”). There are taxonomic, nomenclatural and geographic discrepancies between the list of species used for our analysis based on the FEE and later described species on one hand and the list extracted from POWO on the other: some names used in the FEE are synonyms in POWO, species considered indigenous in the FEE are in POWO said to be introduced, or to occur in Ethiopia or Eritrea at infraspecific rank, but not at specific rank and vice-versa, and species documented in the flora are said to not occur in Ethiopia and Eritrea in POWO. It is beyond the scope of this paper to reconcile both lists but it is worth mentioning that such discrepancies exist.

2.2 Species discovered and described after the flora project

More than 500 new taxa were described during the Ethiopian Flora Project. It was attempted to keep the FEE up-to-date, and new taxa described from Ethiopia and Eritrea following the publication of the relevant family and new species records for the flora were accounted for in various appendices (Edwards et al. 1995: pp. 381–381, Edwards et al. 2000: pp. 451–462, Hedberg et al. 2009a: pp. 203–271). In a symposium volume published after the completion of FEE, additional new records were listed and treated in the same format as in the Flora (Friis et al. 2011). Some of the taxa included in FEE as potentially undescribed new species have now been described, others are completely new, discovered after the completion of the FEE. A significant number of these 56 species (and one subspecies) are described from the south-eastern lowlands of Ethiopia, including a tree (a new species of Acacia, A. fumosa Thulin), a tall self-supporting shrub (Commicarpus macrothamnus Friis & O. Weber), and a number of new species of Barleria from highlands and lowlands. As seen from Table 1, surprisingly many are succulents (Asclepiadaceae, Anacampseros, Euphorbia, Aloe). All these indicate that the work with the FEE area is not yet completed.

3 Estimates of diversity and endemism using FEE data updated with later described taxa and the FEE Floristic Regions (FRs)

Previous estimates of diversity and endemism by Friis et al. (2001, 2005), Kelbessa and Demissew (2014), Sosef et al. (2017), Wang et al. (2020), Marshall et al. (2016) and Hawthorne and Marshall (2019) are presented in the Introduction. In this part of our paper we will continue studying species richness, number of endemics, number of single-region endemics by updating the results of Friis et al. (2001, 2005) and discuss the results in context with the later papers. In agreement with our latest count the ranking of the FRs according to species richness is (high to low; see Table 2; Fig. 1b): SD, SU, HA, BA, KF, GD, TU, EW, GG, IL, GJ, WG, AR, EE, WU, AF. The main change in relation to the previous count from 2005 is that EW has moved down in the ranking from no. 4 to no. 8. The ranking of the FRs according to number of endemics is according to our latest count (high to low; Table 2; Fig. 1c): SU, GD, SD, BA, HA, TU, GJ, AR, KF, EW, WG, GG, WU, IL, EE, AF, and the ranking according to number of single-region endemics is (high to low; Table 2; Fig. 1d): SD, HA, BA, GD, SU, TU, EW, KF, WG, AR, GG, EE, GJ, WU, IL (no single-region endemic in AF).

Thus, the rankings have been reasonably stable during the increasing completeness of the data. Although there are slight differences in the rankings from Friis et al. (2005) to the data for the completed FEE, the high ranking of SD, SU, HA and GD is unchanged. It is notable that the FRs with the highest rainfall, KF, IL, WG (Friis et al. 2010: Fig. 6; Liljequist 1986), are not as outstanding with regard to diversity and endemism as the mountains in the central and northern highlands or the southern and south-eastern lowlands (it should be noted that in both SD and HA there are small areas with highlands). The concentration of species richness in central, southern and south-eastern Ethiopia (SD, SU, HA, BA) is notable in Fig. 1b. The importance of endemism in the central and northern highlands (SU, GD) and in southern and south-eastern Ethiopia (HA, SD, BA) is seen in Fig. 1c, and the complete dominance of southern and south-eastern Ethiopia with regard to single-region endemics (SD, HA, BA) is seen in Fig. 1d.

As the borders of the FRs are defined mainly by deep river valleys and escarpments and because distributions of individual species show that plants often cross these boundaries, it is relevant to study in how many FRs the species occur (Table 3) and the similarity between the FRs, represented by the number of shared species and shared endemics (Table 4). From Table 3 the notable observation can be made that as many as 25.74% of the species are recorded from only one FR, 16.42% from only two FRs, 11.59% from three FRs, etc., and therefore more than half of the species occur in only three FRs or less. This would imply a rapid turnover of species, but may also indicate that many species are strongly underrepresented in the FEE records. The pairwise recording of shared species in Table 4 show a particularly high number of shared species between many of the FRs with large areas of highland, for example the following 16 pairs of FRs with more than 800 shared species (in decreasing sequence. It should be noted that seven of these involve the FR of SU in the central part of Ethiopia and well-collected areas around Addis Ababa University) as follows: GD–SU (1095), EW–SU (1080), EW–TU (1070), TU–SU (1067), SU–KF (1049), TU–GD (1034), BA–HA (956). TU–SD (907), SU–GG (904), GJ–SU (899), SU–AR (891), GD–SD (890), EW–HA (855), EW–GD (852), GD–GJ (850), TU–HA (831).

Table 3 Number of FRs occupied by number of species, varying from one FR to 16 FRs
Table 4 Number of indigenous and endemic species shared between FRs (the endemic species in brackets)

Similarly, in Table 4 the number of shared endemic species is also high between many of the FRs with large areas of highland, for example these 29 pairs with more than 40 shared endemics (in decreasing sequence): GD–SU (127), TU–GD (101), GD–GJ (94), SU–BA (87), TU–SU (86), GJ–SU (86), SU–AR (81), SU–SD (75), SD–BA (73), GD–BA (72), SU–KF (70), AR–BA (68), GD–AR (60), SU–HA (59), EW–TU (59), TU–GJ (58), KF–SD (57), BA–HA (57), GD–SD (50), EW–SU (50), WU–SU (47), SU–WG (47), SD–HA (46), GD–KF (46), GJ–BA (44), GD–HA (44), AR–SD (43), GD–WG (41), AR–HA (41). As mentioned above, the high mountains of Ethiopia have been called ‘islands in the sky’ (Assefa et al. 2007), but it is not least the entire set of Ethiopian ‘sky islands’ that is isolated as an Ethiopian ‘sky archipelago’, while the individual islands of the archipelago are not always isolated from each other. It has long been known that the majority of the endemic bird species of the Ethiopian and Eritrean highlands are widespread, many in fact distributed over most of the highlands (Fjeldså and Klerk 2001), but this situation is not so well documented for the endemic highland plants. As should be expected, pairs of FRs in the highlands generally share a high number of widespread endemic taxa. We may mention that Wang et al. (2020) in their Fig. 4 point out the strong floristic link between the northern and southern high mountains in Ethiopia, which is also clear from our Table 4.

We have also illustrated the turnover between FRs with cluster analysis based on the total number of indigenous taxa (5219) and the 627 endemic taxa. We made the two separate clusters to see if the FRs would cluster differently for all species and for the endemics; the results are seen in Fig. 2a (all-species cluster) and b (endemics cluster). EW and TU cluster together both for all species and for endemics, joining with another group of the adjacent FRs, GD and GJ in the all-species clustering, while GD and SU form a group more loosely associated with GJ in the clustering of the endemics. In the all-species cluster, WU, a fairly dry area on the eastern escarpment of the Ethiopian Highlands, forms a cluster with AR, located on the northwestern slopes of the ‘Somali plateau’ (shared with northern Somalia), while it joins with HA in the cluster of the endemics. This seems slightly difficult to explain, because the largest number of the endemic species in HA are in the lowlands sloping towards southern Somalia, but WU and HA must indeed share a relatively small number of ‘Ethiopian Highland’ endemics. In the all-species clustering, there is a group of FRs which are mutually close to each other and formed by SU, SD, BA and HA, and joined by GG, located to the west of SD.

The FRs of GG, SD, BA, and HA contain vegetation that is rich in species endemics to the south and south-eastern Ethiopia or near-endemics that just transgress the border between south-eastern Ethiopia and southern Somalia (such species would also have restricted range; we tried to analyse such near-endemics in Friis et al. (2005), but have not followed that up here). In the clustering of the endemics, AR and BA join WU and HA, rather than GG and SD. The latter pair surprisingly form a cluster with the moist western FRs (WG, KF, IL), which we find difficult to explain. As seen from Fig. 1d, SD has the highest number of single-region endemics, which means that it might not cluster so easily with other southern FRs. The species-poor FRs (EE, dry lowlands or desert, and the deserts and subdeserts in AF) form a group both in the all-species clustering and in the clustering of the endemics. The relatively moist regions WG, KF and IL form well defined but slightly different groups in the all-species clustering and in the endemics clustering.

Although some of the far apart FRs with mountainous landscape share species and endemics (Table 4), it seems reasonable to explain the general structure of both the all-species cluster and the endemics cluster with the relative geographical position of the FRs (near each other), and differences or similarities in their topography and climate. Thus we see at least three clear patterns, an axis of endemic-rich FRs with prominence of highlands from GD in the north to BA in the south, and these have related endemic floras, as has been suggested by Wang et al. (2020: Figs. 4, 5A). We see in the all-species clustering a group of floristically related FRs in the humid south-west, but not very rich in endemics, and finally we see in both clusters a varying combination of southern FRs, both rich in species and in endemics, and certainly the richest with regard to single-region endemics.

3.1 Correlations between taxa numbers and areas of Floristic Regions (FRs)

In Friis et al. (2005) we compared the number of taxa and the number of endemics in FRs with their areas, using linear regressions and regression models. Then we ranked the FRs (high–low) based on the difference between observed numbers of taxa and endemics and the predicted numbers of these two categories. For the number of species, the ranking of the FRs according to observed species richness in relation to expected species richness was (highest positive values to lowest negative values; Friis et al. 2005: Table 2; Fig. 4): SU, SD, TU, KF, EW, GG, GD, AR, IL, GJ, BA, WG, WU, EE, HA, AF. In this sequence the observed number of taxa were lower than the predicted number from the FR of AR and onward. For the number of Horn of Africa endemics (Friis et al. (2005: Table 2; Fig. 4) the similar ranking of the FRs was: SU, GD, AR, TU, SD, GJ, KF, BA, WU, GG, EW, WG, IL, HA, EE, AF. The FRs GJ and KF contained the same observed number of endemics and this marked the point in the sequence where the observed number of endemic taxa became lower than the predicted number.

We have here repeated the analysis from 2005, using the data presented in Table 2, modelling the species–area relationship for the total number of species, for the number of endemics, and for the number of single-region endemics (Fig. 3). The results show that only a small part of the variance in the number of species and endemics is explained by the area of the FR (R2 = 0.183 and 0.116 respectively), and, moreover, that the species-area relationship is mostly not significant (p > 0.05). We did find a significant relationship between the number of single-region endemics and the area of the FR (R2 = 0.608, p < 0.001), which is not surprising, as one would expect larger regions to have more single-region endemics.

From Fig. 3 we can see that, for the number of species, the ranking of the FRs with more species than expected would seem to be almost the same as in our earlier results based on six published volumes of FEE: SD is now ranking highest, followed by SU; KF is now ranked higher than TU. These FRs fall outside (above) the confidence interval, but based on the data we cannot conclude that the FRs within the confidence interval deviate significantly from the predicted value. The FRs that had fewer species than expected in 2005 included the very dry FRs of EE and AF, as well as WU, and in the present result they also have notably fewer taxa than expected based on their areas and fall below the confidence interval. The FRs of SD, SU, GD, and TU have the highest number of endemic species in comparison to their area and fall above the confidence interval. The FRs IL, WU and EE rank lowest and fall below the confidence interval, with clearly lower ratio of endemics to area than the rest. With regard to single-region endemics, we did not estimate the expected richness in 2005, but here again SD, GD, and TU have notably higher numbers than predicted by the species–area regression line and are above the confidence interval. We can therefore suggest that at least SD is more species rich and richer in single-region endemics than should be expected from its area, and only for the single-region endemics the relationship is significant.

4 Estimates of diversity and endemism using specimen data from RAINBIO and GHI calculated by Hawthorne and Marshall (2019)

RAINBIO is a very large database with ca. 614,000 georeferenced records from Africa, representing 22,577 taxa (https://gdauby.github.io/rainbio/download_page.html; Dauby et al. 2016). According to Sosef et al. (2017), the sources of the RAINBIO dataset is a compilation of publicly available data sets from international herbaria, of which the NATURALIS database (https://bioportal.naturalis.nl/) forms the largest component. The data of RAINBIO are supposed to represent a significant increase in data quality compared to the relatively raw data available through GBIF (https://www.gbif.org/). RAINBIO, which has intended to record only indigenous taxa, includes 26,309 georeferenced records from Ethiopia and Eritrea, representing 4599 species. Table 2 (to the right of the FEE columns) and Fig. 4 summarise the findings based on the RAINBIO specimen data. The individual georeferenced records are shown on the maps as small black dots. The most striking difference between the data set from FEE and the one from RAINBIO is that the number of species in the former set is almost consistently higher than in the latter (except for the Afar (AF) region). In the RAINBIO dataset several FRs lack records for more than 1000 species compared to what is expected from the FEE.

The map of the Dutch collections from Ethiopia and Eritrea in the Wageningen Herbarium by Maesen and Wieringa (2011) explains some of the differences between data from RAINBIO and from the FEE: the Dutch botanists that collected for the Wageningen herbarium were based at the Alemaya (now Haramaya) University in the uplands of HA and they often collected along the roads between Addis Ababa and Jimma, between Jimma, Bonga and Mizan Teferi in KF and along the road between Addis Ababa and Nekemt in WG. The northern FRs of EW, TU and GD are represented with relatively few collections in the Wageningen herbarium (now kept at NATURALIS and therefore in RAINBIO); in this part of the FEE area there was civil unrest during the time of the Dutch collecting activity. In the very species-rich SD, the Dutch collecting activity was concentrated around the town of Negele. Another reason for the FRs in Eritrea and northern Ethiopia being poorly represented in the RAINBIO data set is due to the fact that many collections from Eritrea and northern Ethiopia were gathered during the early phases of botanical exploration in Ethiopia; they are incorporated in the FEE data, but not in RAINBIO. The changing collecting activity over the years can clearly be seen on an animated gif-file in Sosef et al. (2017: Additional file 7).

The early collections from Ethiopia and Eritrea (mainly kept at FT, K and P) are neither sufficiently georeferenced, nor have they been as frequently databased as the more modern collections. The lowlands of HA and BA are also poorly represented in the RAINBIO dataset; the collections from these areas have often been from the time when there was an effective anti-locust service for East Africa organised from Kenya (J. B. Gillett, Royal Botanic Gardens, Kew, pers. com.), but only few collections have been made during the last 50 years. The RAINBIO dataset, particularly as represented in the maps on quarter-degree squares in Fig. 4, and chiefly in Fig. 4b and d, also shows how close to the main roads many of the collections have been made; the areas along the roads in southern and southwestern Ethiopia would seem to have a much higher species richness than further away from the roads. When used for biodiversity studies, the RAINBIO dataset, although very detailed with regard to information about the individual specimens, should be consulted in conjunction with a data set derived from FEE (for example the csv file used for this paper). To improve the point data, specimen data from the many old collections at P, K, FT and ETH need to be captured in existing institutional databased and georeferenced post-facto data sets to make them accessible. Preferably also personal field work and general knowledge of the Ethiopian and Eritrean nature should be engaged.

We have tried to extract information about the endemic species in Ethiopia and Eritrea, considering those species as endemics that do not occur in other African countries in the RAINBIO database. Such records have been mapped on one- and quarter-degree squares in Fig. 4c, d, and f, and their numbers per FR have been listed in Table 2. Again, there are notable differences between the number of endemics per FR based on FEE and on RAINBIO. Higher numbers of endemics per FR in RAINBIO than in FEE are (in order of decreasing difference) seen in: HA (137), SD (72), AF (64), SU (55), TU (27), WG (3), KF (19), EE (16), GG (11). The high number of endemics from AF in RAINBIO may be caused by the error in our method (see above). The surprisingly high percentage of endemism indicated for Ethiopia (19.9%) by Sosef et al. (2017: Table 1) may be due to this erroneous recording of mainly Asian species which in Africa only occur in Ethiopia and/or Eritrea (or as cultivated species or weeds; see below). Such ‘pseudo-endemics’ would be particularly well represented in the dry FRs of EE and AF, but probably also in HA. We noticed the case of Allium ascalonicum L., the shallot, which RAINBIO recorded from HA, but nowhere else in Africa (it is a cultivated plant, probably domesticated in south-western Asia and the name is now reduced to a cultivar of Allium cepa, it should almost certainly not have been included in RAINBIO). Other examples are the 19 taxa mentioned in “Cultivated and introduced taxa vs. indigenous taxa” as recorded in RAINBIO as African species with one record, but are according to POWO and us introduced weeds or cultivated plants. These pitfalls demonstrate that data from even critically reviewed databases like RAINBIO should be used with great care, and the data in Table 2 and the illustrations in Fig. 4c, d, and f are here only included to represent a warning against using the method for more than methodological discussion. We tried to use the same method for assumed single-region endemics, and also here we found higher numbers in RAINBIO from the dry FRs than in the FEE data. We estimate that the extraction of this data is even less reliable than for the endemics, and the data have, therefore, not been included in this paper.

Ranking the FRs according to declining scores of Hawthorne and Marshall’s ‘bioquality’ (GHI scores from Hawthorne and Marshall 2019: Fig. 1; here listed in Table 2) is: BA (441), SD (436), HA (422), GD (333), TU (307), SU and AR (287), WU (268), GJ (245), GG (230), KF (204), AF (205), EW (202), WG (198), EE (169), IL (131). The lists of species recorded from each FR were extracted from FEE without updating the data set with newer records (Hawthorne and Marshall 2019: p. 321) and the species lists should, therefore, only differ from ours with the lack of records of the species in our Table 1. For each FR the score was calculated with the formula GHI = 100*(NBK*27 + NGD*9 + NBU*3)/(NBK + NGD + NBU + NGN). For the FRs with the high scores, the finding of Hawthorne and Marshall (2019) compares reasonably well with our findings with regard to single-region endemics (Table 2): SD (61), HA (56), BA (40), GD (28), SU (15), TU and EW (13), KF (7), WG (6), AR (4), GJ and EE (3), WU (2), IL (1), AF (0). However, we do not understand Hawthorne and Marshall’s surprisingly high scores for the dry and not particularly species-rich WU, EE and AF. We do not understand how AF can score higher than EW, not to mention KF and WG. The fraction in Hawthorne and Marshall’s formula for calculating their ‘bioquality’ will produce high scores if the number of common and widespread species is low in relation to NBK, NGD and NBU, but it seems surprising if this should be enough to produce the result seen here. Above we found problems with the use of distributional records in RAINBIO for species occurring outside Africa, but according to Marshall et al. (2016) the star rating should have taken note of the global distribution of the rated species.

5 Estimates of altitudinal distribution of diversity and endemism using FEE data updated with later described taxa

As mentioned in the Introduction, special attention should be given to study altitudinal distribution of diversity and endemism in countries that are as topodiverse as Ethiopia and Eritrea. Already Friis et al. (2001) found that the vertical distribution of the flora of Ethiopia and Eritrea had a ‘mid-elevation diversity bulge’ at altitudes between ca. 1400 and 2400 m a.s.l. The study is repeated here, using the data available in all volumes of FEE and with the additions from species published later. An interval of 350 m is used for the altitudinal zones. Figure 5 shows two column charts of the species richness and endemism distributed on altitude: Fig. 5a shows the species richness on the altitudinal zones and Fig. 5b the number of endemics on the same zones. The general patterns of this new study are similar to the general patterns in Friis et al. (2001): a fairly steep rise in species richness from sea level to a maximum at 1400–1750 m a.s.l. and then a slow decline until ca. 3150 m a.s.l., above which the elevation histogram shows a ‘tail’ in which there is only a slight decline in species richness until the highest peaks of the Semien and the Bale mountains. The numbers of endemic species on altitudinal zone are low at the lowest altitudes, but start climbing at around 1000 m a.s.l., reaching a maximum at 1700–2150 m a.s.l. and then the numbers drop off with a much less prominent ‘tail’ than the histogram for the species richness.

It is widely accepted that there are discontinuities in the floristic altitudinal gradients on tropical mountains, marking a zonation of the vegetation and the flora. Apart from attempts by early travellers visiting African mountains, the definition of separate altitudinal zones began with the recognition of the Afroalpine zone above 3800–4000 m a.s.l., studied in detail in a revision of the Afroalpine flora (Hedberg 1957) and later the recognition of an Afromontane zone, particularly in forest vegetation between ca. 1800 and 3800 m a.s.l., described in a monograph of African vegetation (White 1983). How does a distinction between lowland flora and an Afromontane zone around ca. 1800 m a.s.l. agree with a ‘mid-elevation diversity bulge’ at altitudes between ca. 1400 and 2400 m a.s.l. and the rather long ‘tail’ above 3500 m a.s.l. of altitudinal zones for which constantly low diversity is scored. The beginning of this ‘tail’ of low diversity at ca. 3500 m a.s.l. agrees rather well with the lower limit of the Ericaceous belt, a relatively homogenous vegetation below the Afroalpine vegetation proper (Friis et al. 2010). The ‘bulge’ just below 2000 m is more difficult to reconcile with the lower limit of the Afromontane zone at ca. 1800 m a.s.l. In a study of the forest vegetation of the Horn of Africa, Friis (1992: Fig. 6) studied the altitudinal zonation of the flora of forest trees with clustering analyses. He found a ‘bulge’ around 1800 m a.s.l. and a marked discontinuity at ca. 1500 m a.s.l. and again one between 3000 and 3350 m. The woody forest flora below 1500 m a.s.l. agreed best with the lowland forest flora, the woody forest flora between 1500 m and 3000–3350 m a.s.l. agreed best with the Afromontane flora of White, and above ca. 3350 m a.s.l. the woody flora agreed with the zone below Hedberg’s Afroalpine vegetation.

In order to see how the ‘mid-elevation diversity bulge’ is floristically related to the altitudinal zonation, we made a cluster analysis of the entire range of the 14 zones of 350 m each, calculating the floristic similarities between the zones and clustering them according to their similarity (Fig. 6). We found that the floristic contents of zones fell into two main clusters with a notable discontinuity around 2450 m a.s.l., which is higher than both White’s lower limit of the Afromontane region and the discontinuity in the woody forest flora found by Friis (1992), where a slightly narrower range (305 m) was used in the cluster analysis. (White’s limit of 1800 m was based on direct observation of data without cluster analysis). It has not been possible to give an explanation for this difference; it is not likely that there is only one explanation that will hold for all plant families and for all vegetation types, but Fig. 6 confirms the distinction between a lowland and a highland flora in Ethiopia and Eritrea, but the interpretation of this is complicated by the floristic differences between eastern and western Ethiopia (as already pointed out in Friis et al. 2010: Fig. 36) and clearly seen in the family Burseraceae in this example. As can be seen in Fig. 7, some families or higher ranking taxonomic groups have a species richness on altitude that agrees with the aggregated graph in Fig. 5, for example the Asteraceae (Fig. 7a). Moisture-loving plants, as most ferns, would be expected to agree with the idea of Rahbek (1995), having a maximum diversity where the clouds meet the mountain slopes, and indeed the graph of the altitudinal species richness of ferns has a relatively narrow ‘bulge’ with a maximum of species richness at 1750–2100 m a.s.l. (Fig. 7b). The Apiaceae (Umbelliferae) is a mainly temperate family, and the species that occur in Ethiopia are few in the lower altitudinal zones, but are more numerous above 1750–2100 m a.s.l. (Fig. 7c). The tropical grasses (Poaceae) are known to have two different photosynthetic pathways (Phillips 2014); the temperate grass species that also occur in the high Afromontane and Afroalpine vegetation have mainly a C3 pathway (a few have a C4 pathway), but the lowland species have a C4 pathway. The chart of altitudinal diversity of Ethiopian grasses (Fig. 7d) has a normal ‘bulge’ at 1400–1750 m a.s.l. and a ‘tail’ from ca. 3150 m and upwards; the former ‘bulge’, slightly lower than the general ‘bulge’ for all species, could represent a majority of C4 species at these altitudes, while the ‘tail’ almost certainly must represent the much fewer temperate C3 grasses in the high Afromontane and Afroalpine zones. The family Burseraceae is according to FEE almost entirely restricted to the eastern slopes of the Ethiopian highlands (Fig. 7e), and the highest number of species occur at 700–1050 m a.s.l. It is, therefore, clear that the general curve for altitudinal species richness is an aggregation of different patterns that together build up to a mid-elevation bulge.

The raw altitudinal data, as can be extracted from the FEE and protologues of subsequently described species, tend to indicate rather narrow altitudinal ranges for many species. In Table 5 we have summarised the number of altitudinal zones for which the species in FEE have been recorded. The highest number of species are only recorded from one altitudinal zone, but it is also very common that species are recorded from two to four zones (thus having altitudinal ranges of 700–1400 m). These ranges can be ranked in a sequence of a category spanning three zones (955 species), then a category spanning two zones (930 species) and then a category spanning four zones (896 species), and then with steadily declining numbers up to one species spanning 12 zones. However, very few species span more than nine zones (having altitudinal ranges of 3500–4200 m). It will thus seem that many altitudinal zones will largely have their own unique sets of species.

Table 5 Number of altitudinal zones of 350 m (from below sea level to the highest mountain peaks) occupied by number of species

6 What is needed for more definite conclusions about Ethiopian and Eritrean plant diversity?

There are still a number of challenges on the way to an image of Ethiopian and Eritrean biodiversity and endemism that does not change with changes in analytical methods and sources of datasets. In the following we will outline some of these challenges. In spite of the difficulties we have encountered with the RAINBIO data we do not doubt that a sufficient number of point data from all parts of the FEE area and from a wide range of altitudes will ultimately be the best to work from, as is also demonstrated by Marshall et al. (2016) and Hawthorne and Marshall (2019). But although the collections of a number of classical nineteenth and early twentieth century expeditions have now been studied and many of their collecting localities partly identified, many collections from the nineteenth and early twentieth century remain poorly georeferenced. These early collections were mainly made in Eritrea and northern Ethiopia, fewer from early geographical explorations or big-game hunting in the south and south-east (Friis 2019). As can be presumed from the FEE data in comparison with the data in RAINBIO, the old material was taken note of for distributional recording on FRs in the FEE, but the old material is mostly not included in the new international databases. The four major institutions with old or poorly georeferenced material from Ethiopia and Eritrea are the herbaria at Kew, UK; the Centro Studi Erbario Tropicale in Firenze, Italy; the herbarium of the French Musée National d’Histoire Naturelle in Paris and the National Herbarium of Ethiopia in Addis Ababa. Of these only the Musée National d’Histoire Naturelle seems to have scanned most or all of their Ethiopian and Eritrean material and made it available on the internet, but the images are not yet sufficiently databased. The collections at NATURALIS are of high quality and well databased and scanned, but contain only relatively new material and are geographically rather focussed.

Another complication with an even coverage of collections from Ethiopia and Eritrea is the often inaccessible terrain. We have seen Ethiopian farmers walk along small paths very far over the mountains from their homes to their fields, but it is unlikely that botanical collecting activity will move very far from the roads. The maps showing georeferenced collections from RAINBIO in Fig. 4 are almost maps of the major Ethiopian roads. More collecting activity away from the roads is desirable, and specimens should be deposited in responsible and well curated herbaria, to which bona-fide scientist should have access. Wang et al. (2020) made a phytogeographical classification of the biodiversity of Ethiopia and Eritrea with sub-provinces and districts in which altitudinal difference of up to 3500 m was included. The lack of concern among researcher for the highly topodiverse terrain of Ethiopia and Eritrea pinpoints the need for more precise altitudinal data and attention given to such data. Even with intensified collecting activity we will not get a complete coverage of collections, neither as specimens or as digital records, but better three-dimensional models based on altitude and climate will be a step in the direction of spanning the gaps, and adequate data and software should be made available for this.

Ways to improve our knowledge of the biodiversity and endemism of Ethiopia and Eritrea would be as follows: (1) improved collecting: widen the collecting activity geographically and secure subsequent wide access to the collections, both physically in Ethiopia and electronically via specimen images and databases online. (2) Improved access and interpretation of existing collections: more scanning of older collections with digital images on the internet and better means to reliable georeferencing such as articles or monographs on the schedules and localities of the collectors who worked in the nineteenth and twentieth centuries up to the days of the GPS; the old collections were often made far from the modern roads and quoting names no longer in use. (3) Improved modelling of species distribution: in recent years many statistical methods have been used to spatially interpolate species distributions, known as species distribution modelling (SDM) or habitat suitability modelling. It will be desirable to produce more high-resolution modelling of the distributions of many more Ethiopian and Eritrean species (ideally all species); these models should be based on detailed digital elevation models (DEMs) and high-quality meteorological information (e.g. BIOCLIM; https://www.worldclim.org/data/bioclim.html). (4) Integration of the improved distribution models: development of a procedure that can combine as many high-resolution modelled distributions as possible to biodiversity maps. As mentioned under (3), models of species’ distributions are frequently used to infer the importance of range-defining variables, but a higher level of synthesis, ‘models that can synthesize models’, is needed.

Before this is achieved, we will continue to meet challenges and contradictions as pointed out in this paper. Studies based on larger areas, such as FRs, do not always agree with studies based on point data or data in one- or quarter-degree squares, and the results from such methods have to be evaluated against each other and against experience from extensive field work.