Introduction

Over 2000 species of insects are regularly consumed worldwide by more than two billion people in rural and urban settings (Jongema 2017; Raheem et al. 2019). Entomophagy (i.e. the practice of eating insects) is part of the culture and traditions of many communities, and is often believed to yield health and medicinal benefits (Egan 2013). Insects can be an important element in improving food security and aiding in addressing the Zero Hunger aspiration of the United Nations’ Sustainable Development Goals (United Nations 2017). Significant progress towards meeting this goal has been achieved but more sustainable methods of meeting healthy dietary requirements are required, including a decrease in the global consumption of livestock meat products (Rust et al. 2020) and an increase in the use of animals with high feed conversion efficiency (Makkar 2018). Edible insects satisfy some of these criteria (Alexander et al. 2017) and can compare favourably to conventional meat products (Nowakowski et al. 2021). Moreover, entomofarming (i.e., mass-rearing of edible insects under controlled conditions) can be less detrimental to the environment than conventional livestock production as it emits fewer greenhouse gases and requires less land and feed per kilogram of protein and edible weight, respectively (Baiano 2020). The Food and Agriculture Organisation of the United Nations has recommended the promotion of edible insects (van Huis et al. 2013; Riggi et al. 2015). Much effort is now being directed to accessing, archiving and promoting the information lodged with indigenous knowledge holders around which particular insects are edible, as well as how to find, capture, prepare and preserve them (Lesnik 2017; Selaledi et al. 2021). This age-old information must, however, be linked with the verifiable scientific names of the insects for it to be useful in a global context (Turpin and Si 2017).

The exact number of edible insect species utilized in Africa is contested. Studies have reported between 246 and 524 edible species on the continent, with Central Africa hosting the most edible insect biodiversity (365 species), followed by Southern Africa (164 species), Madagascar (101 species), Eastern Africa (100 species), Western Africa (91 species) and Northern Africa (18 species) (Fisher and Hugel 2022; van Huis 2003, 2020; Jongema 2017; Kelemu et al. 2015). Many African communities rely on edible insects for nutrients; for example, the Gbaya people in the Democratic Republic of Congo draw 15% of their protein intake from edible insects (Raheem et al. 2019). In some cases, agricultural pests can be consumed as a biological control method, e.g. Schistocerca gregaria (the desert locust) and Zonocerus variegatus (the painted grasshopper or variegated grasshopper) (van Huis 2020; Kekeunou et al. 2006). Orthoptera (crickets, grasshoppers and locusts) are the second most consumed insect group in sub-Saharan Africa after Lepidoptera (Jongema 2017), with grasshoppers and locusts being the most consumed insects in the Central African Republic, Uganda and Madagascar (Fisher and Hugel 2022; van Huis 2020). The call for increased pesticide use in reaction to the recent upsurge in locust swarms in Africa (Xu et al. 2021), despite the negative effects on ecosystems, highlights the importance of finding novel ways to reduce the size of such swarms. Utilizing Orthoptera more widely as a source of animal protein for human consumption could reduce locust populations in a more environmentally sustainable manner (Kimathi et al. 2020).

In Southern Africa, 164 species of edible insects have been recorded, most of which are harvested from the wild (van Huis 2013; Kelemu et al. 2015; Stull et al. 2018). Communities interviewed in South Africa, Zambia and Zimbabwe gave tradition, taste, nutritional value and low cost associated with wild-harvesting as reasons for consuming edible insects (Egan 2013; Manditsera et al. 2018; Stull et al. 2018). Furthermore, the trade of edible insects plays a relevant role in generating income in communities in Botswana, South Africa, Zambia and Zimbabwe (Mutungi et al. 2019).

In South Africa, entomophagy (i.e., the consumption of insects as a source of nutrition) is primarily practiced in Mpumalanga, KwaZulu-Natal, North West, Gauteng and Limpopo provinces, with the latter being the most important region (Hlongwane et al. 2021). A study conducted in Limpopo in 2021 posited that 85% of residents within five municipalities consume grasshoppers, making them the third most popular edible insects after mopane worms (the local name of the caterpillars of Emperor moths such as Gonimbrasia belina), and termites (Hlongwane et al. 2021). Within these and other communities, edible grasshoppers are recognized with local species identifiers and given vernacular names, i.e. ethnospecies. These names are specific to the language of the community and often differ from one place to the next, making it challenging to assign a particular ethnospecies to a scientific species.

Some of the edible grasshopper species harvested from the wild in South Africa could be suitable candidates for mass-rearing (Egan 2013). Being an already well-known food resource, commercial exploitation of grasshoppers could assist in addressing food insecurity, elevating traditional food cultures and promoting economic development in the country (Payne et al. 2016). However, edible grasshoppers in South Africa are largely uncharacterised, making it challenging to link nutritional properties and biological characteristics with species names; this, in turn, hinders progress of potential commercialisation (Stull and Patz 2020). Ambiguity around species names also has a detrimental effect on ecological investigations into the potential impact of wild harvesting (Turpin and Si 2017).

Morphological identification of insects, including grasshoppers, can be challenging due to phenotypic plasticity, genetic variation within populations, camouflage or mimicry, sexual dimorphism and morphological changes throughout the life cycle of species (Friedheim 2016), often leading to subjectivity and disagreement amongst taxonomists. Genetic information, such as DNA barcodes, may assist in closing the species identification gap using sequences of specimens expertly identified as reference for querying unknown sequences. Studies that use DNA barcoding for documenting the diversity of wild edible insects are still rare and represent only a small part of the wide range of taxonomic groups consumed worldwide: e.g. Vespa species utilized in China (Wang et al. 2022), leaf-cutting ants in Colombia (Kooij et al. 2018), caterpillars, beetles and grasshoppers of Northern Angola (Lautenschläger et al. 2017), Macrotermes termites (Egan et al. 2021), Emperor moth caterpillars in eastern and southern Africa (Kusia et al. 2021; Nethavhani et al. 2022), and grasshoppers in Mexico (Pedraza-Lara et al. 2015), and East Africa (Leonard et al. 2020).

Our study explored the potential utility of information on ethnospecies identified by local communities, classic taxonomy, phylogenetics, and delineation of genetic groups for documenting edible grasshoppers in South Africa.

Materials and methods

Specimen collection and study sites

Community members who regularly harvest and consume edible grasshoppers assisted with specimen collection at seven sites in the areas of Bolubedu South (Moleketla and Lufule 2), Giyani (Ka-Homu), Kurisa Moya (Houboschdorp), Hoedspruit and Sekhukhune (Leolo Mountain) in the Limpopo province of South Africa between September 2020 and February 2021 (Fig. 1a; Table S1).

Fig. 1
figure 1

a Approximate location of sampling areas of edible grasshoppers in Limpopo province, South Africa: 1 ­ Lufule 2, 2 ­ Ka-Homu, 3 ­ Moleketla, 4 ­ Kurisa Moya, 5 ­ Mankweng, 6 ­ Leolo Mountain, and 7 ­ Hoedspruit. b Old-planted lands and communal open fields in Moleketla, collection of grasshoppers with leafy branch, and food preparation

The grasshopper collection areas are located in rural and semi-rural areas in the Savannah Biome of the Limpopo and are inhabited or regularly frequented for activities such as farming (Fig. 1b). The villages of Lufule 2 and Moleketla (Bolubedu South) are set in the subtropical Mopani District. Grasshoppers were found outside household yards, in the commonage consisting of old pastures and abandoned plantations. The vegetation type is Tzaneen Sour Bushveld consisting of tall, deciduous, open canopied shrubs. The mean annual rainfall in this area is 660 mm, with a mean annual temperature of 19˚C, and a high of 36 ˚C in January and a low of 4 ˚C in June (Rutherford et al. 2006). Kurisa Moya is set on a private farm close to historical Houboschdorp on the Transvaal Drakensberg Escarpment. The area receives over 700 mm of rain per annum, and mist is common due to the orographic effect of the escarpment. The mean annual temperature is 17 ˚C and the insects were collected from small patches of undisturbed Woodbush Granite Grassland and Northern Mistbelt Forest, as well as from land transformed by pine and gum plantations.

Leolo Mountain is located in the Sekhukhune District, along the north-eastern Drakensberg Escarpment and is one of the few areas undisturbed by mining and much utilised for natural resources still available. This escarpment area is an extension of the norite mountain chain of the Dwars River Mountains in the south and consists of grasslands and scattered tree pockets of Leolo Summit Sourveld. Annual precipitation (mean 660 mm) is considerably higher than that of the Sekhukhune plains below and mean annual temperatures are lower at 15 ˚C (Rutherford et al. 2006). Ka-Homu village complex, located outside Giyani and below the Manombi Mountain is vegetated with Lowveld Rugged Mopaneveld. This summer rainfall area has very dry winters and mean annual precipitation of 400 to 600 mm. Frost occurs in low-lying areas, but hot summers result in a high mean annual temperature of 22 ˚C (Rutherford et al. 2006).

Community members collected specimens using a traditional method that involves disturbing the underbrush with an approximately one-metre-long leafy branch to agitate the grasshoppers, forcing them into the air where they are visible (Fig. 1b). Once the grasshoppers land, they are stunned using this branch and collected by hand, and hind legs are typically removed to prevent the insects escaping. The latter step was avoided to preserve the integrity of the specimens as much as possible for morphological analyses. Upon collection from the field, specimens were euthanized by freezing and individually stored in 100% ethanol until downstream analyses. The specimens will be housed at the Entomological Collection of the Iziko Museum Cape Town (curator Dr Simon van Noort) and are presently available upon request to Dr van Asch (bva@sun.ac.za), Genetics Department of Stellenbosch University.

DNA extraction, PCR amplification and sequencing

We excised approximately 2 mm3 of tissue from the upper leg muscle of one hind femur of each specimen using sterile instruments. We preferentially extracted DNA from the larger muscles of the hind leg but in cases where the specimen was missing both hind legs, enough material was obtained from a foreleg or a midleg. Tissue samples were dried at 37 °C for one hour and homogenised prior to DNA extraction using a standard phenol-chloroform method (Sambrook et al. 1989). DNA concentration (ng/µl) and quality (absorbance ratios at 260/280 and 260/230) were determined using a NanoDrop Spectrophotometer (ThermoFisher Scientific) and diluted to approximately 200 ng/µl, where applicable. Final DNA concentrations used for PCR amplifications were within the range of 20–200 ng/µl. We amplified the standard barcoding region of the mitochondrial COI gene using various combinations of new primers designed to amplify a minimum of 702 bp overlapping the standard barcoding region delimited by the universal primer pair HCO2198 and LCO1490 (Folmer et al. 1994). The new primers were designed based on the consensus sequence obtained from taxon-specific alignments of complete COI sequences (Table S2). We searched manually for multiple potential primer annealing regions in each COI sequence alignment and selected the best primer pairs based on lowest possible number of polymorphic positions, CG content > 40%, ΔTa between forward and reverse primers ≤ 2 °C, and absence of primer dimers and hairpins. New primer pairs were initially tested at 50 °C for the presence of at least one main band of ~ 700 bp relative to a 100-bp DNA ladder. Subsequently, Ta was increased until a single band was clearly visible, to avoid Sanger sequencing of non-specific PCR products. In cases where multiple bands were not eliminated by the use of different primer combinations and higher Ta, the barcoding of the specimen was deemed unsuccessful. We performed PCR amplifications in a total volume of 5 µl comprising 2.5 µl of QIAGEN Multiplex PCR Kit (QIAGEN), 0.5 µl of each primer (10 mM), 1 µl of Milli-Q water and 0.5 µl of template DNA, as follows: 95 °C for 15 min; 35 cycles of 95 °C for 30 s, Ta (52 − 64 °C) for 90 s, 72 °C for 90 s; and 72 °C for 10 min. We performed sequencing uni- or bidirectionally using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems), and products were run in an ABI 3730xl DNA Analyser (Applied Biosystems) at the Central Analytical Facilities of Stellenbosch University, South Africa. We manually inspected and curated all sequences before translation using the invertebrate mitochondrial genetic code in Geneious Prime 2021.2.2 (https://www.geneious.com) to screen for premature stop codons and frameshift mutations indicative of spurious amplification of non-target DNA.

Ethnospecies, taxonomic and DNA-based identification

In South Africa, vernacular names of edible insects are highly correlated with the locality in which they are collected and vary, often from one village to the next. Vernacular names may differ in tone or spelling, or even be completely distinct. Due to the fact that the authors are not fluent in the local languages of Sepedi and Xitsonga, younger members of the communities who were fluent in English were asked to assist with grasshopper collection. However, younger people have not had the opportunity to learn enough about the edible insects to be fully knowledgeable of names, uses, habitat and preparation methods, and older residents were engaged to ensure that the correct ethnospecies names were used. At the end of the collecting season, the researcher visited each village and met with all community members involved with the collection, as well as those who did not collect grasshoppers but who were well experienced with utilising the insects in order to determine the locally correct ethnospecies name for each insect. Due to the fact that young people tend to not spend much time in traditional activities due to more modernized lifestyle, names were always confirmed by older members of the community who are still well versed in the natural history of their area due to the amount of time they spend collecting wild foods and herding cattle. The younger, school going members of the community assisted with spelling in cases where older members were not literate. The community members participating in the grasshopper collections were mostly Sepedi speakers, the largest ethnic group in the region where sampling was conducted.

Co-author S. Hugel performed morphological identification of specimens based on high-resolution photographs of diagnostic characters. At first, keys by Dirsh 1965 were used to provide a tentative generic identification. All African species from these and related genera were then checked to identify specimens using available generic revisions. Finally, specimens were compared to those from a reference collection in Museum National d’Histoire Naturelle (Paris) and Musée Zoologique de la Ville et de l’Université de Strasbourg. When the state of the samples did not make it possible to identify the species with certainty (absence of hind legs, only females available, etc.) a tentative identification was proposed based on the distribution of possible species. If only one possible species is known from the area, the tentative identification is marked as probable. If several species are possible, but one is more frequent according to our expertise, the identification is indicated as possible.

All DNA barcode sequences generated in this study were queried against the National Center for Biotechnology Information (NCBI) database (www.ncbi.nlm.nih.gov) using BLASTn, and against the Barcode of Life Data (BOLD) Systems database (https://www.boldsystems.org/) using the Identification tool (Species Level Barcode), both accessed on November 5, 2021. The top match for each sequence and its percentage of sequence similarity (BOLD) and percentage of sequence identity (BLASTn) was recorded. For both BOLD and BLASTn searches, the threshold of > 95% was considered a reliable match. A large proportion of the sequences did not yield a match > 95%; in those cases, when at least two of the three identifications (BOLD, BLASTn or alpha taxonomy) were consistent, phylogenetic clustering patterns were used for inferring which taxonomic group was potentially correct (see “Determination of genetic groups likely to represent distinct species” section). In cases where taxonomic identifications (morphology, BOLD and/or NCBI) resulted in inconsistent subfamily identification, the most likely subfamily was inferred based on phylogenetic clustering patterns (GR10 and OR95 in G1, OR76 in G2, OR24 in G3, OR102 and OR105 in G6, MBO01 and MBO04 in G10, MAS01 in G15, MBO03, MBO05 and MBO06 in G24), or labelled as undetermined (all specimens in G19 to G20 and G27 to G28) (Fig. 4). We adopted a conservative approach for appending scientific taxa to DNA sequences for the purpose of deposit on GenBank which aimed at the highest taxonomic level that could be attained considering alpha taxonomy and phylogenetics. For example, G27 and G28 were deposited as Orthoptera, and sequences in G7, G8 and G9 were all deposited as Morphacris fasciata. In other cases, alpha taxonomy was unambiguous and took priority over database searches: for example, sequences in G35 were identified by alpha taxonomy with confidence as Zonocerus elegans elegans although database searches resulted in Z. elegans; therefore, these sequences were deposited on GenBank as Z. elegans elegans.

Determination of genetic groups likely to represent distinct species

Multiple sequence alignments were performed using the MAFFT algorithm (Katoh and Standley 2013) in Geneious Prime. Phylogenetic clusters were recovered on a maximum-likelihood (ML) tree built on IQ-TREE (Nguyen et al. 2015) under the best fit model based on the Bayesian Information Criterion, using the Ultrafast Bootstrap Approximation for estimating nodal support (Hoang et al. 2018). The final trees were drawn using TreeGraph2 (Stöver and Müller 2010), and FigTree v1.4.4 (http://tree.bio.ed.ac.uk/). The new DNA barcodes (n = 116) were used to construct a phylogenetic tree for visualization of genetic clusters and calculation of intra-cluster divergence, disregarding taxonomic and ethnospecies information. Genetic groups likely to represent distinct species were identified by calculating maximum pairwise distances (max p-distance, %) for all clusters recovered in the ML tree, using MEGAX (Kumar et al. 2016) under the Kimura 2-parameter (K2P) model (Kimura 1980). Statistical support for p-distances was based on 1000 bootstrap replicates. Genetic groups were determined using intragroup maximum p-distance of 3% as the upper threshold.

Results

Ethnospecies

A total of 176 grasshopper specimens were identified in a total of 35 ethnospecies in Sepedi, Tsonga, Lobedu and Venda (Fig. 2). The majority of ethnospecies (71%; 25/35) were identified by Sepedi speakers, followed by Tsonga (20%; 7/35) while Venda (6%; 2/35) and Lobedu (3%; 1/35) speakers contributed with minor proportions of the total number of ethnospecies. Ethnospecies names could not be provided for 10 specimens (5.7%; 10/176) identified by the collectors as edible.

Fig. 2
figure 2

Number of ethnospecies and specimens of edible grasshoppers collected by community members in Limpopo province, South Africa, as recorded in each local language

Species identification by alpha taxonomy

The majority of specimens were in the family Acrididae (95%; 167/176), represented by 10 subfamilies of which the most frequent was Oedipodinae (n = 56; Fig. 3). Most specimens were identified to subfamily level (97%), except for six specimens in the ethnospecies Mbothoko, Masongwana, and Makihla. A total of 28 genera in Acrididae were identified (or 29 genera when one specimen identified as Catantops or Vitticatantops is included). Only a small proportion of specimens were in Pyrgomorphidae (5%; 8/176), all of which in the genus Zonocerus (Pyrgomorphinae). Images of representative specimens are shown in Figure S1.

Fig. 3
figure 3

Results of taxonomic identification of edible grasshoppers collected by community members in Limpopo, South Africa. Number of individuals identified in a families, b subfamilies, and c genera

Although a large proportion of specimens was identified to genus level (94%; 159/170), only a small proportion was identified to species (31%, 53/170; Table 1; Table S3): 12 species in Acrididae (or 13 species when Heteracris sp. group herbacea, likely H. drakensbergensis, is included), and one species in Pyrgomorphidae.

Table 1 Number of taxonomic groups and number of specimens of edible grasshoppers identified to subfamily, genus and species level based on alpha taxonomy

DNA-based species identification

Initial PCR amplifications of the COI barcoding region using the universal primers HCO2198/LCO1490 (Folmer et al. 1994) were largely unsuccessful as no amplification was obtained or non-specific bands affected the majority of samples. Thus, several new primers were designed, tested and optimized in a range of available specimens. After several rounds of trial-and-error, only 66% of the specimens (116/176) yielded single-band amplicons and high-quality sequences. Some sequences had instances of ambiguous base calls; however, no indels or premature stop codons indicative of non-coding regions, such as nuclear mitochondrial sequences (NUMTs) were detected upon translation. Overall, DNA barcodes were generated for 10 subfamilies and 25 genera within two families identified based on alpha taxonomy leaving Tropidopolinae as the only subfamily, and Cardeniopsis, Ornithacris, Paracinema and Tristia as the only morphologically identified genera for which no DNA barcodes were obtained (Table S1). DNA barcodes were generated for at least one specimen in 26 of the 34 ethnospecies recorded but none were obtained for eight ethnospecies.

BOLD and NCBI sequence queries

The new DNA barcode sequences (n = 116) were queried by BLASTn on the NCBI database and on the BOLD Identification Engine (Table S4; Table S5). BOLD queries returned matches with high sequence similarity (> 95%) for 59% (68/116) of the sequences (Table S4). Of these, 83.8% (57/68) of the sequences were identified to genus level and resulted in 15 genera. Additionally, 51.5% (35/68) were identified to species level and resulted in 11 species. NCBI queries returned matches with high sequence identity (> 95%) for 37% (43/116) of the sequences (Table S5). Of these, 100% of the sequences (43/43) were identified to genus level and resulted in 11 genera. Additionally, 58.1% of the sequences (25/43) were identified to species level, resulting in seven species.

Overall, BOLD yielded more matches with high sequence similarity (> 95%) than NCBI. Although the NCBI queries yielded a greater proportion of high sequence similarity matches to genus and species level, BOLD queries resulted in an overall higher number of matches with high similarity to genus and species level, as well as a higher number of genera and species. Therefore, BOLD seems to be a more comprehensive database in the case of these particular Orthoptera. Furthermore, we detected some taxonomic inconsistencies between BOLD and NCBI: for example, Parapleurus alliaceus was classified in Oedipodinae on BOLD but in Acridinae on NCBI, and Coryphistes ruricola was classified in Acridinae on BOLD but in Catantopinae on NCBI. Additionally, there were no records on the NCBI Taxonomy Database for three genera (Plegmapterus, Pycnodictya, and Sudanacris) and six species (Acrotylus trifasciatus, Catantops momboensis, Cryptocatantops haemorrhoidalis, Locusta pardalina, Rhachitopis nigripes and Sphodromerus undulatus) present in our dataset.

Phylogenetics and genetic groups

A total of 36 clusters with intra-cluster max p-distance < 3% were considered genetic groups potentially representing species (G1 to G36; Fig. 4). As expected from a phylogeny based on short sequences from a single mitochondrial gene, the tree had low nodal support for the deeper nodes but the correct split between Acrididae and Pyrgomorphidae was recovered. The most frequently represented subfamily was Oedipodinae, with 46/116 sequences (39.7%) divided into two main clusters – one cluster formed by 11 genetic groups (G1 to G11), and another comprised of G33 and G34. Eyprepocnemidinae (G21, and G22 to G23) and Acridinae (G31 and G32) appeared polyphyletic. Catantopinae (G12 to G17), Calliptaminae (G24), Cyrtacanthacridinae (G25 and G26) and Spathosterminae (G30) were monophyletic. Hemiacridinae (G18) and Euryphyminae (G29) were represented by single specimens.

Fig. 4
figure 4

Maximum-likelihood tree of edible grasshoppers collected in South Africa based on COI barcoding sequences (n = 116). G1 to G36 represent genetic groups determined for groups of sequences with max p-distance < 3%

Overlapping of specimen identification methods

This study aimed to document the diversity of edible grasshoppers in South Africa using four sources of information: (1) identification of ethnospecies by members of local communities, (2) morphological identification based on alpha taxonomy, (3) genetic identification by querying the new DNA barcodes against public sequence databases, and (4) genetic identification by phylogenetic clustering and estimates of genetic divergence. A summary of the results obtained from the different sources of information is displayed in Fig. 5.

Fig. 5
figure 5

Phylogenetic tree of DNA barcodes of 25 ethnospecies of edible grasshoppers collected by community members in Limpopo, South Africa (n = 116). Branches were collapsed according to 36 phylogenetic groups likely to represent genetic species. 1 Identification by alpha taxonomy; 2 Identification by BOLD/BLASTn. Ethnospecies that correspond to single genetic groups are highlighted in bold. Underlined taxa - consensual identification between alpha taxonomy and BOLD/BLASTn

Of the 116 specimens for which DNA barcodes were generated, 100 specimens were identified to subfamily level based on alpha taxonomy and sequence queries on BOLD and GenBank (Table S1). Of these, 14% (14/100) were inconsistently identified to subfamily. For example, MOK04, MOK05 and OR107 (G27) were morphologically placed in two different subfamilies within Acrididae, but BLASTn and BOLD yielded inconclusive results. Moreover, G27 did not fall within the phylogenetic clusters of Catantopinae or Acridinae, the subfamilies determined by alpha taxonomy, rendering the specimens therein undetermined. At genus level, only 37.1% (43/116) of the barcoded specimens were consistently identified by alpha taxonomy, BLASTn and BOLD: Acanthacris, Acrotylus, Aiolopus, Catantops, Coryphosima, Cyrtacantharis, Gastrimargus, Heteracris, Morphacris, Spathosternum, and Metaxymecus/Tylotropidius (synonymous). At species level, only 8.6% of the barcoded specimens (10/116) were consistently identified by alpha taxonomy, BLASTn and BOLD: Acanthacris ruficornis (n = 2) and M. fasciata (n = 8). Acanthacris ruficornis specimens were genetically very similar (max p-distance = 0.5%), while M. fasciata (Mamaroping) suggested cryptic diversity (max p-distance = 3.8%).

In the total of 25 ethnospecies for which DNA barcodes were generated, six were represented by a single specimen thus hampering calculation of intragroup max p-distance. Of the remaining, 10 ethnospecies were comprised of genetically similar specimens with intragroup max p-distance < 3% (Table S6). The remaining nine ethnospecies had intragroup max p-distances ranging from 3.78% (Mamaroping) to 19.91% (Makihla) indicating non-conspecificity.

Genetic groups and ethnospecies overlapped exclusively in 11 cases: Makwitla (G4; Gastrimargus sp.), Nwa-Rhuda Gerere (G6; Pycnodictya sp.), Mmamogkadi (G13; Catantops sp.), Masongwana (G15; Acrididae), Mamafikeng (G16; Oxycatantops spissus uranius), Mmametome (G17; Phaeocatantops sulphureus), Mmatsipi (G18; Leptacris monteroi monteroi), Lekgowa (G20; Pnorisa sp.), Makahlodi (G21; Heteracris sp.), Mamtotobodi (G32; Acrida sp.), and Tlatlawele (G35; Zonocerus elegans elegans (Fig. 5). Three ethnospecies did not correspond exclusively to a single genetic group but fell in closely related phylogenetic groups: Mamaroping (G7-9; M. fasciata), Tanswelele (G22-23; Metaxymecus sp.), and Malefiswane (G12 and G14; likely Catantops sp.). Makihla included G1, G3 and G5 which fall in one the clusters of the polyphyletic Oedipodinae, and G34 which falls on the other cluster of Oedipodinae (Fig. 4). Some genetic groups included several ethnospecies: G1 included Kendobola and Makhila (Sepedi) identified in different villages, and Nwa-Mroveni, Nwa-Nchichi and Nwa-Rhuda (Tsonga) identified in the same village; G25 included Tatakgope (Sepedi), and Nwa-Nchocho and Nwa-Nthaga (Tsonga) identified in the same village; G27 included Mokhure (Sepedi) and Mamotswaitswai (Lobedu).

Concordance between genetic groups and taxonomic species was difficult to assess as few specimens were identified to species level. However, in some cases the same species was placed in different but closely related genetic groups: for example, M. fasciata appeared in G7, G8 and G9 (max p-distance G7 + G8 + G9 = 3.8%), and Z. elegans in G35 and G36 (max p-distance G35 + G36 = 3.3%). At the genus level, only seven genetic groups corresponded to a single genus (15/116 specimens; 12.9%). The concordance between ethnospecies and taxonomic species showed total overlap in nine cases (representing 48/175; 27% of the total number of morphologically identified specimens): Letungwana (Ornithacris pictula), Makwitla (Gastrimargus sp.), Mamafikeng (Oxycatantops spissus uranius), Mamaroping (Morphacris fasciata), Mmametome (Phaeocatantops sulphureus), Mmatsipi (Leptacris monteroi monteroi), Mokhure (Abisares sp.), Tanswelele (Metaxymecus sp.), and Tlatlawelele (Zonocerus elegans).

Discussion

The worldwide list of edible insects compiled by Jongema (2017) has been used as a comprehensive source of information in recent years (Mariod 2020; Baiano 2020; Lautenschläger et al. 2017; Manditsera et al. 2018) but our study contributes with records of taxa that are not listed in therein: Abisares sp., Amblyphymus sp., Catantops momboensis, Duronia chloronota, Heteropternis sp., Leptacris monteroi monteroi, Ornithacris pictula, Phaeocatantops sulphureus, Pnorisa sp., Pycnodictya sp., Rhaphotittha sp., Spathosternum sp. and Spathosternum nigrotaeniatum. More recently, a list of edible Orthoptera consumed in sub-Saharan Africa was compiled by van Huis (2022). This list reported some edible taxa (Amblyphymus sp., Duronia chloronota, Ornithacris pictula, Pycnodictya sp. and Spathosternum sp.) that do not feature in Jongema (2017) but the sub-Saharan African section is also not comprehensive as Acrotylus sp. and Cardeniopsis sp. were omitted, despite previously being reported in sub-Saharan countries (Jongema 2017). An older study on edible grasshoppers performed in Limpopo (referred to as Northern Province, its former name) reported the utilization of Pnorisa by local communities (van der Waal 1999), but this genus does not feature in either Jongema (2017) or in van Huis (2022). Overall, the utilization of Abisares sp., Catantops momboensis, Heteropternis sp., Leptacris monteroi monteroi, Phaeocatantops sulphureus, Rhaphotittha sp., and Spathosternum nigrotaeniatum as edible Orthoptera is, to the best of our knowledge, reported here for the first time.

Orthoptera in southern Africa include 986 species in 366 genera (Cigliano et al. 2023) and Acrididae, the largest and most consumed family (Jongema 2017) comprise 300 species in 114 genera (Cigliano et al. 2023). These numbers are likely underestimated due to cryptic diversity, incomplete taxonomic coverage and outdated reports: for example, a recent study in the family Lentulidae discovered 32 new taxonomic species in 11 genera in South Africa (Otte 2020). The potentially high number of total Acrididae species in our study region, which is currently not clear, along with the low number of ethnospecies compared to that reported by Van der Waal (1999) suggests that our coverage of edible grasshopper species in Limpopo is not complete. Therefore, there is ample room for future work based on a much larger number of specimens for capturing representatives of less abundant species and obtaining male specimens for taxonomic examination. The high frequency of females and juveniles in our sample compared to males was the main cause for the relatively low proportion of specimens identified to the species level. This aspect, however, should not influence the members of the community involved in the collections as to not bias the sampling towards capturing specimens for their perceived value for the study instead of their traditional utilization as food.

Despite the limitations and challenges of our study (detailed in “Challenges in overlapping the different sources of identification” to “Challenges in NCBI and BOLD queries” sections), the DNA-based analyses evidenced the utilization of 36 genetic species that could not all be taxonomically identified to species level, but the vast majority were identified to genus level. Most reports of edible insect species based on alpha taxonomy also include Orthoptera identified to genus level (e.g. (Hlongwane et al. 2020; Raheem et al. 2019; Riggi et al. 2015) but few explicitly used our sampling strategy, i.e. the traditional method employed by the communities when collecting grasshoppers for own consumption. A study performed on edible insects of Northern Angola recorded vernacular names with the assistance of the communities, and also reported difficulty in achieving species level identification for all specimens and, similarly to our study, the authors found relevant taxonomic keys and/or reference DNA sequences not to be available (Lautenschläger et al. 2017). Neither alpha taxonomy, ethnospecies or DNA analyses are without drawbacks per se, but their use as complementary tools and sources of information can contribute to advance and expedite documentation. To the best of our knowledge, our work is the first study on edible insects that associates DNA sequence data with a specific set of specimens available for future reference, along their vernacular names. Given that morphologically cryptic diversity is a major contributor to insect biodiversity (Li and Wiens 2022) and that indigenous knowledge systems have been eroding due to urbanized and westernized lifestyles (Aswani et al. 2018), our DNA-based data associated with a collection of specimens available for future reference represents important groundwork for accurate documentation of edible grasshoppers in South Africa.

Finally, Orthoptera are mostly collected in the rainy season (October to January), when our sampling was performed in line with the traditional uses in Limpopo. Acrididae and Zonocerus (Pyrgomorphidae) are consumed by large proportions of community members (Hlongwane et al. 2021). The palatability aspect should also be investigated, as it may contribute to understand whether some species are preferred and selectively harvested. Our personal experience in consuming grasshoppers with community members indicates that Acrididae are similarly appreciated and consumed. Zonocerus elegans seem to be the only Pyrgomorphidae consumed in Limpopo, as reported in a previous study (Hlongwane et al. 2021), although it secretes a noxious substance as a defence mechanism against predators. The consumption patterns of the close relative Zonocerus variegatus in eastern and equatorial Africa is better understood, and depends largely on availability, local cultures and restrictions which influence the subjective aspect of palatability (Kekeunou et al. 2020; Kekeunou and Tamesse 2016).

Challenges in overlapping the different sources of identification

While we are confident that the classification of the specimens according to genetic groups is accurate due to the high quality of the sequences and the robustness of the phylogenetic reconstruction and low intragroup divergence, the level of confidence of results obtained by more subjective morphology-based methods is difficult to assess. Indeed, morphology-based experts often come to different conclusions when studying the same material due to personal idiosyncrasies, with the existence of immature life-stages and sexual dimorphism adding further complexity to the task of distinguishing scientific species (Packer et al. 2009). Identification of ethnospecies is also a morphology-based method likely to be impacted by the subjectivity inherent to human observation of complex characters.

Overall, the overlap between genetic groups, ethnospecies and taxonomic species was imperfect, as several genetic groups included specimens identified in different ethnospecies and scientific taxa (e.g. G1, G10 and G25). In some cases, taxonomic identification was not concordant within a given genetic group (e.g. G27), with sequence matches on BOLD and/or GenBank (e.g. G1), and the two databases also discorded in some cases (e.g. G14). However, it is remarkable that almost half of the ethnospecies for which DNA barcodes were generated (11/24) overlapped with an exclusive genetic group and others, such as Nwa-Rhuda fell in closely related genetic groups or belonged in the same subfamily (e.g. Tatakgope in Cyrtacanthacridinae). Some cases of inconsistency among ethnospecies could be explained by the fact that ethnospecies names differ based on the language of the community and geographic area, and even households within communities that share the same language may use different names for the same grasshoppers (Z. Nethavhani, pers. comm.). This is the case of Nwa-Nchocho and Nwa-Nthaga (G25), and Nwa-Mroveni and Nwa-Nchichi (G1), evidencing that it is challenging to obtain ethnospecies names consistently, especially when available specimens are at different stages of development and/or exhibit sexual dimorphism. Moreover, the spreading of Westernized lifestyles often causes loss of traditional knowledge on edible insects across generations (Egan 2013), and it is possible that in some cases community members who assisted with specimen collection were not fully proficient on traditional nomenclature. However, indigenous knowledge on edible insects in South Africa seems to be persisting to the present day. Between 1994 and 1996, van der Waal (1999) recorded 155 names of edible grasshoppers given by local community members in Venda-speaking region of Limpopo, of which a large proportion seemed to be very localized or rarely collected (van der Waal 1999). Van der Waal also found that the same local name was used for identifying different taxa, and gracefully noted that “the many names that were recorded only once or twice may represent a playful approach by the catchers, keeping in mind that most are children”. Eleven of the names recorded by van der Waal are in common with those in our study – Mutotombodzi, Nyamurovheni, Gerere, Nzie-Iuvhele and Sianama (Venda) and Letungwa, Malefiriswane/Mmalifiswane, Malekgwarana, Maletswaitswai, Tatagope (Sepedi/Northern Sotho). Some names are identical (e.g., Letungwana) but others have spelling variations (e.g., Tatakgope/Tatagope, and Mamotswaitswai/Maletswaitswai). Van der Waal also noted that the rapid urbanization, degradation of natural habitat and increasing perception that grasshoppers were “children’s food” could lead to loss of traditional knowledge. However, the fact that 10/35 (29%) of the ethnospecies found in our study have been recorded over 20 years ago reiterates the importance of present-day cultural resources for documenting biodiversity.

Challenges in DNA barcoding

Compared to our previous experience in other insect taxonomic groups (e.g. Powell et al. 2019; Langley et al. 2020; Ajene et al. 2020; Egan et al. 2021; Hlaka et al. 2021), DNA barcoding of Orthoptera was particularly challenging. PCR amplifications and high-quality sequences were only consistently produced when newly designed primers with higher specificity relative to universal primers were used. Even so, several of the genetic groups still showed some ambiguous positions in the electropherograms due to overlap of two different nucleotides. For example, the sequences obtained using the new PCR primers for Z. elegans, which were designed based on the complete COI sequence from the mitogenome of a representative of the species, still showed a few ambiguous positions; however, no premature stop codons were detected in any sequence in the whole dataset. High incidence of heteroplasmy and/or NUMTs (non-functional regions of mtDNA that have integrated into the nuclear genome) seems to be characteristic of Orthoptera (Song et al. 2008), presumably due to their remarkably large genome (Song 2010). The new primers offered improvements in PCR amplification as success rate increased, spurious amplifications decreased, and sequence quality was higher. However, our alignments of complete COI sequences of Orthoptera for designing these new primers evidenced a high level of polymorphism in the primer annealing regions, even though separate alignments were performed for different subfamilies. This variability likely explains the high PCR failure rate in our diverse sample set, which will probably only be solved with employing a wider range of new additional primers with higher specificity. Increasing availability of complete mitogenomes of Orthoptera will likely offer insights into sequence variation in the primer annealing regions of COI that may contribute to improve DNA barcoding in this taxonomic group.

Challenges in NCBI and BOLD queries

Although NCBI yielded more matches at genus level than BOLD, sequence identity was generally lower suggesting that even though NCBI returned more matches, these did not necessarily allow for correct taxonomic identification. In a study where the two databases were compared, NCBI was shown to be more accurate for insects; however, this result was not found to be statistically significant, and the study was not conducted specifically in Orthoptera (Meiklejohn et al. 2019). Despite these limitations, both the previous study and ours evidenced the low percentage (< 70%) in Meiklejohn et al. (2019) of genus and species level identification when performing sequence queries on BOLD and NCBI. Moreover, our low rate of identification success by sequence queries was likely impacted by the poor representation of African Orthoptera in both databases. Misidentification of specimens, taxonomic confusion and clerical mistakes negatively affect DNA barcoding as an endeavour to record and document biodiversity, as the scientific name appended to the deposited sequences forms the foundation of databases. Thus, to minimize potential problems, public databases should require metadata such as photographs and descriptions of morphological characteristics of the uploaded specimen, along with the location where it was found (Friedheim 2016). While BOLD allows for recording metadata, GenBank is purely a sequence repository (Meiklejohn et al. 2019). Additionally, Friedheim (2016) proposed that detailed information about ambiguous base calls that were edited based on interpretation of electropherograms should be disclosed (Friedheim 2016).

Reflexion on research that incorporates traditional knowledge

Collecting data from a community with no cognisance of the spirituality embedded in their collective wisdom or of the value of knowledge generated through ancestral experience is to work in a vacuum with only half the story (Naamwintome and Millar 2015). This compromises the veracity of the data, as well as the effectiveness with which it can be used towards alleviating both local and global challenges, because the context in which the knowledge is generated is often not understood (Knopf 2015). A conundrum exists in working honestly with communities outside the researcher’s own society and/or income bracket: rules should be put in place to protect indigenous knowledge-holders from exploitation, but the same rules can also imply and prompt separation of the researcher from the community, making it difficult to engage authentically with knowledge-holders and leading to the danger of a top-down approach (Knopf 2015). Indeed, even when researchers are intimate members of a community, if they operate from a scientific perspective there is the risk of a pedestal being erected upon which scientific methodologies are placed at the expense of knowledge generated experientially in a local context (Knopf 2015). Honest, unplanned human interactions do not operate from the viewpoint of rules, but arise spontaneously from the platform of being human together (Keane et al. 2016). Nonetheless, principles and methodologies are helpful in ensuring that data are not gathered at the expense of ethics and that information is shared both ways throughout the life of the scientific project and beyond.

The African philosophy of Ubuntu, which most succinctly translates as “I am because you are” was the underlying ethos followed when conducting field work for this study. This value system, which aspires to compassion, reciprocity, dignity, humanity and mutuality (Tutu 1999) is the southern African expression of living a moral life (van der Walt and Oosthuizen 2021 ). Gatherings and interactions with community members were held in the spirit of Ubuntu, which led to the sharing of knowledge around insects and the environment, rather than a one-sided collection of data from the community to the researcher. Instead of interviews, conversations were held in which edible insect names were exchanged for possible scientific names, and then updated if possible after input from taxonomists. Conversations frequently included meals prepared with edible insects, which prompted more knowledge sharing. Due to the fact that friendships were formed, the obligation to return information did not rest merely upon formal ethics approval requirements but upon a genuine wish to return and continue the relationship. Such gatherings also minimised the risk of a colonialist outlook from the researcher, and principles for indigenous data governance advocated by CARE (Carroll et al. 2020) were thus implemented where applicable. Shared initiatives included teaching traditional recipes as well as experimenting with creating new flavours and products from edible insects, which are not reported here. In this way, knowledge was fed back to the community in a manner in which it could be integrated into a communal discovery.

Conclusion

We present the first study of edible grasshoppers in South Africa that reports ethno-identification, alpha taxonomy and DNA sequence information. We found 36 genetic groups that likely represent distinct species of edible grasshoppers that overlap with morphology-based identifications (alpha taxonomy and ethnospecies) to some extent although a number of discordances were evident. Nonetheless, this work represents a step forward in documenting the biodiversity of edible grasshoppers in a fast-changing world. Despite the difficulties and limitations inherent to DNA barcoding and morphology-based methods (classic taxonomy and ethnospecies) for the identification of Orthoptera, the use of different sources of information contributes to unravel the fantastic abundance of edible insect cultural and biological diversity in Africa and worldwide.