Introduction

Ethnobiological investigations around the world have focused on identifying the criteria to select plants, especially those used in medicinal applications, in different populations. Among the different factors that can interfere with plant selection, taxonomic and phylogenetic aspects are addressed in a large number of studies, which are based on the theory of non-random selection, which states that plants can be overused or underused depending on factors that will determine their selection or not. One of the pioneering studies in this regard [1] investigated whether the use of medicinal plants by Native Americans was effective or placebo medicine only. Using a regression analysis, the author came to the conclusion that some taxonomic groups were more used than what was expected if plants were being randomly selected. Years later, seeking to understand the motivations for selectivity, Moerman [2] reported that the presence of biologically active properties as well as factors related to the knowledge about plants acquired over the years and passed from generation to generation contributed to the selection of some plants.

More recently, ethnobiological studies using different approaches and statistical tools have confirmed the theory that plants are not selected at random, but there are rather taxonomic biases that determine why some species are preferred over others [3,4,5,6,7,8]. There are other approaches using phylogenetic tools which also confirm this theory. These studies consider that closer species share characteristics that justify their use, and for this reason, there are groups that stand out, indicating that species of those groups are selected precisely because they have favorable characteristics, leading to the rejection of the possibility of randomness [9,10,11,12,13].

Some of the plant selection criteria, which can culminate in taxonomic biases, have been found and they are associated with availability, historical and cultural preferences, presence of alkaloids, terpenoids, and biologically active volatile compounds in the case of medicinal plants [6]. However, regarding wild food plants, little effort has been made to test the theory of non-random selection of plants, especially in Brazil, as there are no reports of investigations with this scope.

Among the multiple tools used to test the theory of non-random plant selection, two approaches are frequent in different socio-ecological contexts to demonstrate which taxonomic groups are the most used. The first is the Bayesian model, which assumes uncertainty only on the number of species of the investigated flora, that is, the number of useful plants [14]. The other is the Imprecise Dirichlet Model (IDM), which assumes that both the data on the number of species of the investigated flora and the data of the number of species of the overall flora of the investigated environment are uncertain [15]. The first studies in this sense used residual analysis of simple linear regressions to show overused and underused families [1], but this proposal was questioned due to the statistical inconsistency of the method [16]. Then, a binomial analysis was proposed [16], but it was also objected [14].

In this review, we aim to contribute to the establishment of theoretical bases for the theory of non-random plant selection by local populations, specifically in the context of food plants of the Brazilian flora, using the approaches of IDM and Bayesian model to identify patterns in the knowledge and use of wild food plants in Brazil from the identification of over- and underused families. The following question was the starting point: Are there botanical families over- or underused for food purposes by local populations in Brazil? Our hypothesis is that some botanical families are overused and others underused.

Methodology

Bibliographic search

We searched for scientific documents with an ethnobotanical approach that presented a list of food plants occurring in Brazil with at least one species. To this end, four databases were consulted: Web of Science, Scielo, Scopus and PubMed. Search queries were run using pre-established keywords, namely: (1) "Unconventional Food Plants" AND Brazil; (2) "Wild Food Plants" AND Brazil; (3) "Wild Edible Plants" AND Brazil; (4) “Useful Plants” AND Ethnobotany AND Brazil; (5) "Plantas Comestíveis" AND Brasil; (6) "Plantas Alimentícias Não Convencionais" AND Brasil; (7) "Plantas Alimentícias Silvestres" AND Brasil; (8) “Plantas Úteis” AND Etnobotânica AND Brasil. Search results refer to the knowledge and/or use of food plants. Searches were performed on the title, abstract and keywords of the articles.

Inclusion/exclusion criteria

Only studies published in Portuguese and English were included in the review. Works with more general approaches (useful plants) were selected for later extraction of data regarding food plants. Review articles were excluded, but their references were used for locating further articles with primary data. Studies conducted in the same community or using the same database were excluded and the one that contained more complete and detailed information was included. Also, studies that used systematic instruments for data collection, such as interviews, were included. We excluded studies that did not provide information about the data collection method and also those that did not mention the scientific names of the species.

Screening

Duplicates, that is, articles found more than once in different databases, were excluded; only one document was entered in the database. Subsequently, the abstract of each article was read and those without an ethnobotanical approach and reviews were removed (reviews were used for another purpose as mentioned in the inclusion/exclusion criteria section). Then, a second screening was performed. The articles selected in the first screening were read in full length. Those that did not present a list of species and those that did not identify the species were excluded.

Study selection method based on risk of bias

After application of inclusion/exclusion criteria and screening steps, the articles were classified as presenting low, moderate, and high risk of bias according to criteria for ethnobotanical studies of medicinal plants based on sample quality [17].

Articles presenting moderate and low risk underwent another classification that informed a possible increase in the level of risk based on the following information: complete or incomplete identification of plant material; presentation of a complete or partial list of species; presence of restrictions in the studied habit or taxonomic groups, for example, studies conducted only with herbs or forest species or studies with only one family [18].

Finally, articles classified as presenting moderate and low risk were included in the analysis and the others were removed.

Treatment of data

Data on food species and place where the study was carried out were extracted from each article according to the following information: bibliographic reference, biome, region, state, scientific name, family, popular name, part used, and form of use.

Information on all species occurring in Brazil was further extracted using the flora package in R [19]. The information included: scientific name, family, life form, habitat, type of vegetation, and establishment (origin) according to the listing of Flora do Brasil [20]. The correct spelling and accepted names of the species were checked also using this database. When a species was not mentioned in the listing of Flora do Brasil, the database World Flora Online was consulted [21].

Only the list of accepted native Angiosperm species was extracted from the listings of Flora do Brasil [20] and World Flora Online [21]. Naturalized, exotic, cultivated species, and those without the source information were excluded.

Data analysis

Two distinct approaches were used to identify overused and underused families: the Bayesian model based on Weckerle et al. [14] and the IDM based on Weckerle et al. [15]. While the Bayesian model assumes uncertainty only in the number of native food species, the IDM assumes that data on both the number of native food species and the number of overall native species are uncertain. The Excel Inv.BETA function was used calculate the range of the most probable values of θ (proportion of native food species for the overall flora) and θj (proportion of native food species for family j).

Families which obtained a lower limit of θj greater than the upper limit of θ were considered to be overused. Families which obtained a upper limit of θj lower than the lower limit of θ were considered underused. In cases of overlap between the limits of θj and θ, the family was considered neither over- nor underused.

Results

Eighty articles met the inclusion criteria. However, 45 of them were considered to present a high risk of bias, 17 a moderate risk, and 18 a low risk, according to the categorization of risks of bias in ethnobotanical studies in Brazil [17, 18]. Table 1 lists the 35 articles that composed this review.

Table 1 Listing and general aspects of studies with an ethnobotanical approach addressing wild food plants carried out in Brazil

The overused and underused families are listed in Table 2. The Bayesian approach indicated 14 overused and 3 underused families. The IDM was more conservative, indicating a total of 13 overused families and only 1 underused family.

Table 2 Overused and underused families of wild food plants from the Brazilian flora

All overused and underused families found with the IDM approach were the same as those found with the Bayesian approach (Anacardiaceae, Annonaceae, Arecaceae, Cactaceae, Capparaceae, Caryocaraceae, Myrtaceae, Passifloraceae, Rhamnaceae, Rosaceae, Sapotaceae, Talinaceae, and Typhaceae were overused, and Orchidaceae was underused), since the latter was less conservative in relation to the IDM approach. Thus, in the Bayesian model, in addition to the families found with the IDM approach, there was one more family considered overused (Basellaceae) and two additional families considered underused (Eriocaulaceae and Poaceae).

Discussion

The results found in this review provide further evidence supporting the theory of non-random selection of plants, in this case, of wild food plants in Brazil. Similar findings have been reported in different socioecological contexts for medicinal plants such as in Brazil [57], India [4], Papua New Guinea [3], Italy [58], Ecuador [59], Africa [6], Europe [60], Nepal [7], and South Africa [8].

The results of this study were consistent with those observed in the literature for medicinal plants. For example, in a study conducted in Brazil regarding medicinal plants, with a similar methodology to the one employed here (using Bayesian and IDM approaches), the families Anacardiaceae, Capparaceae, Caryocaraceae, Rhamnaceae, and Rosaceae were identified as overused, while Eriocaulaceae, Orchidaceae, and Poaceae were considered underused [57]. In Italy, a study using linear regression, the binomial method, and the Bayesian approach showed that Rosaceae was overused while Poaceae and Orchidaceae were underused [58], similar to the findings in our study. In Papua New Guinea, using the Bayesian approach, Anacardiaceae and Arecaceae were considered overused, and Poaceae and Orchidaceae underused [3]. In India, also with the Bayesian approach, Anacardiaceae and Cactaceae were found to be overused and with a binomial analysis, Poaceae showed to be underused [4]. Finally, in a review with useful plants from Chile, specifically those of the edible category, the families Myrtaceae, Cactaceae and Anacardiaceae were considered overused through the IDM and Bayesian approaches [5], similar to the results found in the present review.

It is worth noting that attractive factors differ between food and medicinal plants, especially from a physicochemical point of view. When similar results are found in the two categories, this does not necessarily mean that the same selection criteria apply for both. The fact that some families are concomitantly overused or underused in both categories may indicate that physicochemical properties are not the only aspect that leads a taxonomic group to be chosen or not. For example, Orchidaceae usually occurs at a low frequency in the environment and most of its plants grow as epiphytes; these characteristics could hinder experimentation in this group of plants and their consequent incorporation into medicinal and food systems.

Since the physicochemical requirements for the selection of medicinal and food plants differ, other shared factors are likely responsible for several families being overused for both purposes. The fact that some families are concomitantly under- or overexplored for food and medicinal purposes, as found in this review and in other phytosociological studies carried out in Brazil, may be related to the ease of access, because many species are widely dominant in Brazilian ecosystems. For example, Anacardiaceae was among the richest families in studies carried out in the Atlantic Forest with native species [61] and also in Caatinga, in an anthropized area [62]. Arecaceae was one of the species with the highest number of species in a study conducted in the Amazon [63]. Myrtaceae and Anacardiaceae were very well represented in terms of number of species in Cerrado [64]. The good representativity of species of these families in the environment is likely a contributing factor for people to find them easily, leading to more contact and greater chances of identifying their uses, ultimately causing these families to stand out as families of both medicinal and food plants.

Besides the ease of access, it is possible that these plants have other attractive characteristics. For example, various studies carried out in Brazil have identified the fruit of food species as the most used plant organ [27, 43, 44, 65]. The absence of such attractive characteristics may explain why some underutilized families have few or none species mentioned as food plant in the wild group, such as Orchidaceae, Eriocaulaceae, and Poaceae in this review. In the case of the latter, despite the family has representatives of great economic importance worldwide and this could theoretically encourage the use of other species of the family, this did not happen in the present review. Only two out of a total of 1297 species of Poaceae from the native flora of Brazil were mentioned as wild food plants.

The results found in the literature indicate that families that have fleshy fruits, such as Arecaceae, Myrtaceae, and Passifloraceae, tend to be better known and used. Fruits of Myrtaceae are known to have a large number and concentration of phenolic compounds with important antioxidant properties, which are beneficial to human health [66]. Some fruits of the family Arecaceae have high nutritional value and are rich in bioactive compounds [67]. Passifloraceae fruits are rich in magnesium and zinc, in addition to containing phenolic compounds, triterpenes, steroids, and flavonoids [68]. These characteristics are key for the determination of their uses, because their presence can contribute to people selecting the plants for consumption.

Conclusions

The selection of wild food plants occurring in Brazil, known and used by different populations, presents a marked taxonomic bias. The identification of overused and underused families contributes to the discovery of families with potential for popularization. In addition, this work is important from the point of view of conservation of wild plants and for the promotion of food and nutritional security. Therefore, efforts are needed to identify the species that could be incorporated into the diet of populations in view of characteristics that make plants more used in relation to others. Furthermore, investigating which parts are most used, their nutritional value, which are the forms of consumption, which are the promising species in the group of wild food species in Brazil, and defining strategies for the management of use are also fields yet to be explored.

In view of their wide geographical distribution, families such as Anacardiaceae Myrtaceae, Arecaceae, and Passifloraceae can be strategic for food prospecting aimed at popularization.