Introduction

Functional plant products have developed into the most dynamic sector of the food industry in the Western world. This trend is driven by growing awareness for health in a wealthy, but ageing society, as well as by changing food habits, such as growing interest in vegetarian and vegan lifestyles. In this context, Amaranth belongs to the crops with the most impressive increase in economic impact, because it is hyped as a superfood and gluten-free alternative to conventional cereals [1]. For Europe alone, the annual market volume for Amaranth was more than 2000 million € already in 2020 and is projected to reach 5000 million € till 2028 [2]. Current applications of amaranth in the food industry include their use as grains (cooked and popped) in breakfast cereals, muesli bars, confectionery, and chocolate specialties. Amaranth flour combined with wheat flour form flour blends that can be used to enhance the nutritional value of various products such as breads, noodles, cookies, and cakes [3]. Furthermore, Amaranth oil is used directly as gourmet oil in salads, dip sauces and smoothies, as well as an ingredient in ω3 enriched food products [4, 5].

Amaranth had been an important staple food in the Aztec culture, equaling maize and beans with respect to economic impact, and it was even worshipped as a deity [6]. In consequence, cultivation of this crop was prohibited under Spanish rule and could be pursued only in a clandestine manner. Independently from Mexico, Amaranth was domesticated in Peru. However, here, Amaranth was mainly used for subsistence mountain agriculture and ritual purposes, and never acquired the same importance as its relative Quinoa, which, along with potato, became the staple food of the Andean cultures [7]. Only in the 1960ies, Peruvian Amaranth, known under the qichwa name kiwicha, had been re-discovered, re-installed, and popularised by the intense efforts of Luis Sumar Kalinowski (1993).

The ancestral species for these two domestication events differed – the Mexican Amaranth (the two species Amaranthus cruentus and Amaranthus hypochondriacus) derive from the wild A. hybridus, while the Peruvian kiwicha (A. caudatus) relates to the wild A. quitensis [8]. Since the Mexican Amaranth flowers earlier and, therefore, is more easily cultivated outside of its centre of origin, it has been in the focus of breeding efforts, but also in the focus of commercial use. In contrast, kiwicha has been mostly neglected and, even in Peru, is only cultivated in a limited area around Cusco and the Urubamba region. The ignorance has reached a stage, where Amaranth, in review articles, is bluntly introduced as “Mexican seed” (e.g., Rojas-Rivas et al. [9]), completely ignoring that Peruvian Amaranth even exists.

The neglectance of kiwicha is not justified since it harbours considerable nutritional value. For instance, compared to its Mexican relative A. cruentus, the Peruvian A. caudatus is richer in iron and calcium [10]. More importantly, the antioxidant capacity of A. caudatus is around two orders of magnitude (around 400 mmol Trolox equivalents per kg−1 for DPPH; [11]) higher compared to its Mexican relatives (around 4 mmol Trolox equivalents per kg−1 for DPPH; [12]). Moreover, kiwicha is rich in essential amino acids and in certain fractions excels other species of Amaranth. For instance, the prolamine fraction of A. caudatus contains around 50% higher contents of lysine, often underrepresented in plant proteins [13].

To unfold the potential of kiwicha as functional food requires strategies to authenticate it against the far more common Mexican species of Amaranthus that are in the meantime also cultivated in Peru, i.e., in the domestication centre of kiwicha. Moreover, unlike its Mexican relatives, kiwicha has almost remained untouched by breeding efforts. Amaranths are morphologically diverse due to their high genetic diversity, but also environmental variability [14, 15]. The high genetic variation could be attributed to variations in chromosomal number as well as by hybridisation supported by outcrossing between Amaranthus species [16]. Many grain, vegetable, and weedy Amaranths are allotetraploids with chromosome numbers of either 2n = 32 or 2n = 34 [17]. To enable future breeding strategies, and to safeguard its biodiversity, it is important to characterise the agro-morphological traits of kiwicha in comparison to other Amaranth species [18]. In Amaranthus, morphology is not only strongly dependent on species but also on environmental conditions. The interaction between production techniques and genotypes leads to significant differences between, but also within, species [19, 20]. Furthermore, since cross-species hybridisation can occur even in the wild, intermediate morphological forms exist resulting in a mixture of morphological characters from the respective progenitor species [21,22,23]. Therefore, identification of species based exclusively on morphological characteristics is difficult for Amaranthus.

To overcome the limitations of merely morphological characterisation, nuclear and chloroplast sequence information have been employed [14, 24]. In fact, it is possible to differentiate grain Amaranth species from other species based on single nucleotide polymorphisms (SNPs) derived from genotyping by sequencing [25]. However, this approach is laborious and expensive and not feasible for practical application. Alternatively, DNA barcoding provides a robust and reliable alternative which in many plant taxa allows for discrimination down to the species level [26]. Depending on the scope of the study, different markers allow either to extract a survey over a taxonomic group or to “zoom in” and differentiate between species of a genus.

For Amaranth, the plastidic barcodes rcbL and matK and the nuclear barcode internally transcribed spacer 1 have been used to clarify the phylogenetic position of A. tricolor against other species, including A. caudatus [27]. These markers were not able to resolve the different species – it should be noted, however, that the taxonomic identity of the accessions was not verified, which might have affected the discriminative power of the used barcodes. Upon integration of entire plastid genomes, it became possible to discern individual species, due to the integration of additional informative sites interspersed over the genome. As a result, this study supported the model of two independent domestication events, bridged by a strong overlap with A. hybridus.

For practical applications, such as verification of commercial food samples or, even more relevant, for authentication that seed material for cultivation is indeed kiwicha, sequence-free approaches are warranted. While sequencing of entire plastid genomes has been a powerful tool for phylogenetic studies [27], it would not be a feasible strategy here, because it is time consuming and still too costly. As an alternative, the variable trnH-psbA intergenic region [28] could be used to develop a diagnostic assay to validate the identity of a particular species. The reliability of this strategy depends on sufficient numbers of reference accessions, whose taxonomic identity had been determined before. Once sequence data from such reference plants of validated identity have been generated, it is possible to elaborate informative SNPs that delineate the target species from other taxa. This can then be used to develop sequence-free fingerprinting assays which allows surveying other individuals of interest. Diagnostic primers that are located in the center of the amplicon and target a specific SNP will generate a diagnostic band in addition to the full-length barcode in the species of interest, but not in the surrogate species [29, 30]. This strategy, termed ARMS (for Amplified Refractory Mutation System) was successfully applied for discriminating closely related Lamiaceae species to impede adulteration in commercial products [29], to authenticate the correct species of Goji berries for congruence with the Novel Food regulations of the European Union [30], or to detect adulteration of Bamboo Tea by Chinese Carnations caused by wrong translation of a Chinese vernacular name into English [31]. For Holy Basil or Tulsi, hyped as a superfood in many industrialised countries, this strategy could be even developed to a stage, where different chemotypes of this plant could be discerned by a single PCR [32].

As a contribution to safeguard the authenticity of kiwicha against other, commercially prevailing Amaranth types, we assessed the agro-morphological diversity and phylogenetic relationships between a germplasm collection comprising 84 accessions of Amaranthus representing all species of commercial interest, along with the relevant varieties that are currently grown commercially. Based on this information, we develop a species-specific ARMS strategy that allows for the discrimination of kiwicha (Amaranthus caudatus) from any other species of the genus Amaranthus.

Materials and methods

Plant materials

A total of 84 accessions of Amaranthus species were used in this study (Table 1). Of these accessions, 15 were classified as A. caudatus, 14 as A. cruentus, 9 as A. hypochondriacus, 4 as A. hybridus, 2 as A. spinosus, 1 as A. tricolor, 1 as A. powellii, and 38 as accessions of unknown taxonomical status. The sampling comprised all species that are used commercially, as well as landraces, and varieties that had been bred for agricultural use. Therefore, this germplasm represents a significant proportion of the genetic diversity that is currently in human use. All accessions were raised to flowering and their taxonomic identities were determined following digital documentation of floral traits and plant habitus according to taxonomic keys of the Flora of China (http://www.efloras.org/florataxon.aspx?flora_id=2&taxon_id=101257), the Flora of North America (http://www.efloras.org/florataxon.aspx?flora_id=1&taxon_id=101257) [33]. To validate the applicability of the assay developed in the course of the study, thirteen commercially used Amaranth seed samples were used as well twelve of those samples were collected from farmers in different regions in Peru in the season 2021, one sample was purchased from a supermarket in Germany whose seeds originated from India (Table S1).

Table 1 List of 84 accessions of Amaranth included in this study

Greenhouse planting and field transplanting

The morphological characteristics of all the accessions were monitored in a field experiment under temperate environmental conditions. Prior to field planting, seeds of the 84 accessions were pre-grown in 100-well trays in a greenhouse at the Botanical Garden, Karlsruhe Institute of Technology (KIT) on 15 April 2020. Surface-sterilised seeds of the Amaranth accessions were sown equidistally at a density of a six seeds per well, and then individualised in the greenhouse to one plant per well after germination. A total of 50 wells were planted per accession, resulting in 50 seedlings. The wells were filled with a mixture of a 1:1:1 peat moss:perlite:soil mixture and raised in the greenhouse at 25 ± 3 °C with a 12-h photoperiod. Three weeks after germination, seedlings were transplanted to a field plot at the Botanical Garden of Karlsruhe Institute of Technology (Karlsruhe, Germany). To achieve a good plant stand, the distance between the individual seedlings of a row was set to 40 cm. The length of each plot was 2.5 m, and the distance between rows was 0.65 m. The seedlings were transplanted in three randomised experimental blocks, with each block containing three rows for each accession. The two outer rows served as buffers, while all measured parameters were taken from the plants in the middle row. Based on soil analysis, the recommended amounts of fertiliser were supplemented using 100 g/m2 organic fertilizer (Hauert Hornoska® Special, Germany) and 90:60:40 kg/ha of NPK. A field capacity of 70–80% was maintained by irrigation throughout the entire experiment till seed maturity. Karlsruhe is located in the Upper Rhine Valley in Southwest Germany (latitude: 49°0′24.8004″ N, longitude: 8°24′13.1508″ E), with an average elevation of 119 m above sea level (based on the World Geodetic System 1984 datum). The climate is oceanic temperature with average temperatures between − 1 °C during winter to 26 °C during summer. Temperature and rainfall were monitored during the experiment. The monthly average values are presented in Table S2.

Phenotyping and morphological evaluation

For morphological observations, three plants located in the center of a second row in each plot were sampled randomly to record the time needed till flowering (d), plant height (cm), length of inflorescence (cm), grain yield (g/plant), 1000-seed weight (mg), and cross area of seeds (mm2). The length of inflorescence was determined as the distance between the lowest flower to the tip of the highest flower in each plant. Seed cross-area was measured for three randomly selected seeds from each of the three plant samples in each replicate by recording images for each individual seed in a defined orientation using a Keyence VHX-950F digital microscope (Keyence, Neu-Isenburg, Germany), and measuring the cross section from the digital image using the area tool of ImageJ (https://imagej.nih.gov/ij/). Qualitative traits such as flower and seed color were assessed at flowering and at maturity stage, by digital imaging (Keyence VHX-950F digital microscope). Seeds were harvested manually using pruning shears when all seeds had reached full maturity. The harvested panicles were dried in a hoop house for two weeks and then threshed manually. The threshed seeds were winnowed and evaluated for grain yield, 1000-seed weight, and seed color.

DNA extraction by CTAB method

Genomic DNA of the 84 Amaranthus accessions was extracted using an adjusted protocol based on cetyl trimethyl ammonium bromide (CTAB) [34] as follows: 100 mg of frozen leaf tissue was ground using a TissueLyzer (Qiagen, Hilden, Germany). In the context of method validation, also seed material was used in some cases. The resulting powder was incubated with 900 µl boiled extraction buffer (1.5% w/v CTAB) containing 10 µl/ml β-mercaptoethanol for one hour at 65 °C. The samples were mixed with 630 µl of chloroform/isoamylalcohol (24:1), shaken horizontally for 15 min and subsequently centrifuged for ten minutes (17,000 g). The upper aqueous phase (which contains the DNA) was transferred into a fresh 2 ml reaction tube and the DNA precipitated with 2/3 v/v of ice-cold isopropanol. The DNA was sedimented by centrifugation (10 min, 17,000 g, 4 °C), the sediment washed with 1 ml 70% EtOH, the EtOH removed by drying in a vacuum centrifuge for 15 min, and the DNA precipitate finally dissolved in 50 µl nuclease-free H2O (containing 5 µg RNAse A). The concentration and purity of the eluted DNA was determined spectrophotometrically (NanoDrop ND-100, peqlab).

PCR/Gel electrophoresis

As genetic barcodes, the intergenic spacer region separating the conserved psbA and trnH plastidic genes and the nuclear internal transcribed spacer (ITS – including ITS1, 5.8S and ITS2) were amplified by PCR using a reaction volume of 30 µl containing 20.4 µl nuclease-free water (Lonza, Biozym), 3 µl tenfold Thermopol Buffer (500 mM KCl, 100 mM Tris–Cl and 15 mM MgCl2), 3 µl bovine serum albumin (10 mg/ml), 0.6 µl dNTPs (200 µM, New England Biolabs), 0.6 µl of forward and reverse primer (200 nM, see primer list, Table 2), and 0.3 µl of Taq polymerase (5 Units, New England Biolabs). 1.5 µl of extracted DNA (50 ng /µl) was used in each PCR reaction to amplify the marker regions. Thermal cycler conditions for the amplification of the psbA-trnH igs region and the ITS region are presented in Table 2.

Table 2 Primers used to amplify DNA markers and the amplification protocol

The amplicon was evaluated by gel electrophoresis using NEEO ultra-quality agarose (Carl Roth, Karlsruhe, Germany). DNA was visualised using Midori Green (NIPPON Genetics EUROPE, Germany) and blue light excitation. The fragment sizes of the amplicons were determined by a 100 bp size standard (New England Biolabs). Prior to being sent out for sequencing, the amplified DNA was purified using the protocol of the MSB Spin PCRapace Kit (Stratec). Sequencing itself was outsourced to Macrogen Europe (Netherlands) and Eurofins (Germany). The quality of the obtained sequences was examined using the software FinchTV Version 1.4.01. Sequences were uploaded to the NCBI database, see Table S3.

Phylogeny

Sequences of the respective barcoding markers (psbA-trnH igs and ITS) including all 84 Amaranthus accessions of this study. were aligned using the Muscle algorithm of MEGA7 (Version 7.0.14, https://www.megasoftware.net/), and topology was inferred using the Neighbour-Joining algorithm. The resulting trees were visualised using the integrated Tree Explorer. However, for the sake of overview, sequences of taxonomically undefined accessions and of A. cruentus that were not informative were omitted from the multiple sequence alignment, such that the resulting phylogenetic tree of psbA-trnH igs comprised 56 accessions of Amaranthus with informative sequences. The full trees including all 84 accessions are given in the appendix. Sequences of the closely related genus Celosia were used as an outgroup in the phylogenetic tree.

Diagnostic finterprints based on the amplified refractory mutation system (ARMS)

Based on a single nucleotide polymorphism at site 103 in the trimmed multiple sequence alignment of the intergenic spacer fragment of psbA-trnH, a species-specific diagnostic primer was designed to trace A. caudatus as described in [30], introducing an additional artificial nucleotide substitution at the third position of the 3’ end of the primer to suppress unspecific binding in other Amaranthus species. PCR was carried out as described above apart from adding the specific primer for A. caudatus to the reaction volume. The result of this multiplex-PCR was evaluated by agarose gel electrophoresis. To validate the specificity of this species-specific diagnostic ARMS primer, the DNA of ten additional species was added to the dataset: A. crispus (Lesp. & Thévenau) A.Terracc, A. graecizans L., A. retroflexus L., two A. viridis L., (one of the accessions was received as A. acutilobus Uline & W.L. Bray, ID9531), A. blitoides S.Watson, A. deflexus L., A. muricatus (Gillies ex Moq.) Hieron., A. blitum subsp. oleraceus (L.) Costea, A. albus L., and A. dubius Mart. ex Thell (Table S4).

Statistical analysis

Phenotypic data were subjected to individual ANOVA for different characters to assess the variability among the accessions and standard error of the treatment means using PROC ANOVA in the SAS software package (SAS Institute, Inc.). The Least Significant Difference (LSD) test was used to carry out post-hoc comparisons of differences among means, applying a significance threshold of P < 0.05 (PROC MEANS). To estimate the degree of linear association between the traits studied, simple correlation coefficients (r) were computed using the PROC CORR routine of SAS. Morphological data were standardised using the Z Score by OriginPro (OriginLab Corporation, Northampton, MA, USA). The Ward cluster method and a Euclidean distance type were used to cluster the data into a phenotypic cladogram for the 84 accessions of Amaranth (OriginPro). A principal component analysis (PCA) was applied to plot the relationship between distance matrix elements with respect to their first two principal components in OriginPro.

Results and discussion

Information on genetic diversity and clustering among and within Amaranth species is important to effectively utilize plant genetic resources to improve the crop through breeding programmes [35]. To assess genetic variation in Amaranth, morphological traits [35,36,37,38,39,40] have been monitored along with molecular markers [27, 37, 41, 42]. However, the morphology of Amaranthus plants is strongly dependent on environmental conditions, which causes significant phenotypic variation between and within species [19, 20]. Furthermore, even wild species of Amaranth tend to hybridise, such that morphological characters from different species can coexist in an individual accession [21,22,23]. Therefore, taxonomic identification of Amaranthus based solely on morphological characteristics is difficult [27, 41]. For the current study, we, therefore, employed both morphological characters and DNA barcoding markers to verify taxonomic identities, which, in the next step, allowed to infer phylogenetic relationships. As a final step, we designed a specific ARMS strategy to discriminate kiwicha (A. caudatus) from other Amaranth species.

Kiwicha and Mexican Amaranth differ in their phenotypic parameters

Commercial use of Amaranth is dominated by species from Mexico, such as A. cruentus, A. hypochondriacus, A. hybridus, or hybrids between these species, leading to the question, to what extent kiwicha is phenotypically different. Therefore, a total of 6 quantitative traits were monitored, including time to flowering (days), plant height at maturity, inflorescence length, grain yield per plant, 1000-seed weight, and cross area of seed. The means, least significant differences, and coefficients of variation from 84 accessions are given in Table S5. Highly significant differences (P < 0.0001) were detected among the screened accessions for all assessed quantitative traits (Table S6).

Especially, the time to flowering displayed a high range of variability between and within species (70.3–100.0 d). A. hybridus cv. Ural (ID 8054) and A. hypochondriacus cv. Mittlerer Typ OR dunkel (ID 8095) accessions were the earliest, requiring only 10 weeks to reach 50% of initiate flowering (Fig. 1, Table S5). Also, the two accessions of the wild A. spinosus (ID 3809 and ID 8290), known under the vernacular name Spiny Amaranth were observed to be early flowering. In contrast, all the accessions of A. caudatus species were late in flowering under these temperate environmental conditions (Fig. 1; Table S5).

Fig. 1
figure 1

Mean values of relevant agro-morphological traits recorded from 84 Amaranth accessions evaluated under temperate environmental conditions in South-West Germany during the season of 2020. A Days to flowering (day), B Plant height (cm), C Grain yield (g/plant). Data represent the mean and standard errors of three biological replicates. Accessions belonging to the species A. caudatus species are given as black bars. Additional agro-morphological traits are presented in supplementary materials (Fig. S1 and Table S5)

In general, the Amaranthaceae are considered as day-length indeterminate [43]. However, A. caudatus displays distinct photoperiodism [44]. Once it has reached its sensitive period at around 30 d after germination, already two short days (9 h light) can induce floral primordia, while in long days (18 h light), flowering required 60 days to initiate. It should be noted that these experiments had been conducted under the conditions of an Illinois summer with night temperatures above 25 °C, such that the development was more rapid than in the current experiment, where temperature was much lower (Supplementary Table S2). Our observations are well in line with this. For grain production in temperate environmental conditions, such as in South Germany, early flowering would be more advantageous to minimize exposure to low temperature during the sensitive flowering and post-anthesis grain filling periods. The pronounced photoperiodism may be the main reason, why A. caudatus is underrepresented in global commercial production.

Significant variations were also observed in plant height. The height ranged from 94 to 286 cm. Here, the 15 accessions of A. caudatus were much taller than the other species (Fig. 1; Table S5). Likewise, grain yield per plant also showed a pronounced variation with mean values from 0.6 to 90.5 g with an overall mean of 30.1 g (Fig. 1; Table S5). Here, most accessions of A. caudatus were in the lower range (12 of the 15 accessions were at < 14 g/plant) compared to A. hypochondriacus, A. hybridus accessions in our study. The highest values were found in accession A. hypochondriacus cv. Mittlerer Typ (ID 8084) with 90.5 g/plant, followed by the accession A. cruentus cv. MT 3 (ID 8041) with 79.3 g/plant. In general, the accessions of the Amaranth grain species showed only moderate grain yield in our study compared to other studies that were conducted in East Austria [45,46,47], which is probably due to the more temperate conditions at our study site. The night temperature dropped to + 5.9 and 3.9 °C in the months of September and October, respectively, and that may affect the rates of photosynthesis, plant growth, and seed setting (Table S5). According to previous studies, the investigated species of Amaranth showed a similar response pattern to temperature, e.g., net photosynthesis (CO2-fixation) increases almost linearly between 10 and 30 °C and decreases when the temperature is lower [48,49,50,51,52]. This pattern of response to temperature is typical for most C4 species such as Amaranth, explaining, why A. caudatus showed lower grain yields in our hands (Fig. 1). In contrast to A. caudatus, the accessions of the weedy species, A. spinosus (ID 3809 and ID 8290), A. powellii (ID 8938), and A. hybridus (ID 8053, ID 8052, and ID 8054) were shorter, early in flowering, but differed from the Mexican grain Amaranths by low grain yield and 1000-seed weight (Fig. 1, Table S5). Spiny Amaranth (A. spinosus) is a native of the Neotropis, while A. hybridus is a native riverbank pioneer with a wide geographic distribution ranging from Eastern North America and parts of Mexico, over Central America to Northern South America [6]. Meanwhile, A. spinosus and A. hybridus have invaded almost all continents and have turned into noxious weeds in many crops including rice and cotton [53]. The early flowering has to be considered as a trait supporting a weedy lifestyle enabling invasion and rapid completion of their life cycle earlier than the invaded crop species.

A substantial variability was also seen for the other traits, such as inflorescence length (48–152 cm), 1000-seed weight (0.16–1.06 g), and cross area of seed (0.58–1.70 mm2) (Figure S1; Table S5). Likewise, whereas the accessions of A. caudatus clustered for total plant length, the other species were widely distributed across the entire population with respect to inflorescence length (Figure S1). The lowest values for 1000-seed weight, as well as cross-area of seed were noted in the two accessions of the wild A. spinosus (ID 3809 and ID 8290), while the A. hypochondriacus accession Anderer Typ accession (ID 8096) recorded the highest values for these two traits. The range of 1000-seed weight range in our study was wider than the values reported previously for more continental conditions of Austria [45, 54, 55]. For instance, compared to the 0.62–0.93 g found in A. cruentus and A. hypochondriacus [45], we found in our study the range was from 0.56 to 1.06 g in these two species (Table S5). Seeds of A. caudatus were lighter with 1000-seed weights between 0.48 and 0.83 g). In contrast, more than 25 accessions belonging to A. cruentus, A. hypochondriacus, and Amaranthus species of unknown taxonomical status had 1000-seed weights above 0.75 g, which was higher than previous reports and indicates that the cultivation conditions were optimal [47, 55]. Since 1000-seed weight is one of the crucial breeding targets for the improvement of grain yield [56], we will return to the relationship between these two traits later in more detail.

Plant height is positively and specifically correlated with flowering time

To predict the consequences of selections for one trait on the performance of others, the examination of genetic relationships between different traits is of great importance. Therefore, the correlations between the six quantitative traits such as days to flowering, plant height at maturity, inflorescence length, grain yield per plant, 1000-seed weight, and cross area of seed were investigated (Fig. 2, Table S7). Grain yield was significantly and positively correlated with inflorescence length (r = 0.28, P < 0.001), 1000-seed weight (r = 0.27, P < 0.05), and cross area of the seed (r = 0.24, P < 0.05), which is consistent with results on grain Amaranths in Africa [36, 46, 57]. Principally, a correlation between two traits can derive from developmental interaction, or from the genetic linkage. The architecture of the Amaranth inflorescence is complex and specific for different species, which is partially due to human selection during domestication [58]. It is self-evident that more extended inflorescences will yield higher grain yields. However, grain yield was negatively correlated with the time required till flowering (Fig. 2A), which is pronounced in A. caudatus as species with late flowering. Here, flowering occurred already in autumn, such that these accessions experienced cold stress during flowering and post-anthesis grain filling. Days to flowering were significantly and positively correlated with plant height (r = 0.61, P < 0.0001), which is to be expected from a longer vegetative period (Fig. 2D). However, there were no significant relationships between inflorescence length, 1000-seed weight, and cross area of the seed (Table S7), indicating that resource allocation to the inflorescence is not limiting for seed size. Instead, a strong positive association was found between 1000-seed weight and cross area of the seed (r = 0.82, P < 0.0001) as to be expected, because larger seeds should be heavier (Table S7).

Fig. 2
figure 2

Relationship between different agro-morphological traits obtained from 84 Amaranth accessions evaluated under temperate environmental conditions during the season of 2020. Relationship of grain yield with days to flowering (A), plant height (B), and inflorescence length (C). D Relationship of plant height with days to flowering. ns non-significant; **, *** and **** Significant at 0.01, 0.001, and 0.0001 probability levels, respectively

Seed color does not qualify as taxonomic trait to distinguish grain Amaranth species

Seed color is considered as an important domestication trait in grain Amaranths, and A. caudatus is often proposed to be less domesticated based on the occurrence of dark coloration in many accessions [8]. In fact, many breeding programs try to develop high-yielding Amaranth varieties with brown or light seed color [59]. We, therefore, studied seed and flower color of our population to determine the genetic variability within and between the species of grain Amaranths (Table 1 and Figure S2). Both traits were documented in a standardized manner as representatively shown in Fig. 3. We could discern six different colors, namely white, cream, gold, pink, brown and black. Twelve accessions of A. caudatus showed light-colored seeds (Figure S2_7, 12, 14), nine of which had even white seeds, while the others displayed golden (ID 7469), pink (ID 9408), or cream-colored (ID 8102) surfaces. The other three accessions of A. caudatus have brown (ID 9442) and black seeds (ID 3807, ID 2496) (see Figure S2_6, 12, 13). The seeds of all A. hypochondriacus accessions were cream-colored or golden, throughout. For A. cruentus, twelve accessions had cream-colored seeds, but in two accessions (ID 7470 and ID 8049) seeds were brown. For A. hybridus, three accessions had black and one cream-colored seeds (ID 8939). The seeds of the two wild species A. spinosus and A. tricolor were black (see Figure S2_17, 24, 30). Since most of the domesticated grain Amaranths (A. caudatus, A. cruentus, and A. hypochondriacus) show light seed colors, this trait was likely selected positively during domestication (Figure S2). This is further supported by the fact that all tested accessions from wild Amaranth species exhibited seeds of dark color, albeit this conclusion is based on a relatively low number of wild accessions in this study. However, our conclusion is consistent with [8], reporting that all the 24 putative wild Amaranth accession (including hybridus) had seeds of dark brown color, while the seeds of 89 domesticated A. caudatus individuals tested in their study were white. Seed coloration is linked with the accumulation of proanthocyanidins in the testa. The antimicrobial activity of these compounds is likely to improve the survival of seeds after they have been shed. In fact, a study on weedy red rice, a feral form of rice, showed that accumulation of proanthocyanidin and seed shattering were key factors for de-domestication and occurred highly coupled [60]. In congruence with previous reports [8], light seed color is predominant, but not exclusive for domesticated A. caudatus and A. cruentus (Table 1 and Figure S2). The incomplete fixation of this domestication trait indicates either weak stringency of selection, genetic constraints, or ongoing gene flow. While pigmentation of the testa seems to be essential for a wild or a feral lifestyle, where seeds are shed, it certainly does not pose a genetic constraint under the conditions of domestication. In fact, loss of pigmentation is a common hallmark of domestication in several crops, for instance in rice [61]. Domestication releases the selective pressure upon this trait because seed shed, and dormancy have been eliminated by agriculture. Thus, unpigmented forms can (but do not need to) arise. The occurrence of a (small) number of colored accessions in domesticated Amaranths might mean that this trait was not actively supported during domestication (contrasting with the situation in rice). However, gene flow from crop wild relatives, must be considered to be a more relevant factor, since the outcrossing rate in Amaranth is relatively high, between 5 and 30% in Amaranth, and the domesticated species remained sympatric with their crop wild relatives, especially A. hybridus [8, 62, 63].

Fig. 3
figure 3

Variability in coloration for seeds (A–F), inflorescence, and individual florets (G–J). Representative examples for seeds with A white (ID 9444), B cream (ID 8047), C gold (ID 8083), D pink (ID 9408), E brown (ID 9442), and F black (ID 8078) color. Representative examples for panicle and individual flower with the colors G white (ID 8087), H ocher (ID 8090), I pink (ID 8600), and J purple (ID 8045). All images were recorded with a magnification of 150× using Keyence VHX-950F digital microscope (Keyence, Neu-Isenburg, Germany). Scale bar for seeds is 200 µm, and for inflorescences and flowers 1 mm

The differences in seed color are reflected by respective differences in flower color as systematically documented by digital images from both, the entire inflorescence, and the individual flower. Here, four different colors were observed, namely, white, ocher, pink, and purple. Most A. caudatus accessions showed white and pink flowers, but the accessions ID 7469, ID 3807, and ID 8102 produced purple flowers (see Figure S2_12, 13, 14). For A. hypochondriacus, color ranged from white, over ocher and pink to purple. For A. cruentus, the flowers were white, pink, and purple. Three accessions of the species A. hybridus had purple, but one had white flowers (Figure S2_30–33). Overall, of the 84 accessions in our study, 74 accessions produced purple- or white-colored flowers (Table 1 and Figure S2). Coloration is widely used to assign a given accession to one of the grain Amaranth species [22]. Our data show that seed and flower color vary within species and overlap between species, congruent with previous reports [35, 64]. Thus, color does not qualify as reliable traits for taxonomic identification. Certainly, it is not possible to conclude from the darker coloration of the seeds that the respective accession is A. caudatus.

Kiwicha is phenotypically more defined than the Mexican grain Amaranths

As we had observed a high phenotypic diversity between cultivated grain Amaranth species (see first and second sections), we wondered, whether this may result from a strong gene flow between cultivated Amaranths and their relatives [65]. Therefore, the accessions were grouped based on their phenotypical parameters to detect, whether grain Amaranths cluster according to species and against their wild relatives (Fig. 4). After normalisation of the values based on their Z-Score, a cladogram was inferred for the 84 accessions from the Euclidean distances using the Ward cluster method (Fig. 4A). This phenotypic tree reveals two main clusters. Cluster I is comprising 54 accessions and is composed of three subclades (defined as A, B, and C), while cluster II consists of 30 accessions and is divided into two subclades (defined as D and E). All 15 accessions from A. caudatus fall into cluster I and form an exclusive subclade A, along with 4 accessions from A. cruentus, 4 from A. hypochondriacus, 2 from A. spinosus, and 1 from each, A. hybridus and A. tricolor. In addition, 24 accessions of unknown taxonomic identity fall into cluster I. In contrast, none of the A. caudatus accessions was located in cluster II, which harboured most (10) of the fourteen accessions for A. cruentus, 4 accessions from A. hypochondriacus, 2 from A. hybridus, and 14 accessions of unknown taxonomic identity. The two accessions of the wild A. spinosus (3809 and 8290) form a separate branch basal to subclade C. The other species are distributed over the entire dendrogram without any detectable species-specific distribution. For example, 10 accessions of A. cruentus were located in cluster II, but 4 accessions in cluster I. As cosmopolitan genus with a large number of species and high phenotypic plasticity [66], the genus Amaranthus already harbours considerable variability. This is further accentuated by frequent outcrossing (5–30%) and hybridisation events [67]. The variability within A. cruentus and A. hypochondriacus was mainly due to the traits flowering time, plant height, and grain yield (Fig. 1), and these two species were closer to each other than to A. caudatus (Fig. 4A). Not a single accession of A. caudatus was grouped in cluster II. Based on their predominantly upright architecture, the unknown Amaranth species were rather belonging to A. cruentus or A. hypochondriacus [59], but the plasticity in plant height, days to flowering, and flower color was considerable and showed that these traits are not suited as reliable taxonomic markers. Nevertheless, most of these unknown accessions had an acceptable grain yield, a semi-dwarf height, and a short vegetative growth, which makes them interesting for breeding. However, other traits, such as dark seed color indicate that they had not been subject to prolonged breeding.

Fig. 4
figure 4

Cladistic representation and Principal Component Analysis (PCA) for 84 accessions of Amaranthus based on 6 agro-morphological traits. A Phenotypic tree based on Euclidean distance inferred by Ward's Minimum Variance. B PCA plot showing scores for PC1 and PC2, C Eigenvectors, eigenvalue, total variation, and cumulative variance derived from the PCA. The experiment was conducted under temperate environmental conditions in South-West Germany during season 2020. The accessions are highlighted according to species: green A. caudatus, blue A. cruentus, red A. hybridus, violet A. hypochondriacus, black other species or accessions of unknown identity

The Principal Component Analysis (PCA) showed that the first three PCs with Eigen values above 1.00 accounted for 85.3 percent of the total variation among the tested accessions (Fig. 4B, C). Hereby, the first and second principal components are described in total 62.3 percent of the variation. The traits with the highest contribution on PC1 were 1000-seed weight and cross area of seed (with a relative weight of 54%), followed by grain yield (relative weight 28%). Several traits, such as plant height, days to flowering, and inflorescence length were negatively correlated with PC1. Instead, days to flowering (with a relative weight of 57%), 1000-seed weight and cross area of seed (each with a relative weight of 24%) had the highest impact on PC2. Conversely, inflorescence length and grain yield exhibited moderate or even strongly negative loadings on PC2. All the traits exhibited generally high and positive communalities above 0.30 on PC3. Overall, the PCA separated A. caudatus clearly from the other species that were mutually interspersed. Thus, kiwicha is phenotypically well defined, while Mexican grain Amaranths are more variable and mutually overlapping. After being degraded to minor crops in their native homelands due to crop substitution and suppression by Spanish colonialism, the grain Amaranths have made a comeback in some parts of Mexico and South America [68]. Later, they have spread to Africa [69], and Asia [53, 70]. Their wide geographical distribution, their phenotypic plasticity, the presence of intermediate forms, and the degradation of biodiversity in their two centres of domestication [42] have stimulated the emergence of many synonyms [66, 71]. To resolve this by phenotype, seems not to be a feasible strategy. Therefore, DNA-based strategies provide a useful alternative to evaluate germplasm for conservation, to characterise core-collections, and to support breeding by marker-assisted selection [72]. From the perspective of consumer safety, genetic authentication can help to safeguard the authenticity and, thus, quality, of commercial products, and protect consumers from surrogated or adulterated products.

The genetic barcode internal transcribed spacer (ITS) is not informative in Amaranthus

Since the phenotypic clustering had demonstrated a clear delineation of A. caudatus from the remaining Amaranth accessions, we explored the possibility to find genetic barcodes that would reflect this difference. The nuclear Internally Transcribed Spacer (ITS) is widely used as a genetic barcode and is manifest with a complete length of 551–558 base pairs (including the ITS1, 5.8S and ITS2 regions) in the genus Amaranthus. Unfortunately, this marker was not informative, because the sequences were absolutely identical between A. caudatus and A. hypochondriacus, and between A. cruentus and A. hybridus respectively, demonstrating that the discriminative power of the ITS region is not sufficient to separate those two species, nor to develop a sequence-based fingerprinting strategy to authenticate A. caudatus. This lack of variances is reflected in the topology of the phylogenetic tree (Figure S3) that fails to resolve the grain Amaranth species, congruent with previous results [73] that reported low divergence of ITS in Amaranthus as well. As a nuclear marker, ITS can be inherited through both parents and is, therefore, subject to meiotic recombination, such that under conditions of gene flow, as it was obviously provided through A. hybridus as wild carrier [8], potential differences between species will be equalized.

The plastidic barcode psbA-trnH igs allows to delineate Amaranthus caudatus

Since the widely used nuclear barcode, ITS, was not informative, we tried the plastidic psbA-trnH igs marker to find sequence polymorphisms diagnostic for A. caudatus. This barcode often shows a high variability that in some cases even allows to reach to taxonomic levels below a species [32]. In eudicotyledons, the length of the psbA-trnH igs barcoding region ranges from 152 to 851 basepairs, with an average length of 357 basepairs [74].

In Amaranthus, the psbA-trnH igs marker has a length between 230 (in A. powellii, ID 8938) and 242 (in A. blitum subsp. oleraceus, ID 9526) base pairs and is, thus, shorter than average, reducing the possible amount of parsimonious informative sites. Nevertheless, a small number of deletions, insertions or substitutions could be detected among the different Amaranth accessions (Suppl. Table S3). The separation of A. caudatus from all other Amaranthus species was reflected by the psbA-trnH igs barcoding marker. However, this marker region does not display sufficient discriminative power to delineate other Amaranthus species. All the 15 tested A. caudatus accessions were completely congruent with respect to the psbA-trnH igs and well separated from all other Amaranthus species. A single diagnostic SNP in the multiple sequence alignment of psbA-trnH igs is characteristic for A. caudatus. To place the grain Amaranth species into a larger context, we inferred a phylogenetic tree based on the psbA-trnH igs marker by sampling additional species of the genus (Fig. 5, Figure S4). The two Celosia species C. argentea and C. cristata were selected as outgroup. Among the tested accessions from the genus Amaranthus itself, the Mediterranean species A. blitum subsp. oleraceus was located at the base of the other taxa. The Peruvian Amaranthus caudatus accessions form a well-defined clade supported by a high bootstrap value. The closest sister clade comprises A. viridis, A. crispus and A. dubius. A second accession of A. viridis, along with the North America species A. retroflexus and A. powelli are located basal to these two clades.

Fig. 5
figure 5

Phylogenetic tree of the genus Amaranthus based on the psbA-trnH igs. The A. caudatus cluster is labelled in green, the sister clades and the accessions basal to the A. caudatus clade in dark green. Two other main cluster (containing A. hypochondriacus, A. hybridus, A. cruentus and A. spinosus) are labelled in red and dark red. Outliers within the genus Amaranthus (e.g., A. graecizans) are displayed in grey. The closely related genus Celosia (black) was chosen as the outgroup. Bootstrap values are noted next to the branches

For the Mexican Amaranths, two main clusters (labeled in red and dark red, see Fig. 5, Figure S4) emerge that are not clearly assigned to specific species, but are shared between A. hypochondriacus, A. hypochondriacus x hybridus, A. hybridus, A. cruentus and A. spinosus, along with some accessions that are not assigned to a particular species, but derive from breeding programms and, therefore, derive from the same species. The red cluster comprises the two accessions of A. spinosus and some representatives of A. hypochondriacus. The dark red cluster comprises all A. hybridus, A. cruentus and A. hypochondriacus x hybridus accessions as well as representatives of A. hypochondriacus. The separation of these two clades can be traced back to one insertion (site 32), one double deletion (site 120) and one substitution event (site 137) in the psbA-trnH igs alignment. The eleven accessions of A. hypochondriacus distribute over both clade (Fig. 5, Figure S4).

In summary, the psbA-trnH igs marker clearly delineates kiwicha from the other grain Amaranths, while the Mexican grain Amaranths species are mutually interspersed and do not display taxon gaps. The psbA-trnH igs is a plastidic marker and, thus, does not discern hybridisation events since chloroplasts are inherited maternally. This may be the reason, why it can delineate A. caudatus from its Mexican ancestors despite the presumed gene flow through A. hybridus [8].

If one strives to create a robust phylogeny for the genus Amaranthus, additional DNA barcodes, a combination of those, or the integration of trait-related genes would be needed. However, for our purpose, to authenticate kiwicha by means of DNA-based assays, the consistent presence of a kiwicha-specific SNP in the psbA-trnH igs marker was sufficient and stimulated us to develop a sequencing independent fingerprinting assay to authenticate kiwicha.

ARMS-based duplex-PCR clearly identifies A. caudatus

While the SNP identified in the psbA-trnH igs marker can be used to discriminate kiwicha from closely related species such as A. cruentus or A. hypochondriacus, for practical application a sequencing-free assay would be preferable to authenticate commercial samples, however. Thus, we designed duplex-PCR to detect this informative SNP at site 103 in the alignment (Fig. 6A,B). The objective of the assay is to obtain, in addition to the full-length psbA-trnH igs amplicon, a diagnostic side band based that is present or absent depending on this SNP. Since a SNP is usually not sufficient to lead to an all-or-none output in amplification, a destabilising mutation was added three positions from the 3’end of the oligonucleotide primer (so-called Amplified Refractory Mutation System). This strategy was successful, because the band seen for the full-length amplicon upon gel electrophoresis was now accompanied by a second diagnostic band in all A. caudatus accessions, but this second band was lacking for all other accessions (Fig. 6C). The upper band, corresponding to the complete psbA-trnH igs fragment serves as a positive control to monitor the overall success of the PCR, whereas the smaller fragment indicates the presence of the G103 delineating A. caudatus from all other tested Amaranthus species. The relatively small size of the entire psbA-trnH igs barcoding region in Amaranth is favourable in this regard because it will even make it possible to trace strongly fragmented DNA in dried or processed samples. Thus, this tool can be easily applied and utilised to unveil possible A. caudatus identities in samples of unknown or uncertain identity, which is helpful to verify, for instance, the identity of commercial samples declared as kiwicha.

Fig. 6
figure 6

A Schematic representation of the psbA-trnH igs region with location and orientation of the oligonucleotide primers. The T to G exchange specific for Amaranthus caudatus is highlighted. B Partial display of the multiple sequence alignment, demonstrating that A. species different from A. caudatus show a T instead of the G found in A. caudatus. C Diagnostic duplex PCR based on the diagnostic ARMS primer shows a double band for A. caudatus, while all other Amaranth species display only the full-length psbA-trnH igs amplicon

The specific ARMS assay allows for rapid identification of kiwicha

The germplasm used in the current study harboured several accessions of unknown taxonomic position beyond the genus annotation (see Table 1). As to validate our specific assay for the authentication of kiwicha, we tested the remaining accessions of unknown identity that, by their phenotype (Fig. 4B) were placed outside of A. caudatus. In fact, none of these accessions displayed the diagnostic double band, which excluded that they are A. caudatus (Fig. 6C), consistent with the phylogenetic position of these accessions outside of the A. caudatus clade (Fig. 5, Suppl. Figure S4). Furthermore, we validated our specific assay for the authentication of kiwicha on 13 commercial seed samples of Amaranth (Table S1). Twelve of them (G-1 to G-12) originated from the seed material grown by local farmers in different regions of Peru, collected in the season 2021, one sample (G-13) was purchased in a supermarket in Germany and was imported from India. Two of the Peruvian samples, G-4 and G-8, as well as sample G-13 from the German supermarket did not display the diagnostic double band, which excluded that they were A. caudatus (Figure S5A). When we sequenced the psbA-trnH igs region of these samples and checked their phylogenetic position, we found them clearly located outside of the A. caudatus clade (Figure S5B, C), confirming the validity of the conclusion based on the duplex PCR. Thus, our ARMS assay allows for rapid, sequence-free validation of a given sample as kiwicha. The commercial sample from Germany was just declared as Amraranth without reference to the species and, thus, did not claim to contain A. caudatus. Here, the presence of A. caudatus was not to be expected anyway. More problematic is the detection of concurrent species in the Peruvian samples (G-4 and G-8), because those were grown as kiwicha and will later be commercialised as such. This result also means that the authenticity of Peruvian Amaranth is at stake, most likely, because some of the newly bred varieties deriving from Mexican Amaranth have been distributed among Peruvian farmers.

Conclusions and outlook

The Peruvian Amaranth A. caudatus (kiwicha) is underrepresented in breeding and in global trading, although it is endowed with interesting nutritive traits with great potential as a functional food. Using a collection of 84 Amaranth accessions we searched for traits that delineate kiwicha from other grain Amaranths. This germplasm comprised a large variation of agromorphological parameters, scored under the environmental conditions of Southwest Germany. The statistical evaluation showed that the Amaranthus species could not be delineated by Principal Component Analysis of these phenotypical parameters due to broad and overlapping intraspecific distributions of these parameters. Unfortunately, this drawback was also not overcome using the nuclear barcoding marker Internal Transcribed Spacer. In contrast, the plastidic marker psbA-trnH igs, while not resolving the phylogenetic relationship between the Mexican species of Amaranthus, provided species-specific SNP separating A. caudatus from all other grain Amaranths in the collection. Based on this SNP we were able to develop a duplex-PCR that was able to verify the identity of a given sample as kiwicha (A. caudatus) by virtue of a diagnostic side band amplified together with the full-length amplicon. We validated this diagnostic assay against other grain Amaranths of known identity, but also against more distant species, such as the Indian A. tricolor, or the European A. graecizans. Additionally, grain Amaranths of presumable Mexican origin or hybrids thereof could be shown to lack the diagnostic band, thus, excluding them as being kiwicha. This assay will help to verify the authenticity of commercial samples and seed material declared as kiwicha, and thus, to safeguard the identity of this endangered crop against surrogation by Amaranth of Mexican origin. It is not meant to resolve the complex evolution and domestication history of the grain Amaranths. This will require genome-based approaches such as Genome-Wide Association Studies (GWAS) or genome sequencing to get insight into the relative importance of selection on the taxonomic relationships within the genus, and the role of hybridization events for gene flow. As to unfold the potential of kiwicha as functional food, we are currently analysing the molecular base for value-giving compounds, such as poly-unsaturated fatty acids, or lysine to develop trait-related markers that can be used for marker-assisted breeding, but also for quality control of commercial samples. The application of our authentication assay upon samples from a commercial context reveals that the genetic authenticity of kiwicha is endangered by the introduction of foreign seed material deriving from Mexican Amaranth, which in the long term will erode the nutritional specificities of kiwicha. Here, a system of seed quality testing by public authorities would be urgently needed, as it is at work in many countries already.