Mitochondrial relationships between various chamomile accessions

Matricaria chamomilla L. (GRIN; The Plant List 2013) is an important medicinal plant and one of the most frequently consumed tea plants. In order to assess mitochondrial genome variation of different cultivated chamomile accessions, 36 mitochondrial SNP markers were used in a HRM (high resolution melting) approach. In thirteen accessions of chamomile (n = 155), twenty mitochondrial haplotypes (genetic distances 0.028–0.693) were identified. Three of the accessions (‘Camoflora’, ‘Mat19’ and ‘Manzana’) were monomorphic. The highest genotypic variability was found for the Croatian accession ‘PG029’ with nine mitochondrial haplotypes (mitotypes) and the Argentinian ‘Argenmilla’ with seven mitotypes. However, most of the mitotypes detected in these accessions were infrequent in our sample set, thus disclosing an unusual high amount of substitutions within the mitochondrial genome of these accessions. The mitotypes with the highest frequency in the examined dataset were MT1 (n = 27), MT9 (n = 23) and MT17 (n = 20). All of the frequent mitochondrial lines are distributed not only over several accessions but also over several geographical origins. The origins often build a triplet with on average two to three concurrent lines. The most distantly related accessions were ‘Mat19’ and ‘Camoflora’ (0.539), while ‘PNOS’ and ‘Margaritar’ (0.007) showed the lowest genetic distance.


Introduction
Matricaria chamomilla L. (common name German chamomile, Asteraceae) (Germplasm Resources Information Network (GRIN) 2019; The Plant List 2013) is a mainly outcrossing and originally diploid (2n = 18) species indigenous to Europe and West Asia with natural populations occurring in many temperate regions worldwide (Franke and Schilcher 2007). More than 120 constituents have been identified in the chamomile flowers, out of which the majority is represented by the terpenoids α-bisabolol and its oxides (≤ 78%) and azulenes (1-15%) (Gupta et al. 2010). The high consumption and increasing market demand necessitated directed breeding concepts to improve crop yields, essential oil contents, pick height and disease resistance (Das 2014;Albrecht et al. 2016). Chamomile breeding started approximately 70 years ago and included diploid as well as tetraploid varieties artificially induced by colchicine treatment (Franke and Schilcher 2007). The Czech and Slovak Republic, Poland, Hungary and Germany were the nations in which the first breeding activities were located (Das 2014;Seidler-Lozykowska 2016). However, information about the natural populations serving as origins is scarce. The use of molecular approaches could help to resolve the genetic relationship between different chamomile varieties. Several molecular approaches to maintain breeding processes were conducted in the past (Wagner et al. 2005;Solouki et al. 2008;Pirkhezri et al. 2010;Okon et al. 2013;Ahmadi et al. 2014). SNP (single nucleotide polymorphism) analyses (Otto et al. 2017) are still rare, though, and genetic diversity studies mostly relied on nuclear polymorphisms. The exploitation of the mitochondrial (mt) diversity of M. chamomilla L. is just at the beginning. Uniparentally inherited markers of the chloroplast or mitochondrial genome proved adequate to enable investigations of migration routes of populations or the exploration of geographical backgrounds and common ancestors that date back several years (Tomaru et al. 1998;Gugerli et al. 2001;Arroyo-García et al. 2006). Information about mt genome diversity bears Communicated by: Izabela Pawłowicz another advantage. Crossings of plants with highly variable or distantly related cytoplasm might enhance the possibility to find CMS (cytoplasmic male sterility) in the progenies (Spassova et al. 1993). CMS results from a conflict between nuclear and cytoplasmic genomes (Spassova et al. 1993;Budar et al. 2003). This sterility trait is widely used in breeding concepts of crop plants to assure controlled cross-pollination and prevent selffertilization (Balk and Leaver 2001;Liu et al. 2013Liu et al. , 2014. For chamomile cultivation, where self-fertilization and seedshedding is a serious problem, the establishment of a male sterile mother line would be of enormous advantage. Increased male sterility in the progenies of inter-cultivar crosses (Faehnrich et al. 2013) is already documented for chamomile but a CMS phenotype was not yet exhibited.
High-throughput technologies like GBS (genotyping by sequencing) and next-generation sequencing (NGS) were used recently to appraise genetic diversity of different chamomile accessions by exploiting a multitude of SNP data of the nuclear and mt genome (Otto et al. 2017;Ruzicka and Novak 2020). The mt sequence information of 33 chamomile individuals published by Ruzicka and Novak (2020) now enables the analysis of point mutations in a broader sample set using a high-throughput technology like HRM (high resolution melting). HRM is a very sensitive 'post-PCR' method, which relies on the different melting behaviours of double-stranded DNA fragments due to their varying sequence and GC content, and already has a fix position in SNP detection, e.g. for plant breeding and genotyping (Borna et al. 2017;Sorkheh et al. 2017;Kim and Kim 2019).
This study should proceed examining the mt diversity of different chamomile accessions for (I) reconstructing their breeding history and (II) estimate diversity between/among accessions and therefore test the applicability of the SNP markers in terms of distant inter-cultivar crossings.

Plant material
Thirteen accessions of Matricaria chamomilla L. were cultivated in the greenhouse at the University of Veterinary Medicine, Vienna (48°15′ N, 16°25′ E, 161 m a.s.l.), in 2017. Five, ten or 15 individuals per accession (Table 1) were harvested in June 2017 and used for SNP screening. Main ploidy levels as declared by Das (2014), Otto et al. (2017) and Faehnrich et al. (2019) were assumed for the accessions and were not further verified in the present study.

DNA extraction
Approximately 1 cm 2 of dried leaf material was ground with glass beads in a swing mill (Mixer Mill MM301, Retsch GmbH, Germany), and total genomic DNA was isolated using a modified CTAB (cetyltrimethylammonium bromide) extraction protocol (Schmiderer et al. 2013). Quantity and quality of the DNA were measured on a spectrophotometer (NanoDrop™ 2000, Thermo Fisher Scientific Inc., USA) and via gel electrophoresis on a 1.4% agarose gel stained with peqgreen (VWR, Austria). The DNA was dissolved in TE (Tris-EDTA, pH 8) buffer and stored at − 20°C until further usage.

SNP detection and primer design
NGS sequence information of 33 chamomile samples from eleven accessions was elaborated in a previous study (NCBI, National Center for Biotechnology Information, https://www. ncbi.nlm.nih.gov/, Submission ID: SUB5046906/BioProject ID: PRJNA515664). A LASTZ alignment of the resulting individual mt consensus sequences and Diplostephium hartwegii (KX063855) as reference sample generated with Geneious 9.1.5 (http://www.geneious.com, Kearse et al. 2012) was used for SNP detection and primer design. SNPs were detected using the SNP caller as implemented in Geneious under default parameters.

HRM analysis
HRM data were generated with the Rotor-Gene™ 6000 and the Rotor-Gene Q Series Software 2.1.0 (Qiagen, Germany).
The final reaction volume of 10 μl contained 1× HRM mastermix (HOT FIREPol® EvaGreen® HRM Mix, Solis BioDyne, Estonia), 150 nM of the forward and reverse primer each, and 2 ng genomic DNA. All samples and 'no template controls' (NTC) were analysed in duplicates.
The HRM analysis with pre-amplification was performed with an initial phase of 14 min at 95°C, 45 cycles of 95°C/ annealing temperature depending on the primer pair (Online Resource 1)/72°C for 10 s/ 20 s/ 20 s. After PCR, a hold of 95°C/1 min was implemented for complete denaturing of all nucleotide strands. HRM was done with a ramp of~10°C around the melting point of the synthesized DNA fragment (Online Resource 1) and in increments of 0.1°C and 1 s hold before the temperature increase.

Evaluation of the HRM analysis
The experimental work was performed according to the MIQE (Minimum Information for Publication of Quantitative Real-Time PCR-Experiments) guidelines (Bustin et al. 2009) wherever applicable. qPCR results were checked for average fluorescence level and average C q value of the samples. Products with C q values over 30 and under 17 and those with an end point fluorescence level of less than 80% of the average were excluded from the analysis and repeated in a later run. For the HRM analysis, the normalised and temperature shifted melting curves were used for the evaluation of the samples. Previously sequenced samples served as reference samples for the different curve types in the HRM analysis. These references were used to automatically assign the unknown samples to predefined genotypes by the Rotorgene software and were added to each run to check for interrun comparability. For the automated classification of the samples by the software, a confidence level of 70% was set. However, most of the samples were anyway correlated with a confidence of above 90%. Outliers were omitted and repeated. For each run, a clear clustering of the curve types in the fluorescence versus temperature plot was obligatory.
For the primer set Mr_mt130788 that detected a triallele (A/C/T), artificial heteroduplexes had to be produced to separate the melting curves of alleles A and T. Therefore, a spike DNA with the known haplotype A was added to the mastermix, resulting in identical HRM curves and melt curves with only one peak for samples with the haplotype A and heteroduplexes and two peaks in the melt analysis for haplotype T samples.
The mean standard deviation of T m per run and ΔT m were calculated for each primer combination (Table 2).

Statistical analysis
The number of multilocus genotypes (MLG) standardized by sample numbers, the Shannon-Wiener Index (H) (Shannon 2001), Stoddard and Taylor's index (G) of MLG diversity (Stoddart and Taylor 1988) and Nei's genetic distances (Nei 1978) were used for diversity analysis of the accessions. eMLG (expected MLG, an approximation of the number of genotypes that would be expected at the largest, shared sample size based on rarefaction) allows for more appropriate comparisons of the populations in case of different sample sizes. Simpson's index (λ) (Simpson 1949) corrected by N/(N-1), evenness, E.5 (Grünwald et al. 2003), index of association (I A ) (Brown et al. 1980;Smith et al. 1993) and Nei's expected heterozygosity (H Exp ) (Nei 1978) were computed for both population genetics and locus summary statistics.
MLG and Nei's genetic distance were used for construction of a minimum spanning network. All statistical analyses were calculated using R 3.3.0 (R Core Team 2018) with poppr 2.2.0 (Kamvar et al. 2014;Kamvar et al. 2015) package.

Development of a core SNP marker set
Thirteen accessions of chamomile (155 individual samples) were analysed with 36 SNP markers in an HRM analysis in order to obtain a better insight in the composition of varieties. Thirty-one markers of this core marker set were selected out of the 102 SNPs detected previously in the mt genome sequences of eleven chamomile accessions (Ruzicka and Novak 2020). Five markers (SNPs 90-94) were developed in addition based on a LASTZ alignment with Diplostephium hartwegii. One criterion for the marker selection was that only a few SNPs with rare mutations were used. All markers were checked for discriminatory power, accuracy and stability. Table 1 Source, origins, number of individuals and declared ploidy levels (according to Das 2014, Otto et al. 2017 and  Mutations were verified by including previously sequenced samples in the analyses. This quality control step allowed for (i) validation of the genotyping technique, (ii) verification of the accuracy of the SNPs used for genotyping.
With the analysed dataset, we could therefore find fifteen of the originally detected mitotypes by Ruzicka and Novak (2020); five new mitotypes were detected so that in total twenty mitotypes were exhibited in this work (Online Resource 2). Mitotypes were consecutively numbered, and numbers that were found in the previous study but could not be verified here were excluded.

Primer evaluation
In accordance with the low amount of substitutions (eleven) originally found in the coding regions of the chamomile mt genome (Ruzicka and Novak 2020), the design of only three markers (Mr_mt46728, Mr_mt154969, Mr_mt262199) for substitutions situated in coding regions was successful. In total, we screened eight transitions (6 A/G, 2 C/T) and 30 transversions (16 G/T, 12 A/C, 1 A/T, 1 G/C) with the given core marker set. One of the 36 selected SNP markers was triallelic (Mr_mt130788). With one primer combination (Mr_mt154969) two substitutions were simultaneously analysed.
ΔT m between the two melting curves per locus as a score for the discriminatory power of the primers is given in Table 2. With all primer combinations ΔT m values between 0.32 and 0.82 and standard deviations of 0.01 and0.05, and therefore, a distinct clustering of the melting curves could be achieved. The best resolution (ΔT m 0.82) was calculated for the locus Mr_mt158868, which detects a G/T transversion. Mr_mt026887 (G/T) and Mr_mt187230 (A/C) showed the lowest ΔT m values (0.32).

Descriptive diversity statistics
As expected, the triallelic marker Mr_mt130788 revealed highest values of H exp and evenness and the highest Simpson's index and therefore provided highest information content. In general, Simpson's index ranged from 0.013 (Mr_mt154969) to 0.612 (Mr_mt130788) (mean = 0.268), the expected heterozygosity H exp ranged from 0.013 to 0.616 (mean = 0.269) and evenness ranged from 0.327 to 0.985 (mean = 0.676) ( Table 2). The HRM marker Mr_mt154969 was the least informative marker of this study. For estimating the linkage disequilibrium between the markers, the index of association (I A ) was calculated across all markers. The average I A was 4.79 with a p value of 0.001 and ṝ d of 0.15 (Fig. 1). Six groups of markers comprising 5, 4, 2, 2, 3 and 5 individual markers, respectively, showed a strong linkage disequilibrium within, although these primers were not in close proximity in the mt genome. Thus, although the executed primers were chosen to be widely distributed in the alignment, 15 of the selected markers were correlated and therefore provided redundant genotyping information.

Distribution of the mitotypes
All of the mitotypes were distributed over several countries despite of the ten mitotypes represented by single individuals of one accession only (MT4, MT5, MT11, MT12, MT14, MT20, MT21, MT22, MT23, MT24). The mitotypes with the highest frequency in chamomile were MT1, MT9 and MT17 (Fig. 3,  Fig. 4). Only one mitotype each was detected in the Austrian and North Korean accession (Fig. 4). Croatia and Argentina retrieve the highest variability caused by the two heterogeneous accessions 'PG029' and 'Argenmilla'. Anyhow, those two were the accessions with the most infrequent mitotypes. The small group of Hungarian samples was split into two mitotypes, which were not present in any other country. The other countries often formed groups of three with two to three identical frequently occurring lines on average. Bulgaria held two mitotypes; one of them also occurred in Croatia, the other one was identical to the North Korean accession. The two main mitotypes of Poland and Romania were MT17 and MT18. Both were in smaller proportion also present in the Slovak Republic. Although the Slovak Republic comprised six mt lineages, MT16 was the mitotype with the highest prevalence in this country. This mitotype could also be found in Romania, so that Poland, Romania and the Slovak Republic shared three to four mitotypes. The main mitotypes of Germany were MT1 and MT7 equally existing in Poland. MT1, in lower frequency also present in two further countries, was the mitotype with the highest frequency in the Argentinian accession. The 'Austrian' MT3 was distributed in the Slovak Republic, Croatia and Germany, thus building another triad. In general, it seems that frequent mitotypes of chamomile are distributed in most cases over two to three countries.

Discussion
Comparison to a previous mt chamomile study Twenty mitotypes could be specified with a core marker set of 36 HRM markers in the mt genome of chamomile. In a Table 4 Distance calculations between accessions based on Nei's genetic distance (Nei 1978) Mat16 previous study nineteen mitotypes were identified in the mt genome of 33 individuals by NGS (Ruzicka and Novak 2020). Four of these previously sequenced mitotypes could not be verified via HRM. Five mitotypes (MT20-24) were additionally found in the larger sample set, so that at least twenty-four mitotypes coexist in M. chamomilla. The use of an extended sample set corroborated the mt heterogeneity of the accessions 'PG029' and 'Argenmilla' whereas 'Camoflora', the only monomorphic accession of the first investigation (Ruzicka and Novak 2020), remained monomorphic. Other than previously assumed, identical mitotypes within di-and tetraploid accessions existed, Mitotype 3, e.g. with highest proportion of 'Manzana', was also present in 'Bona', 'PG029' and 'Mat90'. Furthermore, the accession 'Mat19' appeared monomorphic in this study, because no suitable primer could be designed for the underlying mutation while two mitotypes were found in this accession in the previous study (Ruzicka and Novak 2020). In both studies, three main clusters could be found with more or less identical groups of mitotypes within these clusters.

Mitochondrial diversity examined in this study
As we examined substitutions only, the likelihood that an even higher amount of total mt variation in M. chamomilla is present is high. Plant mt DNA normally shows frequent recombination but a poor substitution rate (Gualberto et al. 2014). Therefore, this high number of maternally inherited point mutations in the mt DNA we found in closely related chamomile accessions is astonishing. An equal proportion of point mutations in the mt genome of other Asteraceae below the interspecific level has not yet been documented (Ruzicka and Novak 2020). The hitherto examined genetic similarity of cultivated chamomile based on nuclear polymorphisms was high (Wagner et al. 2005;Okon et al. 2013;Otto et al. 2017). It is not unusual that the nuclear similarity was lower the more wild populations or geographically distinct populations were included in the analysis. Therefore, Wagner et al. (2005) and Okon et al. (2013) detected higher genetic similarities than Solouki et al. (2008), who compared European and Iranian accessions. Nevertheless, Otto et al. (2017) determined higher genetic similarity especially of tetraploid accessions but without clear geographical correlation. The mt variability found in this study is comparable to nuclear variability of chamomile from the same region, namely Central-Europe (Wagner et al. 2005;Okon et al. 2013). In fact, high mt variability in this study was not only detected between but also within the accessions. Consecutive selection and isolation of the breeding material will most likely be responsible for the increasing homogeneity of nuclear germplasm of cultivated chamomile. Although uniparentally inherited genome evolves more slowly than nuclear genome it is obvious that despite of diminished diversity of the nuclear genome, the mt DNA of cultivated chamomile reflects a higher variability.

Possible relations between mitotype distribution and development of chamomile varieties
Ten of the examined accessions are polymorphic with at least two mitotypes. In two accessions even seven ('Argenmilla'), respectively nine ('PG029') mitotypes were identified though most of them with low frequencies.
Although the results might be biased by different sample volumes in the countries (e.g. two accessions in Germany, Poland and Slovak Republic versus one accession in the other countries, only five individuals from Hungary), it is obvious that Croatia and Argentina retrieve the highest variability caused by these two accessions.
Several hypotheses are possible to explain the high number of point mutations found within accessions of chamomile: (I) spontaneous mutations, which would explain the low diversity found between some of the mitotypes; (II) use of several plant lines to produce cultivars or traded landraces; (III) exchange of plant material between breeders to establish new chamomile lines in other countries; (IV) cultivation of imported chamomile plants in historical times; (V) migration of chamomile populations into several countries and establishment of those chamomile populations with identical mother lines, which were independently used for producing new cultivars; (VI) contamination of seed material of cultivated chamomile with seeds of wild growing chamomile.
The fact that most of the accessions are divided into two or three main mitotypes with high frequencies corroborates the hypotheses of several promising lineages that were used for producing the cultivars and landraces (thinkable for e.g. 'Soroksári' and 'Mat16') or the extensive exchange of plant material between breeders (e.g. Romanian and Polish accessions). It is known that the Romanian 'Margaritar' originates from the Polish 'Zloty Lan' and chamomile plants collected in the wild in Poland (Das 2014). This could explain the close relationship between 'Margaritar' and 'PNOS' we found in our study. The low geographical distance between Polish and German breeding centres in the past could also result in identical mitotypes in different cultivars due to identical cytoplasmic lineages of the underlying natural populations. Nevertheless, according to Das (2014), 'Bona' and 'Goral' should be progenies of breeding experiments using the same plant material, but a close genetic relationship of those accessions cannot be confirmed by our data. If the North Korean accession 'Mat19' would have been the result of a chamomile population arisen independently in North Korea, we would have expected it to be more distantly related to the other accessions. Here again, the use of breeding material from other countries (e.g. Bulgaria or Croatia) or even the import and establishment of chamomile cultivation in ancient times is supposable.
The high number of monotypic mitotypes in the accessions 'PG029' and 'Argenmilla' could better be explained by contamination of the seed material or a higher tendency of spontaneous point mutations in these accessions than by extensive plant exchange alone. Nevertheless, 'Argenmilla' and 'PG029' also were more distinct to many of the other chamomile accessions in the GBS study of Otto et al. (2017).

Applicability for detecting a CMS phenotype
An increased mt mutation rate was detected in an infertile line of onion, where it co-occurs with CMS (Kim et al. 2016;Kim and Kim 2019). Crossings of highly variable chamomile lines or of distantly related accessions might enhance the possibility to find CMS in the progenies. The high amount of substitutions within the accessions 'Argenmilla' (2×) and 'PG029' (2×) are promising to induce negative interactions of the mt and nuclear genome often resulting in the occurrence of a CMS phenotype. Although distance calculations identify 'Mat19' (2×) and 'Camoflora' (2×) or 'Soroksári' (2×) as the most distantly related and therefore the most adequate candidates for inter-cultivar crossings, the high diversity of the examined accessions will necessitate the genetic analysis of individuals, especially when highly polymorphic accessions as 'Argenmilla' and 'PG029' should be included.

Conclusions
Our results confirm the high extent of variations found in a previous study in the mt genome of chamomile and indicate that distant inter-cultivar crossings with the aim to exhibit a CMS phenotype in the progenies could be successful. However, it is obvious that an accession-based analysis might not be sufficient but that an individual-based screening will have to be executed prior to crossing trials. HRM analysis proved to be a valuable tool for the screening of point mutations in a high amount of samples and could therefore serve as adequate facility for this purpose.
Several hypotheses are possible to explain the occurrence of multiple mitotypes per accession in chamomile. The additional analysis of natural populations would be necessary to prove or exclude some of the hypotheses presented here. However, based on this dataset, a multi-causal source seems to be most likely.