Genetic diversity of provitamin-A cassava (Manihot esculenta Crantz) in Sierra Leone

Understanding the genetic diversity among accessions and germplasm is an important requirement for crop development as it allows for the selection of diverse parental combinations for enhancing genetic gain in varietal selection, advancement and release. The study aimed to characterize 183 provitamin A cassava (Manihot esculenta Crantz) accessions and five Sierra Leonean varieties using morphological traits, total carotenoid content and SNP markers to develop a collection for conservation and further use in the cassava breeding program. Both morphological parameters and 5634 SNP markers were used to assess the diversity among the provitamin-A cassava accessions and varieties. Significant differences were observed among the accessions for most of the traits measured. The first five PCs together accounted for 70.44% of the total phenotypic variation based on yield and yield components among the 183 provitamin-A cassava accessions and five Sierra Leonean varieties. The present study showed that provitamin-A cassava accessions in Sierra Leone have moderate to high diversity based on morphological and molecular assessment studies. The similarity index among the 187 and 185 cassava accessions grouped them into 6 and 9 distinct clusters based on morphological and molecular analyses, respectively. A significant positive, but low correlation (r = 0.104; p < 0.034), was observed between the two dendrograms. The results obtained will serve as a guide and basis of germplasm management and improvement for total carotenoid content, yield and African cassava mosaic disease resistance in Sierra Leone.


Introduction
Genetic diversity provides species with the ability to adapt to changing environments. Several studies have been reported on the use of morphological descriptors to determine the genetic diversity among cassava genotypes (Rimoldi et al. 2010;Asare et al. 2011;Thompson 2013). Recent advances in molecular biology techniques have led to the development of important tools for genetic diversity study in several plant species. The accuracy in accession characterization may therefore, be enhanced/achieved with the use of molecular markers associated with morphological traits.
Previous studies in plant genetic diversity used DNA molecular markers for beta carotene improvement in cassava (Ferreira et al. 2008;Rimoldi et al. 2010), and included amplified fragment length polymorphism (Benesi et al. 2010), simple sequence repeats (Alves et al. 2011;Parkes 2009;Oliveria et al. 2012;Costa et al. 2013) and single nucleotide polymorphism (Kizito et al. 2005;Tangphatsornruang et al. 2008;Ferguson et al. 2011;Thompson 2013;Rabbi et al. 2015). With recent advances in high throughput genotyping technologies, single nucleotide polymorphism markers (SNPs) are increasingly becoming markers of preference for plant genetic studies and breeding.
SNPs are the most common types of genetic variation among species, involving just a change in a single nucleotide. Expressed Sequence Tags (ESTs) have been exploited to explain and detect SNPs in maize (Zea mays L.) (Ching et al. 2002) and soybean (Glycine max L. Merr.) (Zhu et al. 2003). Lopez et al. (2005) and Rabbi et al. (2014Rabbi et al. ( , 2015 have also reported SNPs detection from ESTs in cassava. Cassava being an outbreeding and highly heterogeneous crop, possesses an extreme level of phenotypic plasticity, and thereby, lacks the potential for unified classification system for cultivars (Kawano 1978). Consequently, characterization of agronomic traits becomes a challenge. To conduct a successful genetic diversity study on cassava germplasm in Sierra Leone, there is a need to unravel the genetic potential existing among Sierra Leone's cassava breeding program, which consists of fourteen released varieties and provitamin-A cassava accessions induction from Institute of International Tropical Agriculture, Nigeria. Thus, the need for assessing and understanding the genetic diversity among the provitamin-A cassava accessions and identifying gaps to be filled within the breeding program in Sierra Leone is required.
The objectives of the study, therefore, were to characterize, quantify and exploit the diversity of 183 provitamin-A cassava accessions and five Sierra Leonean varieties using morphological traits, SNP markers and total carotene content and to develop a collection for conservation and future use in the breeding programmes.

Germplasm sources and experimental design
The plant materials used in the study consisted of 183 provitamin-A cassava accessions known for their varying levels of provitamin-A properties, obtained from the International Institute of Tropical Agriculture (IITA, Ibadan, Nigeria) and established at the Taiama experimental site in Sierra Leone, in 2014 (Table 1) and five Sierra Leonean cassava varieties. The trial was established and evaluated during the cropping season of 2015-2016 at the Njala Agricultural Research Institute (NARC), Foya crop site, Njala, representing the transitional rain forest agro-climatic zone (Van Vuure et al. 1972;Odell et al. 1974). The trial was laid out in an Alpha lattice design with two replications, and each replication had four blocks with 47 entries per block. The blocks were separated by 1 m and 2 m alleys between and within blocks to reduce intra and inter block plant competition, respectively. Each entry was grown on 10 m row ridge at a spacing of 1 m 9 1 m between and within ridges, respectively. Cassava cuttings of 20-25 cm length were obtained from healthy stem cuttings and horizontally planted.

Morphological traits
Agro-morphological data was collected at 1, 3, 6 and 9 months after planting (MAP) on the parameters listed below using the IITA cassava descriptor (Fukuda et al. 2010) (Table 2).
Harvesting was done at 12 MAP (August-September). The following parameters were taken at harvest: number of marketable roots (expressed as count numbers), number of non-marketable roots (expressed as count numbers), total number of storage roots (expressed as count numbers), roots weight/tuber (kg), inner skin color, and outer skin color, ease of peel, root shape, marketable weight (kg), and non-marketable weight (kg). Dry matter content, expressed in  percentage was determined by selecting three representative storage roots. Slices of the fresh root were randomly selected and weighed to obtain a 100 g fresh mass sample per genotype, before being dried for 48 h in an oven at 80°C. The dried samples were then reweighed to obtain the dry mass. Disease occurrence and intensity were mostly measured in the 1st, 3rd, 6th and 9th month after planting.

Molecular characterization
The Dellaporta method of DNA extraction (Dellaporta et al. 1983) was carried out at the International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria. For genotyping-by-sequencing library preparation, the ApekI restriction enzyme (recognition site: G|CWCG) that produces less variable distributions of read depth, and therefore, a larger number of scorable SNPs in cassava  was used. Two 96-plex GBS libraries were constructed as  Elshire et al. (2011) and sequenced at the Institute of Genomic Diversity at Cornell University, using the Illumina HiSeq 2500. Raw read sequences were processed through cassava GBS production pipelines developed using TASSEL 5.0V2. The GBS-derived SNPs were further filtered using the TASSEL software (Bradbury et al. 2007) to retain only polymorphic SNPs. Initially, filtered for minor allele frequency (MAF \ 0.05), the generated 5634 SNPs were processed under the Next Generation Cassava project. The resulting SNP dataset was used for the diversity analysis study among the 188 cassava accessions already phenotyped and analyzed. Results from both the phenotype and genotype analyses were compared to check the correspondence between the two.

Data analysis
Agro-morphological data sets from this study were subjected to selected statistical packages for analysis. Analytical procedures comprised the following softwares and statistical procedures: descriptive statistics using XLSTAT (2010), MINITAB 15 and STATA 13. Principal Component Analysis (PCA) were performed using Princomp software to examine the structure of the correlations between the variables using SAS 9.3. Cluster analyses, based on Agro-morphological and SNP markers data sets, were performed to group observations together using the method of Ward's minimum variance distance using SAS 9.4. A dendrogram was plotted from the computed similarity values for each Agro-morphological traits and SNP markers to show the relationship among the accessions. The provitamin-A studied accessions and varieties were grouped based on the varying levels of total carotenoid content. Basic diversity indices for the 183 provitamin-A studied accessions and varieties were calculated using Power marker (Liu and Muse 2005) and GenAlex version 6.41 (Peakall and Mouse 2006). The Power maker software was used to generate the following statistics: number of alleles per locus, major allele frequency, observed heterozygosity (Ho), expected heterozygosity (He) and polymorphic information content (PIC) (Bostein et al. 1980). PIC values were calculated with the equation: where: RP 2 i = sum of each squared ith haplotype frequency.
A Mantel matrix test (Mantel 1967) was carried out to compare the extent of agreement between dendrograms derived from morphological and molecular data using the distance matrices. The pairwise genetic distance (identity-by-state, IBS) matrix was calculated among all individuals using PLINK (Purcell et al. 2007). A Ward's minimum variance hierarchical cluster dendrogram was built from the IBS matrix, using the analyses of phylogenetic and evolution (ape) package in R.

Results and discussion
Summary statistics of morpho-agronomic traits of 183 provitamin-A studied accession and varieties Table 3 shows summary statistics of some morphoagronomic traits of 183 provitamin-A studied accessions and varieties. Sprouting was only recorded in the first month after planting (MAP) and ranged from 65 to 100% among the 183 provitamin-A studied accessions and varieties with an average of 9.56 seeds sprouted in the first month. Severity scores for African Cassava Mosaic Disease Cassava Bacterial Blight and Cassava Green Mite variably ranged from 0 to nine in the studied collection consisting of the 183 provitamin-A cassava collection and the five varieties. Percent incidence for African Cassava Mosaic Disease, Cassava Bacterial Blight and Cassava Green Mite variably ranged from 0 to 9. Most of the morphological characters both quantitative and qualitative were taken in the 3rd, 6th, 9th and 12th MAP. Color of apical lobe ranged from 3 to 9 about a mean of 6.8 ± 1.61 3 MAP; whereas the same traits scored ranged from 0 to 9 about a mean of 6.71 ± 1.74 9 MAP. Plant height ranged from 65.5 to 284.5 cm at 6 MAP about a mean of 155.69 ± 26.12 cm. Leaf area ranged from 10.24 to 73.93 cm 2 at 6 MAP; whereas leaf retention ranged from 1.75 to 4.5 at the same time. All yield related traits were recorded at 12 MAP. Yield per hectare ranged from 0.2 to 42.5 t/ha; while dry matter content ranged from 4.0 to 44.5% (Table 3). These parameters which were good indicators of growth showed considerable variation for the morpho-   observed heterozygosity (0.38) was moderately higher than the expected heterozygosity (0.19). This substantiates the difference in the relatedness of most of the provitamin-A studied accessions which were developed from varieties of half sib families with different female know parental sources (Female plants) been pollinated by different sources. However, the major allele frequency (MAF) of all the 'markers used in the observations was generally, below 0.95, indicating that they were all polymorphic. PIC values ranged from 0.11 in TR 1233 to 0.18 in TR 1199 and TR 1525 with a PIC mean of 0.14. The higher the PIC value the more informative is the marker. Since morphological traits are influenced by the environment, molecular markers which are not influenced or controlled by the environment are preferable in genetic diversity studies (Kaemmer et al. 1992;Gepts 1993;Njoku 2012;Thompson 2013). The study carried out by Kawuki et al. (2009) was the first published report where SNPs were used for genetic diversity studies in cassava. They characterized and identified some SNP markers and assessed their utilization in cassava genetic diversity analysis assessment. The present study seems to be the first reported case in Sierra Leone, where SNP markers were used in cassava diversity study of provitamin-A cassava accessions. Using the 5634 SNP markers, 95% of them were polymorphic. The informativeness of a genetic marker is measured by the polymorphic information content (PIC). The mean PIC value observed in this study (0.14) is relatively lower than previously reported. Indeed, Kawuki et al. (2009)   The bold column in tables signifies the traits that contributed higher negative or positive loadings to the percent variance explained mrot, marketable roots; unmrot, non-marketable roots; tsr, total number of storage roots; mwet, marketable weight; nmwet, Non-marketable weight; twet, total weight; yld, yield; wsrot, storage root weight; dmc, dry matter content; rz, root size; rs, root shape; ocol, Outer color; epeel, ease of peel A studied accessions and five varieties. The tool has a practical application in the selection of parent lines for breeding purposes and varietal development. The cumulative variance of 70.44% by the first five axes with eigen values [ 1.0 indicates that the identified traits within these axes exhibited great influence on the phenotype of these accessions, and could effectively be used for selection among them. This study agrees with findings of Afuape and Nwachukwu (2005;Afuape et al. 2010), who reported a cumulative variance of 70.09% for the first three axes in the dry evaluation of nine sweetpotato genotypes, weight of total roots, weight of biomass, and dry matter as the important traits that distinguished the elite materials been researched on.
Cluster groupings of the studied accessions and varieties based on morpho-agronomic traits using ward's minimum variance and SNP markers Agro-morphological traits diversity analysis: The dendrogram constructed based on the data generated from the agro-morphological traits divided the provitamin-A studied accessions and five varieties into six major clusters (A to F), and at a genetic distance of 0.30, and each had sub clusters apart from Cluster A (Table 6). Cluster A consisted of only two cassava accession germplasm with no sub clusters. Cluster B, had two sub cluster, Cluster D recorded the highest number of accessions, 57 in total, followed by Cluster E and F, grouping 53 and 34 accessions, respectively. In general, most of the accessions in this study were grouped according to their morpho-agronomic traits and geographical location. For example, the accessions in major Cluster E scored similar values for most of the morph-agronomic traits studied. Three out the five Sierra Leonean varieties developed in Sierra Leone were grouped into cluster F: while cluster B and D contained only provitamin-A studied accessions introduced to Sierra Leone in the form of seeds from IITA, Nigeria, and had a discrete pattern of clustering, which have been grouped more or less per their state, geographical distribution or country. SNP markers diversity analysis: The 181 Provitamin-A cassava accession germplasm and 4 Sierra Leonean varieties were grouped into nine clusters based on the 5643 SNP markers (Fig. 1). Clusters A, B, C, D and E, had 21, 7, 11, 8, and 16 accessions, respectively; while cluster F, G, H and I consisted of 10, 47, 50 and 17 accessions, respectively (Table 7). Clusters A, B, C, E, G, H and I had 3, 1, 2, 4, 9, 10, and 1 accessions with varying levels of total carotenoid content. Cluster I consisted of only one provitamin-A studied accessions.
Correlation Analysis between Clusters from Agro-Morphological Traits and SNP Makers: A comparison of the two dendrogram based on Mantel matrix test showed a significant positive, but weak correlation between the morphological and molecular data sets (r = 0.104, p \ 0.034). In a similar study, Raghu et al. (2007) mentioned that 24 morphological traits out of 28, contributed to the total variation observed. Here, our clustering study showed six and nine distinct clusters based on morphological and molecular analyses, respectively, indicating a large variability in the collection. In a similar study, Carvalho and Schaal (2001) identified 22 distinct clusters using 94 cassava accessions in Brazil, whereas Raghu et al. (2007) identified six distinct groups using 58 accessions. Our study is, therefore, in agreement with all these studies. Although the morphological and SNP data grouped the accessions into six and nine distinct clusters, respectively, some similarities were observed. Accessions TR 0747 and TR 0365 which were selected as provitamin-A studied accessions were found to be closely similar using both morphological and genetic markers. This could explain why the morphological and molecular analysis showed similar accessions between the two clusters. There are no reports on the genetic diversity of provitamin-A cassava accessions using morphological traits, molecular markers and total carotenoid content so far. This remains the first study using morphological, genetic diversity characterization and total carotenoid content levels of our provitamin-A cassava accessions in Sierra Leone.
The study reveals a moderate degree of diversity among the provitamin-A cassava accessions and varieties which can be further used for crop improvement. This may provide an opportunity to enhance and boost the breeding strategy.

Conclusion
The present morphological and molecular assessment studies reported that provitamin-A cassava accessions in Sierra Leone have moderate to high diversity based on total carotenoid content, morphological, and molecular assessment (Table 8).
The inter-relationships of morpho-agronomic factors in determining cassava fresh root yield based on provitamin-A cassava accessions require additional research to fully understand concept of improving total carotenoid content and yield on provitamin-A B A D C E F G H I Fig. 1 Dendrogram of 182 Provitamin-A studied accessions and Sierra Leonean varieties based on SNP markers cassava accession germplasm. Even though the agromorphological traits are generally employed to estimate genetic diversity in crop plants, such a method has its own limitations as the traits are heavily influenced by the environmental conditions and climate being the main factor influencing the growth and development of the species (Cadena Iniguez and Arevalo Galarza 2011). This also confirms the importance of molecular techniques and markers on Provitamin-A cassava accession germplasm to carry out successful research and improvement studies. The present study has revealed that during provitamin-A cassava variety development, high dry matter content (quality trait) is a priority trait that should be considered at both primary and advance (yield evaluation) stages with good root qualities to facilitate adoption after varietal release.