Genetic worth of multiple sets of cowpea breeding lines destined for advanced yield testing

The objective of this study was to determine genetic potentials in eight sets of cowpea lines for grain yield (GY), hundred seed weight (HSDWT) and days to 50% flowering (DT50FL). A total of 614 F6 genotypes constituting the sets, grouped by maturity, were evaluated across two locations in Northern Nigeria, in an alpha lattice design, two replications each. Data were recorded on GY, HSDWT and DT50FL.Variance components, genotypic coefficient of variation (GCV), and genetic advance (GA) were used to decode the magnitude of genetic variance within and among sets. Genetic usefulness (Up) which depends on mean and variance to score the genetic merits in historically bi-parental populations was applied to groups of breeding lines with mixed parentage. Principal component analysis (PCA) was used to depict contribution of traits to observed variations. GY and DT50FL explained the variance within and between sets respectively. Genotypes were significantly different, although genotype-by-location and set-by-location interaction effects were also prominent. Genetic variance (δ2G) and GCV were high for GY in Prelim2 (δ2G = 45,897; GCV = 19.58%), HSDWT in Prelim11 (δ2G = 7.137; GCV = 17.07%) and DT50F in Prelim5 (δ2G = 4.54; GCV = 4.4%). Heritability varied among sets for GY (H = 0.21 to 0.57), HSDWT (H = 0.76 to 0.93) and DT50FL (H = 0.20 to 0.81). GA and percentage GA (GAPM) were high for GY in Prelim2 (GAPM = 24.59%; GA = 269.05Kg/ha), HSDWT in Prelim11 (GAPM = 28.54%; GA = 4.47 g), and DT50F in Prelim10 (GAPM = 6.49%; GA = 3.01 days). These sets also registered high values of genetic usefulness, suggesting potential application in non-full sib populations. These approaches can be used during preliminary performance tests to reinforce decisions in extracting promising lines and choose among defined groups of lines. Supplementary Information The online version contains supplementary material available at (10.1007/s10681-020-02763-y)


Introduction
Cowpea [Vigna unguiculata (L.) Walp.] is a key legume in the semi-arid regions of Sub-Saharan Africa (SSA) because of its significant contribution to food and nutritional security in the region. The crop provides a cheap source of quality protein and minerals to both rural and urban communities in Africa Dube and Fanadzo 2013). The grains and leaves are both good sources of protein ranging from 21 to 33% and from 27 to 43%, respectively (Ahenkora et al. 1998;Boukar et al. 2011;Ddamulira et al. 2015). Cowpea predominance in the dry zones of Africa is attributable to its inherent drought tolerance and capability to grow in marginalized soils where other crops fail (Ehlers and Hall 1997;Ewansiha and Singh 2006;Agbicodo et al. 2009;Hall et al. 2010;Fatokun et al. 2012). In the dry savannas of West Africa, cowpea is regarded as a dual purpose crop providing both human food and animal fodder (Singh et al. 2003;Kamara et al. 2012). Additional attractiveness of cowpea is seen in its ability to fix nitrogen in the soil, making it a key component of the traditional intercropping systems (Kyei-Boahen et al. 2017). A recent report also revealed cowpea's medicinal properties, particularly anti-cancer, anti-hyperlipidemic, anti-inflammatory and anti-hypertensive properties (Jayathilake et al. 2018). These unique properties make cowpea a focus crop with potential to curb both the dynamic climate and malnutrition challenges in SSA.
Cowpea is largely produced and consumed in west and central Africa, with Nigeria leading the production at a rate of 2.14 million metric tonnes annually (FAOSTAT 2017;Boukar et al. 2018). However, farmers in west Africa have not been able to exploit the crops' yield potential, given that the average grain yield is about 492 kg/ha compared to a possible yield of between 2,000 and 3,000 kg/ha demonstrated on experimental station (Carsky et al. 2001;Agbicodo et al. 2009;Ahmad et al. 2010;Boukar et al. 2013Boukar et al. , 2018. The production and consumption of cowpea is challenged by numerous biotic and abiotic factors including insects, diseases, parasitic weeds, extreme and intermittent water and heat stresses (Agbicodo et al. 2009;Boukar et al. 2013Boukar et al. , 2018Togola et al. 2017).
Concerted efforts are being placed on cowpea to boost its productivity including deployment of modern quantitative genetics and genomic tools (Ehlers et al. 2012;Boukar et al. 2016Boukar et al. , 2018. These are expected to accelerate the rate of genetic gain, allowing farmers to benefit from the full genetic potentials of the crop. Additionally, the need to meet consumers' demand has revolutionized breeding, now requiring breeding for clearly defined product targets and profiles (Ragot et al. 2018). Grain yield, fodder potential and maturity duration are key components of each product target among other traits. Consequently, breeders may have to create and parallelly manage multiple populations of genetic materials in the breeding programs to suit specific product targets. Breeding lines emerging from several crosses may be fragmented based on maturity groups or other traits. In cases where multiple breeding sets are created, it is important to understand the genetic potentials of each set of materials or populations in terms of genetic variability and expected genetic advance for key product traits like grain yield to warrant continued investment in advanced testing across the target environments (Allier et al. 2019). The approach to define the usefulness or the genetic worth of a set of genetic materials or a cross has been described (Bernado 2010;Allier et al. 2019) and the concept has been largely applied in maize breeding to identify the best populations for extraction of superior inbred lines (Tabanao and Bernardo 2005). In this approach, the genetic usefulness (U) of a population for a given quantitative trait is determined by its mean (l) and expected genetic gain (iHr p ) as follows: U = l ? i*H*r p where i is the selection intensity which depends on the selection pressure, r p is the phenotypic standard deviation, and H is the broad sense heritability (Tabanao and Bernardo 2005;Bernado 2010). For instance, mean and genetic variance components of grain yield and other traits were deployed to dissect the usefulness of nine (6 synthetic and 3 F 2 ) maize populations (Fountain and Hallauer 1996). In cowpea and soybean, Meenatchi et al. (2019) and Johnson (1955) exploited the genetic variability parameters: phenotypic coefficient of variation (PCV), genotypic coefficient of variation (GCV), broad sense heritability (H) and genetic advance as a percentage of mean (G APM ) for grain yield and component traits to understand the extent of genetic variability using F 2 populations, although the usefulness criterion was not used. Two early generation populations of cowpea were examined based on genetic variance, heritability and genetic advance expressed as a percentage of mean to gauge the degree of genetic variability for grain yield and fodder traits (Kumar et al. 2017;Dinakar et al. 2018). However, when dealing with multiple populations, a combination of the means and genetic advance becomes handy to ease decisions in choosing the best sets of materials to advance in the breeding program (Schnell and Utz 1975;Tabanao and Bernardo 2005;Bernado 2010). The use of these genetic parameters is key in predicting the genetic worth of different sets of breeding populations and therefore reinforcing the decisions to focus resources for advanced testing on lines from populations with high genetic value. The objective of the present study was to decode the genetic potential of eight sets of cowpea breeding materials evaluated in preliminary yield trials to ascertain effective extraction of the best lines for further testing in advanced yield trials and/or for recycling as parents in the hybridization nursery. The study exemplified an effective use of quantitative genetic concepts to make selection decisions in a breeding program.

Site description
Field experiments were conducted during the 2019 cropping season in 2 locations at IITA experimental farms in Minjibir, Kano State, Nigeria, and at the National Animal Production Research Institute (NAPRI), Shika, Kaduna State, Nigeria (Table 1). Minjibir (12°08.997 0 N, 8°39.733 0 E) is in the Sudan savanna agroecology. The area has one wet season which commences in May/June, ending in October, with mean annual rainfall of about 674 mm and annual temperature range of 26-32°C. Shika (11°15 0 N, 7°3 2 0 E) is in the Northern Guinea Savanna agroecology, in the sub-humid zone of Nigeria. The zone has a unimodal wet season which begins in April/May and finishes by mid-October, with average annual rainfall of 1050 mm. Maximum temperature in Shika during the cropping season varied between 27 and 35°C. Fertilizer was applied in both fields at a rate of 100 kg of NPK (15-15-15) per ha.

Plant genetic materials
Sets of lines belonging to eight cowpea populations intended for preliminary yield tests (PYT), derived from multiple crosses in the breeding program and targeting different product profiles were used in this study. The crossing structure, pedigrees and agronomic features of parental lines are presented in Supplemental File 1. The creation of the multiple sets of test lines was based on maturity duration meant to suit different agro-ecolozies in cowpea growing corridors of Northern Nigeria. Consequently, the sets were categorized as: extra early and early maturity targeting the short duration production in the Sahelian and Sudan Savanna zones of West Africa, Medium and late maturity groups meant for the Medium and late duration product profiles suitable for the Guinea Savanna zone of West Africa. These maturity groups in addition to striga resistance status of the lines gave rise to the eight sets used in the present study. Smarmily, the sets were created by making several bi- Prelim11 90 9 9 10 Alpha lattice 2 Minjibir: Aug 2nd Shika: Aug 20th a number of replications; b mean annual temperature range measured in degrees celcious ( o c); c mean annual rainfal measured in millimeters (mm); Minjibir and Shika are the names of loctions or sites parental crosses using specific elite parents per maturity group; that is, two sets for short duration group: Prelim7 and 10, two sets of medium duration group: Prelim2 and 5, and three sets of late duration group: Prelim2, 3 and 8. The crosses generated F 1 s that were self-pollinated and between 200 -300 F 2 derived lines per set were advanced by single seed descent (SSD) until F 5 generation. At this stage lines were planted in a striga infested observation plot and susceptible lines within each set were dropped and resultant sets of F 6 genotypes belonging to the different maturity groups were then used in the present study (Supplemental File 1). Included in the study is an extra early duration set of F 6 lines referred to as Prelim11 that came from the inter-mating of eight parents. The sets had variable population sizes ranging from 60 to 90 and totaling to 614 genotypes (Table 1). Additionally, the crosses producing the eight sets of genetic materials involved parental lines capturing key traits of focus in the breeding program: High grain yield potential, large seed size, varying maturity (extra-early, early, medium and late), striga resistance, bacterial blight resistance and aphid resistance. The populations were developed by the cowpea breeding program over a period at the International Institute of Tropical Agriculture (IITA), Kano Station, Nigeria.

Experimental layout
At both Minjibir and Shika experimental sites, the eight populations were laid out as separate experiments in one mega experimental field per location. Materials were planted on ridges spaced at 0.75 m apart, with 0.2 m hill spacing within row. All experiments consisted of four rows per plot, each measuring 4 m long, arranged as an alpha lattice design, with two replications per experiment and the number of incomplete blocks within a replication varied depending on the number of lines within each of the eight populations (Table 1). The experiments at both locations were planted at varying dates in between June and August 2019 depending on suitable cropping period of the location (Table 1).

Data collection
Plant stand was determined two weeks after seedling emergence and at harvest. Date to 50% flowering (DT50FL) was recorded when 50% of plants in the middle two rows in a plot had flowered and the number of days were computed with reference to the planting date. At maturity, the middle two rows in a plot were harvested, threshed and weighed to obtain grain yield (GY) in grams per plot. The grain yield per plot was then converted to kilograms per hectare (kg/ha), considering the spacing and the plot length. Seed samples were taken from each plot and used to generate the one hundred seed weight (HSDWT) data, measured in grams.

Traits distribution
The R statistical software, version 3.5.2 (R Core Team 2018) was used to generate and summarize a graphical visualization using box plots and histograms of traits distribution within and between populations. The means from two locations were used to generate the box plots for the sets while the histograms were generated using individual plot data for the two locations. Scripts used have been provided in Supplemental File 2.

Mean squares
Analyses of variances (ANOVA) were performed in two steps; first with merged data of all sets, across two locations to assess differences between sets, and second for each population independently to assess variances within the sets. The following models were implemented in R using agricolae and lme4 packages (Bates et al. 2015;Mendiburu 2020) to obtain mean squares (MS), coefficient of variations (CV) and standard errors of means for the traits: (a) Between set Model P ijkh ¼ l þ set i þ l j þ set Ã l ð Þ ij þ set g ð Þ ik þ ðset g ð Þ Ã lÞ ijk þ poolederror Where P ijkh is the observed value of the ith genotype in the jth location, l is the general mean,-,g i ,l j ,ðg Ã lÞ ij , setðgÞ ik and ðset g ð Þ Ã lÞ ijk represent the effects of the genotype, location, the interaction between genotype and location, the effect of genotypes nested within sets and the interaction between genotypes within set by location effect respectively. The between sets ANOVA was performed on a cell mean basis and later converted on a plot basis by multiplying the MS by a common factorn ¼ p= P ð1=r i Þ, and the pooled error inserted in the ANOVA was estimated from the experimental error mean squares (EMS) of the individual trials as: P ðr Ã EMSÞ= P r (Cochran and Cox 1957). In both expressions mentioned above, r i is the number of replications in each trial and p is the number of trials. The approximate degree of freedom for the poled error term was obtained following the Welch-Satterthwaite equation: , v i is the error degree of freedom of individual trials and EMS i is the error mean square of individual trials (Satterthwaite 1946). When conducting F-tests, the denominator term for Set was Set*Loc, while set g ð Þ Ã lÞ was used as a denominator term for the following factors: Loc, Set*Loc and Set(Geno). The pooled error MS was used as a denominator F-test for the Set(Geno)*Loc term.
Where P ijkh is the observed value of the ith genotype in the jth location, l is the general mean, g i , l j , lðrÞ jk ; ðlðr b ð ÞÞ jkh and ðg Ã lÞ ij represent the effects of the genotype, location, replication nested within location, block and replication nested within location, and the interaction between genotype and location respectively;; and e ijkh is the residual effect. The denominator F-test for Loc, Loc(Rep) and Loc(Rep (Block) were lðrÞ, lðr b ð Þ and EMS respectively while lattice effective error (LEE) was used as a denominator test for Geno and Geno*Loc. The LEE was obtained from the standard error of the mean (SEM) estimates of the Geno*Loc term as: gÃl where n is the number of values used to estimate the Geno*Loc means which is equal to the number of replications in this case. The R scripts used for these analyses are provided in Supplemental File 2.

Variance components
To obtain variance components within each set, a linear mixed model (lmer) function in R was implemented using lme4 package (Bates et al. 2015). Variance components for the major sources of variation were estimated as; Where, MS G , MS G 9 L and MS e are the respective mean squares for genotypes, genotype 9 location interaction and the error, while r is the number of replications and l is the number of locations.

Genotypic and phenotypic variability
The extent of dispersion or the degree of variability within each breeding set was estimated using the formula proposed by (Johnson 1955) as; =lÞ Â 100; Where; l is the grand mean. Broad sense heritability (H 2 ), was computed from the variance components, expressed on an entry mean basis as: G , r 2 G 9 L and r 2 e are variance components for genotype, genotype x location interaction and the error respectively while r and l are number of replications and locations respectively.

Genetic advance and usefulness
Expected genetic advance (G A ) and genetic advance expressed as a percentage of the mean (G APM ) for each trait was computed according to (Allard 1960) as; where, k i is a standardized selection differential (assuming 10% selection intensity for prediction of genetic advance, H 2 is the broad sense heritability, r P is the phenotypic standard deviation, and l is the grand mean. The genetic worth or usefulness (U) of each population was then estimated based on the mean and genetic advances according to (Schnell and Utz 1975;Tabanao and Bernardo 2005;Bernado 2010) as: Where, U is the genetic usefulness of a population, l is the mean of the population and G A is the expected genetic advance.

Principle component analysis (PCA)
Three parameters namely, yield, seed weight and days to 50% flowering were used to conduct PCA on the sets of breeding lines in R using vqv/ggbiplot package developed by Vincent (2011). PCA scores of the three variables namely GY, HSDWT and DT50F were generated and used to determine the contribution of each variable to the total variations within and among the sets. PCA plots were then generated to visualize the scatter pattern of sets and genotypes within sets along the X and Y axes.

Traits distribution
The frequency distributions of lines in each population according to traits are presented in Fig. 1 and Supplemental Fig. 1. The box plots revealed different levels of dispersion within each breeding set with Prelim5 being the most variable set with high median GY, followed by Prelim10 and Prelim1, while Prelim11 had the least dispersion and the lowest median GY (Fig. 1a). The median seed weights (HSDWT) were within close ranges for most of the breeding sets, although Prelim8 stood out with the highest values while Prelim11 had the lowest (Fig. 1b). Prelim11 was earlier than other sets with median DT50FL of about 45 days while Prelim8 took more than 50 days on average to flower (Fig. 1c). The depictions from the histograms showed variations for grain yield, 100 seed weight and days to 50% flowering within the eight sets of breeding materials thus portraying continuous distributions typical of quantitative traits (Supplemental Fig. 1). The eight breeding sets responded uniquely to the environments based on their performances for GY, HSDWT and DT50FL, with each breeding set showing differential performances (high or low) between the two locations as depicted in the individual location boxplots presented in Supplemental Fig. 1.

Classification of breeding sets
Results of PCA conducted among and within breeding sets are presented in Fig. 2 and Supplemental Fig. 2. In general, PCA has showed diversity both within and between breeding sets based on GY, HSDWT and DT50FL, with PC1 and PC2 between sets accounting for 91.2% of total variation in the data. PCA showed the three traits (GY, HSDWT and DT50FL) to be distinct enough and provided good discrimination among and within the breeding sets. For variation among sets, PC1 was strongly associated with HSDWT (PC1 = 0.65) and DT50FL (PC1 = 0.73) and therefore, Prelim sets with high positive scores for PC1 were promising for these two traits, while PC2 was correlated with GY (PC2 = -0.90) hence, sets with high negative scores for PC2 were good for GY. When the data was grouped sequentially by each trait, clusters of breeding sets with potential for GY, HSDWT and DT50FL became apparent. Prelim1, 5 and 10 clustered in a group with GY above 1,200 kg/ ha while Prelim11 was alone in the low yielding category (\ 1000 kg/ha), the rest being intermediate (Fig. 2a). For HSDWT, Prelim7, 2, and 8 were categorized as having seed weights above 15.9 g, other sets being between 15.0 and 15.9 g, while Prelim10 had a mean HSDWT of less than 15 g (Fig. 2b). PCA for DT50FL revealed two groups with Prelim7,10 and 11 falling in the early flowering category with less than 48 days while the rest of the c Fig. 1 Phenotypic distributions: Box plots showing the dispersion quartiles within each of the eight sets of advanced breeding materials. a Grain yield (GY). b 100 seed weight (HSDWT). c Days to 50% flowering (DT50FL), generated using means from two locations. Histograms reflecting the distributions within each breeding set and boxplots for individual location dispersions are presented in Supplemental Fig. 1 sets were categorized as flowering later than 48 days after planting (Fig. 2c).
When we performed PCA within each of the eight sets, it was clear that potentially high yielding genotypes ([ 1500 k/ha) that overlap with high seed weight and earliness could be extracted from these populations ( Fig. 2d and Supplemental Fig. 2). Except for Prelim7 and 10, GY was strongly correlated with PC1 and accounted for most of the variation among genotypes within sets, therefore, genotypes with high  In plots a, b and c the DT50FL accounted for most of the variation explained by PC1 while GY explained a greater proportion of variance on the PC2 axis. In plot d, GY and HSDWT accounted for most of the variance on the PC1 axis while DT50FL was associated with variation on the PC2 axis positive scores on PC1 axis were good for GY. For Prelim7 and 10, HSDWT and DT50FL contributed most to the variations explained by PC1. The PCA results further revealed that despite the genotypes being clustered as the best performers for a particular trait, there were still enough diversity among genotypes within each cluster ( Fig. 2d and Supplemental  Fig. 2). A summary chart for the proportion of best genotypes that could be extracted from each set for GY, HSDWT and DT50F is presented in Fig. 3. It was evident that no genotype with GY above 1500 kg/ha could be obtained from Prelim11 while Prelim5 had the highest number of high yielding genotypes (Fig. 3a). For HSDWT, Prelim7 had a higher number of genotypes with seed weight above 20 g compared to other breeding sets (Fig. 3b). Genotypes with less than 45 DT50F were frequent in Prelim11 while all the genotypes in Prelim3 flowered later than 45 days (Fig. 3c). When the three traits were considered together, more genotypes combining GY [ 1,200 kg/ha, HSDWT [ 15 g and DT50FL \ 45 days could be extracted from Prelim 5 than in other sets (Fig. 3d).

Variance between and within breeding sets
Analysis of variance depicted the eight breeding sets not to be statistically different for GY, HSDWT and DT50FL, indicated by non-significant mean square values for sets ( numerical performance of the breeding sets in each of the two locations have been presented in Supplemental Table 1. When we considered genotypes nested within sets, the genotypic differences were highly significant (P \ 0.001) for all the three traits (Table 2). In addition, the effect of location was highly significant for all the three traits (P \ 0.001) and consequently, the overall responses of the breeding sets were highly influenced by environment as portrayed by significant Set-by-Location interaction effects for all the three traits ( Table 2). The effects of genotypes nested within breeding sets were also statistically significant at P \ 0.001) for all the three traits, signaling the apparent difference among genotypes within the sets (Table 2). However, the interaction between nested effect of genotype within set and location was highly significant suggesting the presence of genotype-byenvironment interaction. When mean squares for variation within breeding sets were compared, significant genotype effects were observed for GY, HSDWT and DT50F with at least a probability value of P B 0.05 for all the eight breeding sets (Table 3 and Supplemental Table 2). Location effects were significant, with at least P B 0.05 for most of the traits with exceptions in some breeding sets which shown no statistical significance (Supplemental Table 2). The interactions between genotypes and location were also highly significant in all the eight breeding sets for all the three traits (Table 3 and  Supplemental Table 2).
For 100 seed weight, all the breeding sets showed a generally lower magnitudes of variance attributed to Genotype-by-Location interaction (d 2 G*L ) relative to the respective genotypic variances (d 2 G ), with Prelim 11 (d 2 G = 7.137 vs d 2 G*L = 0.39) having the highest genetic variance component (Table 4)  Geno Genotype; Loc Location; SEM Standard error of the mean; CV Coefficient of variation; DF Degrees of freedom; MS Mean square; GY Grain yield; HSDWT Hundred seed weight; DT50FL Days to 50% flowering. ANOVA was conducted using genotype means obtained from individual location analysis. The symbols; *, **, and *** represents the probability at 0.05, 0.01 and 0.001, respectively. # Refers to the source of variation whose degree of freedom was used as denominator for the F-test HSDWT Hundred seed weight; DT50F Days to 50% flowering; the symbols; *, **, and *** represents the probability at 0.05,0.01 and 0.001 respectively the lowest variability for seed weight with GCV and PCV of 10.38% and 11.87% respectively ( Table 4). The partitioning of variance within sets for DT50FL revealed variable magnitudes of variances attributed to genetic components, with Prelim1 (d 2 G = 4.52 vs d 2 G*L = 3.79) and Prelim5 (d 2 G = 4.54 vs d 2 G*L = 6.56) having high values of genetic variance components relative to the Genotypeby-Location interaction terms (Table 4). Breeding sets that showed high genotypic and phenotypic variability for days to 50% flowering included Prelim1 (GCV = 4.43%; PCV = 5.75%), Prelim5 (GCV = 4.39%; PCV = 6.27%), Prelim3 (GCV = 4.20%; PCV = 5.04%) and Prelim10 (GCV = 4.08%; PCV = 4.53%). Breeding set with the lowest genotypic variability for HSDWT was Prelim7 which had a GCV of 2.16% (Table 4). For all the traits and breeding sets, the differences between the two parameters (PCV and GCV) were minimal, yet by judging from the standard deviation of genetic variance components, these differences are significant.

Genetic advance and usefulness of breeding sets
Results for expected genetic advance within the eight breeding sets computed based on broad sense heritability and assuming 10% selection intensity are present in Table 5. Heritability for grain yield computed on an entry mean basis ranged from 0.21 for Prelim3 to 0.57 for Prelim2. Overall, the expected genetic advance (G A ) for GY was more dependent on heritability than on genetic variance. When G A for grain yield was expressed as a percentage of population means (G APM ), Prelim2 emerged with the highest GCV Genetic coefficient of variation; PCV Phenotypic coefficient of variation; GY Grain yield; HSDWT Hundred seed weight; DT50FL Days to 50% flowering; the labels: Prelim1, Prelim2, Prelim3, Prelim5, Prelim7, Prelim8, Prelim10 and Prelim11 are the breeding sets or populations percentage of expected genetic advance (G APM = 24.59%; G A = 269.05Kg/ha). This was followed by Prelim7 (G APM = 21.84%; G A = 239.91Kg/ha) and Prelim5 (G APM = 19.77%; G A = 251.27Kg/ha). Other intermediate breeding sets were Prelim8 (G APM = 17.91%; G A = 211.12Kg/ha) and Prelim1 with G APM and G A of 14.21% and 175.83Kg/ha respectively while Prelim3 had the lowest G APM (Table 5).
Consequently, when usefulness criterion was used to compare breeding sets based on grain yield, most of the sets that had high percentage of genetic advance also recorded high usefulness (Up) values (Table 5). For instance, Prelim5 had the highest Up of 1522.33 kg/ha, flowed by Prelim1 (Up = 1413.54 kg/ha), while Prelim11 (Up = 874.13Kg/ha) and Pre-lim3 (Up = 1110.93Kg/ha) were the least useful sets for grain yield (Table 5). For HSDWT, the heritability values were relatively high across all breeding sets in the range of 0.76 for Prelim8 to 0.93 for Prelim2 (Table 5). It was observed that breeding sets with high heritability values also showed relatively high prediction values of genetic advance, the best sets being Prelim11 (G APM = 28.54%; G A = 4.47 g) and Pre-lim2 (G APM = 24.94%; G A = 4.00 g), while Prelim8 (G APM = 15.97%; 2.68 g) registered the lowest percentage value of expected genetic advance (Table 5). Usefulness criterion revealed Prelim11 and Prelim2 as the most useful breeding sets for HSDWT with Up values of 20.12 g and 20.04 g respectively, while prelim10 had the lowest usefulness value even though it had moderate percentage value of expected genetic advance. When it came to DT50F, the heritability values were variable between the breeding sets ranging from low (0.20 for Prelim8) to intermediate (0.49 for Prelim5) and high (0.81 for Prelim10). Consequently, breeding sets that showed high predicted genetic advance were Prelim10 (G APM = 6.49%; G A = 3.01 days), Prelim1 (G APM = 6.01%; G A = 2.88 days) and Prelim5 (G AP = 5.41%; G A = 2.63 days). Prelim2 and Prelim11 had intermediate proportion of expected genetic advance while Prelim8 and 3 had the lowest prediction of genetic advance for DT50F (Table 5). To make sense of usefulness criterion for DT50F, the expected genetic advance was deducted from the mean DT50F, thereby, revealing Prelim10 (Up = 43.44 days), Prelim11 (Up = 43.59 days) and Prelim1 (Up = 45.08 days) with high genetic potential for early flowering (Table 5).

Discussion
Decisions in plant breeding are continuously becoming more complex given the dynamic consumer demands and preferences, and the current issues of climate change. As the human population continues to surge, breeders are constantly under pressure to release improved varieties with high yields and other preferred traits. Consequently, a typical active breeding program often handles multiple populations intended for varied purposes or product targets (Witcombe and Virk 2001). This introduces complex deliberations and challenges relating to handling large sizes of genetic materials, resource allocations and selection decisions at every breeding stage (Witcombe and Virk 2001;Sun et al. 2011). Therefore, Sun et al. (2011) noted that, careful choice of genotypes at each step in a breeding program is key in determining the ultimate success in the next selection stages for genetic advancement. The present study elucidated the genetic worth of eight sets of cowpea breeding materials evaluated in preliminary yield trials across two locations in Northern Nigeria, deploying the concepts of genetic variance, heritability, genetic advance and usefulness criterion to aid in making selection decisions for advancement of materials. We began by examining the distributions of the three traits; GY, HSDWT and DT50FL, within each of the eight breeding sets. The traits variation approximated continuous distributions within the sets, suggestive of quantitative inheritance. Sinnott (1937) argued that, when phenotypic variation is presumably environmental and or conditioned by multiple genes with minor effects, the distribution is essentially symmetrical. In cowpea, grain yield, seed weight and flowering time are complex traits that exhibits quantitative variations in nature (Lopes et al. 2003;Ishiyaku et al. 2005;Boukar et al. 2016). The present study depicted different levels of total dispersions within the breeding sets, with some sets such as Prelim5, Prelim8 and Prelim11 showing slight shifts towards high GY, HSDWT and less DT50FL respectively. The observed dispersions suggested involvement of genetic factors governing the traits tested and that recovering promising lines from these sets is highly probable.
When the breeding sets were analyzed using PCA, it became apparent that the eight breeding sets were distinct from each other although some of them overlap for the three traits. Grouping the breeding sets by their means in respect to the traits allowed the PCA to highlight the potential sets for GY, HSDWT and DT50FL (Fig. 3). When PCA was examined within each breeding set, the structure reflected diversity among genotypes, but some genotypes were highly associated with GY reflecting their yield potentials while others were more correlated with HSDWT and DT50FL, implying those genotypes performed well for the traits in question (Supplementary Fig. 2). PCA was able to identify the top performing genotypes within each breeding set for the three traits, with clear categorizations of those having GY above 1500 kg/ha, seed HSDWT above 20 g and DT50FL less than 45 days.
A summary of the proportion of high performing genotypes that could be extracted from each breeding set was derived from the PCA and presented in Fig. 3. This chart portrayed Prelim5, Prelim7 and Prelim11 as sets having high frequencies of genotypes with GY above1500kg/ha, seed HSDWT above 20 g and DT50F less than 45 days, while Prelim5 had the highest number of genotypes with good combination of desired values of the three traits. PCA is a powerful data reduction tool that has been used in cowpea conventional breeding for morphological characterization and defining key determinants of grain yield (Oladejo et al. 2016). A study by (Vural and Karasu 2007) deployed PCA using multiple yield component traits to understand which of the factors explained most of the total variance in the data, and found seed weight and pod size to contribute most of the variations. In the present study, the traits distributions and PCA provided an overall picture of total variability and structure in the data among and within the breeding sets. Differences among sets were mostly explained by DT50FL as indicated by higher PC1 score for this trait than others. This observation is consistent with the fact that the sets were created based on maturity and therefore, it is expected that the groups would be distinct in terms of DT50FL. On the other hand, variation among genotypes within sets were mostly explained by GY and HSDWT as reflected by high PC1 scores for these traits. Given the information on the contributions of the traits to variation on the PC1 and PC2 axes it was possible to identify promising sets and genotypes within sets for higher GY. The fact that variability among genotypes within each set was mostly explained by GY and HSDWT implies that selection within the sets for these two traits would be more beneficial than for DT50FL. However, since the phenotypic variability generally was only slightly greater than the genetic variability in these traits, the total dispersion does not reflect wholly the magnitude of genetic variance since it is a combination of genetic and environmental variations and hence, an accurate assessment would require partitioning of total variance into its different components (Bernado 2010).
To unravel the variability between and within the sets, we conducted a two-step classical ANOVA, first between the sets and then for individual breeding sets. Sets did not show significant mean differences for all the three traits considered although numerically some sets had higher mean values than others. However, the effect of genotypes nested within location was significant, an indication that sets are likely different, but its significance could have been masked by environment. Indeed, the analysis revealed significant interactions between sets and location and that of genotypes nested in set by location. This outcome suggested that meaningful selections among sets and genotypes within sets would require testing the materials in multiple locations to eliminate the confounding effect of the environment. In addition, it's important to understand the amount of variation within the population in addition to the mean in order to make a more informed selection decision (Tabanao and Bernardo 2005;Bernado 2010). The present study tested genotypic variation in the eight sets and found the genotype effects within each set to be significantly different for all the traits except for GY in Prelim11. This suggested that there was enough genetic variability within the sets to warrant selection and recovery of good performing lines. However, the observed significant effects of genotype-by-location interaction for traits in most of the sets suggested presence of variation in relative performance of genotypes between the locations, creating an alert to proceed with caution when merging means from the two locations to make selection within the sets (Mohammadi et al. 2015). Genotypic variation for grain yield, seed weight and flowering time in cowpea are known to be influenced by environments (Adewale et al. 2010;Odeseye et al. 2018). This complicates the selection of superior genotypes, thereby reducing genetic progress (Allard and Bradshaw 1964;Mohammadi et al. 2015). In the present case, decision would be made based on two locations data, and considering that further testing in more locations is expected, selection based on means and with a relatively relaxed selection intensity would be suggested to avoid elimination of potentially stable genotypes for GY at this stage.
To further decode the genetic potential of the eight breeding sets, total variance within each set was partitioned to reflect variances attributed to genotype, location and the interaction thereof (Table 4). This allowed further dissection of the breeding sets in terms of genetic coefficient variability, heritability, genetic advance and overall genetic usefulness of the sets. Breeding sets that had high relative magnitude of genetic variance had moderate to high heritability and further depicted relatively high expected genetic advance and genetic usefulness This observation suggested that the sets with high values of genetic variance, genotypic coefficient of variation, heritability, expected genetic advance and genetic usefulness for the traits in question, would respond well to future selection, and superior lines for the traits are extractable from these sets. This finding was consistent with the past studies in cowpea which used similar genetic parameters to evaluate the effectiveness of population response to selection (Damarany 1994;Omoigui et al. 2006;Manggoel 2012;Nwosu et al. 2013). The observed minimal differences between GCV and PCV for all the traits studied implied that the traits are mostly governed by genetic factors with little role of environment in the phenotypic expression of these characters (Manggoel 2012). Therefore, selection for these traits based on phenotypic value may be effective. Manggoel (2012) alluded to the fact that heritability estimates coupled with genetic advance are useful in predicting the resultant effect for the selection of the best individuals from a population. Moderate to high broad sense heritability values observed in the present study suggested that selection within each Prelim set for GY, HSDWT and DT50FL would be beneficial, given the moderate magnitude of environmental influence. The results of usefulness criteria were consistent with that from variance components and genetic advance. This suggested that the concept of genetic usefulness may be used to evaluate the genetic merit of specifically defined groups of breeding materials that are not necessarily derived from a two-parent cross. Usefulness criteria have historically been applied to bi-parental populations with full sib progeny to predict population performance in early generations (Tabanao and Bernardo 2005;Bernado 2010;Allier et al. 2019). The advantage of genetic usefulness is that is captures the overall value of a population in terms of its mean performance and total variance (Tabanao and Bernardo 2005;Bernado 2010;Allier et al. 2019). With homozygous lines and the opportunity for replicated testing at later generations as it is the case in the present study, there is improved prediction accuracy of genetic usefulness. The information may still be helpful at early performance testing phase, especially when there is need to prioritize among several groups of breeding materials. Indeed, our study has demonstrated that there are some sets like Prelim11 (U P-= 874 kg/ha) and Prelim3 (U P = 1110 kg/ha) with relatively low GY scores that would be dropped at this stage and lines taken back in the crossing nursery for yield improvement.
The present study elucidated the structure and properties of eight sets of cowpea breeding materials that are destined for further testing, revealing the uniqueness of each set and the magnitude of expected gain from selection within each set and the genetic usefulness of each set. The variance component analysis allowed estimation of genetic and phenotypic coefficient of variation, heritability and expected genetic advance. These parameters exposed the genetic potential of eight sets of cowpea breeding lines for GY, HSDWT and DT50FL, revealing sets with high genetic variance and from which superior lines could be extracted to recommend for advanced testing. Estimates of genetic usefulness were generally consistent with results from variance components which provided additional layer of information on the score for genetic merits of the sets. The current study highlights a novel application of usefulness criteria in non-biparental populations with populations defined based on maturity groups. However, comparisons of performance among populations may be limited by the nature of traits used for grouping as in the present case where maturity may be correlated with other traits used for assessing performance. Principal component analysis depicted the relative contributions of the three traits to the variability between and within sets, revealing that more benefit would be obtained by selecting among genotypes within sets based on GY and HSDWT than on DT50FL. These approaches generated relevant information required in making decision for advancement in a conventional breeding program.