Utilizing genetic diversity to select tomato lines tolerant of tomato yellow leaf curl virus based on genotypic coefficient of variation, heritability, genotypic correlation, and multivariate analyses

Tomato yellow leaf curl virus (TYLCV) is one of the most destructive pathogens for tomato crops. The development of TYLCV-tolerant tomato lines (TLs) requires a thorough understanding of their genetic variability and relationships among lines and in traits of interest. Twelve F9 TLs were evaluated for phenotypic TYLCV tolerance, vegetative growth, yield, and fruit quality during the 2018 and 2019 fall seasons to identify elite breeding lines. TLs were selected by a bulk selection method from segregating generations of the commercial F1 hybrids TH99802 (TLs 1–6) and TH99806 (TLs 7–12). TLs exhibited either mild or no symptoms. The TYLCV titer varied between 0.8 × 105 and 3.9 × 105 in symptomless TLs plants compared to 56.7 × 105 in severely symptomatic plants of susceptible ‘Castlerock.’ Across both seasons, TL-3, TL-5, and TL-8 exhibited the highest total and marketable plant yields, TYLCV tolerance, and acceptable fruit quality. Most traits had high estimates of genetic variance, genotypic coefficient of variance, and broad sense heritability. Our results indicated that there was sufficient genetic variability for selection of the best lines. Principal component analysis and hierarchical cluster analysis indicated that the TLs were highly diverse of the evaluated traits and could be divided into three clusters. Cluster 1, which included TL-1, TL-3, and TL-8, performed better for TYLCV tolerance and economically important traits. Clusters 1 and 2 showed the greatest degree of dissimilarity. Therefore, crossing parents from Cluster 1 with 2 is predicated to maximum recombination for improve genotypes.


Introduction
Tomato (Solanum lycopersicum L.) is an important vegetable crop. Egypt is the fifth largest producer worldwide, with 6,245,787 tons produced in 2021 from 357,259.4 fedans, averaging 17.5 tons fed −1 (< http:// faost at. fao. org >). Tomato yellow leaf curl disease (TYLCD) is one of the world's most devastating diseases for tomato producers (Lefeuvre et al. 2010;Moriones and Navas-Castillo 2000). TYLCD symptoms include yellowing, curling, and cupping of leaves, severe stunting and abortion of flowers and fruits, which can result in yield reduction of up to 100% (Abhary et al. 2007). Tomato yellow leaf curl virus (TYLCV) is the causative agent of TYLCD and belongs to the genus Begomovirus in the family Geminiviridae. TYLCV is a monopartite DNA virus with a circular genome that contains six genes, with two genes on the viral strand (V1 and V2) and four genes on the complementary sense strand (C1 to C4) (Abhary et al. 2007;Gronenborn 2007). TYLCV is exclusively transmitted in a persistent-circulative manner by the sweet potato whitefly, Bemisia tabaci (Genn. 1889) (Homoptera; Aleyrodidae) (Gronenborn 2007). B. tabaci is an invasive pest found in over 175 countries worldwide (Ramos et al. 2018). The virus has spread as a result of the tremendous proliferation of the whitefly globally. The number of tomato-producing regions reporting TYLCD pandemic breakouts has increased to up to 70 nations since its discovery in the Jordan Valley in 1930s (Mabvakure et al. 2016).
TYLCVD control is a laborious, costly, and difficult task. The primary TYLCD control strategy is the use of 1 3 insecticides to eradicate the B. tabaci viral vector (Lapidot et al. 2014). However, insecticides treatment can be expensive and environmentally risky. Furthermore, because B. tabaci develops resistance to insecticides, they can become ineffectual (Palumbo et al. 2001). The simplest, safest, most practical, and ecologically friendly approach to controlling TYLCD, minimizing yield losses, and reducing viral transmission is to use TYLCD-resistant cultivars. As a result, TYLCD resistance has become one of the most important goals in tomato breeding.
The continuous development of new tomato lines/cultivars is important to improve TYLCD resistance/tolerance, productivity, and to overcome virulence genes developed by the virus. The selection and development of elite TYLCDresistant/tolerant lines require a thorough understanding of the genetic variability and relationships among lines and between traits of interest (Lynch and Walsh 1998;Bernardo 2010). Plant breeders use morphological traits to estimate genetic variability because they are simple to score, rapid to analysis, and inexpensive assessment.
The phenotypic (PCV) and genotypic (GCV) coefficients of variation and heritability are essential biometric tools used to assess the genetic variability among the genotypes (Singh and Chaudhary 1985;Kumar et al. 2013). Heritability governs both the potential for improvement through selection and the effect of the environment on the expression of traits (Robinson et al. 1949). According to Burton and De Vane (1953), combining GCV with h 2 b estimates produces a trustworthy measure of the projected level of potential improvement via selection.
Traits correlation determines how well genetic variability may be exploited by selection. Depending on the correlation, selection for one trait may either increase or decrease the expression of another (Bernardo 2010). Phenotypic correlation is the result of the interplay between genetic and environmental factors. Genetic correlation is the only method used to direct breeding, since it is the only aspect of inheritable nature (Lynch and Walsh 1998).
Several statistical techniques can be used to estimate genetic variability and the levels of similarity/dissimilarity among genotypes (Mohammadi and Parasanna 2003). These analyses are extremely helpful for planning crossing, allocating lines to specific heterotic groupings, and precise identification with regard to plant variety preservation (Hallauer et al. 2010). Since data from morphological characterization are collected from a sizable dataset containing both qualitative and quantitative traits, multivariate analyses are highly suited to the classification and ranking of genotypes. Furthermore, genetic variability has been estimated using multivariate analyses (Mohammadi and Prasanna 2003). Principal component analysis (PCA) and cluster analysis are the best multivariate approaches for morphological characterization of genotypes (Mohammadi and Prasanna 2003;Reich et al. 2008).
This study was conducted to inform the development of TYLCV-tolerant lines. Twelve F 9 tomato lines (TLs) were selected from a tomato breeding program and cultivated under TYLCV-infected conditions to evaluate their phenotypic TYLCD tolerance, vegetative growth, and fruit quality. Estimation of genetic variability, broad sense heritability, and genotypic correlations were performed, and the TLs were categorized and ranked using multivariate analyses in order to select the elite lines.

Tomato lines
Tomato cultivar TH99802 (Yassamen F 1 , Syngenta) is characterized by a determinate growth habit; vigorous vegetative growth; medium-early ripening; fruit firmness; spherical fruits with an average weight 160-180 g; high tolerance to verticillium (V), fusarium (Fol 0-1), and Stemphylium (S/ Ss); and moderate tolerance to TYLCD. The tomato cultivar 'TH99806' (Nirouz F 1 , Syngenta) is characterized by a determinate growth habit; vigorous vegetative growth; early ripening; heat tolerance; spherical fruits with average weight 130-150 g; tolerance to fruit cracking; highly tolerance to tobacco mosaic virus (ToMV 0-2 ), verticillium (V), and fusarium (Fol 1-2 ); and moderate tolerance to TYLCD. Based on these hybrids, a bulk selection program was organized for TYLCD resistance and high productivity at the Agricultural Experimental Station (AES), Faculty of Agriculture, Cairo University, Giza, Egypt (30°01′07′′N; 31°12′28′′E). Symptomless plants with vigor vegetative growth, average fruit weight > 80 g, and high fruit set were selected from segregated generations during the fall seasons. Twelve F 9 lines were selected and evaluated their TYLCD tolerance, vegetative growth, yield, and fruit quality traits during the 2018 and 2019 fall seasons. Evaluation conducted at AES under TYLCD infection field conditions, where the climatic conditions are suitable for the widely flourishing of viruliferous whiteflies (Fig. 1).

Planting and experimental design
Seeds of TLs were sown on the 1 July of 2018 and 2019, in seedling trays (209 cells) filled with a mixture of peatmoss and vermiculite (volume 1:1) enriched with macro and microelements under greenhouse conditions. The greenhouse was covered with a black saran fabric with narrow holes to prevent the entry and exit of insects. Five-week-old seedlings were field-transplanted in a randomized complete block design (RCBD) (Singh and Choudhary 1985) with three replicates. Each experimental unit (EU) consisted of two rows/line. Each row was 1 m wide and 3 m long. Plants were set 50 cm apart and subjected to common agricultural practices without applying insecticides.

TYLCV inoculation
Whiteflies is flourishes in Egypt from April through November, with a peak from August to October (Abd-Rabou and Evans 2020). Therefore, viral inoculation depended on natural infestation with viruliferous whiteflies in both the nursery and field. To encourage TYLCV infection, highly symptomatic plants of the susceptible cultivar 'Castlerock' were grown in seedling greenhouse and had an abundance of whiteflies. A row of 'Castlerock' plants were cultivated between EUs as a source of TYLCV infection and a guide to TYLCV symptom severity.

TYLCD tolerance
Phenotypical TYLCD tolerance was evaluated based on the severity of TYLCD symptoms in TLs 3 months after transplanting (3MAT) in both seasons. Symptoms severity was assessed for individual plants of each line using a 1-5 scale as described by Mahmoud (2015): 1, no symptoms appear on the plant; 2, slight symptoms on plant top; 3, moderate symptoms; 4, severe symptoms on the entire plant; and 5, severe symptoms and plant stunting. The individual plant ratings were summed for each line and divided by the number of evaluated plants to calculate the TYLCD-mean score (TYLCD-MS).
TYLCV presence has been shown by polymerase chain reaction (PCR) using universal Begomovirus primers AVcore/ACcore (Brown et al. 2001) and the specific primers TYLC2C3F/TYLC2C3R (Table 1) for species TYLCV, TYLCV-Mld, TYLCVMalv, TYLCSV-ES[2], and TYLCSV, which are found throughout the Mediterranean region and Middle East (Anfoka et al. 2008). Total genomic DNA was isolated using the CTAB method (Murray and Thompson 1980) from the young leaves of both symptomless TLs and severely symptomatic 'Castlerock' plants at 3MAT. Five plants from each line were used to sample the young leaves. PCR cycle parameters are described in Table 1. All PCRs were performed in a programmable thermocycler (Mastercycler ep gradient 5, Eppendorf, Hamburg, Germany). The PCR products were resolved in 1.5% agarose gel in 1 × Tris-acetate-EDTA buffer. DNA bands visualized with ethidium bromide staining (0.5 μg mL −1 ) and photographed under UV light using gel documentation system (Bio-Rad® Gel Doc-2000). One-kb ladder DNA was used as the molecular weight size marker.
Vegetative, yield, and fruit quality traits Vegetative traits, i.e., plant length (PL), number of plant leaves (NPL), and the area of the fifth leaf from the apex (LA), were measured on five randomly selected plants for each EU at 3MAT, excluding plants from row edges. The LA was measured using the leaf weighting method, as described by Pandey and Singh (2011). Yield [early (EY): the first three harvests; total (TY): all collected fruits; and marketable (MY): all Fig. 1 Average monthly maximum and minimum temperatures and relative humidity during the period from July to December in the 2018 and 2019 fall seasons (https:// power. larc. nasa. gov/ data-access-viewer/) normal collected fruits] and fruit quality traits [average fruit weight (AFW); fruit firmness (FF); fruit shape index (FSI); contents of total soluble solid content (TSS); and titratable acidity (TA)] were measured. Samples of 20 fully red-ripe fruits from each EU were harvested at the peak harvesting time, weighed to estimate AFW, and washed with distilled water to analyze fruit traits. FSI was calculated as the ratio between the polar and equatorial diameters of fruit according to Yeager (1937), where FSI is > 1.2 in oval fruits, 0.95-1.2 in round shape, and < 0.95 in oblate fruits. FF was determined using a food pressure tester (Force Gauge Model M4-200-Series 4; Mark-10 Corp., Copiague, NY, USA) Mark-10 (Series 4). Then, fruit extract was obtained by blending and filtering the flesh. TSS was determined using a hand refractometer. TA was ascertained using 0.1 N NaOH solution and phenolphthalein as indicators (AOAC 1990). The taste index (TI) and maturity (M) were also estimated as indications of tomato flavor and quality to assess consumer acceptance and distinguish between lines. The taste index [TI = (•Brix/(20 × TA)) + TA] and the maturity (M = •Brix/ TA) were calculated using equations described by Navez et al. (1999).

Biometrical analyses
Estimation of the evaluated TLs' genetic diversity and their classification based on phenotypic traits was performed using the following statistical methods (Mohammadi and Prasanna 2003). Independent analysis of variance (ANOVA) is performed for each trait with estimate of PCV, GCV, phenotypic correlation coefficient (r ph ), genotypic correlation coefficient (r g ), and h 2 b . Multivariate analyses were performed for all estimated traits using PCA and hierarchical cluster analysis (HCA).

ANOVA and ANCOVA
The collected phenotypic data were checked for the normality using the Shapiro-Wilk test (Shapiro and Wilk 1965), and data for TYLCD-MS, EY, TY, AFW, FSI, TA, TI, and M were arcsin square root transformed (Wickens and Keppek 2004). ANOVA of the RCBD was performed for each seasons according to Wickens and Keppek (2004). When Bartlett's homogeneity test was nonsignificant, a combined ANOVA over the two seasons was also performed (Bartleet 1937). Significant differences between the combined means were calculated using the Duncan's multiple range test at a 5% probability level (Duncan 1955). A combined analysis of covariance (ANCOVA) was also performed for the RCBD over the two seasons to estimate genotypic and phenotypic correlations between traits (Wickens and Keppek 2004). The ANOVA, ANCOVA, and mean comparisons were conducted using MSTATc v.2.1 (Michigan State University, Michigan, USA; Freed et al. 1989).

Estimation of genotypic and phenotypic correlation coefficients
Covariances were computed in a similar manner as shown in Table 2 (Singh and Chaudhary 1985). These covariance components were substituted in the following formulae to compute r g and r ph (Johnson et al. 1955): is the genotypic covariance between a given pair of traits (× 1 and × 2) and 2 g(x1) 2 g(x2) are the genotypic variances of × 1 and × 2, TYLC2C3R TTA AAA GCT TAT GGA TTC ACG CAC AGG GGA AC Table 2 The Year respectively. r ph = COV ph(x1x2) ∕ √ 2 ph(x1) 2 ph(x2) , where COV ph(x1x2) is the phenotypic covariance between two traits (× 1 and × 2) and 2 ph(x1) and 2 ph(x2) are phenotypic variances for each trait. Significance for r g and r ph was performed according to Yassin (1973).

Multivariate analysis
The pooled data for all traits were standardized using Z-scores to avoid the effect of scale differences before the multivariate analysis. PCA with varimax rotation was applied (Sharma 1996). The latent root criterion (eigenvalue > 1) and parallel analysis were used to determine the number of statistically significant components (Johnson and Wichern 1988). The biplot between the top two PCs that adequately explained a significant percentage of the total variance was created to assist in identifying the relationships among PCs and traits, PCs and lines, lines and their traits, and among the different traits (Yan and Rajcan 2002;Yan and Kang 2003). The correlation coefficient between any two traits can be estimated using the cosine of the angle between the vectors (Yan and Kang 2003). The vectors are positively correlated if the angle between them is < 90°, negatively correlated if the angle is > 90°, and independent if the angle is exactly 90° (Yan and Rajcan 2002). HCA was performed to construct a dendrogram using the squared Euclidean distance and Ward's joining method. All multivariate statistical analyses were performed using IBM SPSS software version 26.0.0 (SPSS Inc., Chicago, IL, USA; ÓConnor 2000) and XLSTAT software version 2019 (Addinsoft, Paris, France). Table 2 shows the combined ANOVA results of a RCBD for the 2018 and 2019 fall seasons for the estimated fourteen traits of the twelve evaluated TLs. There were no significant differences between years (MS y ) for the estimated traits, except LA and AFW. These results demonstrated the environmental similarities between the two season and their influence on the evaluated traits (Falconer 1952). Meteorological information on minimum and maximum temperatures and air relative humidity (Fig. 1) support this result. There were significant differences in genotypes [MS g ; P < 0.001 for all traits, except TI (P < 0.01) and M (P < 0.05)]. Also, genotype had the highest incidence (> 50%) on total pooled variance for all traits, except TA (22.2%). These findings showed that TLs had broad genetic variance, which may improve TYLCD tolerance and yield-related traits (Schouten et al. 2019;Hassan et al. 2022). Mean squares due to G × Y interaction (MS gy ) were nonsignificant for the most of estimated traits (9 out of 14 traits). Significant differences with MS gy were seen in PL, TY, TA (P < 0.05), TSS (P < 0.01), and LA (P < 0.001). These findings indicated the genotype stability of TLs during both seasons (Singh 2001). Hence, emphasis was placed on determining the specific effects of genotypes and comparing their means for the estimated traits (Wickens and Keppek 2004).

Phenotypic performance
The phenotypic responses of the evaluated TLs under TYLCV infection for the TYLCV-mean score, vegetative, yield, and fruit quality traits are shown in Table 3.
TYLCD tolerance TLs showed mild to moderate TYLCD symptoms at 3 MAT (Table 3). TYLCD-MS ranged between 1.15 for TL-1 and 1.82 for TL-12 across both seasons. The evaluated TLs could be divided into three groups according to significant differences in their symptom severity (TYLCD-MS). The first group had the lowest TYLCD-MS (1.15-1.34) and was represented by TL-1 to TL-3 and TL-5 to TL-8; the second group was only represented by TL-10 (1.52); and the third group had the highest TYLCD-MS (1.69-1.82) and was represented by TL-9, TL-11, and TL-12 (Table 4).
The presence of TYLCV-DNA was determined by a PCR assay in symptomless plants of TLs, as compared with the susceptible cultivar Castlerock (Fig. 2). Electrophoresis analysis revealed a single 550 bp fragment in all the DNA samples examined with the AVcore/ACcore universal primer pair (Fig. 2a). A 500-bp PCR product was produced in all of DNA samples using the specific primer pair TYLC2C3F/ TYLC2C3R (Fig. 2b). These findings indicate TYLCD tolerance in TLs, where plants harbor viral DNA (Fig. 2) but exhibit either mild or no symptoms (Table 3).

Vegetative traits
The results for vegetative growth traits (PL, NPL, and LA) for TLs at 3MAT are shown in Table 3. PL ranged from 0.63 m for TL-4 to 1.49 m for TL-3. TL-2 and TL-3 were the tallest plants (1.49 and 1.43 m, respectively) across both seasons, followed by TL-12, TL-11, and TL-8 (1.36, 1.29, and 1.27 m, respectively) with no significant differences between them. The shortest plants were 0.70,and 0.73,respectively) with no significant differences among them (Table 3).
According to the phenotypic evaluation results, lines TL-3, TL-5, and TL-8 had the highest TY, MY, and TYLCD tolerance, as well as acceptable fruit quality and AFW > 80 g, except TL-5, which had an AFW of approximately 71 g. z Phenotype TYLCD-MS: TYLCD-mean score; PL: plant length; NPL: number of plant leaves; LA: leaf area; EY: early plant yield; TY: total plant yield; MY: marketable plant yield; AFW: average fruit weight; FF: fruit firmness; FSI: fruit shape index; TSS: fruit TSS content; TA: fruit titratable acidity; TI: fruit taste index; and M: fruit maturity index y Mean value ± standard error (season = 2 and replicate = 3). Mean values followed by a letter in common were not significantly different according to Duncan's multiple range test (p < 0.05) x TL: The selected F 9 lines from commercial tomato F 1 hybrids TH99802 (TL-1 to TL-6) and TH99806 (TL-7 to TL-12) w Data were transformed by the arcsin equation for statistical analysis v TYLCD-mean scores: 1, symptomless; 2, slight; 3, moderate; 4, severe; and 5, very severe symptoms  Table 4 Genotypic (
All estimated traits had δ 2 ph > δ 2 g and PCV > GCV (Table 3), indicating additional environmental influences on trait expression (Singh 2001). However, δ 2 g represented a larger proportion of δ 2 ph (Table 3), and the difference between the PCV and the relevant GCV was negligible for all traits (PCV:GCV ratios ranged from 1.00 to 1.17), except TSS and TI (PCV:GCV ratios were 1.69 and 2.33,  respectively; Table 3). These traits are, therefore, fairly stable and highly heritable (Singh 2001). This was confirmed by h 2 b estimates, where h 2 b was very high (70.8-99.1%) for most traits, with the exception of TSS and TI (35.7 and 117.6%, respectively) (Johnson et al. 1955). Heritability is used to indicate the relative degree to which a character is transmitted from parent to offspring. In improved tomato lines and cultivars, estimates of the h 2 b of TYLCD resistance/tolerance were high (Abdel-Ati et al. 2005;Mazyed et al. 2007;Hassan et al. 2022). The magnitude of such estimates also suggests the extent to which improvement is possible through selection. PCV and GCV estimation with h 2 b provides an accurate indicator of the heritable component of variance (Bello et al. 2012). High heritability traits can be improved by various phenotypic selection techniques (Singh 2001). Moderate heritability suggests non-additive gene action in their control; therefore, complex breeding methods may be recommended to improve TSS and TI (Singh 2001). So, phenotypic selection based on TYLCD tolerance (TYLCD-MS), vegetative growth (PL, NPL, and LA), plant yield (EY and MY), and fruit quality traits (FF, FSI, TA, and M) can be trusted to select the best lines among the evaluated TLs.

Phenotypic and genotypic correlation coefficients
Phenotypic (r ph ) and genotypic (r g ) correlation coefficients for all study traits are presented in Table 4. Successful exploitation of genetic diversity through selection is determined by the correlation of traits. Selection for a particular trait may either increase or decrease the expression of another trait, depending on how closely they are correlated (Bernardo 2010). For most of the estimated traits, r g values were higher than r ph values, indicating a moderately string inherent relationship between traits (Bernardo 2010). Traits that showed significant r g to each other generally showed significant r ph . Several significant phenotypic and genotypic correlations were detected. Here, we focus on correlations between TYLCD tolerance (TYLCD-MS), yield, and its components. TYLCD-MS showed positive genotypic and phenotypic correlations with FSI, but negative genotypic association with TI and M. Positive r g and r ph values were found for EY and each of NPL, TY, and MY, but negative r g and r ph values were found between EY and FSI. Only EY and M showed positive genotypic correlation. TY showed positive r g and r ph values with each of NPL, EY, and MY, and positive r g value only with TA. MY showed significant and positive r g and r ph values with each of NPL, EY, and TY, and positive and negative correlations with TA and FSI, respectively. AFW had positive r g and r ph values with each of PL and NPL, as well as positive r g value with LA (Table 4).
In accordance with our findings, Zengin et al. (2020) reported a negative correlation between fruit weight and length and TYLCD resistance provided by the Ty-3a gene in F 3 families. Under TYLCD natural infection, direct selection of lines with the largest NPL will maximize the plant yield. When PL, NPL, and LA increase, the AFW (a component of plant yield) also increases (Table 4).

Multivariate analysis
PCA has been widely used to evaluate tomato genotypes, identify plant traits that have contributed most to the observed variance among genotypes, and select parental lines for breeding purposes (Merk et al. 2012;Chávez-Servia et al. 2018;Figás et al. 2018;Tembe et al. 2018;Jin et al. 2019;Tripodi et al. 2021). PCA reduced the dimensions of the 14 traits to four PCS (Table 5), which represented 88.34% of the total variance based on Kaiser's criteria (eigenvalue > 1; Fig. 3) (Johnson and Wichern 1988). According to Brejda et al. (2000), eigenvalues are considered to be the best representation of system attributes across the main components as they explain variance of at least 10%. All estimated traits had a significant effect on the first four PCs. PC1 accounted for 42.93% of the total variation (Fig. 3) and was positively influenced by NPL, PL, TI, TSS, EY, MY, and TY and negatively influenced by TYLCD-MS (Table 5). PC2 was positively correlated with FF, FSI, and LA, accounting for 19.13% of the total variation. PC3 accounted for 16.75% of the total variation and was positively correlated with TA and negatively with M. PC4 was positively correlated with AFW, accounting for 9.54% of the total variation. These findings indicate that traits representing TYLCD tolerance, vegetative, yield, and fruit quality can be used to create groups (Reich et al. 2008). These findings are consistent with those of Shteinberg et al. (2021), who found that PCA revealed a strong separation between tomato genotypes based on TYLCD tolerance. Other studies have also reported that some vegetative growth, yield, and fruit quality traits significantly differed among evaluated tomato genotypes, according to PCA findings (Agong 2001;Glogovac et al. 2012;Merk et al. 2012;Chernet et al. 2014;Iqbal et al. 2014;Figás et al. 2018;Tembe et al. 2018;Jin et al. 2019;Sehgal et al. 2021;Islam et al. 2022). Future evaluations may be based on fewer traits with little information loss, which could reduce the labor, time, and money required to discriminate and define different genotypes (Reich et al. 2008).
The first two PCs accounted for 11 of the 14 estimated traits and explained a high percentage (62.06%) of the variance (Table 5). Therefore, a biplot between the first two PCs (Fig. 4) was created to assist in identifying the relationships among PCs and each of the traits and lines (Yan and Rajcan 2002;Yan and Kang 2003). TYLCD-MS, vegetative growth, yield, and fruit quality traits are represented by vectors, and anywhere parallel vectors (going in the same direction) reveal a strong positive correlation among these traits while the vectors to the sides show a slight correlation between features, those at 180° or almost opposite demonstrate a strongly negative correlation. The PCA biplot divided the estimated traits into three groups. The first group included PL, LA, AFW, FF, TSS, TI, and M, which were positively correlated with the first two PCs. The second group included NPL, EY, TY and MY, which were positively correlated with PC1 and negatively correlated with PC2. The third group included TYLCD-MS, FSI, and TA, which were negatively correlated with PC1 and positively correlated with PC2. TYLCD-MS was positively correlated with FSI and TA and negatively every EY, TY, MY, and LN. The biplot correlations results corroborate those obtained from r g and r ph estimates (Table 4). TA, M, and AFW traits contributed less to the total genetic diversity (short arrows) (Fig. 4). TLs separated into all quarters of the biplot between PC1 and PC2 (Fig. 4), indicating a high level of genetic diversity within and among groups (Reich et al. 2008). Lines with higher values of a given trait were plotted closer to the vector line and further in the direction of that particular vector, often on the vertices of the convex hull. Thus, TL-3, TL-5, and TL-8 were considered superior for most evaluated traits. TL-1, TL-2, TL-11, and TL-12 were closest to the biplot origin, indicating that these lines have the least variability for the estimated traits (Fig. 4).
Tomato lines were divided into three clusters using HCA based on the TYLCD-MS, vegetative growth, yield, and fruit quality traits as illustrated in Fig. 5 and Table 6. This result suggests significant genetic diversity within and among clusters (Table 6). HCA is particularly effective at characterizing tomato genotypes with the greatest degree of similarity/dissimilarity based on morphological traits (Iqbal et al. 2014;Chernet et al. 2014;Bhattarai et al. 2016;Chávez-Servia et al. 2018;Hussain et al. 2018;Tembe et al. 2018;Grozeva et al. 2021;Ene et al. 2022). Cluster 1 comprised TL-1, TL-3, and TL-8 (Fig. 5), which had low TYLCD-MS (highly tolerance), FF, FSI, and M; moderate LA; and high PL, NPL, EY, TY, and MY, AFW, TSS, and TA (Table 6). Six lines made up Cluster 2, which was separated into two groups. The first group included TL-2, TL-6, TL-7, and TL-9, while the second group comprisedTL-4 and TL-5 (Fig. 5). This cluster had the lowest PL, NPL, LA, EY, TY, MY, AFW, and FF values; moderate TYLCD-MS (moderate tolerance) and FSI values; and the highest TI and M (Table 6). Cluster 3 comprised TL-10, TL-11, and TL-12 (Fig. 5) and had low TA and TI values; moderate PL, NPL, EY, TY, MY, AFW, FF, TSS, and M; and high TYLCD-MS (low tolerance), and FSI (Table 6). Variance within the clusters ranged from 15.36 (Cluster 3) to 87.53 (Cluster 2) ( Table 6). The maximum distance of Cluster 2 from the centroids was 10.61 and the minimum distance of Cluster 3 (1.53) ( Table 6). The distance between cluster centroids ranged from 18.56 to 42.99 (Table 7). The lowest distances were found in Cluster 2 and Cluster 3 (18.56), and the highest  Table 6 Mean and different statistics of three cluster analysis for twelve advanced tomato lines based on fourteen phenotypic traits z TYLCD-MS: TYLCD-mean score; PL: plant length; NPL: number of plant leaves; LA: leaf area; EY: early plant yield; TY: total plant yield; MY: marketable plant yield; AFW: average fruit weight; FF: fruit firmness; FSI: fruit shape index; TSS: fruit TSS content; TA: fruit titratable acidity; TI: fruit taste index; and M: fruit maturity index y TYLCD-mean scores: 1, symptomless; 2, slight; 3, moderate; 4, severe; and 5, very severe symptoms were found in Cluster 1 and Cluster 2 (42.99) ( Table 7). Cluster 1 performed better in terms of TYLCD tolerance, vegetative growth, yield, and fruit quality performance relative to Clusters 2 and 3, including the population means (Table 6). These results imply that this particular cluster would be more responsive to selection than the others, assuming that TYLCD tolerance and fruit yield are the target traits. Select lines from different clusters, notably Clusters 1 and 2, should be crossed to create custom cultivars/hybrids with beneficial TYLCD tolerance and yield traits.

Conclusion
Tomato lines in Clusters 1 and 2 show promise for TYLCD tolerance and economically important traits. TLs in these clusters contain useful breeding material, which could be used as parental genotypes or pre-breeding material for the development of future varieties for TYLCD tolerance, vigorous vegetative growth, productivity, unique fruit shape and size, and flavor desirable for different markets. TYLCD tolerance and productivity can be improved by crossing lines from Clusters 1 and 2.

Author contributions
The study's conception and design, material preparation, data collection and analysis, and the writing of the first draft of the manuscript were all done with participate from both authors.
Funding Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB). No funding.
Data availability Data generated or analyzed during this study are provided in the manuscript.

Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.