1 Introduction

Tomato (Solanum lycopersicum L.) is an important vegetable crop. Egypt is the fifth largest producer worldwide, with 6,245,787 tons produced in 2021 from 357,259.4 fedans, averaging 17.5 tons fed−1 (< http://faostat.fao.org >). Tomato yellow leaf curl disease (TYLCD) is one of the world’s most devastating diseases for tomato producers (Lefeuvre et al. 2010; Moriones and Navas-Castillo 2000). TYLCD symptoms include yellowing, curling, and cupping of leaves, severe stunting and abortion of flowers and fruits, which can result in yield reduction of up to 100% (Abhary et al. 2007). Tomato yellow leaf curl virus (TYLCV) is the causative agent of TYLCD and belongs to the genus Begomovirus in the family Geminiviridae. TYLCV is a monopartite DNA virus with a circular genome that contains six genes, with two genes on the viral strand (V1 and V2) and four genes on the complementary sense strand (C1 to C4) (Abhary et al. 2007; Gronenborn 2007). TYLCV is exclusively transmitted in a persistent-circulative manner by the sweet potato whitefly, Bemisia tabaci (Genn. 1889) (Homoptera; Aleyrodidae) (Gronenborn 2007). B. tabaci is an invasive pest found in over 175 countries worldwide (Ramos et al. 2018). The virus has spread as a result of the tremendous proliferation of the whitefly globally. The number of tomato-producing regions reporting TYLCD pandemic breakouts has increased to up to 70 nations since its discovery in the Jordan Valley in 1930s (Mabvakure et al. 2016).

TYLCVD control is a laborious, costly, and difficult task. The primary TYLCD control strategy is the use of insecticides to eradicate the B. tabaci viral vector (Lapidot et al. 2014). However, insecticides treatment can be expensive and environmentally risky. Furthermore, because B. tabaci develops resistance to insecticides, they can become ineffectual (Palumbo et al. 2001). The simplest, safest, most practical, and ecologically friendly approach to controlling TYLCD, minimizing yield losses, and reducing viral transmission is to use TYLCD-resistant cultivars. As a result, TYLCD resistance has become one of the most important goals in tomato breeding.

Initially, the cultivated tomato lacked TYLCD resistance (Hassan et al. 2009). Various accessions of wild tomato species were found to be TYLCD resistant (Ji et al. 2007; Hassan et al. 2009; Yan et al. 2018). TYLCD resistance has been successfully introgressed into the cultivated tomato from resistant wild relatives including S. chilense (Dunal) Reiche, S. habrochaites S. Knapp & D.M Spooner, S. pennellii Correll, S. peruvianum L., and S. pimpinellifolium L. (Vidavski 2007; Singh et al. 2019). As a result, several TYLCD-tolerant cultivars/F1 hybrids have been released for commercial cultivation by global seed companies (Vidavski 2007; Vidavski et al. 2008; Dhaliwal et al. 2020).

The continuous development of new tomato lines/cultivars is important to improve TYLCD resistance/tolerance, productivity, and to overcome virulence genes developed by the virus. The selection and development of elite TYLCD-resistant/tolerant lines require a thorough understanding of the genetic variability and relationships among lines and between traits of interest (Lynch and Walsh 1998; Bernardo 2010). Plant breeders use morphological traits to estimate genetic variability because they are simple to score, rapid to analysis, and inexpensive assessment.

The phenotypic (PCV) and genotypic (GCV) coefficients of variation and heritability are essential biometric tools used to assess the genetic variability among the genotypes (Singh and Chaudhary 1985; Kumar et al. 2013). Heritability governs both the potential for improvement through selection and the effect of the environment on the expression of traits (Robinson et al. 1949). According to Burton and De Vane (1953), combining GCV with h2b estimates produces a trustworthy measure of the projected level of potential improvement via selection.

Traits correlation determines how well genetic variability may be exploited by selection. Depending on the correlation, selection for one trait may either increase or decrease the expression of another (Bernardo 2010). Phenotypic correlation is the result of the interplay between genetic and environmental factors. Genetic correlation is the only method used to direct breeding, since it is the only aspect of inheritable nature (Lynch and Walsh 1998).

Several statistical techniques can be used to estimate genetic variability and the levels of similarity/dissimilarity among genotypes (Mohammadi and Parasanna 2003). These analyses are extremely helpful for planning crossing, allocating lines to specific heterotic groupings, and precise identification with regard to plant variety preservation (Hallauer et al. 2010). Since data from morphological characterization are collected from a sizable dataset containing both qualitative and quantitative traits, multivariate analyses are highly suited to the classification and ranking of genotypes. Furthermore, genetic variability has been estimated using multivariate analyses (Mohammadi and Prasanna 2003). Principal component analysis (PCA) and cluster analysis are the best multivariate approaches for morphological characterization of genotypes (Mohammadi and Prasanna 2003; Reich et al. 2008).

This study was conducted to inform the development of TYLCV-tolerant lines. Twelve F9 tomato lines (TLs) were selected from a tomato breeding program and cultivated under TYLCV-infected conditions to evaluate their phenotypic TYLCD tolerance, vegetative growth, and fruit quality. Estimation of genetic variability, broad sense heritability, and genotypic correlations were performed, and the TLs were categorized and ranked using multivariate analyses in order to select the elite lines.

2 Materials and methods

2.1 Tomato lines

Tomato cultivar TH99802 (Yassamen F1, Syngenta) is characterized by a determinate growth habit; vigorous vegetative growth; medium-early ripening; fruit firmness; spherical fruits with an average weight 160–180 g; high tolerance to verticillium (V), fusarium (Fol 0–1), and Stemphylium (S/Ss); and moderate tolerance to TYLCD. The tomato cultivar ‘TH99806’ (Nirouz F1, Syngenta) is characterized by a determinate growth habit; vigorous vegetative growth; early ripening; heat tolerance; spherical fruits with average weight 130–150 g; tolerance to fruit cracking; highly tolerance to tobacco mosaic virus (ToMV0-2), verticillium (V), and fusarium (Fol1-2); and moderate tolerance to TYLCD. Based on these hybrids, a bulk selection program was organized for TYLCD resistance and high productivity at the Agricultural Experimental Station (AES), Faculty of Agriculture, Cairo University, Giza, Egypt (30°01′07′′N; 31°12′28′′E). Symptomless plants with vigor vegetative growth, average fruit weight > 80 g, and high fruit set were selected from segregated generations during the fall seasons. Twelve F9 lines were selected and evaluated their TYLCD tolerance, vegetative growth, yield, and fruit quality traits during the 2018 and 2019 fall seasons. Evaluation conducted at AES under TYLCD infection field conditions, where the climatic conditions are suitable for the widely flourishing of viruliferous whiteflies (Fig. 1).

Fig. 1
figure 1

Average monthly maximum and minimum temperatures and relative humidity during the period from July to December in the 2018 and 2019 fall seasons (https://power.larc.nasa.gov/data-access-viewer/)

2.2 Planting and experimental design

Seeds of TLs were sown on the 1 July of 2018 and 2019, in seedling trays (209 cells) filled with a mixture of peatmoss and vermiculite (volume 1:1) enriched with macro and microelements under greenhouse conditions. The greenhouse was covered with a black saran fabric with narrow holes to prevent the entry and exit of insects. Five-week-old seedlings were field-transplanted in a randomized complete block design (RCBD) (Singh and Choudhary 1985) with three replicates. Each experimental unit (EU) consisted of two rows/line. Each row was 1 m wide and 3 m long. Plants were set 50 cm apart and subjected to common agricultural practices without applying insecticides.

2.3 TYLCV inoculation

Whiteflies is flourishes in Egypt from April through November, with a peak from August to October (Abd-Rabou and Evans 2020). Therefore, viral inoculation depended on natural infestation with viruliferous whiteflies in both the nursery and field. To encourage TYLCV infection, highly symptomatic plants of the susceptible cultivar ‘Castlerock’ were grown in seedling greenhouse and had an abundance of whiteflies. A row of ‘Castlerock’ plants were cultivated between EUs as a source of TYLCV infection and a guide to TYLCV symptom severity.

2.4 TYLCD tolerance

Phenotypical TYLCD tolerance was evaluated based on the severity of TYLCD symptoms in TLs 3 months after transplanting (3MAT) in both seasons. Symptoms severity was assessed for individual plants of each line using a 1–5 scale as described by Mahmoud (2015): 1, no symptoms appear on the plant; 2, slight symptoms on plant top; 3, moderate symptoms; 4, severe symptoms on the entire plant; and 5, severe symptoms and plant stunting. The individual plant ratings were summed for each line and divided by the number of evaluated plants to calculate the TYLCD-mean score (TYLCD-MS).

TYLCV presence has been shown by polymerase chain reaction (PCR) using universal Begomovirus primers AVcore/ACcore (Brown et al. 2001) and the specific primers TYLC2C3F/TYLC2C3R (Table 1) for species TYLCV, TYLCV-Mld, TYLCVMalv, TYLCSV-ES[2], and TYLCSV, which are found throughout the Mediterranean region and Middle East (Anfoka et al. 2008). Total genomic DNA was isolated using the CTAB method (Murray and Thompson 1980) from the young leaves of both symptomless TLs and severely symptomatic ‘Castlerock’ plants at 3MAT. Five plants from each line were used to sample the young leaves. PCR cycle parameters are described in Table 1. All PCRs were performed in a programmable thermocycler (Mastercycler ep gradient 5, Eppendorf, Hamburg, Germany). The PCR products were resolved in 1.5% agarose gel in 1 × Tris–acetate-EDTA buffer. DNA bands visualized with ethidium bromide staining (0.5 μg mL−1) and photographed under UV light using gel documentation system (Bio-Rad® Gel Doc-2000). One-kb ladder DNA was used as the molecular weight size marker.

Table 1 Primers used to detect TYLCV-DNA

Vegetative, yield, and fruit quality traits

Vegetative traits, i.e., plant length (PL), number of plant leaves (NPL), and the area of the fifth leaf from the apex (LA), were measured on five randomly selected plants for each EU at 3MAT, excluding plants from row edges. The LA was measured using the leaf weighting method, as described by Pandey and Singh (2011). Yield [early (EY): the first three harvests; total (TY): all collected fruits; and marketable (MY): all normal collected fruits] and fruit quality traits [average fruit weight (AFW); fruit firmness (FF); fruit shape index (FSI); contents of total soluble solid content (TSS); and titratable acidity (TA)] were measured. Samples of 20 fully red-ripe fruits from each EU were harvested at the peak harvesting time, weighed to estimate AFW, and washed with distilled water to analyze fruit traits. FSI was calculated as the ratio between the polar and equatorial diameters of fruit according to Yeager (1937), where FSI is > 1.2 in oval fruits, 0.95–1.2 in round shape, and < 0.95 in oblate fruits. FF was determined using a food pressure tester (Force Gauge Model M4-200-Series 4; Mark-10 Corp., Copiague, NY, USA) Mark-10 (Series 4). Then, fruit extract was obtained by blending and filtering the flesh. TSS was determined using a hand refractometer. TA was ascertained using 0.1 N NaOH solution and phenolphthalein as indicators (AOAC 1990). The taste index (TI) and maturity (M) were also estimated as indications of tomato flavor and quality to assess consumer acceptance and distinguish between lines. The taste index [TI = (◦Brix/(20 × TA)) + TA] and the maturity (M = ◦Brix/TA) were calculated using equations described by Navez et al. (1999).

2.5 Biometrical analyses

Estimation of the evaluated TLs’ genetic diversity and their classification based on phenotypic traits was performed using the following statistical methods (Mohammadi and Prasanna 2003). Independent analysis of variance (ANOVA) is performed for each trait with estimate of PCV, GCV, phenotypic correlation coefficient (rph), genotypic correlation coefficient (rg), and h2b. Multivariate analyses were performed for all estimated traits using PCA and hierarchical cluster analysis (HCA).

ANOVA and ANCOVA

The collected phenotypic data were checked for the normality using the Shapiro—Wilk test (Shapiro and Wilk 1965), and data for TYLCD-MS, EY, TY, AFW, FSI, TA, TI, and M were arcsin square root transformed (Wickens and Keppek 2004). ANOVA of the RCBD was performed for each seasons according to Wickens and Keppek (2004). When Bartlett’s homogeneity test was nonsignificant, a combined ANOVA over the two seasons was also performed (Bartleet 1937). Significant differences between the combined means were calculated using the Duncan’s multiple range test at a 5% probability level (Duncan 1955). A combined analysis of covariance (ANCOVA) was also performed for the RCBD over the two seasons to estimate genotypic and phenotypic correlations between traits (Wickens and Keppek 2004). The ANOVA, ANCOVA, and mean comparisons were conducted using MSTATc v.2.1 (Michigan State University, Michigan, USA; Freed et al. 1989).

Estimation of phenotypic and genotypic variability and heritability

The components of variance attributable to differences among TLs were estimated by utilizing the mean squares (Supplementary Table 1; Singh and Chaudhary 1985). The genotypic (δ2g), phenotypic (δ2ph), genotype × year (δ2gy), and pooled error (δ2e) variances components were calculated according to Tessema et al. (2022) (Supplementary Table 1).The GCV and PCV were estimated according to Burton (1952) as follows: GCV = \((\sqrt{{\delta }_{\mathrm{g}}^{2}}/\overline{x })\times 100\) and PCV = \((\sqrt{{\delta }_{\mathrm{g}}^{2}}/\overline{x })\times 100\), where \(\overline{x }\) = grand mean of the trait. The GCV and PCV are classified as low (< 10%), moderate (10–20%), and high (> 20%) as suggested by Johnson et al. (1955). Broad sense heritability (h2b) for a trait was estimated according to Johnson et al. (1955), as follows: h2b = (δ2g/δ2ph) × 100. The h2b is classified as low (< 30%), moderate (30–60%), and high (> 60%).

Estimation of genotypic and phenotypic correlation coefficients

Covariances were computed in a similar manner as shown in Table 2 (Singh and Chaudhary 1985). These covariance components were substituted in the following formulae to compute rg and rph (Johnson et al. 1955): rg = \({\mathrm{COV}}_{g(x1x2)}/\sqrt{{\delta }_{g(x1)}^{2}{\delta }_{g(x2)}^{2}}\), where \({\mathrm{COV}}_{g(x1x2)}\) is the genotypic covariance between a given pair of traits (× 1 and × 2) and \({\delta }_{g(x1)}^{2}{\delta }_{g(x2)}^{2}\) are the genotypic variances of × 1 and × 2, respectively. rph = \({\mathrm{COV}}_{\mathrm{ph}(x1x2)}/\sqrt{{\delta }_{\mathrm{ph}(x1)}^{2}{\delta }_{\mathrm{ph}(x2)}^{2}}\), where COVph(x1x2) is the phenotypic covariance between two traits (× 1 and × 2) and \({\delta }_{\mathrm{ph}(x1)}^{2}\) and \({\delta }_{\mathrm{ph}(x2)}^{2}\) are phenotypic variances for each trait. Significance for rg and rph was performed according to Yassin (1973).

Table 2 The combined analysis of variance (ANOVA) for the estimated fourteen traits of twelve tomato lines over the 2018 and 2019 fall seasons under field TYLCD infection conditions

Multivariate analysis

The pooled data for all traits were standardized using Z-scores to avoid the effect of scale differences before the multivariate analysis. PCA with varimax rotation was applied (Sharma 1996). The latent root criterion (eigenvalue > 1) and parallel analysis were used to determine the number of statistically significant components (Johnson and Wichern 1988). The biplot between the top two PCs that adequately explained a significant percentage of the total variance was created to assist in identifying the relationships among PCs and traits, PCs and lines, lines and their traits, and among the different traits (Yan and Rajcan 2002; Yan and Kang 2003). The correlation coefficient between any two traits can be estimated using the cosine of the angle between the vectors (Yan and Kang 2003). The vectors are positively correlated if the angle between them is < 90°, negatively correlated if the angle is > 90°, and independent if the angle is exactly 90° (Yan and Rajcan 2002). HCA was performed to construct a dendrogram using the squared Euclidean distance and Ward’s joining method. All multivariate statistical analyses were performed using IBM SPSS software version 26.0.0 (SPSS Inc., Chicago, IL, USA; ÓConnor 2000) and XLSTAT software version 2019 (Addinsoft, Paris, France).

3 Results and discussion

3.1 ANOVA

Table 2 shows the combined ANOVA results of a RCBD for the 2018 and 2019 fall seasons for the estimated fourteen traits of the twelve evaluated TLs. There were no significant differences between years (MSy) for the estimated traits, except LA and AFW. These results demonstrated the environmental similarities between the two season and their influence on the evaluated traits (Falconer 1952). Meteorological information on minimum and maximum temperatures and air relative humidity (Fig. 1) support this result. There were significant differences in genotypes [MSg; P < 0.001 for all traits, except TI (P < 0.01) and M (P < 0.05)]. Also, genotype had the highest incidence (> 50%) on total pooled variance for all traits, except TA (22.2%). These findings showed that TLs had broad genetic variance, which may improve TYLCD tolerance and yield-related traits (Schouten et al. 2019; Hassan et al. 2022). Mean squares due to G × Y interaction (MSgy) were nonsignificant for the most of estimated traits (9 out of 14 traits). Significant differences with MSgy were seen in PL, TY, TA (P < 0.05), TSS (P < 0.01), and LA (P < 0.001). These findings indicated the genotype stability of TLs during both seasons (Singh 2001). Hence, emphasis was placed on determining the specific effects of genotypes and comparing their means for the estimated traits (Wickens and Keppek 2004).

3.2 Phenotypic performance

The phenotypic responses of the evaluated TLs under TYLCV infection for the TYLCV-mean score, vegetative, yield, and fruit quality traits are shown in Table 3.

Table 3 Phenotypic means of the twelve tomato lines evaluated across the 2018 and 2019 fall seasons under field TYLCV infection conditions

TYLCD tolerance

TLs showed mild to moderate TYLCD symptoms at 3 MAT (Table 3). TYLCD-MS ranged between 1.15 for TL-1 and 1.82 for TL-12 across both seasons. The evaluated TLs could be divided into three groups according to significant differences in their symptom severity (TYLCD-MS). The first group had the lowest TYLCD-MS (1.15–1.34) and was represented by TL-1 to TL-3 and TL-5 to TL-8; the second group was only represented by TL-10 (1.52); and the third group had the highest TYLCD-MS (1.69–1.82) and was represented by TL-9, TL-11, and TL-12 (Table 4).

Table 4 Genotypic (rg) and phenotypic (rph) correlationsz between 14 traits of 12 advanced tomato lines cultivated under field TYLCV infection conditions over the 2017 and 2018 fall seasons

The presence of TYLCV-DNA was determined by a PCR assay in symptomless plants of TLs, as compared with the susceptible cultivar Castlerock (Fig. 2). Electrophoresis analysis revealed a single 550 bp fragment in all the DNA samples examined with the AVcore/ACcore universal primer pair (Fig. 2a). A 500-bp PCR product was produced in all of DNA samples using the specific primer pair TYLC2C3F/TYLC2C3R (Fig. 2b). These findings indicate TYLCD tolerance in TLs, where plants harbor viral DNA (Fig. 2) but exhibit either mild or no symptoms (Table 3).

Fig. 2
figure 2

Detection of TYLCV viral DNA in symptomless plants of tomato inbred lines grown in natural infectious field compared to TYLCV-infected plants of susceptible cultivar Castlerock using primers AVcore/ACcore (a) and TYLC2C3F/TYLC2C3R (b) in a PCR assay. Lanes M: 1 kbp DNA marker; 1–12: samples of tweleve tomato lines; and 13: sample of susceptible cultivar Castlerock

Vegetative traits

The results for vegetative growth traits (PL, NPL, and LA) for TLs at 3MAT are shown in Table 3. PL ranged from 0.63 m for TL-4 to 1.49 m for TL-3. TL-2 and TL-3 were the tallest plants (1.49 and 1.43 m, respectively) across both seasons, followed by TL-12, TL-11, and TL-8 (1.36, 1.29, and 1.27 m, respectively) with no significant differences between them. The shortest plants were TL-4, TL-5, and TL-9 (0.63, 0.70, and 0.73, respectively) with no significant differences among them (Table 3).

NPL generated by TLs at 3MAT ranged from approximately 37 in TL-9 to 54 in TL-8 across both seasons (Table 3). TL-8 had the highest NPL (54.3) across both seasons, followed by TL-3 (50.36), TL-2 (47.95), TL-12 (47.02), TL-11 (45.87), and TL-6 (45.66). NPL was lowest in TL-9 and TL-4 (36.91 and 39.21, respectively), with no significant differences between them (Table 4). The LA (cm2) of TLs at 3MAT ranged from 34.17 in TL-5 to 46.27 in TL-12 across both seasons (Table 3). TL-12 and TL-11 produced the largest LAs across both seasons (46.27 and 45.76, respectively). The smallest LAs were generated by TL-7, TL-5, and TL-4 (33.59, 34.17, and 35.01, respectively) across both seasons with no significant differences between them (Table 3).

Plant yield

EY (kg plant−1) for the TLs was between 0.36 in TL-9 and 0.93 in TL-8 across both seasons (Table 3). TL-8 had the greatest EY (0.93) across both seasons, followed by TL-3, TL-5, and TL-6 (0.79, 0.79, and 0.74, respectively) with no significant differences between them. TL-9 and TL-10 had the lowest EY (0.36 and 0.41, respectively) with no significant differences between them.

Across both seasons, TY (kg plant−1) values ranged from 1.94 in TL-7 to 2.67 in TL-5 (Table 3). The most promising lines for TY were TL-5 (2.67), TL-3 (2.67). TL-1, TL-7, TL-9, and TL-10 produced the lowest TY (1.97, 1.94, 1.98, and 2.03) with no significant differences from. MY (kg plant−1) ranged between 1.65 in TL-9 and 2.47 in TL-5 across both seasons (Table 3). TL-5, TL-3, and TL-8 yielded the greatest MY (2.47, 2.39, and 2.28 kg, respectively) with no significant differences among them. TL-9, TL-10, TL-1, and TL-7 yielded the lowest MY (1.65, 1.79, 1.84, and 1.81 kg, respectively) with no significant differences among them (Table 3).

Fruit quality traits

The AFW (g) of the TLs ranged from 70.82 in TL-5 to 126.02 in TL-3 across both seasons (Table 3). TL-3 and TL-8 yielded the highest AFW (126.02 and 124.95, respectively), followed by TL-1 (116.27), TL-12, and TL-11 (99.32 and 98.21, respectively). TL-5 and TL-4 had the lowest AFW (70.82 and 72.60, respectively; Table 3). Across both seasons, FF (kg cm-2) ranged from 0.34 in TL-5 to 0.69 in TL-2 (Table 3). TL-2 had the most FF, followed by TL-12 (0.60), TL-1 (0.58), TL-11 (0.53), TL-3 (0.52), and TL-7 (0.52). TL-5 (0.34), TL-3 (0.39), TL-9 (0.39), and TL-10 (0.41) had the lowest FF with no significant differences among them. FSI varied among TLs (Table 3). TL-2 and TL-9-TL-12 had oval fruits (FSI ranged from 1.31 to 1.53); TL-4-TL-7 had round fruits (FSI ranged from 0.97 to 1.11); and TL-1, TL-3, and TL-8 had oblate fruits (FSI ranged from 0.68 to 0.80) (Table 3).

Across both seasons, fruit TSS content (°Brix) ranged from 3.87 in TL-9 to 5.57 in TL-2 (Table 3). The highest TSS was seen in TL-2 (5.57), followed by TL-12 (5.17), TL-1 (4.91), TL-8 (4.89), TL-4 (4.79), TL-3 (4.76), and TL-11 (4.62) (Table 3). TA values (mg citric acid 100 g−1 FW) ranged from 0.36 in TL-8 to 0.47 in TL-12 across both seasons (Table 3). TL-12, TL-2—TL-7, and TL-9 -TL-11 showed the greatest TA. TL-8 and TL-1 had the lowest TA. TI and M were determined using the TSS and TA values. These indices are typically a more accurate indicator of a fruit’s flavor than just its TSS or acidity. TI across both seasons ranged from 0.89 in TL-9 to 1.14 in TL-2 (Table 3). TL-2 fruits had the highest TI (1.14), followed by the other lines (0.95–1.06), except for TL-9 (0.89), which was ranked third. The M values ranged from 9.11 in TL-9 to 14.35 in TL-2 (Table 3). The highest M values were observed in TL-2 (14.35), TL-8 (14.18), TL-1 (12.91), and TL-4 (11.89) with no significant differences. TL-9 and TL-3 had the lowest M values (9.11 and 9.81, respectively) with no significant differences among them (Table 3). Navez et al. (1999) stated that tomato fruits are deemed tasty with TI values > 0.7 and maturity > 10. Accordingly, the fruits of the evaluated TLs are considered to be good for fresh consumption.

According to the phenotypic evaluation results, lines TL-3, TL-5, and TL-8 had the highest TY, MY, and TYLCD tolerance, as well as acceptable fruit quality and AFW > 80 g, except TL-5, which had an AFW of approximately 71 g.

3.3 Heritability and phenotypic and genotypic variation

Tomato breeding aimed at selecting desired genotypes is linked with GCV and heritability estimates and other genetic parameters for important traits (Bernardo et al. 2010; Dhaliwal et al. 2020). The results for variability components (δ2g, δ2gy, δ2e, δ2ph, PCV, and GCV) and h2b for fourteen phenotypic traits of the evaluated TLs are presented in Table 2. These parameters are crucial for an efficient tomato breeding program for TYLCD tolerance (Falconer and Mackay 1996). The δ2g accounted for a greater proportion of the variability than δ2ph for the majority of traits and was 3.0–135.5 times that of δ2gy. PCV% values varied from 0.2 (AFW) to 31.8 (EY), whereas GCV% values ranged from 0.2% (AFW) to 31.6% (EY) (Table 5). EY, FF, and PL had high PCV% (31.8, 20.4, and 20.1, respectively). Medium PCV% were observed for FSI (19.0), M (19.0), MY (12.3), TYLCD-MS (11.8), LA (11.2), TSS (11.3), and NPL (10.9). Low PCV% were observed for TA (5.6), TY (4.6), TI (0.7), and AFW (0.2; Table 3). Only EY had a high GCV%, while TSS, TA, TY, TI, and AFW had low GCV% (6.7, 4.8, 3.9, 0.3, and 0.2, respectively). The other traits had medium GCV%, ranging from 10.2 (LA) to 19.9 (FF). High (> 20%) or moderate (10–20%) PCV and GCV suggest a high level of variability in EY, TYLCD-MS, PL, NPL, LA, MY, FSI, and M traits. High variability shows the potential for effective selection for trait improvement (Singh 2001). Low PCV and GCV values (< 10%) for TY, AFW, TA, and TI indicate that lower genetic variability exists for these traits. The slightly higher PCV than GCV indicates a lesser environmental effect. Low GCV and moderate PCV values for TSS suggest that the environment had the greatest influence on this trait.

Table 5 Eigenvalues and component loading values of the first four principal components (PCs) for the fourteen traits of tomato lines cultivated under field TYLCV infection conditions over the 2017 and 2018 fall seasons

All estimated traits had δ2ph > δ2g and PCV > GCV (Table 3), indicating additional environmental influences on trait expression (Singh 2001). However, δ2g represented a larger proportion of δ2ph (Table 3), and the difference between the PCV and the relevant GCV was negligible for all traits (PCV:GCV ratios ranged from 1.00 to 1.17), except TSS and TI (PCV:GCV ratios were 1.69 and 2.33, respectively; Table 3). These traits are, therefore, fairly stable and highly heritable (Singh 2001). This was confirmed by h2b estimates, where h2b was very high (70.8–99.1%) for most traits, with the exception of TSS and TI (35.7 and 117.6%, respectively) (Johnson et al. 1955). Heritability is used to indicate the relative degree to which a character is transmitted from parent to offspring. In improved tomato lines and cultivars, estimates of the h2b of TYLCD resistance/tolerance were high (Abdel-Ati et al. 2005; Mazyed et al. 2007; Hassan et al. 2022). The magnitude of such estimates also suggests the extent to which improvement is possible through selection. PCV and GCV estimation with h2b provides an accurate indicator of the heritable component of variance (Bello et al. 2012). High heritability traits can be improved by various phenotypic selection techniques (Singh 2001). Moderate heritability suggests non-additive gene action in their control; therefore, complex breeding methods may be recommended to improve TSS and TI (Singh 2001). So, phenotypic selection based on TYLCD tolerance (TYLCD-MS), vegetative growth (PL, NPL, and LA), plant yield (EY and MY), and fruit quality traits (FF, FSI, TA, and M) can be trusted to select the best lines among the evaluated TLs.

3.4 Phenotypic and genotypic correlation coefficients

Phenotypic (rph) and genotypic (rg) correlation coefficients for all study traits are presented in Table 4. Successful exploitation of genetic diversity through selection is determined by the correlation of traits. Selection for a particular trait may either increase or decrease the expression of another trait, depending on how closely they are correlated (Bernardo 2010). For most of the estimated traits, rg values were higher than rph values, indicating a moderately string inherent relationship between traits (Bernardo 2010). Traits that showed significant rg to each other generally showed significant rph.

Several significant phenotypic and genotypic correlations were detected. Here, we focus on correlations between TYLCD tolerance (TYLCD-MS), yield, and its components. TYLCD-MS showed positive genotypic and phenotypic correlations with FSI, but negative genotypic association with TI and M. Positive rg and rph values were found for EY and each of NPL, TY, and MY, but negative rg and rph values were found between EY and FSI. Only EY and M showed positive genotypic correlation. TY showed positive rg and rph values with each of NPL, EY, and MY, and positive rg value only with TA. MY showed significant and positive rg and rph values with each of NPL, EY, and TY, and positive and negative correlations with TA and FSI, respectively. AFW had positive rg and rph values with each of PL and NPL, as well as positive rg value with LA (Table 4).

In accordance with our findings, Zengin et al. (2020) reported a negative correlation between fruit weight and length and TYLCD resistance provided by the Ty-3a gene in F3 families. Under TYLCD natural infection, direct selection of lines with the largest NPL will maximize the plant yield. When PL, NPL, and LA increase, the AFW (a component of plant yield) also increases (Table 4).

3.5 Multivariate analysis

PCA has been widely used to evaluate tomato genotypes, identify plant traits that have contributed most to the observed variance among genotypes, and select parental lines for breeding purposes (Merk et al. 2012; Chávez-Servia et al. 2018; Figás et al. 2018; Tembe et al. 2018; Jin et al. 2019; Tripodi et al. 2021). PCA reduced the dimensions of the 14 traits to four PCS (Table 5), which represented 88.34% of the total variance based on Kaiser’s criteria (eigenvalue > 1; Fig. 3) (Johnson and Wichern 1988). According to Brejda et al. (2000), eigenvalues are considered to be the best representation of system attributes across the main components as they explain variance of at least 10%. All estimated traits had a significant effect on the first four PCs. PC1 accounted for 42.93% of the total variation (Fig. 3) and was positively influenced by NPL, PL, TI, TSS, EY, MY, and TY and negatively influenced by TYLCD-MS (Table 5). PC2 was positively correlated with FF, FSI, and LA, accounting for 19.13% of the total variation. PC3 accounted for 16.75% of the total variation and was positively correlated with TA and negatively with M. PC4 was positively correlated with AFW, accounting for 9.54% of the total variation. These findings indicate that traits representing TYLCD tolerance, vegetative, yield, and fruit quality can be used to create groups (Reich et al. 2008). These findings are consistent with those of Shteinberg et al. (2021), who found that PCA revealed a strong separation between tomato genotypes based on TYLCD tolerance. Other studies have also reported that some vegetative growth, yield, and fruit quality traits significantly differed among evaluated tomato genotypes, according to PCA findings (Agong 2001; Glogovac et al. 2012; Merk et al. 2012; Chernet et al. 2014; Iqbal et al. 2014; Figás et al. 2018; Tembe et al. 2018; Jin et al. 2019; Sehgal et al. 2021; Islam et al. 2022). Future evaluations may be based on fewer traits with little information loss, which could reduce the labor, time, and money required to discriminate and define different genotypes (Reich et al. 2008).

Fig. 3
figure 3

Scree plot for fourteen principal components (PCs) for fourteen phenotypic traits of twelve tomato lines grown under field TYLCV infection conditions over the 2018 and 2019 fall seasons

The first two PCs accounted for 11 of the 14 estimated traits and explained a high percentage (62.06%) of the variance (Table 5). Therefore, a biplot between the first two PCs (Fig. 4) was created to assist in identifying the relationships among PCs and each of the traits and lines (Yan and Rajcan 2002; Yan and Kang 2003). TYLCD-MS, vegetative growth, yield, and fruit quality traits are represented by vectors, and anywhere parallel vectors (going in the same direction) reveal a strong positive correlation among these traits while the vectors to the sides show a slight correlation between features, those at 180° or almost opposite demonstrate a strongly negative correlation. The PCA biplot divided the estimated traits into three groups. The first group included PL, LA, AFW, FF, TSS, TI, and M, which were positively correlated with the first two PCs. The second group included NPL, EY, TY and MY, which were positively correlated with PC1 and negatively correlated with PC2. The third group included TYLCD-MS, FSI, and TA, which were negatively correlated with PC1 and positively correlated with PC2. TYLCD-MS was positively correlated with FSI and TA and negatively every EY, TY, MY, and LN. The biplot correlations results corroborate those obtained from rg and rph estimates (Table 4). TA, M, and AFW traits contributed less to the total genetic diversity (short arrows) (Fig. 4). TLs separated into all quarters of the biplot between PC1 and PC2 (Fig. 4), indicating a high level of genetic diversity within and among groups (Reich et al. 2008). Lines with higher values of a given trait were plotted closer to the vector line and further in the direction of that particular vector, often on the vertices of the convex hull. Thus, TL-3, TL-5, and TL-8 were considered superior for most evaluated traits. TL-1, TL-2, TL-11, and TL-12 were closest to the biplot origin, indicating that these lines have the least variability for the estimated traits (Fig. 4).

Fig. 4
figure 4

Biplot among the first two principal components (PC1 and PC2) with fourteen phenotypic traits and twelve tomato lines (TLs) grown under field TYLCV infection conditions over the 2018 and 2019 fall seasons. Phenotypic traits were TYLCV-mean scores (TYLCV), plant length (PL), plant leaves number (LN), leaf area (LA), yield (early: EY; total: TY; and marketable: MY), average fruit weight (AFW), fruit firmness (FF), fruit shape index (FSI), fruit taste index, and fruit maturity index

Tomato lines were divided into three clusters using HCA based on the TYLCD-MS, vegetative growth, yield, and fruit quality traits as illustrated in Fig. 5 and Table 6. This result suggests significant genetic diversity within and among clusters (Table 6). HCA is particularly effective at characterizing tomato genotypes with the greatest degree of similarity/dissimilarity based on morphological traits (Iqbal et al. 2014; Chernet et al. 2014; Bhattarai et al. 2016; Chávez-Servia et al. 2018; Hussain et al. 2018; Tembe et al. 2018; Grozeva et al. 2021; Ene et al. 2022). Cluster 1 comprised TL-1, TL-3, and TL-8 (Fig. 5), which had low TYLCD-MS (highly tolerance), FF, FSI, and M; moderate LA; and high PL, NPL, EY, TY, and MY, AFW, TSS, and TA (Table 6). Six lines made up Cluster 2, which was separated into two groups. The first group included TL-2, TL-6, TL-7, and TL-9, while the second group comprisedTL-4 and TL-5 (Fig. 5). This cluster had the lowest PL, NPL, LA, EY, TY, MY, AFW, and FF values; moderate TYLCD-MS (moderate tolerance) and FSI values; and the highest TI and M (Table 6). Cluster 3 comprised TL-10, TL-11, and TL-12 (Fig. 5) and had low TA and TI values; moderate PL, NPL, EY, TY, MY, AFW, FF, TSS, and M; and high TYLCD-MS (low tolerance), and FSI (Table 6). Variance within the clusters ranged from 15.36 (Cluster 3) to 87.53 (Cluster 2) (Table 6). The maximum distance of Cluster 2 from the centroids was 10.61 and the minimum distance of Cluster 3 (1.53) (Table 6). The distance between cluster centroids ranged from 18.56 to 42.99 (Table 7). The lowest distances were found in Cluster 2 and Cluster 3 (18.56), and the highest were found in Cluster 1 and Cluster 2 (42.99) (Table 7). Cluster 1 performed better in terms of TYLCD tolerance, vegetative growth, yield, and fruit quality performance relative to Clusters 2 and 3, including the population means (Table 6). These results imply that this particular cluster would be more responsive to selection than the others, assuming that TYLCD tolerance and fruit yield are the target traits. Select lines from different clusters, notably Clusters 1 and 2, should be crossed to create custom cultivars/hybrids with beneficial TYLCD tolerance and yield traits.

Fig. 5
figure 5

Dendrogram using Ward method between groups classification of twelve tomato lines (TLs) based on fourteen phenotypic traits

Table 6 Mean and different statistics of three cluster analysis for twelve advanced tomato lines based on fourteen phenotypic traits
Table 7 Distances between cluster centroids in twelve tomato genotypes

4 Conclusion

Tomato lines in Clusters 1 and 2 show promise for TYLCD tolerance and economically important traits. TLs in these clusters contain useful breeding material, which could be used as parental genotypes or pre-breeding material for the development of future varieties for TYLCD tolerance, vigorous vegetative growth, productivity, unique fruit shape and size, and flavor desirable for different markets. TYLCD tolerance and productivity can be improved by crossing lines from Clusters 1 and 2.