Prediction of strawberry fruit yield based on cultivar-specific growth models in the tunnel-type greenhouse

The strawberry growth and fruit yield of five Korean cultivars in the tunnel-type greenhouse predicted using their growth. The number of leaves, petiole length, leaf length and width, crown diameter, and the ratio of red and far-red (RFR) of the five Korean cultivars were measured during the cultivation period. The number of leaves of all cultivars exhibited a similar trend during this period; the plant and petiole length of ‘Maehyang’ were the longest, leaf length exhibited similar trends in all five cultivars except for ‘Jukhyang’, the leaf width of ‘Arihyang’, was the longest, and crown diameter of ‘Keumsil’ was the thickest. The leaf length, crown diameter, and RFR were associated with the fruit yield in the multiple linear regression. When a single model was used to predict the yield of all five cultivars, the correlation between expected yield and actual yield was r = 0.53. When cultivar-specific models were built for the prediction, the correlation increased to r = 0.77. The results indicated that the fruit yield of strawberry cultivars could be better predicted by considering cultivar-specific information, so it may be necessary to consider individual cultivars specifically rather than all cultivars simultaneously.


Introduction
Plants are continuously exposed to different environmental conditions due to their sensibility, and the physical features of the plants are affected by environmental factors. For instance, leaf size can vary with precipitation and air temperature (Wright et al. 2017). The change of the physiological traits can also induce change in fruit yield. Plant leaves produce glucose through photosynthesis, and the glucose is changed to disaccharides and polysaccharides and are used for flowering and fruiting. The transport of sugar is demonstrated by source and sink strength. Giaquinta (1978), moreover, reported that the level of sugar and activity of sucrose-metabolizing enzymes was modified by source and sink strength. Generally, leaves do become the source, and flowers and fruits become the sink. Thus, it can be inferred that the growth of fruit vegetables may predict fruit yield. Previous studies have demonstrated that the fruit yield of strawberries was correlated with plant size (Guttridge and Anderson 1981;Olsen et al. 1985), the number of leaves per crown (Mason and Rath, 1980), and the number of leaves per plant (Lacey 1973). Faby (1997) reported that transplant crown diameter and the total fruit yield of strawberry are positively correlated. Furthermore, strawberries with larger crown, produced higher fruit yield during early season, on average, than strawberries with a smaller crown (Albregts 1968;Durner et al. 2002). Ahn et al. (2021) reported that larger leaf size and more leaves of five Korean strawberry cultivars ('Arihyang', 'Jukhyang', 'Keumsil', 'Maehyang', and 'Seolhyang') might result in higher fruit yield, and the 1 3 source and sink strength might describe the relationships between the growth factors (leaf size, the number of leaves, and crown diameter) and yield factors (time to first flowering, fruit weight, and the number of fruits). Ahn et al. (2021) also subjoined the ratio of red and far-red (RFR) lights to the growth factors because the growth and fruit production of strawberries can be predicted by the ratio of RFR light: biomass and starch accumulation due to red light (Li et al. 2012) and early flowering due to far-red light (Brown and Klein 1971;Cerdan and Chory 2003;Johnson et al. 1994;Wollenberg et al. 2008). Generally, canopy shading reduces the RFR (Smith 1982), and the number of leaves and leaf size affects the canopy shading.
Prediction of fruit yield is essential for harvest and market planning (Wulfsohn et al. 2012). The relationships between growth and fruit yield of strawberry may allow us to predict fruit yield through plant height, the number of leaves, leaf length and width, and crown diameter (Sim et al. 2020). Moreover, many studies have been conducted to predict fruit yield using a regression model, correlation/regression analysis, multiple regression model, and machine-learningbased predictive model (Døving and Måge 2001;Obsie et al. 2020;Sim et al. 2020;Zadravec et al. 2013). Predictive modeling, however, is particularly difficult because thinning (removal of old and infected leaves) in commercial cultivation may induce random noise to data due to a reduction in the number of leaves and crown diameter, and each cultivar may have its own mechanisms to contribute fruit yield. As many studies have displayed that thinning improve growth and fruit yield during the fruit production period, therefore, predictive models should be designed considering thinning (Cocco et al. 2020;Hicklenton and Reekie 2002). This study evaluates the growth factors of managed strawberries that are related to fruit yield and to predict fruit yield using the growth factors considering thinning and different predictive models.

Cultural environment and growth parameter collection
Strawberries were grown in a double-tunnel greenhouse covered with a 0.1-mm polyethylene film. The greenhouse was built in the east-west direction. An air heater (8 kW electronic, temperature < 8 °C) and a lagging cover (complexed with aluminum, cotton, and felt sheets) were used between 18:00-08:00 to keep warm and for heating in winter. For ventilation, the side window was opened during the photoperiod, and the airflow fan (diameter 600-mm) was set to operate for 25 min and rest for five minutes. The soil type in the greenhouse was silt loam soil, well-drained and mulched with black polyethylene film. Base fertilizer was applied according to local recommendations at 2000 kg per 1000 m 2 of commercial organic matter. Irrigation was performed at regular intervals using a nutrient solution (N-P-K-Ca-Mg-S = 16-4-8-4-me•L −1 ). The collected environmental data are indicated in Fig. 1. The number of leaves, plant height (cm), length of petiole (cm), the leaf length (cm), leaf width (cm), and crown diameter (mm) were measured and recorded every two weeks starting from the seventh day after transplanting. The leaf area, leaf area index, fresh weight, and dry weight were recorded on the transplanting day and the last day of cultivation. The days of flowering for the five strawberry cultivars were recorded, and it started on November 18th, 2020. The strawberry fruits were harvested from December 10th, 2020 to March 26th, 2021. For the fruit yield, three plants from five experimental plots, in total 15 plants, were randomly selected for the data collection. The horticultural characteristics investigation method was conducted with reference to a previous report (Sim et al. 2020).

Ratio of red and far-red light (RFR)
The wavelengths of red and far-red reaching the crown were measured using a spectrometer (LI-180, Li-Cor, Lincoln, NE, USA). The wavelength ranges of red and far infrared are 1 3 620-700-nm and 700-750-nm, respectively. The RFR ratio was measured in a total of 15 plants with 5 replications of 3 plants at the crown position.

Data analysis
We analyzed 15 plants per cultivar. To describe the total yield per plant (grams), a simple linear regression on the first day of flowering, ANOVA with the five cultivars, and multiple linear regression with the growth variables (the number of leaves, plant height, petiole length, leaf length, leaf width, and crown diameter) as well as the RFR ratio was used. For the growth variables, the closest time point before the first flowering was chosen. Then, the growth variables measured at earlier time points were also compared. To choose the best predictive model, cross-validations were used to evaluate the prediction performance and select the model, which resulted in the smallest mean square error. Two approaches were considered during predictive modeling. First, the cultivar was used as a predictor and the same set of other variables for all cultivars (i.e., single model for the five cultivars). Second, the data were subsetted by cultivar then considered different sets of variables for each cultivar (i.e., five cultivar-specific models). All statistical analyses were conducted in R version 4.0.2 (R Core Team 2020).

Growth characteristics of selected strawberry cultivars
All five strawberry cultivars were cultivated by continuously removing old and infected leaves. The number of leaves of all five cultivars exhibited a similar trend during the growth period and continued to increase from 60 d after transplanting (Fig. 2). 'Maehyang' and 'Jukhyang' in the late growth period were defoliated due to the increase in the number of infected leaves, and they reduced slightly as a result. As for the plant length, 'Maehyang' was the longest from early to late growth stage, followed by 'Keumsil', 'Arihyang', 'Jukhyang', and 'Seolhyang'. As for the petiole length, with the result of the plant height, 'Maehyang' was the longest. Then, 'Keumsil', and the other three cultivars were relatively similar. The leaf length was similar in all five cultivars except for 'Jukhyang' being substantially shorter at the early stage of growth. The leaf width of 'Arihyang' (average 9.4 cm) was the longest among the five cultivars, and 'Maehyang' (average 7.4 cm) was the shortest. The leaf widths of 'Jukhyang', 'Keumsil', and 'Seolhyang' were similar. As for the crown diameter, 'Keumsil' was the thickest (average 18.0-cm), and the other four cultivars were relatively similar. 'Keumsil' with thick crown diameter had the highest leaf area index, fresh weight and dry weight among the five cultivars (data not indicated). The ratio of average RFR under the leaves was the lowest in 'Arihyang' and 'Jukhyang', which had the widest leaf widths, and was highest in 'Seolhyang' and 'Maehyang', which had the narrowest leaf widths (Table 1 and Fig. 2). It was thought that the leaf width had a greater the canopy shading on the crown than the leaf length, and there was no significant relationship with the leaf length. Therefore, the distribution of RFR according to cultivar might be due to leaf width. Moreover, there was no significant relationship between the flowering date and the increase in fruit yield.

Flowering characteristics of selected strawberry cultivars
'Arihyang' showed early flowering, on average, compared to the other cultivars with an average day after transplanting (DAT) 53, and 'Jukhyang' and 'Seolhyang' flowered the first time about two weeks after 'Arihyang' on average (Fig. 3). It is thought that the late flowering of 'Jukhyang' is due to delayed transplanting. Some plants of 'Seolhyang' flowered as early as 'Arihyang', but the average DAT was 66. The reason might be that the investigated plants were not properly differentiated during the seedling period due to insufficient low air temperature. 'Maehyang' and 'Keumsil' exhibited similar flowering with an average DAT of 62 days. In the figure, the solid thick line in the middle of each box indicates the sample median (for each cultivar), and the dotted line indicates the sample average. For each box plot, an outlier is defined as a data point below Q1-1.5 IQR or above Q3 + 1.5 IQR, where IQR = Q3-Q1 is the interquartile range (the length of each box), and it is marked as a dot.

Fruits yield characteristics of the five strawberry cultivars
As for the number of fruits, 'Keumsil' (averaged 457 fruits) exhibited the highest number, followed by 'Seolhyang', 'Maehyang', 'Arihyang', and 'Jukhyang' (averaged 385, 368, 300, and 188 fruits, respectively) (Fig. 4). It is judged that 'Jukhyang' exhibited remarkable low number of fruits as the planting date was the latest. The number of fruits in the four cultivars, except for 'Jukhyang', increased sharply after February 1, 2021 (DAT 140). Similar to the number of fruits, 'Keumsil' (averaged 7.2 kg) had the highest cumulative fruit weight. In the case of 'Arihyang', the number of fruits was low, but the cumulative fruit weight tended to be high due to the large-size fruit features. As with the cumulative number of fruits, the cumulative fruit weight of the cultivars also increased sharply after February 1, which is 140 days of DAT. 'Arihyang' (average 27.1-g), which has the characteristic of large-size, exhibited an overwhelmingly high weight per fruit compared to other varieties, followed by 'Keumsil' (average 19.7-g). The remaining three cultivars showed relatively similar weight per fruit. After the first harvest in all cultivars, the weight per fruit reduced gradually.

Prediction of fruit yield using the characteristics of selected strawberry cultivars
All descriptive regression analyses are summarized in Table 2. It appeared that the time of first flowering was not associated with the total yield (p = 0.683), but some cultivars tended to result in higher yield than others, on average (p = 0.002). According to Tukey's method, 'Keumsil' had a higher yield than 'Jukhyang' by 120.3 g on average with 95% CI (9.0, 231.6 g). The other pairwise comparisons were not statistically significant at α = 0.05 due to the high variability indicated in Fig. 5. In multiple linear regression with the growth variables, a longer leaf was associated with higher yield (p < 0.001), a bigger crown in terms of diameter was associated with lower yield (p = 0.005), and a higher RFR ratio was associated with lower yield (p = 0.010) on average. When the growth variables measured at an earlier time point (than the closest to the time of first flowering) was considered, the associations disappeared or weakened. For predictive modeling, the cultivar-specific approach outperformed the single-model approach. Based on the  single-model approach, the leaf length, crown diameter, and RFR were selected as useful predictors. Still, the predicted outcome and observed outcome had a low correlation of r = 0.53. By considering the cultivar-specific approach, the overall correlation between predicted and observed increased to r = 0.77. Using the prediction results from the cultivar-specific approach, the actual fruit yield and the predicted fruit yield are plotted in Fig. 6. The predictions for 'Arihyang', 'Keumsil', and 'Maehyang' (r = 0.73, 0.82, and 0.78, respectively) were more accurate than for 'Jukhyang' and 'Seolhyang' (r = 0.32 and 0.54, respectively). The combination of the first flowering time, petiole length, leaf width, crown diameter, and RFR was the best predictive model for 'Arihyang'; leaf length and crown diameter for 'Keumsil'; and the first flowering time, plant height, and petiole length for 'Maehyang'.

Discussion
We determined the number of leaves, plant height, petiole length, leaf length and width, crown diameter, days to the first flowering, the cumulative count and weight of fruit, the weight per fruit, and the ratio of RFR of ' Arihyang', 'Jukhyang', 'Keumsi', 'Maehyang', and 'Seolhyang' from September 2020 to March 2021 ( Fig. 2 and 4). Ahn et al. (2021) reported the same traits of the five Korean strawberry cultivars from September 2019 to March 2020. These results demonstrated that trends, such as the number of leaves of 'Jukhyang' and 'Maehyang', leaf length of 'Jukhyang', leaf width of 'Maehyang', crown diameter of 'Arihyang' and 'Keumsil', the cumulative count of fruit of 'Jukhyang', the cumulative weight of fruit of 'Jukhyang', and weight per fruit of 'Arihyang' were similar to the trends of Ahn et al. (2021). It demonstrated that the growth factors can characterize the cultivars. Moreover, Ahn et al. (2021) determined the cultivar-specific relationships between the variability of environmental factors (daily air temperature, soil water content, and solar radiation) and fresh weight harvested in the following week (adjusting for weeks since transplanting) but the article just displayed the characteristics and trends of the cultivars during the cultivation season. Sim et al. (2020) reported that the crown diameter of strawberry displayed high positive correlation coefficients with the yield but our results indicated that crown diameter exhibited negative correlation with the yield. Sim et al. (2020) reported that the crown diameter of strawberry displayed high positive correlation coefficients with the yield but our results indicated that crown diameter exhibited negative correlation with the yield. The crown diameter may induce random noise due to thinning. Therefore, it might be that the crown diameter is an unreliable factor in predicting the yield.
There are several relationships between growth and fruit yield reported in the literature. Kašičkina (1959) graded strawberry seedlings within families according to vigor and demonstrated that the most vigorous seedling gave the heaviest yield, while Topčijski (1964) did not find a correlation between leaf size and fruit yield and leaf width and fruit yield. These results showed that the correlations are cultivar-specific ( Table 2). The leaf width was a useful predictor in the predictive models for Arihyang and Keumsil, but it was not for Maehyang. In addition to leaf size and width, studies of the correlation between other leaf factors and fruit yield were conducted. Pickett (1917) reported simple phenotypic correlations from 900 strawberry seedlings derived from crosses of 17 parents. The correlation coefficients between the number of leaves and fruit number, between the average area of leaflets and the average weight per fruit, and between the total leaf area and total yield were r = 0.48 (p < 0.001), 0.29 (p < 0.01), and 0.75 (p < 0.001), respectively. In our data, leaf length was associated with fruit yield (p < 0.001), but the number of leaves was not (p < 0.213). This indicates the need to determine various leaf factors to predict the fruit yield of strawberries. Moreover, the RFR ratio was a suitable parameter for modeling the total fruit yield. It is a wellknown fact that far-red promotes flowering (Collins and Barker 1964) and fruit yield (Zahedi and Sarikhani 2016), but the prediction of fruit yield using RFR has not been conducted much in strawberry studies. We demonstrated that the leaf width of strawberry had the greatest effect on the RFR of the crown (Table 1 and Fig. 2). The leaf width and length are cultivar characteristics. Therefore, it is necessary to evaluate the relationship between the leaf width, RFR, the number of flowers, flowering date, and fruit yield in future studies.
Based on the relationships between growth and strawberry fruit yield, prediction models have been used to predict yield. Still, it appears challenging, particularly with the thinning procedure. As indicated in Fig. 2, the growth variables tend to differ from time to time. In this study, the growth variables observed close to the time of the first flowering was used (Fig. 3), and the predictive models underperformed when the growth variables observed earlier were used. Additionally, among the five cultivars in this study, the predictive models for 'Jukhyang' and 'Seolhyang' underperformed (r 2 = 0.10 and 0.29, respectively) relative to the other cultivars (Table 2). Sim et al. (2020) also attempted to predict the fruit yield of 'Seolhyang' strawberry based on growth data using multiple regression analysis, and the correlation coefficient of the growth data was r 2 = 0.36, which was similar to the result based on the single-model approach (r = 0.53, r 2 = 0.28). However, the correlation between the predicted and observed outcomes was improved by the cultivar-specific approach (r = 0.77, r 2 = 0.59) as well as correlations between the growth factors and fruit yield. Hondelmann (1965) and Kaljaskina (1966) reported the correlation between fruit yield and the number of crowns at the time of fruiting (r = 0.56) and between plant vigor and the number of trusses (r = 0.42) and flower size as well as the berry size (r = 0.7). If various characteristics were joined to the data, it might help improve the prediction of fruit yield. For example, photosynthetic and root characteristics can be collected and used. Choi and Jeong (2020) determined the photosynthesis parameters of leaves and root activity of 'Arihyang' and 'Keumsil'. Therefore, it is necessary to try to predict fruit yield using more characteristics in future studies.

Conclusions
The fruit yield using the growth data of the strawberry cultivars was predicted. The leaf length, crown diameter, and RFR as predictors because the factors were strongly associated with the fruit yield. Moreover, the cultivarspecific approach based on the growth factors improved the correlation between the predicted and observed outcomes. The results indicated that the fruit yield of the tested strawberry cultivars might be predicted using the growth data during the cultivation period. Fruit yield prediction can improve further if more cultivars and growth data are obtained.