Introduction

Apple (Malus domestica Borkh.) is a widely cultivated crop in temperate regions of the world, significant in terms of the economy [1]. Apples are very popular and frequently consumed throughout the whole year. Including apples in a diet is very important for human health because fruit contains, among others: vitamins, dietary fiber, monosaccharides, minerals, and phenolic compounds. Eating apples can prevent many diseases, for example, type-2 diabetes, ischemic heart disease, asthma, and lung cancer. Consumers pay attention to the quality and appearance of apples [2]. Nowadays, there are several thousand apple cultivars [3]. Surface texture, color, and size, which can be determined using image analysis, as well as the taste and chemical composition can affect the quality of apples. Some features may be similar for the selected cultivars, but each cultivar is characterized by specific properties, which may determine the prices and human preferences [4]. In addition to the cultivar, the quality of apples may depend on the cultivation conditions, climate, and geographical origins [5]. Differentiation of fruit properties including texture, color, and shape may depend on the degree of ripeness [6]. It is an important issue for carrying out cultivar discrimination using these characteristics. The results of discrimination of different cultivar performed for fruit and vegetable may be used in practice. For example, in processing and packing factories for post-harvesting applications [7]. Additionally, the detection of cultivars may be important in supermarkets for an automatic definition of prices. Among the set of different fruit and vegetables, vision systems may enable the determination of the species of individual fruit or vegetable, as well as its cultivars. These systems may allow the reduction of time required for determining the price compared to the cashier [6]. Different apple cultivars can be mixed during harvesting or marketing. Mixing of cultivars (unintentional or fraudulent) can be caused by the cultivation of several apple cultivars in one orchard, the presence of many small orchards, and the sale of different apple cultivars at the same time. It can generate some serious problems for the industry. Manual detection of cultivars, especially if the fruit has a similar appearance, performed by humans may be subjective. More complicated chemical or physical analytical methods may include long-lasting and more expensive ones. Also, the fruit of different geographical origins can be mixed. Therefore, effective techniques for the determination of the cultivar or geographical origin of fruit are desirable [3, 8, 9]. Different techniques or methods are used for examination of the authentication of food products, such as, for example, spectroscopy, mass spectrometry, chromatography, nuclear magnetic resonance, molecular, immunological, isotopic, and bioinformatics [10]. Also, machine vision systems and image processing techniques can be applied to classify fruit [11, 12]. Application of machine vision based on color image processing for fruit sorting can provide high accuracy and can allow the detection of even slight changes and it is easier, more objective, less time-consuming, and arduous than human vision [13]. Computer vision can use image texture, which defines a function of spatial variation in values of the image pixels. Texture analysis provides numerical data computed from the image of the object. It can be applied to evaluate the quality and safety of food and agricultural products, among others in inspection and grading. Texture parameters can even specify the changes that are difficult to relate to changes, which are perceived in the visual manner [14].

The study was aimed at the evaluation of the usefulness of textures calculated from the skin, longitudinal section, and cross-section images of apples for discrimination of different apple cultivars. The research was intended to reveal whether discrimination based on the outer area of apples (skin) or flesh (longitudinal and cross-sections) provided the highest level of correctness.

Materials and methods

Materials

The apples of three cultivars: ‘Szampion’, ‘Idared’ and ‘Gala’ were purchased in the local supermarket in Poland, where they were delivered after several months of cold storage. The apples of individual cultivars differed in color and size. Some examples of whole apples ‘Szampion’, ‘Idared’ and ‘Gala’ are presented in Fig. 1.

Fig. 1
figure 1

Examples of whole apples of different cultivars: a, d—‘Szampion’, b, e—‘Idared’, c, f—‘Gala’

The apples were cleaned and air-dried at room temperature. The sections of apples were prepared by cutting an apple in one move with a sharp knife. The experiments using the image analysis were performed using 50 apples for the analysis of skin, 50 apples for the longitudinal section, and 50 apples for the cross-section for each cultivar.

Image analysis

The Epson Perfection flatbed scanner was used for acquiring the images. The samples were scanned at the 1200 dpi resolution and saved in the TIFF format. Whole apples (with skin) were scanned from different sides of the lateral surface. For individual apples, two scans of skin were performed. In the case of longitudinal sections, the whole area of the section was scanned. For individual apple, two scans of the longitudinal section were obtained, including one for each half-formed after cutting the apple. Similarly, for cross-sections, the whole section of an apple was scanned, and two scans of cross-sections were acquired for every individual apple. Due to this, the image analysis was performed in 100 replications for the apple skin, 100 replications for longitudinal section and 100 replications for cross-section for each cultivar. The obtained images were processed using Mazda software (Łódź University of Technology, Institute of Electronics, Poland) [15]. In the case of images of the outer surface of whole apples with skin, the regions of interest (ROIs) including the areas of apples with high resolution adjacent to the scanner scene were selected from images. For cross-sections and longitudinal sections of apples, ROIs included the whole section area without core and skin. The images were converted to individual color channels. The texture features based on the run-length matrix, co-occurrence matrix, gradient map, histogram, Haar wavelet transform and autoregressive model were calculated from color channels: R, G, B, L, a, b, U, V, H, S, I, X, Y, Z [15].

Statistical analysis

To determine the discrimination between textures of apples ‘Szampion’, ‘Idared’ and ‘Gala’, the discriminant analysis was performed using WEKA 3.9 software (Machine Learning Group, University of Waikato) [16]. The analyses were performed separately for the textures from images of the skin, cross-sections and longitudinal sections of apples. The total accuracy and accuracy for discrimination of individual ‘Szampion’, ‘Idared’ and Gala apple cultivars were determined. In the first stage of the analyses, the selection of texture parameters was carried out using Best First with the CFS (Correlation-based Feature Selection algorithm) subset evaluator [17]. This method of attribute selection provided textures, which were characterized by the highest accuracies. A test mode of the tenfold cross-validation was used for the classification [17]. Discrimination analyses were performed for the individual color channel (R, G, B, L, a, b, U, V, H, S, I, X, Y, Z) using discriminative classifiers from groups of Bayes (Bayes Net, Naive Bayes), Functions (Logistic, Multilayer Perceptron), Meta (Multi Class Classifier, Filtered Classifier), Rules (J Rip—Java Repeated Incremental Pruning, PART) and Decision Trees (J48, LMT—Logistic Model Tree) [17].

The Bayes Net classifier learned Bayesian networks. For the estimation of the conditional probability tables of a network after learning the structure, the Simple Estimator at alpha equal to 0.5, which estimates probabilities from data, was used. The batch size as the number of instances to the process was 100. The K2 search algorithm using a hill-climbing restricted by an order of the variables with the maximum number of parents equal to 1 and the score type of Bayes for the judgment of a network structure equality were applied. The Naive Bayes as a probabilistic classifier was used at the batch size of 100. The Logistic classifier applied the logistic regression model at the batch size of 100, the maximum number of iterations -1 and the ridge value in the log-likelihood of 1.0E-8. The Multilayer Perceptron based on a neural network with backpropagation was used at the batch size of 100, the hidden layers of ‘a’ = (attributes + classes)/2, the learning rate of 0.3, the momentum of 0.2, the training time (number of epochs to train through) of 500. The Multi Class Classifier using a two-class classifier for multiclass datasets was applied at the batch size of 100, the random width factor of 2.0. The Filtered Classifier running a classifier filtered data was used at the batch size of 100 and the Discretize filter. The J Rip as RIPPER (Repeated Incremental Pruning to Produce Error Reduction) algorithm for effective and fast rule induction was applied at the batch size of 100, the folds of 3, the minimum total weight of the instances in a rule of 2.0, the number of optimizations runs of 2. The PART building the partial decision trees using J4.8 to obtain the rules was used at the batch size of 100, the minimum number of instances per rule of 2, the confidence factor for the pruning of 0.25, the number of folds of 3. The J48 generating the decision trees using the C4.5 algorithm was applied at the batch size of 100, the number of folds of 3, the minimum number of instances per rule of 2, the confidence factor for the pruning of 0.25. The LMT using logistic model trees was applied at the batch size of 100, the minimum number of instances of 15, the number of boosting iterations of -1 (number was cross-validated) [17, WEKA software]. In addition to simple implementation and the configuration usually applied at the default settings, the advantage of all classifiers was the short computation time. The highest classification accuracy was the criterion for the evaluation of the analysis. The results of cultivar discrimination for selected channels, which had the highest level of correctness, are included in this paper.

Results and discussion

The first stage of the study included the analysis of images of the apple's outer skin. The images of the selected areas of the skin of whole apples ‘Szampion’, ‘Idared’ and ‘Gala’ from different color channels are presented in Fig. 2. Differentiation of the skin images in selected color channels between cultivars can be seen, which may affect the results of discrimination.

Fig. 2
figure 2

Images of the selected areas of skin of apples ‘Szampion’, ‘Idared’ and ‘Gala’ from different color channels

In the case of cultivar discrimination performed using textures from the images of the apple skin, the highest accuracies were obtained for channels R, a and X. For textures from channel R (Table 1), the total accuracy was equal up to 93% for Bayes Net classifier, from which apples of ‘Idared’ and ‘Gala’ were classified with 100% correctness and ‘Szampion’ with 80% correctness. Additionally, apples ‘Gala’ were correctly classified in 100% in the case of JRip, PART and J48 classifiers. For other classifiers, accuracy for ‘Gala’ was 90%. These accuracy results are very high and satisfactory. Apples ‘Idared’ were correctly classified in 80–100%. The accuracy in the range of 50–90% was observed for apples ‘Szampion’. The lowest total accuracy of 80% was determined in the case of the JRip classifier.

Table 1 The accuracy of discrimination between different apple cultivars based on textures selected from channel R of images of skin of apples

In the case of textures of images of skin of apples selected from channel a (Table 2), the apples of different cultivars were classified with total accuracy up to 93% (100% for ‘Gala’, 90% for ‘Szampion’ and 90% for ‘Idared’) for PART and J48 classifiers. As in the case of analyses carried out for selected textures from channel R, for channel a, the apples ‘Gala’ were classified most correctly (90–100%) for individual classifiers. Apples ‘Szampion’ were classified with accuracies of 80–90% and ‘Idared’—70–90%. The lowest total accuracy was 80% for Logistic and Multi Class Classifier.

Table 2 The accuracy of discrimination between different apple cultivars based on textures selected from channel a of images of skin of apples

For textures selected from channel X of the images of apple skins (Table 3), the highest total accuracy of discrimination of apples ‘Szampion’, ‘Idared’ and ‘Gala’ was 90% for Bayes Net, JRip, PART and J48 classifiers, from which the apples ‘Gala’ were correctly classified in 100%, ‘Idared’—in 90% and ‘Szampion’—in 80%. For individual classifiers, the accuracy for ‘Gala’ ranged from 80 to 100%, for ‘Idared’—from 70 to 100%, and for ‘Szampion’—from 60 to 80%. The lowest total accuracy was 73% in the case of Logistic and Multi Class Classifier.

Table 3 The accuracy of discrimination between different apple cultivars based on textures selected from channel X of images of skin of apples

In the next stages of the study, the discriminant analyses were performed for the apple sections. Both in the case of longitudinal section and cross-section, from all analyzed channels, the accuracies were the highest for textures selected from channels G, b and U. The images of the longitudinal section of apples ‘Szampion’, ‘Idared’ and ‘Gala’ from different color channels are presented in Fig. 3. In the case of discrimination based on the textures selected from channel G of the longitudinal section images (Table 4), the accuracy reached 100% for Multilayer Perceptron and Multi Class Classifier. In these cases, all examined apples ‘Szampion’, ‘Idared’ and ‘Gala’ were correctly classified. The obtained result is very satisfactory. Likewise, high accuracy of 97% was observed for three classifiers (Bayes Net, Logistic, LMT) and an accuracy of 93%—for one classifier (Naive Bayes).

Fig. 3
figure 3

Images of the longitudinal sections of apples ‘Szampion’, ‘Idared’ and ‘Gala’ from different color channels

Table 4 The accuracy of discrimination between different apple cultivars based on textures selected from channel G of images of the longitudinal section

For discriminant analyses performed based on textures selected from channel b of the longitudinal section images (Table 5), the total accuracy of 100% was determined for the Logistic classifier. In the case of other classifiers, the results of total accuracy were also very high—97% for Naive Bayes, Multilayer Perceptron, Multi Class Classifier and JRip, 93% for Bayes Net and LMT, 90% for Filtered Classifier, PART and J48. In all cases, the apples ‘Szampion’ were classified with 100% correctness. The accuracy for ‘Idared’ ranged from 90 to 100% and for ‘Gala’ was in the range of 70–100%.

Table 5 The accuracy of discrimination between different apple cultivars based on textures selected from channel b of images of a longitudinal section

Discriminant analysis of apples ‘Szampion’, ‘Idared’ and ‘Gala’ apple based on textures from channel U of the longitudinal section images (Table 6) provided the total accuracy up to 100% in the case of Logistic classifier. Likewise, high total accuracy of 97% was observed for Bayes Net, Multilayer Perceptron, Multi Class Classifier, PART, J48 and accuracy of 93% was determined for Naive Bayes, Filtered Classifier and LMT classifiers. Considering the classification accuracy of individual apple cultivars, 100% was obtained for ‘Szampion’ in the case of all classifiers, 80–100% for ‘Idared’ and 80–100% for ‘Gala’.

Table 6 The accuracy of discrimination between different apple cultivars based on textures selected from channel U of images of a longitudinal section

The following stages of the analysis included the evaluation of discrimination accuracies of different apple cultivars using selected textures from images of the cross-section. The examples of images of the cross-section of apples ‘Szampion’, ‘Idared’ and ‘Gala’ apple converted to different color channels are presented in Fig. 4. In the case of cross-section, the accuracies were slightly lower than for the longitudinal section analyses. Discrimination of different apple cultivars based on textures selected from channel G of images of cross-section (Table 7) revealed the total accuracy of up to 93% for Naive Bayes classifier, including 100% for ‘Idared’ and ‘Gala’ and 80% for ‘Szampion’. A lower total accuracy of 90% was determined for Multilayer Perceptron and LMT classifiers. Evaluation of the classification accuracy of individual apple cultivars revealed that 100% was obtained in the selected classifier for ‘Szampion’, ‘Idared’ and ‘Gala’.

Fig. 4
figure 4

Images of the cross-sections of apples ‘Szampion’, ‘Idared’ and ‘Gala’ from different color channels

Table 7 The accuracy of discrimination between different apple cultivars based on textures selected from channel G of images of cross-section

In the case of analysis performed for textures from channel b of images of cross-section (Table 8), the apples ‘Szampion’, ‘Idared’ and ‘Gala’ were classified with total accuracy equal up to 97% (LMT classifier), from which 100% was observed for ‘Szampion’, 100% for ‘Idared’ and 90% for ‘Gala’. The high correctness equal to 93% was also determined for Naive Bayes and J48, and 90% was obtained for Bayes Net, Logistic, Multilayer Perceptron, Filtered Classifier, JRip and PART. Among the individual apple cultivars, only ‘Szampion’ and ‘Idared’ were correctly classified in 100% for selected classifiers. The highest correctness for ‘Gala’ was equal to 90%.

Table 8 The accuracy of discrimination between different apple cultivars based on textures selected from channel b of images of cross-section

The highest total accuracy of discrimination of apples ‘Szampion’, ‘Idared’ and ‘Gala’ based on textures from channel U of images of cross-section (Table 9) reached 97% (100% for ‘Szampion’, 100% for ‘‘Idared’’ and 90% for ‘Gala’) in the case of LMT classifier. Likewise, high total accuracy of 93% was determined for Bayes Net, Logistic, Multilayer Perceptron, and 90%—for Naive Bayes. For selected individual classifiers, only apples ‘Szampion’ and ‘Idared’ were fully correctly classified (100%) and apples of ‘Gala’ were classified with the highest accuracy up to 90%.

Table 9 The accuracy of discrimination between different apple cultivars based on textures selected from channel U of images of cross-section

Based on obtained results, it was found that the selected skin texture features, longitudinal section and cross-section of apples calculated using image analysis can be very useful for discrimination of different apple cultivars. The results can be applied in practice for the detection of cultivar falsification of apples. Considering data included in the available literature, in the case of apples and other fruit, as well as vegetables, the high accuracies of cultivar discrimination were also determined. In the literature, digital image processing for cultivar identification was applied. It may also constitute an alternative for other techniques and methods, some of which may be more expensive or long-lasting. According to Ronald and Evans [4], classification based on size and color features from the images of different apple cultivars performed using Naive Bayes revealed the accuracies of 92% for Honey Crisp, 91% for Pink Lady, 90% for Golden Delicious in the case of validation data set, as well as in the case of test data set. Sofu et al. [11] classified apples ‘Golden Delicious’, ‘Starking Delicious’ and ‘Granny Smith’ using color, weight, size and stain parameters. In the case of color features of experimental apples, the sorting accuracy rates were equal to 100.00% for ‘Starking Delicious’, 95.08% for ‘Golden Delicious’ and 93.44% for ‘Granny Smith’. In the case of weight estimation of these apple cultivars, the RMSE (Root Mean Square Error) ranged from 0.80 for ‘Golden Delicious’ to 9.09 for ‘Granny Smith’. The sorting rate of stains was in the range of 65–80%. Using four features (color, size, stain, weight), the average sorting accuracy ranged from 79.00% for speed of 0.2 m s−1 to 89.00% for speed of 0.05 m s−1 [11]. The cultivar classification accuracy of ‘Gala’, ‘Granny Smith’, ‘Red Delicious’ and ‘Golden Delicious’ performed using textures calculated from on-tree image reached 84% [18]. Rudnik and Michalski [19] obtained total classification accuracy equal to 99.78% for testing data set of images of six apple cultivars using Deep Convolutional Neural Networks. Naranjo-Torres et al. [20] reported the high discrimination accuracies for different cultivars of apples and pears. The fruit of the class of Apple Red 1 was identified with correctness equal to 100%. The accuracies of 95% and 93% were observed in the case of Apple Golden 1 and Apple Pink Lady classes, respectively. In the case of pears, accuracies reached 100%, 99% and 88% were determined for Pear Williams, Pear Monster and Pear Red classes, respectively. Sabzi et al. [7] discriminated against the orange cultivars with the correctness of up to 96.70% using neural networks combined with metaheuristic algorithms. Dubey and Jalal [21] identified different categories of fruit and vegetables including Fuji Apple, Granny Smith Apple, Agata Potato, Asterix Potato, Cashew, Diamond Peach, Honeydew Melon, Kiwi, Nectarine, Onion, Orange, Plum, Spanish Pear, Taiti Lime, Watermelon. Using texture parameters based on the improved sum and difference histogram calculated from color images, the average accuracy of 98.90% for HSV color space was observed, from which Granny Smith Apple, Agata Potato, Asterix Potato, Cashew, Diamond Peach, Onion, Taiti Lime and Watermelon were correctly classified in 100%. Chithra and Henila [12] reported that the differentiation of apples (‘Fuji’, ‘Granny Smith’, ‘Red Delicious’) from bananas (yellow and green cultivars) based on the features computed using Hue from HSI images can reach the correctness of up to 100%. In the case of discrimination of different cultivars of mango fruit based on geometric and texture features using the Naive Bayes algorithm, the accuracies of 100% for HinThar, 95% for PanSwae and YinKwal, 90% for MaSawYin and SeinTaLone were obtained for testing data set [22]. In the case of identification of six nectarine cultivars based on features from images of the central part of the skin, the total accuracy up to 100% was determined for feature histogram vectors calculated based on the Rg (red and gray) and YR (luminance and normalized red) intensity color layers. It was a very satisfactory result compared to the cultivar classification accuracy of 87% obtained by an expert human operator [8]. In the case of application of hyperspectral imaging in the VNIR (Visible and Near Infrared) and SWIR (short wavelength infrared) spectral regions coupled with Sequential Minimal Optimization (SMO) classifier, the accuracy of 93.3% was observed for discrimination of apples ‘Champion’, ‘Idared’, ‘Gloster’, ‘Golden Delicious’ and ‘Topaz’ [23]. In addition to imaging techniques, there are literature data regarding the application of other techniques and methods for cultivar discrimination. Moriya et al. [24] and Ban et al. [25] used the simple sequence repeat (SSR) markers for the identification of different apple cultivars. Marrazzo et al. [26] indicated the usefulness of an electronic nose chemical sensor for the discrimination of different apple cultivars. Shang et al. [27] classified three apple cultivars using dielectric properties with the accuracies of approximately 100%. Cappellin et al. [28] reported that PTR-ToF–MS (Proton Transfer Reaction Time-of-Flight Mass Spectrometer) combined with data mining methods enabled the discrimination of the apple cultivars (‘Gala’, ‘Golden Delicious’, ‘Fuji’). Vis/NIR spectroscopy combined with wavelet transform—artificial neural network model disclosed 100% classification accuracy of apples ‘Fuji’, ‘Copefrut Royal Gala’ and ‘Red Delicious’ [29]. Also, in-line Vis/NIR diffuse reflectance spectroscopy allowed to obtain the high total accuracies of 98% for red apple cultivars (‘Fuji’, ‘Red Delicious’, ‘Royal Gala’) and 85% for yellow apple cultivars (‘Golden Delicious’, ‘Golden Rosé’) [3]. Application of near-infrared spectroscopy coupled with clustering algorithms for discrimination of apples ‘Fuji’, ‘Gala’, ‘Huaniu’ and ‘Huangjiao’ provided accuracy of 97% for fuzzy discriminant c-means clustering, 67% for fuzzy c-means clustering, 57.5% for Gustafson–Kessel clustering and 48.5% for possibilistic c-means clustering [5]. Classification of apples ‘Fuji’, ‘Red Star’, ‘Gala’, each cultivar from two different cultivation regions, using near-infrared spectroscopy combined with successive projections algorithm (SPA) and extreme learning machine (ELM) provided high correctness of 96.67% for the prediction set [9]. The application of visible range reflectance spectroscopy for discrimination of apple cultivars revealed an accuracy of up to 94% [30].

Conclusions

The research confirmed the usefulness of image processing based on the selected texture features of the skin, longitudinal section and cross-section of apples for discrimination of different apple cultivars. It was also proved that a flatbed scanner provides very satisfactory results and it may be an alternative for a camera or other devices. The total accuracies reached 100% for discrimination performed for textures from images of a longitudinal section of apples. The results of 100% were observed in the case of three color channels G, b and U. Slightly lower correctness of up to 97% was determined for images of cross-section based on selected textures from channels b and U. In the case of discrimination of different cultivar performed for the apple skin, the highest total accuracy was the lowest and reached 93% for textures from channels R and a. The results revealed that texture features calculated using image analysis may allow for cultivar identification of apples in an objective, inexpensive and fast way with a very high probability. It can be very important in practice for detecting the falsification of apple cultivars.