Keywords

1 Introduction

Apple is a vital commercial crop in the world. There are about 90 countries and regions grows apples and China is a main country of apple plantation and sales. In 2015, the yield of apples reached 4261.34 million tons, which accounts for up to 55 percent of global output of apples [1,2,3]. Fruit operation in apple orchard contains fruit thinning and bagging in young fruit stage, and harvesting in fruit maturation period. The workload of fruit operation accounts for 40 percent of all workload in entire apple production cycle, and this work is still completed reliant on manual operation. However, labor shortage and increased labor costs in rural areas brought by the rapid development of urbanization progress makes current human-based production pattern unable to meet the ever-increasing demand of apple production scale [4]. Therefore, introducing robots to replaced human labor for fruit operation is an inevitable tendency in digitization and mechanization of modern orchards management [5, 6]. As the eye of orchard robot, visual system pays a crucial role during the apple detection, and the accuracy and speed of fruit detection directly affects the efficiency of fruit operation. Researching detection approach of young green apples, therefore, was of great realistic significance.

For the past few years, many scholars put effort into the research of fruit detection in natural environment [7,8,9,10]. According to the peculiarity that apple fruit has stronger red component than background such as branches and leaves, Bulanon et al. [11] realized the separation between fruit and background by using optimal threshold segmentation algorithm, and the success rate for fruit detection of Fuji apple achieved 88%. Huang et al. [12] developed a method based on 2D fuzzy entropy for mature apple detection by combining two-dimensional histogram and genetic optimization algorithm in L*a*b* color space, and this method have a better performance for red apple detection. To solve the problem of varying illumination in apple detection, Kim et al. [13] developed an improved method of statistical transformation that reduced impact from illumination by extracting structural information, and successfully detected red apple in orchard through united L*a*b* color space and AdaBoost algorithm, and success rate was 80.68%. Qian et al. [14] proposed a detection method for mature apple based on a mixed color model of R/B and V, and success rate achieved 84.9%. The researches above are focused primarily on the detection of mature apple, which have enough color contrast between apple fruit and background. On this case, it can be acquired a good performance for fruit detection only used color feature of objects. In the aspect of green fruits detection, Rakun et al. [15] and Linker et al. [16] developed a method for green apple detection and amount estimation by using color and texture feature. With the help of ring flash lamp, Zhang et al. [17] developed hybrid classifier based on color feature and support vector machine for green fruits detection at nighttime, and the average rate of correct detection was 89.3%. Si et al. [18] proposed an algorithm integrated gap statistic with K-means for green apple detection, and success rate achieved 81%. Most green fruits detection studies above were implemented in fruit inflating stage. In this stage, fruits are still green, but they are large in size and thick leaves may cuts through solar glare, which makes even illumination condition around the fruit and lowers the effect of sun’s diffuse light. For the apples in young fruit stage, however, fruits, stems and leaves have higher color similarity and small sun-shade area, which are so likely influenced by uneven illumination from sun light that it unable to get better performance for fruits detection only used single feature. For this issue, the goal of this study was to develop a computer vision approach combining adaptive chromatic aberration algorithm and HOG-SVM model to detect young apple objects in natural environment using conventional color images. The proposed young apple detection method was a combination of computing chromatic aberration map, segmenting region of interesting (ROI) with iterative threshold segmentation algorithm, extracting potential apple objects using CHT algorithm, and false fruits were distinguished and removed by utilizing the HOG feature and GS-SVM classifier.

2 Materials and Methods

2.1 Image Acquisition

For developing and testing proposed algorithm, images of apples in orchard were acquired in natural environment using digital camera (D90, Nikon) with a resolution of 4288 × 4848 pixels. Images were photoed randomly in June 2016 in apple orchard of institute of pomology of CAAS, Xingcheng, China. Hanfu apple were selected as the testing fruit variety. The purpose of this research develop a detection method for young green apples using color images, thus images were re-sized to 858 × 570 pixels utilizing the bi-cubic interpolation algorithm for computational convenience.

2.2 ROI Extracting

ROI extracting is an important step for young apples detection, and color is a most intuitional feature for objects separation. The common color spaces in computer vision mainly including tree primary colors space such as RGB model; luminance-chrominance space such as L*a*b* model; sensual space such as HSV/HSI model; individual axle space such as Ohta model [19]. Background like green leaves and stems contain similar color with young apples, so seeking out the color characteristic that can effectively separate fruits from others was a critical issue. Line section approach was used to analyze the distinction of young green apple in color images, and results show that apples’ value of G component was stronger than that of B component in RGB color space, as shown in Fig. 1.

Fig. 1.
figure 1

Results of line section analysis

The color of fruit images might be affected by diffused lights although G-B chromatic aberration segmentation can distinguish fruits from backgrounds in some occasion. For further decrease this influences, color feature can be adjusted by changing coefficients m and n in chromatic aberration formula mG-nB to adapted varying illumination. According to the transformation relations, therefore, formula mG-nB was equivalent to G-(n/m) × B, and when n and m takes average gray values of image’s B and G component, respectively, adaptive green and blue chromatic aberration formula can be reduced as shown in Eq. (1):

$$ AGBCA = G - \frac{{B_{avg} }}{{G_{avg} }} \times B $$
(1)

In which G and B is component G and component B of RGB image respectively, Gavg and Bavg is the mean value of image’s component G and component B respectively.

The ROI was separated from background using iterative threshold segmentation (ITS) algorithm after adaptive green and blue chromatic aberration map has computed. The main idea of ITS algorithm is that an initial threshold was pre-chosen firstly, and then update the threshold in a specific strategy until a predefined criterion was met [20]. The detailed steps are as follow:

  • Step 1. Take average intensity of the image as initial threshold T0 that obtained by Eq. (2):

  • $$ T_{0} = \frac{{\sum\limits_{m}^{{P_{1} }} {\sum\limits_{n}^{{P_{1} }} {\text{Q}(m,n)} } }}{{P_{1} }} $$
    (2)
  • In which P1 is total number of the image pixels, Q(m, n) is arbitrary value of a pixel intensity in the image.

  • Step 2. Two regions, D1 and D2, are divided by using initial threshold T0, and average intensities of D1 and D2, μ1 and \( \mu_{2} \) were defined as Eqs. (3) and (4):

  • $$ \mu_{1} = \frac{{\sum\limits_{{Q(m,n) < T_{c} }} {Q (m,n )P_{ 1} (m,n )} }}{{\sum\limits_{{Q(m,n) < T_{c} }} {P_{ 1} (m,n )} }} $$
    (3)
  • $$ \mu_{2} = \frac{{\sum\limits_{{Q(m,n) > T_{c} }} {Q (m,n )P_{ 1} (m,n )} }}{{\sum\limits_{{Q(m,n) > T_{c} }} {P_{ 1} (m,n )} }} $$
    (4)
  • In which Tc is threshold, P1(m, n) is the intensity value of every pixel in the image.

  • Step 3. Bringing \( \mu_{2} \) and \( \mu_{2} \) calculated from Eq. (3) and Eq. (4) respectively into the following Eq. (5), and obtained a new threshold Tc + 1.

  • $$ T_{c + 1} = \frac{{\mu_{1} + \mu_{2} }}{2} $$
    (5)
  • Step 4. Repeated step2 and step3 until the difference value between Tc + 1 and Tc lower than a predefined value.

  • Step 5. Using the value acquired from iterative computation as the threshold of image’s binarization segmentation to set potential apple areas apart from the background.

2.3 Image Denoising and Objects Detection of the Potential Apple

Some pixels of the background that color like fruits were not separated after ROI segmentation, and it becomes image noise so that affected the accuracy of fruits detection. To solve this issue, morphological processing approaches with the disk structuring element were used to reduced image noise. Utilizing Eq. (6), at first, to executing erosion operation to the image. Then, expanded object region by applied Eq. (7) to eliminate noise in image. Using flood-fill algorithm to filled the missing holes of apple fruits in processed image.

$$ A\Theta B = \left\{ {x |\left( B \right)_{x} \subseteq A} \right\} $$
(6)
$$ A \oplus B = \left\{ {x|\left[ {\left( B \right)_{x} \hat{ \cap }A} \right] \ne \varnothing } \right\} $$
(7)

In which A is the processing image, B is the disk structuring element.

Many tiny noise were filtered after morphological processing, but some noise in larger size are still existed, and it needs to be further treated. It is found through observation that apple fruits, relative to stems and leaves, more resembles circular, and the eccentricities of the fruits were usually less than 0.9 and the areas of the connected regions were commonly larger. On account of this, blob analysis method was applied for further image denoising. Concrete steps are as follow:

  • Step 1. Defined every connected region as a blob, and every blob’s eccentricity E can be calculated by Eq. (8).

  • $$ E = \frac{(rd - nd)}{(rd + nd)} $$
    (8)
  • In which rd is the semi-major axis, nd is the semi-minor axis.

  • Step 2. The calculated set of blob eccentricity was Ei, i = 1…p, p is the quantity of blobs. If Ei was greater than 0.9, then assigning all pixels in the i-th blob to 0.

  • Step 3. Computing pixel area of blob that has been processed with eccentricity analysis.

  • Step 4. The calculated set of blob pixel area is Ej, j = 1…q, q is the quantity of blobs. If Rj satisfied Eq. (9), then assigning all pixels in the j-th blob to 0.

  • $$ R_{j} \le \gamma R_{\hbox{max} } $$
    (9)

In which, \( \gamma \) constant and in this study \( \gamma \) = 1/5, Rmax is the pixel quantity of the largest blob in image.

The IOR and background in image has been largely separated after processing above, but few noise patches that eccentricity and pixel area similar to apple fruits are existed yet. And this noise cannot be processed with color feature already, which brought challenges to fruit extracting in the condition of similar colors. It is found through observation that despite there is a difference between young apple and mature apple in size, they still present spherical characteristic and partial circular contours can be obtained [21].

Circular Hough transformation (CHT) is a variant of standard Hough transformation. It firstly calculated parameters of a circle according to features of pixels located in a known perimeter line segment, and then fitting the line segment into a circle using that parameters [22]. The parameters equation of known center and radius of a circle can be defined as Eq. (10).

$$ \begin{aligned} x = x_{o} + r\,\cos \theta \hfill \\ y = y_{0} + r\,\sin \theta \hfill \\ \end{aligned} $$
(10)

In which \( r \) is the radius of the circle, \( \theta \) is the intersection angle formed by the x axis with the line \( \overrightarrow {OP} \) that joined center point \( O(x_{0} ,y_{0} ) \) and a certain point \( P(x,y) \) on the circle.

CHT has the characteristics of strong robustness for varying illumination, noise interference and partial occlusion.

It is found through observing and analyzing fruits in images that the radii of young apples are between 20 pixels and 60 pixels. So used it as the restricted radius range of CHT detection to improved detection efficiency for circular objects. However, traditional CHT detection commonly voted candidate pixels of the circle that generated by around fixed radius, and this strategy requires 3D array to stored radii votes, which takes up more memory capacity and processing time. To remedy this issue, Atherton et al. [23] proposed a Phase-Code CHT employed 2D accumulator array pattern, in which a kind of radii codes was used as coherent in the separated parameters space and carry on accumulation computing using the ring core of phase coding. This pattern brought faster processing in estimation of circular radius [24], and it could better satisfying the real-time request of young green apple detection. Given these considerations, the Phase-Code CHT algorithm was adopted in this study for detection potential fruit objects.

Two major parameters determine the results of circle extraction during circular objects detection. One parameter is sensitivity factor that reflecting the level of sensitivity to the accumulator array, and the greater sensitivity it was, the stronger detection ability for weak and fuzzy circles it possessed. Another parameter is gradient threshold of the edge that reflecting the level of sensitivity to edge pixels, and the threshold lower it was, the stronger detection ability for weak edges it possessed. It is found by experiments that the circular objects can be detected preferably when sensitivity factor was 0.9 and gradient threshold of the edge was 0.4.

2.4 False Apple Objects Removing

After procedure above, potential apple objects have been detected, but few leaves might be identified as fruits that caused false positive errors. To get rid of false fruit objects in the image and improved the accuracy of fruits detection, a model combining HOG feature and GS-SVM classifier was proposed.

2.4.1 HOG Feature Extraction

HOG feature is a descriptor used for describing local feature of the image. The main idea of the algorithm is that local object appearance and shape can often be featured rather well by the distribution of local intensity gradients or edge directions in an image [25]. HOG feature have the advantage of insensitivity with geometric distortion and optical deformation of the image. This advantage weakened the influences from varying illumination and posture rotation of the objects. HOG feature currently in use to pedestrian detection, but not in use to apples detection yet. The main processing steps of HOG feature extraction were described as follow:

  • Step 1. Calculating the gradients of horizontal direction and vertical direction of the processing image. The gradient of the pixel point (x, y) were defined as Eq. (11) and Eq. (12):

  • $$ G_{x} (x,y) = I(x + 1,y) - I(x,y) $$
    (11)
  • $$ G_{y} (x,y) = I(x,y + 1) - I(x,y) $$
    (12)
  • In which \( G_{x} (x,y),G_{y} (x,y),I(x,y) \) are the horizontal gradient, vertical gradient and pixel value of the pixel point (x, y), respectively.

  • Step 2. Calculating gradient magnitude and direction of every pixel in the light of the gradients of horizontal direction and vertical direction of the pixels. The gradient magnitude \( G(x,y) \) and gradient direction \( \alpha (x,y) \) of the pixel point (x, y) was defined as Eqs. (13) and (14):

  • $$ G(x,y) = \sqrt {G_{x} (x,y)^{2} + G_{y} (x,y)^{2} )} $$
    (13)
  • $$ \alpha (x,y) = \arctan^{ - 1} (\frac{{G_{y} (x,y)}}{{G_{x} (x,y)}}) $$
    (14)
  • Step 3. Dividing the image into multiple cells and quantificating gradient into 9 directions, and every cell contains 8 × 8 pixels, then the histogram of oriented gradients of every cell was counted, and forming the HOG descriptor of every cell.

  • Step 4. Connecting descriptor of 4 cells in a block after combining every 2 × 2 adjacent cell into a block, then forming HOG descriptor of the block.

  • Step 5. Connecting HOG descriptors of all the blocks, and acquired HOG descriptor of the image.

2.4.2 Improved Support Vector Machine Based on Grid Search Optimization Algorithm

HOG features extracted from the image need to be analyzed and classified to identify whether detected circular objects are real fruits. Support vector machine(SVM) is a machine learning model suitable for binary classification with small training sample [28]. It through the transformation from nonlinear input space to high dimension space applied the kernel function to solved the optimal classification hyperplane [29]. Let \( (\overrightarrow {{x_{1} }} ,y_{1} ), \ldots (\overrightarrow {{x_{N} }} ,y_{\text{N}} ) \) be the training sample set for classification, where the sample feature vector \( \overrightarrow {x} \in R^{m} \), m is the dimension of real topological space, the cluster tag vector \( y \in \{ - 1, + 1\} \), N is the number of classification samples. Training samples and acquired a classification function that able to classified test data correctly. Construction and solving of the optimization classification function in non-linear classification was defined as Eq. (15):

$$ \left\{ {\begin{array}{*{20}l} {\quad maxL(\alpha ) = \sum\limits_{i = 1}^{N} {\alpha_{i} } - \frac{1}{2}\sum\limits_{i = 1}^{N} {\sum\limits_{j = 1}^{N} {\alpha_{i} } } \cdot \alpha_{j} \cdot y_{i} \cdot y_{j} \cdot K(x_{i} ,y_{j} )} \hfill \\ {s.t\,\sum\limits_{i = 1}^{N} {\alpha_{i} } \cdot y_{i} = 0,\sum\limits_{j = 1}^{N} {\alpha_{j} } \cdot y_{j} = 0} \hfill \\ {\;0 \le \alpha_{i} ,\,\alpha_{j} \le c,\,i,j = 1, \ldots ,N} \hfill \\ \end{array} } \right. $$
(15)

In which c is penalty factor that reflecting the degree of penalty of wrongly classified sample. Optimal solution \( \alpha^{ *} = \left( {\alpha_{1} ,\alpha_{1} , \cdots \alpha_{N} } \right)^{T} \) was obtained from the equation above, and classification threshold \( b^{*} \) as Eq. (16):

$$ b^{*} = y_{i} - \sum\limits_{i = 1}^{N} {\sum\limits_{j = 1}^{N} {y_{i} \alpha^{*} K(x_{i} ,y_{j} )} } $$
(16)

Then the function of optimization classification \( H\left( x \right) \) as Eq. (17)

$$ H(x) = \text{sgn} (\sum\limits_{i = 1}^{N} {\alpha^{*} } \cdot y_{i} \cdot K(x_{i} ,x_{j} ) + b^{*} ) $$
(17)

In which \( \text{sgn} ( \cdot ) \) is the symbolic function, K is kernel function.

Given radial basis function (RBF) possessed faster learning convergence and powerful ability in approximating any non-linear function [30], this study applied RBF as the kernel function that was defined as Eq. (18):

$$ K(x_{i} ,x) = \exp (\left\| {x - x_{i} } \right\|^{2} /g) $$
(18)

In which g is the selected parameter of the kernel function.

The default parameters, penalty factor c and kernel parameter g, determines the training result of the classification model [31]. To solve the problem of high distinguishing error rate caused by the improper parameters selection in traditional SVM, the grid search algorithm that have a significant ability in globally optimal solution computing was introduced to optimizing the parameters. The procedure of optimization is as follow:

  • Step 1. Meshing parameters c and g in certain space scope, then traversing every point in the grid and chosen its value.

  • Step 2. Calculating classification accuracy of the validation set with chosen c and g using K-CV algorithm.

  • Step 3. Repeat step 2, and get a couple of c and g as optimal parameters that possessed highest classification accuracy of the validation set.

3 Experimental Results and Discussion

In order to verify the validity and accuracy of the approach, 108 young apple images were selected randomly as training set, and another 108 images were selected randomly as test set. All algorithm programmed in Matlab 9.1 (supplied by MathWorks Corporation at Nedik in Massachusetts, USA) ran in the PC with an Intel Core i5-4210 M, 4 GB RAM, a Windows 10 operating system. The flow chart of the proposed algorithm was described in Fig. 2.

Fig. 2.
figure 2

Flow chart of the proposed algorithm

3.1 ROI Segmentation and Extracting

Figure 3 shows the result of an example of 2 components image, 2 chromatic aberration maps, and their corresponding binary images. Figure 3(a) shows an example of re-sized original image. Figure 3(b), (c) and (d) shows the G component image and its corresponding binary images after segmentation using OTSU algorithm and ITS algorithm, respectively. Figure 3(e), (f) and (g) shows the B component image and its corresponding binary images after segmentation using OTSU algorithm and ITS algorithm, respectively. Figure 3(h), (i) and (j) shows the general G-B image and its corresponding binary images after segmentation using OTSU algorithm and ITS algorithm, respectively. Figure 3(k), (l) and (m) shows the adaptive G-B chromatic aberration map proposed in this study and its corresponding binary images after segmentation using OTSU algorithm and ITS algorithm, respectively. It can be seen that the adaptive chromatic aberration map reduced some over-saturation areas on the fruit surface and this can avoid highlight fruit pixels were being filtered while ROI was not separated from background as a large number of noise were existed after G component image, B component image, and general G-B image were segmented. And ITS algorithm, relative to OTSU algorithm, could remove more background pixels while maintains more fruits pixels.

Fig. 3.
figure 3

Example of 2 components images, 2 chromatic aberration maps, and their corresponding binary images: (a) Original image, (b) G component map, (c) Binary image of G component using OTSU, (d) Binary image of G component using ITS, (e) B component map, (f) Binary image of B component using OTSU, (g) Binary image of B component using ITS, (h) General G-B image, (i) Binary image of general G-B using OTSU, (j) Binary image of general G-B using ITS, (k) Adaptive G-B image, (l) Binary image of adaptive G-B using OTSU, (m) Binary image of adaptive G-B using ITS.

3.2 Image Denoisng

Most of the background pixels were eliminated in image after initial segmentation described above. However, some noise still existed mainly because partial stems and leaves in natural illumination presented so similar color with young apples that fruits cannot be segmented using color information already. Hence using morphological processing and blob analysis for further separation of fruits and background in the light of the shape difference between fruits and background. Figure 4 shows the de-noising results of an example of image segment outs. Figure 4(a) and (b) shows the results of morphological open operation and holes filling, respectively. Figure 4(c) and (d) shows the blob analysis results in eccentricities and pixel areas. It can be seen that morphological operation could remove tiny background noise and improved fruits’ integrity in image, but larger area noise were not eliminated, and this noise could be removed by blob analysis.

Fig. 4.
figure 4

Example of de-noising results of image segmentation outs: (a) Image after morphological open operation, (b) Image after morphological holes filling, (c) Image after blob analysis with eccentricity, (d) Image after blob analysis with pixel area.

3.3 Objects Detection of Potential Circular Fruit and Its Discriminant Validation

Few noise of stems and leaves that eccentricity and pixel area resembles young apples still existed after a series of above-mentioned processing. Therefore, improved circular Hough transformation (CHT) algorithm was used to detect potential circular apples in view of apple fruits similar to rounded objects. Figure 5 shows the result of an example of potential circular apples detection. It can be seen that potential circular apples can be preferably detected by utilized CHT algorithm, but a small number of leaves might be fitted as circular objects, which results the false positive errors. For the higher detection accuracy, HOG-SVM model that combined HOG feature descriptor and GS-SVM classifier was applied to inspect and judge whether potential circular objects are apples.

Fig. 5.
figure 5

Result of an example of potential by circular apples detection.

424 images were chosen randomly from training set as the training samples of the SVM classifier, in which the positive sample set was consist of 214 young apple images and the negative sample set was consisting of 210 images that contained backgrounds like stems and leaves. The HOG feature was extracted after every sample image were normalized 48 × 48 pixels, and finally acquired 900 dimensions’ character in every sample image. Optimizing classification parameters for SVM classifier and obtained that the optimal penalty factor c was 5.278 and kernel parameter g was 0.0039. Takes c, g and the HOG feature of training image set as input to training the support vector machine, then the HOG-SVM classification model was generated. When this model in used, the HOG feature of the circular objects detected by CHT were inspected by trained SVM classifier to decide whether the objects were real apples. Figure 6 shows the classified result of an example using HOG-SVM classification model. It can be seen that the false fruit objects have been completely removed after proposed classification model in used.

Fig. 6.
figure 6

Classified result of an example using HOG-SVM classification model.

The testing image set that contains 108 images that contains 209 young apples were utilized for further validity verification of the proposed method. The gray-level co-occurrence matrix (GLCM) and the local binary patterns (LBP) were be chosen as the classification features for comparison with proposed method. Table 1 shows the comparison result of detection performance of three methods. It can be seen that the distinguishing abilities of the GLCM-SVM model and LBP-SVM model were weaker despite they could distinguished most of apples in correctly. Serious false positive errors were occurred when using GLCM-SVM model and LBP-SVM model, which leads to lower correct detection rate in general. The total correct detection rate of the proposed method in this study achieved 93.13%, false negatives rate was 11.49%, and false positive rate was 4.84%. These results shows that proposed method had a higher correct detection rate for young green apples and a stronger distinguishing ability for background.

Table 1. Comparison of recognition performance of three methods

4 Conclusion

An approach to detecting young green apples in natural condition was developed. To use shape, texture and color information, adaptive chromatic aberration, blob analysis, CHT algorithms and HOG-SVM classification model were implemented. A variety of detections were merged for the final work. And preliminarily verified the capacity of machine detection for young green apples in natural environment. According to the results, some conclusions can be made: (1) OTSU algorithm and ITS algorithm were applied respectively to the segmentation of the general G-B chromatic aberration map and the adaptive G-B chromatic aberration map. The result shows that OTSU algorithm unable to separated fruits from the background. The proposed segmentation method consists of adaptive G-B chromatic aberration and ITS algorithm has better performance for young green apples. There is higher intensity contrast between apples and background in adaptive G-B chromatic aberration map. ITS algorithm has stronger segmentation ability in complex illumination condition than OTSU algorithm. (2) Blob analysis and morphological operation can remove most of noise in segmented image. Improved CHT algorithm has good adaptability and robustness. The algorithm can effectively overcome the interference of incomplete fruit contours results from partial occlusion, and have good performance for the extracting and fitting of the potential circular fruits. (3) In the framework of SVM model, the feature of histogram of oriented gradients in proposed method was employed and compared with GLCM feature and LBP feature. The results indicated that the HOG-SVM model proposed in this study has better detection performance in general.

The method proposed in this study might be beneficial for robot to perfected the ability of young green fruits detection and possess good application prospect in the future. However, the detection capacity of non-circular fruit and large calyx fruit remains to be improved. Further work will focus on the detection of this fruits in complex natural environments.