Camera Recognition and Laser Detection based on EKF-SLAM in the Autonomous Navigation of Humanoid Robot

The ability of autonomous navigation of the humanoid robot under unknown environment is very important to real-life applications. EKF-SLAM based on the camera recognition and laser detection for humanoid robot NAO is presented in this paper. Camera recognition is used to recognize if the object is a landmark. Because the computational resources needed for the feature-based position estimation are quite expensive, the laser instead of the camera provides the position of the landmark. A fractional order proportional-integral (PI) controller is designed to reduce the derivation of the NAO robot from the desired path during autonomous navigation. Experiments show that the proposed method is valid and reliable for autonomous navigation of the NAO robot under unknown environment.


introduction
With the progress of science and technology, service robots at home have been widely used for people.Widespread attention has been paid on the humanoid robot as one of the most active research direction of the robot research areas.The study focused on the fields of a complex environment to complete some human tasks, just to name a few, such as human assistance, home care and transportation [1].So it is necessary to study the robot to walk from one point to another independently, which is an essential component for basic action of robots.
Object recognition is a basic research in the field of machine vision, and it plays an important role in the realization of the target image understanding.In the process of moving object recognition and localization of mobile robot, the target position in the image, scale and viewpoint and the background images will change with the vision sensor of the robot.It increases the difficulty for the robot to identify and locate moving target [2].Target recognition based on color features, invariance of rotation, translation, scaling is suitable for the target with bright color.However, when the color of an object is similar to the target in the environment, it will lead to the failure of target recognition [3].So target / Published online: November 2017 recognition based on color features in this paper is used to judge the contour of the object.Thus the object recognition is more accurate and reliable [4].In this paper, the NAO robot will identify the target object with every step.
SLAM can estimate the structure of the surrounding world (the map), by using moving exteroceptive sensors, while it can simultaneously locate [5].SLAM is derived from the Bayesian framework, and the consistency issue is a fundamental problem.EKF-SLAM can produce inconsistency in the linearization process [6].EKF-SLAM uses the extended Kalman filter (EKF), which is developed from Kalman filter (KF).As the robot moves, this map keeps updating by the EKF.There are two critical steps, prediction (robot motion model) and correction (the robot sensor detects the landmarks that had been mapped before) [7].
In this paper, the robot using camera recognizes if the object which the robot sees is a landmark.Because the computational resources needed for the feature-based position estimation provided by the camera are quite expensive and the method cannot be run during the entire journey of the robot, a laser scanner is used to detect the position of the landmarks [8,9].The information of landmark does not need to be stored in the NAO robot before experiment.The method is introduced in section III(A).The parameters will be modified if the landmark changes.If the object is a landmark, the laser module is called to give the position of the robot.The proposed method in this paper is applied on a humanoid NAO robot [10].Camera is used to capture landmark for robot navigation based on EKF-SLAM.The laser sensor is used to improve the accuracy and simplify the computation of EKF-SLAM algorithm.The fractional order controller is also added to reduce the deviation when the NAO robot walks [11].In this paper, the NAO robot recognizes the predefined graphical markers through its camera, and obtains the landmark information.
In our early work in [12], the object recognition was completed by hand drawing contour through the third party API in choregraph.Then the NAO robot calls Vision Reco.module to complete the recognition.In this paper, the camera of NAO robot does not need to use the third party API to complete the object recognition.NAO robot can automatically obtain the image and recognize the object.Aldebaran Robotics company develops a graphical programming platform-Choregraphe for NAO robot, and also develops many functional models for this platform, such as audio, communication , data edit, motions, system, trackers, vision etc.In ref. [12], the authors used the vision model to realize the identification function of NAO robot.Because of strict requirement of object shape, low rate of identification, drawing the outline of the object in advance etc, this method is extremely limited.In this paper, the method we proposed could make the robot identify the object automatically, we just need to pre-establish the parameters of object in Python code.In our early work in [12], only laser was used to provide the position of the landmark.As long as the object appears in the detection range of the laser, the laser will give the position information of object no matter the object is landmark or not.This will result in over computation and wasting time.In this paper, the NAO robot provides the position information only when it recognizes the landmark.Thus the proposed method can reduce the computational time and make the robot walk autonomously.
The organization of this paper is as follows.Section 2 introduces the basic knowledge used in this paper, including the NAO robot, object recognition in the visual database of OpenCV, laser based EKF-SLAM algorithm and the fractional PI controller design.Section 3 describes the process of research.The results of simulation and experiment are showed in Section 4. Section 5 draws the conclusion.

NAO Humanoid Robot
NAO humanoid robot, as shown in Fig. 1, has a total of 25 degrees of freedom, flexible movement.It is also equipped with an inertial navigation unit (a gyroscope and an accelerometer).The use of some action engines and gait planning algorithms can make the NAO walk steadily, even realize some highly difficult movements [13].
NAO has two identical cameras, the angle between the top and bottom center of the camera is 40 degrees, the maximum available 30 frames / s, with a resolution of 640 × 480 images.The camera can be used to capture the far and near, and near (about 50cm) can only be used under the camera.Some parameters of the camera are shown in Table 1 [14].Although NAO has two cameras, only one camera is active at any instance where the settings of the parameters can be changed through the camera API.There is one laser scanner on the robot NAO head.The type of the laser scanner used in our experiment is URG-04LX.The function of the laser is provided in Table 2 [15].

EKF-SLAM
EKF-SLAM uses mobile probability distribution model and observation model to obtain the state vector and covariance matrix.The input and output of EKF-SLAM are shown in Fig. 2 [17].The state matrix X predicts the position of the robot and the state vector of the landmark.Its dimension is 3 + 2N where N denotes the number of signs [18][19][20].
The covariance P is a square matrix representing the uncertainty of system or deviation.Equation 2 is the covariance matrix, including many sub-variance matrices.
Where P RR is the robot variance, P MM is the signs variance, P RM is the associated variance of the robot and road signs, P MR is the transpose of P RM .EKF is a kind of minimum variance estimation method for nonlinear system equations.It can express system equations of linear system and estimate the state vector of nonlinear system.With a given initial state values, the covariance matrix and the observed value at time t, recursive the computation of state vector of t moments.

Fractional PI Controller
When the NAO moves, it may be interfered by sensor noise, environmental noise and other factors.It can lead to deviation of the target trajectory.In order to minimize the deviation, a fractional order PI controller is designed to compensate the deviation [12].In ref. [12], simulation results show that the fractional controller based on EKF-SLAM can reduce the position deviation during the NAO walking under the unknown indoor environment.In this paper, the controller design is cited in ref. [12].The key information is as follows.
This paper uses the fractional order P I (PI α ) as controller, whose transfer function is:  Al-Alaoui operator discretized, Eq. 3 can be written as: where, 4is changed through the three order PSE [21] expansion: where, So Eq. 5 can be expressed as Reshuffling the terms in Eq. 7, we have We define the control quantity in the time domain as u(t) based on Eq. 7, we can get u(t) as follows: The difference between the real position and estimation position is denoted as e(t) which is employed as the input of controller.u(t) is the output of controller which is employed to correct the walking process.Based on the results from a large number of experiments, when K p = 0.9, K i = 0.2, α = 0.1, T = 0.1, the best result is achieved, i.e., e(t) is minimized.

Object Recognition
To identify landmarks or obstacles, visual detection based on the camera of NAO robot is used.Firstly, semiautonomous learning is used in order to access to the target threshold in the detection process.It can improve the light adaptability of visual processing, and perform object segmentation by using global threshold method [22] in BGR color space.Secondly, the pixel is traversed to extract target feature points.Finally, the outline of the object is extracted, and the contour perimeter, area, and other parameters are calculated.Thus the measured object is determined if it is a target object.It is a valid method to determine the target object in our study.
We use color histogram and color moment to extract the color feature of the target object [23,24].Color histogram describes the global distribution of the color of the image.The color histogram calculation divides the color space into several short intervals.Each interval is a bin histogram.This process is called color quantization.Then, the color histogram is obtained by counting the number of pixels in each short interval.Color quantization uses vector quantization method.The calculation formula is as follows: where, k represents the characteristic value of image, L is the number of possible characteristic values, n k is the number of image pixels with k characteristic value, N is the total number of pixels.Different color space will generate different color histogram.The RGB space is the most common method.The color histogram could be obtained by statistics from R, G and B channels.It could be expressed by three one dimensional color histograms or one three dimensional color histogram.
Any distribution of color could be expressed by its moments.Because the distribution information of color converges on the low order moments, mean (μ i ), variance (σ i ) and skewness (s i ) of color are enough to express the distribution of image color.They are expressed as the following equations.
where p ij is the i th color component of j th pixel in the image.
The color moments of the image need three color components and three low order moments of each color component.Compared with other expression of color distribution, the color moment is more succinct.In fact, the color moments is always combined with other image characters.This can avoid the disadvantages that the ability of judgment of low order moments is weak.In general, the color moments is firstly used to narrow down before using other image characters.
Images in Fig. 3 are obtained by using the NAO robot's camera and converted to numpy format.3) is extracted conveniently.The result of image segmentation is shown in Fig. 4.
Then image in Fig. 4 is converted into HSV format by using cv2.cvtColor.HSV (Hue, Saturation, Value) based on a color space creates the visual characteristics of color, which is known as hexagonal pyramid model.The model parameters are the color: hue (H), saturation (S), value (V).
Here RGB format will be converted into HSV format, which is designed to reduce the optical effect of light on target recognition.The separation channel through the cv2.split method, the hue, saturation and brightness of the image information are stored in the H, S, V, but the standard H range should be 0-360.H is in the range of standard half 0-180 in the OpenCV HSV format.We set the threshold of saturation, hue, brightness three properties respectively, then binary image.The color of the recognized object in this paper is green.The green color values are in the range of 35-70, which satisfy the original pixels hue threshold value.The threshold value is set to 255, while the other position is set to 0. The above work is the pretreatment of the main Fig. 4 The results of image segmentation Fig. 5 The results of recognition of objects working objects recognition.According to the saturation and brightness control to distinguish the background and the object, other colors background interference are excluded, and finally the image median filtering can reduce noise in the image point.
Due to environments may also contain other similar colored objects to the target object, target object shape information is regarded as the primary way to be recognized.Firstly, the preprocessed binarized image can find all of the contour in the figure by using the cv2.findContours function.According to the contour of the area and perimeter, the ratio of area and perimeter square can exclude part of the background profile.Secondly, the contour of the moment is calculated by cv2.moments function.'nu20' and 'nu02' which are the normalized values in the x and y contour direction of second-order central moments are used to determine the contour of target object.Because the contour of the object is approximately rectangular and set upright, 'nu02' is greater than 'nu20'.Finally, the center position of the object in the image is calculated, which is shown in eq.( 11) and eq.( 12).
where c ['m10'] is the first moment for the contour in x direction, c ['m01'] is the first moment for the contour in y direction, c ['m00'] is the area of contour.By using this method to recognize the object, the conditions are more stringent, then judgment is more accurate.Figure 5 shows the result of recognition of objects.

Location of Landmarks
NAO humanoid robot can get the location of objects by its own sensors, such as laser, sonar or camera.Sonar can only measure the distance between the robot and the object.
Positioning error provided by camera is large.However, laser can provide accurate position of target object.The detection area of laser and its coordinate axis are shown in Fig. 6.There is error when NAO robot walks.The error will increase with accumulation.The robot walks five steps which will produce negligible error.So the robot NAO in the experiment stops in every five steps to retrieve the position of the target object, and the new location information will be send back to the robot.When the object is recognized as target object by the camera, the position information of the target object is given by laser.The laser information includes x-y coordinates, the distance and angle values of the target object.K-means clustering method [25] is used to extract useful readings from laser and exclude invalid value, such as zero.
The data is stored in a two dimensional 684 × 4 array.The data includes distance between laser and objects, angle, x and y in Cartesian space.The steps are shown as follows [26].
Step 1: Assign each observation to the cluster whose mean is closest to it.
where x p is a set of observations {x 1 • • • x n } and assigned to exactly one S i , m i is an initial set of k means Fig. 6 The characteristics of laser Fig. 7 The position information of NAO robot and landmarks Step 2: Update the new means as the centroids of the observations in the new clusters.
The algorithm stops when all m i do not change any more.

experiment Results
The work in [12,27] used laser to detect the location information of landmarks.But laser cannot judge whether the object is the target object (that is landmark) in its detection range.The laser data is from all the objects in its detection range.This is the limitation in the work [12].In this paper, camera and laser are combined to be used in SLAM.
Camera is used to recognize if the object is a landmark.If the object is a landmark, then laser data is called.If it is not a landmark, NAO robot will not call laser to detect the location information of landmark.This method saves the computational resource of NAO robot and time.Fractional PI controller is used in this paper to decrease the error in the walking of NAO robot.The work in [27] used PI controller to decrease the error.However, PI controller is not better than Fractional PI controller [12].

The Scene of Two Landmarks
We consider two kind of experiment scenes.The first scene is that two landmarks which need to be recognized.The second scene is about one landmark and one non-landmark.The position information of NAO robot and landmarks is shown in Fig. 7.The distance between the NAO robot and the landmark on its left side is about 0.8m.The distance between another landmark from the NAO robot is about 1.6m.We set the safety distance between NAO robot and landmark to be 0.3m.The experiment step is shown in Fig. 8.The NAO robot recognizes the target object by using camera before walking.When it finds two landmarks, then the NAO robot calls the laser data of both landmarks.If the transversal distance between the two landmarks is greater than 0.6m, the robot can walk straight in the middle of both landmarks.NAO robot will stop walking if the distance between NAO robot and landmark is less than 0.3m.The experimental scene is shown in Fig. 9.
The NAO robot recognizes the landmarks and plans the path by the recognized landmarks.Figure 10 shows the recognition process when the NAO robot walks.Figure 10a is the recognition result before the NAO robot begins to walk.The left figure in Fig. 10a is the processing results of the hue (H), saturation (S), brightness (V) treatment of objects which the NAO robot views.The right figure in Fig. 10a is the contour of extracted landmarks.Figure 10b and c are the second and third recognition results respectively.These results show that the proposed recognition method in this paper is feasible and valid when NAO robot walks.
The unit of data obtained by laser is millimeter.In order to make the calculation more easily, we transform the unit into meter.Laser will create values and store them in the array by scanning the object.Firstly, NAO robot transposes this array and filters the invalid values.Then the obtained data is divided into several parts by clustering method according to the amount of objects.Lastly, the mean value of distance and angel data detected by laser are calculated and transformed into coordinate values.Table 4 shows a part of data obtained by laser.

The Scene of One Landmark and One non-Landmark
In order to verify that the recognition process is only valid for the specified landmark, we do an experiment that the position information of NAO robot and landmark and nonlandmark are shown in Fig. 12.The experiment step is shown in Fig. 13.The green object is the recognized landmark.The Figure 15 shows the recognition processing of the NAO robot under one landmark and one non-landmark in the experiment.We can see that the NAO robot can identify the landmark accurately.Figure 15 shows the process of recognition.Figure 15abc are the first, second and third recognition results.
When one landmark and one non-landmark coexist in the same environment, the NAO robot can call position information of the landmark by laser when it meets and recognizes the landmark.That means the NAO robot cannot call position information of the landmark by laser when it sees other non-landmark.Because laser can detect all the position information of the object, we need only extract the position information of landmark by clustering method.Then the NAO robot can walk toward the landmarks.The part of data detected by laser in our experiment is shown in Table 5.
Figure 16 shows the walking trajectory of NAO robot under the experiment environment of one landmark and one non-landmark.We can see that the trajectory of the NAO robot is good.It shows the proposed method in this paper is valid.
In [12,26], laser is used to detect the landmarks location based on the EKF-SLAM algorithm.But the NAO robot is unable to judge if the object is a landmark in the range of laser detection.So the NAO robot needs to regard all objects in the range of laser detection as landmarks.The limitation is to reduce the range of laser detection in order to prevent non-landmarks from emerging in the range of laser detection.This is not appropriate for the real environment.In this paper, camera object recognition is used to judge if the object is a landmark before laser detects the location of the landmark.If the object is a landmark, then the NAO robot calls laser to provide the location information of landmark.Otherwise, the NAO robot will not call laser.This method improves the real-time of the EKF-SLAM algorithm.
Figure 11 shows the NAO robot walking trajectory under two landmarks in the experiment.In Figs.11 and

Fig. 16
The NAO robot walking process under one landmark and one non-landmark that the walking error of NAO robot reduced greatly by using fractional PI controller.Figure 15 demonstrates that the NAO robot can judge the landmark and non-landmark correctly.Figure 16 shows that laser can also detect the location information of non-landmark.The NAO robot running time is listed in Table 6.In this paper, the running time means the interval time including from sending instructions by computer to receiving instructions, executing instructions and returning the results to the computer by the NAO robot.

Conclusion and Future Work
In this paper, EKF-SLAM based on the camera recognition and laser detection was developed.EKF-SLAM algorithm is used under indoor environment simultaneous localization and map building.Fractional controller is used to reduce the error during the walking process of the NAO robot.Camera and laser are combined to recognize the landmarks and provide the position information of the landmarks.The experiment results show that the proposed method is feasible, and can finish walking of the NAO robot autonomously under indoor environment.This paper only used two landmarks.We could study more complex situation, such as more landmarks.Better recognition method will be used to improve the result of object recognition.In the real environment, the limitation of this paper is that we do not improve the EKF-SLAM algorithm.Future work is to improve the EKF-SLAM algorithm.

Fig. 2
Fig. 2 The observed model and mobile model in EKF-SLAM connection

Fig. 3
Fig.3The picture taken by the camera of NAO robot Though computer vision library based on image processing -OpenCV, three channel images are B, G, R instead of the usual R, G, B. So it is necessary to make conversion.The split function in the OpenCV library can separate image channels.The channel values are stored in R, G, B format, then they use merge function to merge channels.Thus the converted image is obtained.In the first image preprocessing, image segmentation using cv2.pyrMeanShiftFiltering(The function as show in Table

Fig. 11
Fig.11The NAO robot walking process under two landmarks

Fig. 12 13
Fig.12 The position information of NAO robot and landmark and non-landmark Fig. 14 The experiment scene of landmark and non-landmark

Fig. 15
Fig.15 The recognition process of NAO robot in walking (landmark and non-landmark)

Table 2
Main characteristics of the NAO URG-04LX laser

Table 3
Function of OpenCV library

Table 4
Partial data of two landmarks detected by laser

Table 6
Two scene Running timeThe scene of one landmark The scene of two and one non-landmark landmarks