1 Introduction

Polishing the surface or edge of metal products is one of the indispensable important contents in manufacturing. With the development of manufacturing industry and the increase of labor cost, the traditional manual polishing method has been difficult to adapt to the current requirements of intelligent upgrading of manufacturing industry. In summary, the use of industrial robots to assist or even replace manual grinding is a mainstream development trend at present.

Grinding has gradually become a common area for industrial robots, and many manufacturers have launched more mature products for the market. However, the existing products are still far from being completely automated and flexible due to the single specification of the grinding object, the fact that the fixed position of the grinding object cannot be changed at will, and the need to frequently switch or even create new grinding programs when grinding different objects. In view of this, it is necessary to install “eyes” for industrial robots, allowing them to perceive the location of external grinding objects and other information like humans, and realize the autonomous planning and automatic generation of grinding trajectories accordingly, so as to promote the operation flexibility of grinding robots and their adaptability in the face of different grinding objects, and form a good foundation for the automation and intelligent transformation of grinding operation processes. This paper focuses on the key technology of vision-guided grinding robot and its application, which is an effective implementation of this idea. Vision-guided grinding robots are mainly involved in two research areas: machine vision inspection and grinding robots.

The essence of machine vision is to achieve the correlation and conversion of two-dimensional image information and three-dimensional spatial position information by collecting images of target objects and combining them with the calibration results of vision systems. At present, machine vision inspection technology has been widely used in education, medical, agriculture, machinery manufacturing, cell phone screen inspection and many other industries. With this, many valuable results have been born. For example, Singh et al. [1] proposed a framework for automatic detection of surface defects based on machine vision and convolutional neural networks, which solved the problem of detecting common surface defects in centerless grinding of tapered rollers and has positive significance for traditional labor-intensive industries to improve the automation of the inspection process. Yang et al. [2] proposed a surface defect detection method for steel rails based on machine vision and YOLO v2 deep learning network model, which can accurately detect and locate defects with an average accuracy of 97.11%, showing good robustness. Ding et al. [3] developed a laser-based machine vision measurement system, which can measure the 3D contours of deformed surfaces and can perform deformation displacement analysis based on the contour data obtained from the measurements. Liu [4] proposed a physical education teaching evaluation method based on artificial intelligence and machine vision for the current problem of inefficient and error-prone human evaluation in physical education, which can play a good role in assisting and supporting physical education teachers by using artificial intelligence algorithms for data analysis and machine vision to identify the teaching process. Ansari et al. [5] proposed a visual inspection method for rice seed variety purity, which uses machine vision and combines multivariate analysis based on color, morphology and texture features to achieve the detection of rice variety purity, forming a good pavement for the subsequent construction of an automated rice seed variety purity inspection system or even an automated germination rate monitoring system. Jian et al. [6] proposed an improved algorithm for cell phone screen image defect identification and segmentation detection for the problem of misalignment of cell phone screen images caused by vibration, and further developed an automated detection system that can effectively detect various types of defects on cell phone screens. Zhong et al. [7] proposed a machine vision-based 3D measurement method for structured light, which uses machine learning for structured light 3D measurement, reduces the complexity of measurement operation and computation time, and makes real-time measurement possible.

In contrast, the essence of a grinding robot is to add a floating spindle and abrasive to the end of an industrial robot, and then rely on the drive of each joint of the robot to move the abrasive along the preset trajectory of the workpiece surface, and then complete the grinding operation. As an important field of industrial robot applications, grinding has been thoroughly studied by many scholars. For example, Jeon et al. [8] developed an automatic grinding robot system for engine cylinder liner oil groove machining, which can adjust its position by itself according to various types of oil grooves. Ge [9] established a robotic weld in-line grinding system incorporating laser vision sensors for off-line grinding of weld seams with uneven surface quality. Wan et al. [10] combined machine vision technology with industrial robots to design a robotic system from loading to grinding, which solved the problems of difficult workpiece positioning and complicated trajectory schematic teaching during grinding and realized the automated operation from workpiece loading, grinding to discharging. Guo et al. [11] proposed a robot grinding motion planning method, including robot motion planning and weld grinding, and verified the effectiveness of the proposed method by grinding experiments of pipe fitting welds. Xu et al. [12] designed and fabricated a novel prototype of a wheeled pipe polishing and grinding robot, which consists of a moving structure, a positioning structure and a polishing structure, and has the advantages of compact structure, adaptability and high grinding efficiency. Wang et al. [13] proposed an improved whale optimization algorithm and applied it to the optimization of the grinding trajectory, which has a positive effect on reducing the impact effects in grinding and improving the grinding smoothness.

The combination of machine vision and industrial robots can be visualized as the collaboration between human eyes and limbs to achieve a deep communion between external perception and execution of actions. This combination also gives robots more intelligent characteristics, enabling them to better realize the purpose of replacing humans with machines. For example, material sorting robots [14], firefighting robots [15], and agricultural harvesting robots [16] that apply machine vision have shown good performance in their respective applications. At the same time, there are some research works about the vision system for the grinding robot, which also provides a good foundation for the research of this paper. Diao et al. [17] proposed a 3D vision system that could easily integrate into an intelligent grinding system, which can be suitable for industrial sites. Zhao et al. [18] proposed a vision-based grinding strategy for the mobile robot, and this strategy was proven to support reconstruction of workpiece surface and generation of grinding path by measuring point clouts. Wan et al. [19] set up a grinding workstation constituting of machine vision, and study case showed that this grinding workstation can determined the object position and targeted the robotic grinding trajectory by the shape of the burr on the surface of an object. These works stated above proves to a certain extent that the integrated application of machine vision technology and grinding robot is technically feasible and potentially advantageous.

Based on the above analysis, this paper conducts an in-depth study on the vision-guided grinding robot prototype system and related applications for a titanium surface edge flexible grinding demand proposed by a titanium manufacturer. Firstly, this study analyzes the shortcomings of the existing robotic regrinding process and achieves the improvement of the regrinding process by introducing machine vision technology. Subsequently, this study further utilizes machine vision and image processing algorithms to achieve high-quality recognition and high-precision positioning of metal surface edges. Then, the D–H parameter model of the regrinding robot is established, and the planning and simulation of the regrinding trajectory is carried out using the position information of the identified regrinding edges. Finally, the simulation-validated grinding trajectory is introduced into the grinding robot, and the effectiveness of the proposed scheme is verified by actual grinding experiments.

2 Analysis of existing robots’ regrinding process and its improvement

Nowadays, industrial robots are used in many grinding operations, as shown in Fig. 1. The use of grinding robots has improved the efficiency of grinding operations to a certain extent. However, in general, the control of the position and attitude of the grinding head at the end of the robot is still mainly achieved with the help of a robot FlexPendant, which still lacks the ability to automatically plan the grinding process by matching specific objects.

Fig. 1
figure 1

Example of robot grinding process

At present, the process flow of most grinding robots is not flexible enough, as shown in Fig. 2. For a specific specification of the grinding object, it is necessary to first fix it in the same position, and then establish the corresponding program according to the requirements of the grinding operation, and then use the grinding program to generate the grinding track to drive the robot to complete the grinding operation. The area of the grinding teeth of the grinding head is usually large enough to ensure that the regrinding operation is completed even if there is a slight deviation in the plate position when fixing. However, the limitations of this grinding process are significant, as different grinding targets require different grinding procedures to be established in advance. If a wide range of metal surface edges need to be regrinded in practice, the operator needs to frequently switch or even create new regrinding programs, which causes a lot of inconvenience to the control and operation of the regrinding robot. In addition, although industrial robots are used in grinding operations, there are still many limitations in the application scenarios due to the large proportion of time spent on manual teaching to generate grinding programs and the relatively single specification of grinding objects, which is not conducive to the promotion and popularity of robotic grinding systems.

Fig. 2
figure 2

Process flow of existing robotic grinding

In this study, a vision system is added to the existing grinding robot, and the position coordinates of the edge to be ground on the metal surface are obtained through the vision system. On this basis, combined with the inverse kinematics analysis of the robot, the planning and simulation of the robot grinding trajectory are realized. The introduction of the machine vision system allows the grinding robot to match the grinding object to achieve automatic detection and positioning of the spatial position, i.e., without special emphasis on the need to keep the specifications of the grinding object constant. It is also not necessary to fix the grinding object in the same position under the same specification, which makes the grinding robot much more flexible and flexible. The improved robot grinding process is shown in Fig. 3.

Fig. 3
figure 3

Process flow of vision-guided robotic grinding

Taking the grinding of metal surface edges as an example, compared with the existing robot grinding process, the biggest difference of the improved robot grinding process is mainly reflected in the following two aspects. Firstly, the improved grinding robot can use machine vision technology to identify and locate the grinding edge, so as to detect and extract the spatial position information of the grinding edge. Secondly, the robot can plan and simulate the grinding trajectory based on the grinding edge position information obtained from vision detection. In summary, these two aspects are also the core content of this paper’s research.

3 Recognition and localization of grinding edges based on machine vision

In a vision-guided metal edge grinding robot system, the main function of the vision part is to identify and locate the position of the grinding edge. This can be achieved by first calibrating the vision camera and the robot and establishing a conversion relationship from the vision image to the spatial position of the grinding object and then to the spatial position under the grinding robot. On this basis, the image acquisition, pre-processing and edge extraction of the grinding object are performed to obtain high-quality edge profile information of the grinding object. Finally, using the system calibration results, the collected edge information of the regrind object is transformed into spatial position information that can be understood by the industrial robot.

3.1 Vision camera calibration

The imaging of an industrial camera is determined by the geometric model of imaging, which is essentially the use of the small-aperture imaging principle to map a point in the scene onto the image plane through the camera lens. Therefore, the imaging process of the camera is also called the transformation process of the projection geometry.

Suppose that the true position of any point P in space in the world coordinate system is represented by the coordinates pw (Xw, Yw, Zw), but its position in the camera coordinate system is not consistent with Pw. This is mainly because the origin of the camera coordinate system is determined by the camera optical center, the default optical axis is the Z-axis, and the projection direction of the object is the positive Z-axis direction. When using the vision system for spatial localization, the first problem to be solved is to convert the position of the photographed object from the position under the world coordinate system to the position under the camera coordinate system. If the position coordinates of the point P in the camera coordinate system are defined as p (X, Y, Z), then there exists a conversion relationship between p and pw as shown in Eq. (1).

$$\left[ {\begin{array}{*{20}c} X \\ Y \\ Z \\ 1 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\mathbf{R}} & {\mathbf{T}} \\ 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {X_{w} } \\ {Y_{w} } \\ {Z_{w} } \\ 1 \\ \end{array} } \right] = {\mathbf{M}}_{{\mathbf{1}}} \left[ {\begin{array}{*{20}c} {X_{w} } \\ {Y_{w} } \\ {Z_{w} } \\ 1 \\ \end{array} } \right]$$
(1)

where R represents a 3 × 3 rotation matrix, T represents a 3 × 1 translation vector, and M1, composed of the two, represents the external parameters of the camera. M1 represents the coordinate transformation relationship between the world coordinate system and the camera coordinate system.

A two-dimensional planar image can be visually obtained by using a camera to shoot images, and point P is presented as a two-dimensional point on the image. Generally, the pixel coordinate system can be established by taking the upper left corner of the image as the origin and the two adjacent sides perpendicular to each other starting from the upper left corner of the image as the coordinate axis. If the pixel coordinate of P is known to be p0 (u, v), then the transformation relationship between p0 and p is satisfied as shown in Eq. (2).

$$Z\left[ {\begin{array}{*{20}c} u \\ v \\ 1 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} \frac{1}{dx} & 0 & {u_{0} } \\ 0 & \frac{1}{dy} & {v_{0} } \\ 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} f & 0 & 0 \\ 0 & f & 0 \\ 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} X \\ Y \\ Z \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} \frac{f}{dx} & 0 & {u_{0} } \\ 0 & \frac{f}{dy} & {v_{0} } \\ 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} X \\ Y \\ Z \\ \end{array} } \right] = {\mathbf{M}}_{{\mathbf{2}}} \left[ {\begin{array}{*{20}c} X \\ Y \\ Z \\ \end{array} } \right]$$
(2)

where f represents the camera focal length; dx and dy denote the physical scale size of a single pixel point on a two-dimensional image, along the direction of two coordinate axes, respectively; (u0, v0) denotes the coordinates of the intersection of the camera optical axis and the image plane in the pixel coordinate system. M2 is an internal parameter of the camera, which represents the conversion relationship from pixel coordinates to camera coordinates.

By associating Eq. (1) with Eq. (2), the conversion relationship between p0 and pw can be obtained as follows:

$$Z\left[ {\begin{array}{*{20}c} u \\ v \\ 1 \\ \end{array} } \right] = {\mathbf{M}}_{{\mathbf{2}}} {\mathbf{M}}_{{\mathbf{1}}} \left[ {\begin{array}{*{20}c} {X_{w} } \\ {Y_{w} } \\ {Z_{w} } \\ 1 \\ \end{array} } \right]$$
(3)

The purpose of vision camera calibration is to obtain the conversion matrix from the world coordinate system to the pixel coordinate system, i.e., the internal and external parameters of the camera. When performing camera calibration, the most widely used method is the checkerboard calibration method proposed by Zhang [20]. The calibration method proposed by Zhengyou Zhang uses the images of the calibration target with different orientations as shown in Fig. 4 to solve for the internal and external parameters. Zhengyou Zhang’s calibration method defines the plane where the checkerboard is located as the Xw − Yw plane of the world coordinate system, takes a specific corner point extracted from the checkerboard as the origin Ow of the world coordinate system, and takes the two sides of the checkerboard passing through Ow as the Xw and Yw axes. Since the size of the checkerboard is known, the coordinates of all corners on the checkerboard in the world coordinate system can be obtained, and the Zw values of all corners in the world coordinate system are identical to 0. Subsequently, using the image detection algorithm, the pixel coordinates of all corner points of checkerboard can be easily extracted.

Fig. 4
figure 4

Checkerboard calibration target image

After the pixel coordinates of corner points and the corresponding world coordinate values are obtained, a mathematical relationship model between the two can be built using Eq. (3). From Eq. (2), it can be seen that the camera internal reference represents the inherent properties of the camera and has no relationship with the actual placement orientation of the calibration target. Thus, it is possible to establish multiple sets of mathematical correspondences by taking multiple sets of images of the calibration target with different orientations, and use the idea of nonlinear least squares to solve for a suitable M2. Based on this, M1 can be further obtained by continuing to use the idea of nonlinear least squares solution. It is worth noting that the external parameters represent the spatial position conversion relationship between the world coordinate system and the camera coordinate system, so once the external parameters are solved, the camera position must not be changed at will, otherwise M1 needs to be re-calibrated.

Since the calibration method proposed by Zhang Zhengyou has been very mature and widely used, this paper will not describe it in detail.

3.2 Hand-eye calibration of robot

The vision camera calibration enables the conversion from pixel coordinates to world coordinates. However, the robot grinding operation needs to be driven by the robot, so it is also necessary to convert the edge position information of the metal surface to be repaired under the robot coordinate system, so as to realize the association with the robot operation system.

In practice, the relationship between the camera and the robot is divided into two types, “eye-in-hand” and “eye-to-hand”. The former means that the camera is mounted directly on the robot, while the latter means that the camera is mounted at a fixed position outside the robot. Both methods have their advantages and disadvantages. Considering the stability and safety of robot grinding operation, this paper chooses the correlation method with eye-to-hand.

The height of the grinding table and the grinding object are easy to measure, and the imaging plane is placed parallel to the metal surface to be polished. In summary, this paper uses a monocular camera acting on a 2D plane and obtains the conversion relationship between image coordinates and robot end coordinates by means of affine transformation, as shown in Eq. (4).

$$\left[ {\begin{array}{*{20}c} {x^{\prime}} \\ {y^{\prime}} \\ \end{array} } \right] = {\mathbf{R^{\prime}}}\left[ {\begin{array}{*{20}c} x \\ y \\ \end{array} } \right] + {\mathbf{C}}$$
(4)

where p′ (x′, y′) represents the coordinates of point P in space ignoring the height information under the robot coordinate system; p1 (x, y) represents the image coordinates of point P on the two-dimensional plane image corresponding to the actual physical scale, which satisfies the conversion relation with the pixel coordinates as shown in Eq. (5); R′ represents a 2 × 2 rotation matrix; while C represents a 2 × 1 translation vector.

$$\left\{ {\begin{array}{*{20}c} {x = udx - u_{0} dx} \\ {y = vdy - v_{0} dy} \\ \end{array} } \right.$$
(5)

It can be seen from the formula (4) that at least three groups of coordinates corresponding to three points are needed to calculate R′ and C and obtain the conversion relationship between the image coordinate system and the robot coordinate system, so as to achieve the purpose of hand-eye calibration. In order to perform the robot hand-eye calibration, it is necessary to move the end of the robot to at least three points to be calibrated. The coordinate values of these points in the robot base coordinate system are then read by the FlexPendant and combined with the corresponding image coordinate values for calculation. To reduce errors generated by the robot or other parties, it is common in reality to collect the coordinates of nine points for calculation and have the end of the robot move sequentially to the nine position points on the calibration target to obtain a more accurate hand-eye calibration result by averaging. The nine-point calibration target commonly used for robot hand-eye calibration is shown in Fig. 5.

Fig. 5
figure 5

Nine-point calibration target

3.3 Recognition and localization of grinding edges

Once the vision system and robot are calibrated, the grinding object can be fixed on the grinding table for image capture and acquisition. There are no other special requirements for the fixing position of the grinding object, only that the edges of the metal surface to be ground are within the working space of the robot and the field of view of the camera.

Considering that the acquired image may have noise interference and affect the subsequent edge recognition and positioning accuracy, it is necessary to filter the image first, so as to further improve the quality of the captured image. There are three commonly used image denoising means, such as mean filtering, median filtering and Gaussian filtering. Through experimental comparison, Gaussian filtering [21] can make the edges of the objects in the image smoother and can blur the image of the metal surface without the edge information being significantly disturbed. This is more in line with the need to achieve the refinement of metal surface edges in this paper, so Gaussian filtering is used to pre-process the captured images. The principle of Gaussian filtering is shown in Eq. (6).

$$I_{\sigma } = I * G_{\sigma }$$
(6)

where Iσ represents the output image, I represents the input image, * represents the convolution operation, and Gσ represents a two-dimensional Gaussian kernel with standard deviation σ. The standard deviation σ is closely related to the degree of smoothness of Gaussian filtering, which often needs to be determined after several trial selections according to the specific situation when it is used.

In order to enhance the distinction between the polished object and other background environments in the shot image, so as to create favorable conditions for the subsequent polished edge extraction, the filtered image also needs to be binarized and open-operated.

Binarization means presetting a threshold value. It is 255 when the grayscale of a pixel point is higher than the threshold setting and 0 when it is lower than the threshold setting, making the image appear distinctly black and white. The key to binarization is to determine a reasonable threshold value. If a fixed threshold is used, trial selection and image processing experiments are required continuously, which is obviously less efficient and the threshold is determined with a certain degree of blindness. In this paper, Otsu’s binarization method [22] is used to improve the process, which counts the frequency of occurrence of each pixel value and then iterates through all possible thresholds (256 in total) to obtain the best image processing result. Among them, the selection of the best threshold value is judged by Eq. (7).

$$\min S = I_{A} S_{A} + I_{B} S_{B}$$
(7)

where IA represents the sum of all pixel values greater than the current threshold; IB represents the sum of all pixel values less than the current threshold; SA represents the variance of all pixel values greater than the current threshold; SB represents the variance of all pixel values less than the current threshold; S represents the weighted sum of the variances of the pixel values of the two classes under the division of the current threshold, and the threshold taken as the smallest value of S is the optimal threshold for image binarization.

After binarization, there may still be a small number of noise points in the image. At this time, the opening operation in morphology can be used to further process the image. Open operation is a filter based on geometric operation, which essentially processes the image by first eroding and then dilating to achieve the purpose of removing isolated noise, as shown in Fig. 6.

Fig. 6
figure 6

Open operation principle

After the open operation, the image can be used for edge recognition. Comparing various current edge detection operators, the Canny operator [23] is selected in this paper to extract the edge contour of the metal surface to be grinded. The basic idea of the Canny operator is to calculate the integrated gradient amplitude of each pixel point in the x-axis and y-axis directions of the image coordinate system, and take the local maximum value of the integrated gradient amplitude as the edge to be extracted, and finally set the pixel value at the non-edge position to 0 to realize the extraction of the edge contour.

After the outline of the grinding edge is identified, the pixel coordinates of the grinding edge can be clarified and then substituted into the vision camera and robot hand-eye calibration results. Subsequently, the spatial position information of the grinding edge in the robot’s base coordinate system can be obtained, and the positioning detection of the grinding edge can be realized.

4 Kinematics modeling and its application of grinding trajectory planning

After obtaining information on the position of the grinding edge in the robot’s base coordinate system, the grinding trajectory planning can be performed. The robot grinding trajectory planning is based on the inverse solution of robot kinematics [24, 25]. In other words, the robot end poses are known and the individual joint angles are solved in reverse. The continuous trajectory is calculated using polynomial interpolation to obtain the individual joint angle variations, which provides the basis for the final realization of the regrinding trajectory driven by the joint angle variations.

4.1 Inverse solution analysis of robot D–H model and kinematics

The D–H model [26, 27] is established by the D–H parameter method, which can be used to describe the changes between the joint coordinate systems of the manipulator. D–H parameter rule is a common method to establish robot model. Taking a six-axis industrial robot as an example, it has three translations and three rotations, totaling six degrees of freedom. The principle of D–H parameter method is to solve the four-order homogeneous transformation matrix Ai between two adjacent axes, establish the position and attitude relationship between the joints of each axis, and obtain the position and attitude of the end of the robot in the robot base coordinate system. The definition of Ai is shown in formula (8):

$$A_{i} = \left[ {\begin{array}{*{20}c} {\cos \theta_{i} } & { - \sin \theta_{i} \cos \alpha_{i} } & {\sin \theta_{i} \sin \alpha_{i} } & {a_{i} \cos \theta_{i} } \\ {\sin \theta_{i} } & {\cos \theta_{i} \sin \alpha_{i} } & { - \cos \theta_{i} \sin \alpha_{i} } & {a_{i} \sin \theta_{i} } \\ 0 & {\sin \alpha_{i} } & {\cos \alpha_{i} } & {d_{i} } \\ 0 & 0 & 0 & 1 \\ \end{array} } \right]$$
(8)

where i represents the joint label, i = 1–6 for 6-axis industrial robots; di, ai, αi, θi are the D–H parameters of industrial robots, which represent the linkage deflection, linkage deflection, torsion angle and rotation angle of joint i, respectively.

By multiplying the six flush transformation matrices together, the positional matrix \(A_{0}^{6}\) from the robot base coordinate system to the robot end coordinate system is obtained, which can be expressed by Eq. (9).

$$A_{0}^{6} = A_{1} A_{2} A_{3} A_{4} A_{5} A_{6} = \left[ {\begin{array}{*{20}c} {n_{x} } & {o_{x} } & {a_{x} } & {p_{x} } \\ {n_{y} } & {o_{y} } & {a_{y} } & {p_{y} } \\ {n_{z} } & {o_{z} } & {a_{z} } & {p_{z} } \\ 0 & 0 & 0 & 1 \\ \end{array} } \right]$$
(9)

The inverse solution of robot operation is based on the position of the grinding head at the end of the robot, and the angle of rotation θi of each joint is solved by the transformation matrix in turn. Take the solution of θ1 as an example, multiply \(A_{1}^{{{ - }1}}\) at both ends of formula (9); then use the relationship that the matrices at both ends after multiplying \(A_{1}^{{{ - }1}}\) are equal and the values of corresponding elements are also equal to obtain θ1 as shown in formula (10):

$$\theta_{1} { = }\arctan \left( {\frac{{m_{1} }}{{n_{1} }}} \right) - \arctan \left( {\frac{{d_{4} }}{{ \pm \sqrt {m_{1}^{2} { + }n_{1}^{2} - d_{4}^{2} } }}} \right)$$
(10)

where \(m_{1} = a_{y} d_{6} + p_{y}\) and \(n_{1} = a_{x} d_{6} + p_{x}\). In the same way, θ2θ6 can also be solved by similar matrix transformation and correspondence. Considering that the method of robot inverse kinematics analysis has been developed more mature, it will not be repeated here.

4.2 Research on robot trajectory planning and simulation

Based on the D–H parameters of the robot, a simulation model of the robot, namely D–H parameter model, can be constructed, as shown in Fig. 7. The essence of this D–H parameter model is to set a series of connecting rods and connect them by joints to form a spatial open motion chain. Taking a six-axis industrial robot as an example, each axis of the six-axis robot is established by setting the joint rotation angle, joint distance, linkage length, linkage rotation angle, and joint type of the robot, and finally connecting each axis to build a robot simulation model.

Fig. 7
figure 7

Simulation example of one industrial robot

On the basis of the industrial robot simulation model, the trajectory of the robot end can be planned. When simulating the motion trajectory, it is necessary to set the angle, velocity and acceleration of the initial point and the end point. Wherein the speed and the acceleration are set to be zero, then the pose of the robot end corresponding to the point is respectively solved in a Cartesian space, the joint angle of the robot is solved according to the pose, and finally the joint angle, the angular velocity and the acceleration in the motion track are calculated by using a polynomial interpolation method.

5 Case study

5.1 Construction of prototype system of vision-guided grinding robot

The hardware of the experimental regrinding robot mainly consists of an industrial robot, an industrial camera and lens, and a floating grinding spindle. The specific parameters are shown in Table 1.

Table 1 Hardware components of the vision-guided grinding robot

According to the lighting environment of the laboratory, the grinding platform is placed on the left side of the robot, and the industrial camera is fixed above it. Subsequently, the grinding head at the end of the robot is fixed on the wheel of the robot through a connecting plate, and an air pump and a frequency converter are connected. The grinding object is placed on the working platform within the shooting range of the camera. The original intention of designing the vision-guided grinding robot is to grind the obvious defects such as metal edge burrs and metal surface bulges, so as to reduce the manual workload in the rough grinding process. Grinding operation does not require high precision, so the floating grinding head which provides air pressure to achieve flexible grinding is selected. The vision-guided grinding robot system for the laboratory is shown in Fig. 8.

Fig. 8
figure 8

Vision-guided grinding robot system

5.2 Vision system calibration

After the grinding system of the vision-guided robot is set up, the vision camera and the industrial robot can be calibrated, and the conversion from the pixel coordinates of the shot image to the coordinates under the base coordinate system of the industrial robot is realized. The grinding platform and the grinding object are determined, and the height information can be obtained by measurement. In addition, when calibrating the vision camera, the checkerboard calibration target can be kept coplanar with the metal surface to be ground, so the 2D camera can be used for image acquisition. In this experiment, the internal and external parameters of the camera were calibrated by using the 12 checkerboard calibration pictures shown in Fig. 4, which have different positions and orientations but are all coplanar with the ground metal surface.

First, extract the corner pixel coordinates of the twelve checkerboard calibration target images, as shown in Fig. 9.

Fig. 9
figure 9

Corner point pixel extraction in calibration target

Subsequently, the Zhang Zhengyou calibration method and the nonlinear least squares method are used to obtain the appropriate intrinsic parameter matrix of the vision camera as follows:

$${\mathbf{M}}_{{\mathbf{2}}} { = }\left[ {\begin{array}{*{20}c} {2838.9666} & 0 & {389.6784} \\ 0 & {2838.5841} & {303.8005} \\ 0 & 0 & 1 \\ \end{array} } \right]$$

On the basis of obtaining M2, similar external parameters can be obtained. Because the plane of the calibration board and the distance between the calibration target and the camera are known, the rotation matrix R in the extrinsic parameter matrix can be simplified to a rotation vector. Taking the world coordinate system constructed on the first of the 12 calibration board images as an example, the external parameters of the camera obtained by calibration are as follows:

$$\left\{ {\begin{array}{*{20}c} {{\mathbf{R}}{ = }\left[ {\begin{array}{*{20}c} { - 0.04843203} \\ { - 0.00613504} \\ {1.14953739} \\ \end{array} } \right]} \\ {{\mathbf{T = }}\left[ {\begin{array}{*{20}c} {1.26226667} \\ { - 4.92489213} \\ {62.85090402} \\ \end{array} } \right]} \\ \end{array} } \right.$$

After the camera calibration, the hand-eye calibration is performed to obtain the transformation matrix between the camera coordinate system and the robot base coordinate system. Place the nine-point calibration plate at the same height as the ground metal surface. By manually controlling the pose of the industrial robot, the industrial robot is adjusted so that the tail end of the law wheel of the industrial robot is in one-to-one correspondence with the centers of nine circles on the nine-point calibration plate, and the coordinates (x′, y′)of the robot corresponding to the nine points under the time base coordinate system are respectively recorded. Then, the pixel coordinates (u, v) of the centers of the nine circles in the image captured by the nine-point calibration plate are extracted. Finally, according to the correspondence between the pixel coordinates and the coordinates of the robot end-movement position points, the hand-eye calibration results can be obtained by solving Eqs. (4)–(5).

When nine-point calibration method is adopted, the pixel coordinates of nine-point and its coordinates in the robot base coordinate system are shown in Table 2.

Table 2 Nine-point calibration coordinate parameters

According to Sect. 3.2, the hand-eye calibration results can be calculated from the data in Table 2 as follows:

$$\left\{ {\begin{array}{*{20}c} {{\mathbf{R^{\prime}}}{ = }\left[ {\begin{array}{*{20}c} { - 0.00485561} & {0.241485} \\ {0.483952} & {0.0191143} \\ \end{array} } \right]} \\ {{\mathbf{C = }}\left[ {\begin{array}{*{20}c} {809.073} \\ { - 740.146} \\ \end{array} } \right]} \\ \end{array} } \right.$$

5.3 Extraction and localization of grinding edges

After the calibration is completed, a titanium alloy product is fixed and placed as the grinding object (as shown in Fig. 10), and the camera is used to collect and preprocess the image of the grinding object. The image collected by the camera and its pre-processing results are shown in Fig. 11.

Fig. 10
figure 10

Fixed grinding object

Fig. 11
figure 11

Image collection and pre-processing

After preprocessing, Canny operator is used to extract the edge contour to be trimmed. In this example, the edge of the metal surface to be trimmed is similar to a rectangle, so the contour information of the trimmed edge can be obtained by extracting 4 vertices of the edge and by contour fitting. The extraction and fitting of the contour of the grinding edge are shown in Fig. 12.

Fig. 12
figure 12

Contour extraction

According to the pixel coordinates of the vertices obtained in Fig. 12, the coordinates of the four vertices of the grinding edge in the robot base coordinate system are obtained by substituting the calibration results of the vision system (as shown in Table 3), so as to realize the spatial positioning of the grinding edge.

Table 3 Vertex localization results

5.4 Grinding trajectory planning and simulation

According to the D–H parameters of the grinding robot shown in Table 1, the simulation model of the industrial robot is established. It is worth paying attention to that the working point of the end of robot simulation model established based on D–H parameters is the center of the flange, rather than the center of the diamond grinding disc. When performing the grinding operation, it’s necessary to ensure the contact between the grinding disc and the grinding object. So a tool coordinate system based on the grinding disc needs to be established, thus transferring the working zero point and the working direction from the flange to the grinding disc. The establishment of tool coordinate system mainly includes TCP (tool centre position) calibration and TCF (tool coordinate frame) calibration [28,29,30]. Among them, the main method of TCP and TCF calibration is N-point method (N ≥ 3), and the calibration accuracy is affected by the actual operation and often improved with the increase of N. Because the working area of the grinding disc is large, the 3-point method can basically meet the experimental accuracy requirements. Taking the 3-point method for TCP calibration as an example, a fixed reference point is firstly determined in the robot workspace. Then control the movement of the robot, and make the grinding disc center reach the fixed reference point in three different attitudes, as shown in Fig. 13. At the same time, record the coordinate values and their corresponding euler angles when the grinding disc center arrive the reference point in each attitude. Finally, the position conversion relationship between the flange center and grinding disc center can be obtained by calling the program of the robot system or using some open source toolbox. TCP calibration requires a fixed arrival position and different arrival attitudes, which is different of TCF calibration. When implementing the 3-point method of TCF calibration, the grinding disc center is required to maintain a fixed attitude, and there are only coordinate transformations. In addition to the above difference, the calibration process of TCP and TCF is similar, so it will not be repeated here.

Fig. 13
figure 13

TCP calibration by using 3-point method

Subsequently, the joint angle information of the industrial robot at the positions of the four vertices is obtained by substituting the data in Table 3 and using the inverse kinematics equation of the robot. In order to ensure that the metal edge can be fully ground, it is extended by 5 cm on the basis of the four vertices, so that the grinding of the vertices is smoother. At the same time, a transfer point is set right above the first vertex as the working point before and after the robot begins to grind. At the beginning, the trajectories of the grinding robot from the zero point to the transfer point and returning to the zero point from the transfer point after the grinding are all arcs, and the rest are straight lines. The angle, angular velocity and acceleration of the joint angle in the grinding trajectory are obtained by applying a quintic polynomial interpolation. The simulation results of grinding trajectory are shown in Fig. 14.

Fig. 14
figure 14

Simulation of grinding trajectory

5.5 Metal surface edge grinding experiment

The grinding trajectory verified by simulation is imported into the robot FlexPendant. Then, a grinding program is run to grind the edge of the metal surface. According to Fig. 12b, there are four sides that the regrinding robot needs run, and the grinding sequence is as follows:D → A → B → C → D. Figure 15 shows the initial position of the regrinding robot, and Fig. 16 shows the actual process of the regrinding operation.

Fig. 15
figure 15

Initial position of the robot

Fig. 16
figure 16

Grinding operation process

The comparison of metal surface edges before and after grinding is shown in Fig. 17.

Fig. 17
figure 17

Comparison before and after grinding

In order to show the change of the metal surface edge before and after grinding more intuitively, texture analysis is used for further comparative analysis, as shown in Fig. 18.

Fig. 18
figure 18

Texture analysis comparison chart

It can be seen from the observation of Figs. 17 and 18 that the edge texture before grinding is rough, and the surface edge after grinding is smoother, thus achieving the purpose of grinding the metal surface edge.

At present, there is still a relatively difficult to complete replace hand-based grinding with the vision-guided grinding robot. After communication with the cooperative enterprise, the research purpose is to polish the easy-to-grind metal surface and remove the edge burrs, thus reducing the work intensity of subsequent grinding operators. This also forms a good foundation for the complete realization of robot-based automatic grinding in the future. In the experimental environment of vision-guided grinding robot, the surface roughness value of the grinding object that measured by the surface roughness meter reduced to less than 0.8 μm from the original 2–3 μm, and the edge burrs were significantly removed. This improve the quality of the metal surfaces and their edges significantly, which meet the application requirements of the enterprise.

6 Conclusion

As two core technologies in the field of intelligent manufacturing, the combination of machine vision and industrial robots will endow robots with more intelligent characteristics and provide support for the better realization of replacing people with machines. Based on this point, a prototype system of vision-guided grinding robot is developed to meet the demand of flexible grinding of titanium surface edge put forward by a titanium production enterprise in reality, and the experimental verification is carried out for the test titanium provided by the enterprise, and the experimental results show that the proposed scheme is feasible. The research work in this paper is derived from the needs of our cooperative enterprise, and the goal is that we can solve the practical problems of the enterprise. Our starting point of research work is the improvement of robot grinding process of the cooperative enterprise by introducing machine vision technology, and the experimental result prove that this application method can effectively solve the problem of replacing people with robots in the rough grinding stage, which has the typical characteristic of multi-technology fusion application.

Compared with the traditional grinding robot which depends on the preset trajectory, the prototype system developed in this paper has been significantly improved in operational flexibility and work flexibility, but there is still a certain distance from the full realization of flexible grinding. In particular, in the grinding of complex surfaces or cavities, the limitations of the camera field of view and the complexity of the grinding trajectory to be planned pose challenges to the practical application of the proposed grinding robot. In addition, the prototype system developed in this paper is still in the experimental validation stage, and the integration and packaging of the system need to be further improved. In the future, how to improve the operation flexibility and system integration level of the grinding robot so that it can be better applied to the production practice of enterprises will be the key research direction of this paper.