Digitisation of a moving assembly operation using multiple depth imaging sensors

Several manufacturing operations continue to be manual even in today’s highly automated industry because the complexity of such operations makes them heavily reliant on human skills, intellect and experience. This work aims to aid the automation of one such operation, the wheel loading operation on the trim and final moving assembly line in automotive production. It proposes a new method that uses multiple low-cost depth imaging sensors, commonly used in gaming, to acquire and digitise key shopfloor data associated with the operation, such as motion characteristics of the vehicle body on the moving conveyor line and the angular positions of alignment features of the parts to be assembled, in order to inform an intelligent automation solution. Experiments are conducted to test the performance of the proposed method across various assembly conditions, and the results are validated against an industry standard method using laser tracking. Some disadvantages of the method are discussed, and suggestions for improvements are suggested. The proposed method has the potential to be adopted to enable the automation of a wide range of moving assembly operations in multiple sectors of the manufacturing industry.


Introduction
The automotive industry has always been open to technological transformations because of their need to adapt to a continuously changing marketplace and to stay competitive [1].This industry is one of the early adopters of automation using automated machines and robots for material handling, processing, assembly and inspection operations and continues to be highly automated [2].However, there are a few operations in vehicle production that have not been automated yet such as those in the trim and final assembly line where the vehicle gets its seats, internal and external trims, and wheels.This is because the installation of components on a constant moving vehicle body is a complex task that is as yet best performed by skilled human operators [3].
The focus of this work is to enable the automation of the wheel loading operation in the trim and final assembly line.Though this operation would seem straightforward for a human operator, it is one of the most complex manufacturing assembly activities to automate because it requires accurate tracking of the moving vehicle body that sways unpredictably on the conveyor line and real-time recognition of alignment features for successful assembly.In this operation, the human operator grabs a wheel, uses his vision and cognition to track the motion of the vehicle body on the conveyor line to anticipate when and where to load, loads the wheel on the wheel hub of the vehicle axle and fastens the bolts to secure the wheel in a fixed time of about 10 s [4].Chen et al. have indicated that the wheel loading operation alone can cost automotive manufacturers up to US$1.5 million a year thereby justifying the need to automate this operation [5].The potential threat of developing musculoskeletal disorders in operators, caused by manoeuvring heavy wheels in uncomfortable body postures during installation despite using weight compensation gantries, further reinforce the need for automation [6].
The main enabler for developing an automated solution for wheel loading is a cost-effective method to accurately track in real-time the motion characteristics of the moving vehicle body and simultaneously recognise the misalignment between the to-be-loaded wheel and the hub that receives the wheel.This data can be used by an automation solution to gather intelligence about the operation in real-time enabling it to successfully perform the wheel loading operation.This research investigates the use of gaming interface technologies such as depth imaging sensors (Microsoft Kinect ™ ), as potential replacement for human senses, to capture and digitise the above data in real-time and make it available to an automation solution.An example of an automation solution could be the use of expert systems to make human-like decisions during the operation to control an industrial robot arm that aligns and loads the wheels on to the moving vehicle body.
In this work, the assembly line conditions are emulated in a simplified manner in a laboratory setting in which the Kinect sensors are used.The main intent behind this work is to evaluate the feasibility of using multiple Kinect sensors to obtain live shopfloor data and validate their performance against an industry standard method in order to explore an alternative low-cost method to the current commercial shopfloor data acquisition methods.However, there is not enough evidence in literature of the Kinect sensors being successfully used in a factory environment primarily because the sensors themselves are not certified for use in industry settings and do not have the software interfaces yet to communicate with industrial production systems.The next step in this work is to extensively test the performance of the proposed method using multiple Kinect sensors in an actual wheel loading assembly line environment over a long period in order to evaluate the industrial applicability of this work.

Related work
In the past three decades, robotic automation has played a significant role in the automotive industry in manufacturing processes such as stamping, welding, material handling and painting, but currently, there are no commercial robotic assembly solutions on the trim and final vehicle assembly line [7].From this moving assembly line, researchers have identified the wheel loading operation as one of the early targets of automation.
The difficulty of using a rigidly programmed industrial robot to load wheels in a dynamic environment has been recognised in literature.Since most current industrial robots have to be pre-programmed with very little flexibility in their task execution, it is difficult for them to cater to complex requirements, as in wheel loading [5].A few papers have reported attempts to automate the wheel loading operation by proposing industrial sensor-based methods to replace the human skills of simultaneously tracking the moving vehicle body to anticipate the precise loading moves and aligning the wheel with the wheel hub for successful assembly.
Early works in automation of moving assemblies proposed the use of conveyor motion tracking using data from the motion encoders and synchronising it to the assembly robot motion [4,6].In this method, conveyor motion data along the direction of flow is obtained without the use of any additional sensors.However, in real production lines, the objects on the conveyor line are subjected to random motion in up to 6°of freedom (oscillations and rotations in all 3 axes), and therefore, conveyor tracking is insufficient, and additional systems that recognise and track the objects moving on the conveyor line are required [7].
Cho et al. have reported the use of a visual tracking manipulator using a camera on the wheel gripper mounted on an industrial robot that loads the wheel to track the centre of the wheel hub on the moving vehicle body [4].The visual tracking method is divided into macro tracking that monitors the velocity of the moving vehicle body and micro tracking that monitors the fine positional errors to assist in precision wheel loading.However, there is no mention of a proposed alignment correction method.Chen et al. have reported a method of visual servoing to track the motion of the vehicle body in 2 axes to determine the wheel loading instance and position [5].Force sensors that measure the loading force along all 3 axes are used for precisely controlling the final movement of the robot towards the wheel hub to perform loading according to set values of compliant contact forces between the robot tool and the wheel hub.Misalignment between the wheel and the wheel hub is also checked by the visual servoing system, and transformation is applied to correct it.Shi has reported a preliminary analysis of dynamic conveyor motion and presented the typical motion characteristics of industrial conveyors such as speed, acceleration and multi-axis deviations in motion [8].Based on that study, Shi and Menassa have proposed a method in which a coarse vision camera tracks the general motion characteristics of the moving vehicle body with lower accuracy and a fine vision camera to track the deviations in vehicle body motion just before loading is performed [9].A vision camera placed at the end of the industrial robot arm loading the wheel is used to locate the wheel hub studs for alignment.Lange et al. have also proposed a coarse and fine sensing system and a compliant force-torque sensor in the robot end effector to control the loading step to compensate for final temporal or spatial offsets [10].Predictive modelling of robot motion trajectory control in addition to the computed trajectories based on vision system inputs is used to enhance the loading precision.The camera on the robot end effector identifies the positions of the wheel hub studs with respect to the positions of the wheel bores to determine misalignment.Schmitt and Cai [7] report the use of monocular camera mounted on the assembly robot end effector to track the moving object and estimate its motion trajectory in order to guide the motion of the assembly robot.In this work, only a single point on the moving object is tracked and the trajectory is estimated offline after all the measurement images are obtained.
In all of the above reported research, industrial vision systems such as stereovision and monocular cameras are used for object tracking and feature recognition.These systems require computationally expensive real-time image processing and pattern-matching algorithms.Also because of their use of colour values of pixels for isolating target objects from the background, ambient light plays an important part in image processing accuracy, and therefore, an active computational intervention is required to compensate for changes in lighting conditions.Thirdly, vision systems can only effectively provide object tracking along 2 axes, and additional force sensors are required to provide the same along the third axis.The advent In recent years, many researchers have studied the potential and limitations of depth imaging sensors, but few have considered using this technology in an industrial manufacturing setting.Prabhu et al. have used depth imaging sensors such as the Kinect to track components of a pen in real-time to digitise the assembly [11].Such et al. have used them to track the progress of a composite layup process to provide look-ahead instructions for the operator laying up the composite fibre material on a tool [12].Since depth data is provided by infrared (IR) and not visible light (RGB) imaging, these reported methods do not get influenced by changes in ambient light conditions.
Prabhu et al. have previously proposed a depth sensorbased method to determine the misalignment between the wheel and the wheel hub mounted on a constantly moving vehicle body and to compute the optimum alignment manoeuver [13].They investigated the use of a single depth imaging sensor to recognise alignment features on both the wheel and the wheel hub over a short stretch of 400 mm of a 2.5-m long typical wheel loading workstation.This paper reports an extension of that work and investigates the use of multiple depth imaging sensors to capture not only the misalignment but also the motion tracking data along the entire length of the workstation.The motion tracking data obtained from the depth sensors is also compared to that obtained from a highly accurate laser motion tracker.The main objective of this comparison is to gauge the accuracy of the consumergrade depth imaging sensors and evaluate whether such sensors can be used to enable the automation of the wheel loading operation in the industry.

Method
In this paper, a novel method of obtaining live shopfloor data using multiple low-cost depth imaging sensors is presented.The operation targeted is that of wheel loading on the trim and final assembly line of automotive production with the vision to automate that operation in the future.One such automation in which the authors have demonstrated the possibility of using a depth imaging sensor to capture live shopfloor data for dynamic alignment of components to be assembled on a moving line [13].This paper reports an extension of that work.

Mimicking the wheel loading operation
In order to collect live data pertaining to a wheel loading operation and given that it was not possible to do this in an actual production line (Fig. 1) at this stage of research, the key elements of the operation are mimicked in controlled laboratory conditions.
An important element of the operation is the moving conveyor line carrying the vehicle body with the wheel hubs mounted on its front and back axles.In a future scenario where wheel loading is automated, the wheel loading robot would have to constantly track the moving wheel hub to know when, where and how to load the wheel.A method to automatically recognise the wheel hub and track its motion along the workstation is thus needed.Therefore, the motion of the wheel hub on the conveyor line is reproduced in the laboratory by mounting the wheel hub on to a robot arm and programming the Fig. 5 The experiment setup for optimising sensor positioning parameters Fig. 6 Wheel hub centre point identification Fig. 7 The moving wheel hub tracked within the far and near motion sensing zones robot arm to mimic typical conveyor line motion.The conveyor motion characteristics such as out-of-plane deviations are programmed as sinusoidal oscillations in the following five patterns: 1. Linear motion of the wheel hub along x-axis without any deviations at a speed of around 67 mm/s as reported by Shi as the average conveyor line speed [8]. 2. Linear motion along x-axis at around 67 mm/s with stopstart movements to mimic the vehicle body jerks on the conveyor line.3. Linear motion along x-axis at around 67 mm/s with sinusoidal motion deviations along the vertical y-axis to mimic the bounce of the vehicle body on the conveyor line.4. Linear motion along x-axis at around 67 mm/s with sinusoidal motion deviations along the perpendicular z-axis to mimic the sway of the vehicle body on the conveyor line. 5. Linear motion along x-axis at around 67 mm/s with sinusoidal motion deviations along both y-axis and z-axis to mimic the composite effect of both bounce and sway of the vehicle body on the conveyor line.
According to the data received from a Tier 1 manufacturer, a typical vehicle body in motion on a conveyor will deviate from linear motion with out-of-plane oscillations having amplitude of ±10 mm and a frequency of 1 Hz [15].
The second important element of the operation is the radial alignment between the wheel and the wheel hub so that the bores of the wheel are in the same angular position as the studs on the wheel hub at the time of loading.The misalignment scenarios are also reproduced during the experiments by positioning the wheel hub on the robot arm with varying angular positions.

Experiment setup
The wheel loading workstation, simulated in the laboratory, is divided into two motion sensing zones, namely the far sensing zone and the near sensing zone.In the far sensing zone, the coarse motion of the moving wheel hub is tracked whereas in the near sensing zone, the motion characteristics are closely monitored.In the near sensing zone, the alignment features on Two depth imaging sensors, one in the far sensing zone (called the 'far sensor') and one in the near sensing zone (called the 'near sensor'), are used (Fig. 2).The wheel hub is mounted on the robot arm in such a way that the studs face the depth sensors.The robot used is Comau NM-45, a 6-axis industrial robot arm with a maximum payload of 45 kg and position reproducibility of 0.06 mm.The robot is programmed to mimic the 5 conveyor motion patterns listed in Section 3.1, and each pattern is repeated 10 times to obtain multiple data sets to determine reproducibility of results.

Depth sensor positioning
The depth sensors are reported to have maximum accuracy over the distance range of 1 to 3 m from the sensor with an effective field of view of 54.0°horizontal and 39.1°vertical [16].Khoshelham and Elberink have reported that the random error of measurement results increases quadratically with increasing distance from the sensor and reaches 40 mm at the maximum range of 5 m [17].These inputs influenced the positioning of the depth sensors in the far and near sensing zones along with the primary requirement to cover the entire length of the wheel loading workstation with the combined frames of view of the two sensors.
Therefore, the far sensor is placed at a perpendicular distance of 2 m from the moving wheel hub plane covering a horizontal field of view of about 2 m.The near sensor is placed at a distance of 850 mm covering a horizontal field of view of about 800 mm.The two sensors are laterally separated by a distance of 1 m to attain a 400 mm of view overlap with each other and are placed at the same height as that of the moving wheel hub from the ground.The two sensors together cover an area of 2.4 m of the workstation that has a total length of 2.5 m at a typical assembly line (Fig. 3).
In addition to the two depth sensors, a laser motion tracker (Leica Absolute Tracker AT402) is also used to track the motion of the moving wheel hub (Fig. 2).The laser tracker uses a laser beam that is reflected off a reflector that is attached to the wheel hub to track its motion.It has a resolution of 0.1 μm, accuracy of ±10 μm and repeatability of ±5 μm which makes it a very accurate device for tracking motion and therefore is used as a gold standard in the industry and in this work to The third depth sensor is placed directly in front of the wheel placed on the storage rack and at the same height as that of the centre of the wheel from the ground (Fig. 4).This sensor recognises the alignment features on the wheel, which are the 4 tapped bores, and measures their angular position.In an actual shopfloor setting, the near sensor described above could also be mounted in an eye-in-hand configuration with the camera mounted on the robot end effector as an extended fixture of the wheel loading robot.This setup will enable the near motion tracking and alignment feature detection of the moving wheel hub as well as alignment feature detection of the wheel to be performed by the same depth sensor.

Sequence of events
This experiment imitates the wheel loading operation as it is performed in an actual automotive trim and final assembly line.The sequence of events reproduced is as follows: (a) The robot arm moves the wheel hub linearly across the workstation (along x-axis) for a distance of about 2.5 m at an average speed of about 67 mm/s.Typical conveyor  (c) The wheel hub then enters the near sensing zone in which the near sensor tracks its centre and records its spatial position in all 3 axes.The speed of motion is also computed.The near sensor also recognises the alignment features, the 4 studs, on the moving wheel hub and records their angular positions.(d) The depth sensor placed in front of the stationary wheel recognises the alignment features, the 4 bores, on the wheel and records their angular positions.(e) The data generated in each of the above steps enables the automated wheel loading scenario of the future to make critical human-like decisions such as when, where and how to load the wheel onto the moving wheel hub and to dynamically correct the misalignments if any between the wheel and the wheel hub before loading.Any data that is out of the tolerance limits can be used to trigger an abort.

Determination of optimum setup
The use of depth imaging sensors in a real manufacturing environment to obtain live moving assembly data is a relatively new concept in literature.Little is known on the optimum sensor positioning parameters and their influence on data capture precision and accuracy, such as the perpendicular distance of the sensor from the objects to be tracked and the sensor face plane angle with respect to the assembly line plane.Therefore, a few experiments are conducted to determine the effects of variations in sensor positions on the measured data and to obtain the optimum setup for data capture.
The test case of wheel feature recognition is used to conduct these experiments (Fig. 5).In the first experiment, the effect of sensor face angle with respect to the wheel face plane was investigated.Instead of varying the sensor face plane angle, the wheel face plane angle was varied for convenience of setup.This angle was varied from −20°to +20°, and the optimum angle range in which the wheel features were successfully recognised was determined.In the second experiment, the distance of the sensor from the wheel face was varied from 700 to 950 mm, and the optimum distance for successful feature recognition was determined.In the third experiment, the number of depth image frames used to apply cumulative averaging of feature recognition data was varied to obtain the optimum number that gave the best-averaged result.
Each of the 3 experiments is repeated for 10 iterations to check for reproducibility of the results.The setup that produces results with the least standard deviation is chosen for positioning the depth sensors in the digitisation of the simulated wheel loading operation.
Despite optimum setup, the sources of errors that could impact the measurement results in this work are as follows:

Motion tracking and feature recognition of the moving wheel hub
The algorithms used to track the moving wheel hub and recognise the angular positions of the alignment features of both the moving wheel hub and the stationary wheel are based on the comparison of depth values of the pixels that belong to the object with those of the background as reported by Prabhu et al. [13].To track the moving wheel hub, the depth images provided by the far sensor are continuously monitored for depth values of pixels belonging to the wheel hub motion plane, represented by the far motion sensing zone, to be in the range of 800 and 900 mm, indicating the presence of the wheel hub.The algorithm then scans the zone to determine object extremities in the horizontal and vertical directions.
Once the extreme points are identified, computing and averaging their midpoints results in identification of the wheel hub  centre point (Fig. 6).Since the Kinect sensor produces up to 30 depth images per second, the continuous identification of the centre point within these images results in tracking the motion of the moving wheel hub.
The new concept introduced in this work is the use of multiple depth sensors and the segregation of the workstation into far and near motion sensing zones (Fig. 7).The moving wheel hub first enters the field of view of the far sensor in the far motion sensing zone.This zone covers the pre-loading area where the x, y and z positions of the centre of the wheel hub are tracked, and its motion speed is constantly computed.Any deviations or disruptions in motion along any of the 3 axes are captured and recorded.Since the far sensor is placed at a relatively larger distance from the wheel hub motion plane, it can track a wider area but is also less accurate and therefore can only measure coarse motion characteristics.
The wheel hub then moves into the field of view of the near sensor in the near motion sensing zone.This zone covers the loading area, and therefore, the fine motion is tracked with more accuracy and precision than in the far sensing zone.In this zone, along with tracking the wheel hub centre point, the alignment features of the moving wheel hub are also recognised.To recognise the alignment feature (stud), the depth images provided by the near sensor are continuously monitored to identify the wheel hub centre point.From the centre point, the pitch centre diameter (PCD) line, which is located at a radial distance of 108 mm from the centre point, is tracked in order to find pixels that have depth values around 900 mm, indicating the presence of the stud (Fig. 8).Once the stud is identified, its centre point is obtained by locating its extremities.
The centre points of the remaining three studs are computed easily since the studs are radially located 90°apart.The angular positions of the studs are represented by the angular position of the stud located within the 90°to 180°quadrant (Fig. 9).

Motion tracking and feature recognition of the stationary wheel
The depth sensor placed in front of the stationary wheel also uses the same edge detection algorithm to detect the wheel centre as the one used to detect the moving wheel hub centre (Fig. 10a).The 4 bores located 90°apart from each other at a pitch centre diameter of 108 mm from the wheel centre (Fig. 10b) are recognised, and their angular positions are measured in terms of the angle of the bore located within the 90°to 180°quadrant.
The difference between the angular positions of the alignment features on the wheel and those on the wheel hub denote a misalignment (Fig. 11) that needs to be corrected before loading can take place.

Results
Digitisation of the wheel loading operation is the process of obtaining key shopfloor data that will be used by the automated version of the operation in the future.The results of this digitisation process are presented in this section in the following order:  1. Identification of wheel features and measurement of the angular positions of the bores.2. Motion tracking of the moving wheel hub and identification of the angular positions of the wheel studs for the following simulated motion patterns: (a) Linear motion along x-axis with no deviations in y-axis and z-axis.(b) Jerky motion along x-axis with no deviations in y-axis and z-axis.(c) Linear motion along x-axis with sinusoidal deviation in y-axis.(d) Linear motion along x-axis with sinusoidal deviation in z-axis.(e) Linear motion along x-axis with sinusoidal deviation in y-axis and z-axis.

Recognition of alignment features of the wheel
The depth sensor captures depth images of the stationary wheel at the rate of up to 30 frames per second.From within each depth image, the 4 bores of the wheel are recognised and their angular positions, represented by the angle of the bore located within the 90°to 180°quadrant (the 'first bore'), are measured.To improve the accuracy of this method, the angle obtained is cumulatively averaged over 45 depth frames before it is recorded.Ten iterations of the experiment are conducted, and the results are tabulated in Table 1.

Motion tracking and alignment feature recognition of the wheel hub
The far and near depth sensors track the motion of the wheel hub by continuously detecting the centre point of the hub and recording its x, y and z coordinates along with its speed in the direction of motion (x-axis).In the far sensing zone, the far sensor tracks the position and speed of the wheel hub whereas in the near sensing zone, the near sensor also identifies and records the angular positions of the studs of the moving wheel hub.
The motion tracking data obtained from the far and near sensors is compared to that obtained from the laser tracker that tracks the same motion.Since the laser tracker and the depth sensor are not synchronised during motion tracking, the two sets of data cannot be plotted and visualised on the same chart.The motion tracking results for the five simulated motion patterns are presented below.Since each motion pattern is run for 10 iterations, the wheel hub position and speed values are averaged over the 10 iterations.In the far sensing zone Wheel hub motion tracked along all 3 axes by the far sensor and the laser tracker when the wheel hub is in the far sensing zone is presented in charts shown in Fig. 12.The corresponding speed computed from the near sensor and laser tracker motion data is plotted in the charts shown in Fig. 13.
In the near sensing zone Wheel hub motion tracked along all 3 axes by the near sensor and the laser tracker when the wheel hub is in the near sensing zone is presented in Fig. 14.The corresponding speed computed from the near sensor and laser tracker motion data is plotted in the charts shown in Fig. 15.In this zone, the angular position of the wheel hub stud located in the 90°to 180°quadrant (the 'first stud') is also measured for 10 iterations as shown in Table 2.

Jerky motion along x-axis with no deviations in y-axis and z-axis
Since jerky motion is along the x-axis only, the y-axis and zaxis motion tracking charts are not shown.
In the far sensing zone Figures 16 and 17 show the motion and speed charts produced by the far sensor and the laser tracker, respectively.
In the near sensing zone Figures 18 and 19 show the motion and speed charts produced by the near sensor and the laser tracker, respectively.Table 3 shows the angular positions of the wheel hub measured over 10 iterations for this motion pattern.

Linear motion at 67 mm/s along x-axis with sinusoidal deviations in y-axis
Since the oscillations are along the y-axis only, x-axis zaxis motion tracking charts are not shown.
In the far sensing zone Figure 20 shows the motion charts produced by the far sensor and the laser tracker.
In the near sensing zone Figure 21 shows the motion charts produced by the near sensor and the laser tracker, and Table 4 shows the angular positions of the wheel hub measured over 10 iterations.

Linear motion at 67 mm/s along x-axis with sinusoidal deviations in z-axis
Since the oscillations are along the z-axis only, x-axis and yaxis motion tracking charts are not shown.
In the far sensing zone Figure 22 shows the motion charts produced by the far sensor and the laser tracker.In the near sensing zone Figure 23 shows the motion charts produced by the near sensor and the laser tracker.Table 5 shows the angular positions of the wheel hub measured over 10 iterations for this motion pattern.

Linear motion at 67 mm/s along x-axis with sinusoidal deviations in y-axis and z-axis
Since the oscillations are along the y-axis and z-axis, x-axis motion tracking chart is not shown.
In the far sensing zone Figures 24 and 25 show the motion charts produced by the far sensor and the laser tracker for yaxis and z-axis deviations, respectively.
In the near sensing zone Figures 26 and 27 show the motion charts produced by the near sensor and the laser tracker for yaxis and z-axis deviations, respectively.Table 6 shows the angular positions of the wheel hub measured over 10 iterations for this motion pattern.

Discussion
In this paper, a unique method of obtaining live moving assembly data from the shopfloor using low-cost depth imaging sensors is presented.The moving assembly operation chosen is wheel loading in the trim and final assembly line in automotive production.The eventual aim of this research is to enable cost-effective automation of this complex operation.Therefore, to replace the human operator, it is necessary to have a solution that can observe the assembly operation and capture critical operation data, make decisions based on the data and implement actions based on the decisions made to successfully perform the assembly.This work proposes a method of using low-cost depth imaging sensors as a replacement for the human operators ability to observe and digitise the operation.The proposed method uses three depth sensors in total, one sensor to recognise the alignment features of the to-be-loaded stationary wheel and the other two sensors to capture the motion characteristics of the moving wheel hub as well as recognise its alignment features.The data obtained from these three sensors could be used by a decision support/expert system to make load/abort, align/not align, when-to-load and where-toload decisions for the automated wheel loading solution.However, before the method is implemented, it is necessary to determine the optimum sensor positioning on the shopfloor and optimum sensor parameters to obtain the best possible results.The alignment feature recognition of the stationary wheel is used as a test case for this purpose, and the results are discussed.

The impact of distance of the sensor from the observed object
From a study of the impact of sensor distance on the measurement precision of the alignment feature angle, it  was observed that at a distance of 950 m and above, the features of the wheel were too small to be rendered in the depth image whereas the minimum distance below which the feature recognition algorithm will not work was 700 mm.Therefore, the distance between the sensor and the wheel was varied from 700 to 950 mm, and the optimum distance of 850 mm was obtained with the least standard deviation of 0.28°(Fig.28).

The impact of sensor face angle with respect to the object plane
The feature recognition algorithm uses depth values of the pixels corresponding to the object being tracked to recognise features and measure its angular positions.Therefore, it is expected that the sensor face (Fig. 29) is perfectly parallel (relative angle is zero) to the object face plane every single time.Fig. 26 Wheel hub positions captured by the near sensor and the laser tracker (y-axis) However, this is not possible in a real shopfloor scenario, and therefore, it was necessary to find the angle range within which the proposed method would work.The results below show that the feature recognition works reliably only within the −10°to +10°range (Fig. 30).

Impact of number of frames used for averaging
In this work, the cumulative averaging technique is used to reduce the measurement errors while obtaining the angular positions of the alignment features on the wheel and the moving wheel hub.The sensor produces 30 depth image frames per second, and the algorithm processes each image to recognise the features and measure their angular positions.Because of the noise present in the sensor depth data, measurement obtained from only 1 frame does not suffice.Therefore, angular positions measured from multiple frames are averaged to determine the final angular positions.The number of frames averaged was varied from 1 frame (no averaging) to 120 frames and the optimum number of frames was found to be 45 with the least standard deviation of 0.17°(Fig.31).Averaging over 45 frames results in a delay of 1.5 s to obtain the angular position result, which is satisfactory.

Impact of IR interference between the two depth sensors
Depth imaging sensors are infrared (IR) light emitting devices which measure depth by processing the IR waves that are reflected back to it from the surfaces in its field of view.Therefore, when two or more sensors are used to observe the same scene, such as in this research, the IR waves emitted by the sensors interfere with each other causing significant noise in depth data obtained from the two sensors [18].Due to this constraint, the far and near sensors were time-multiplexed to operate at different times during the operation.When the wheel hub enters the field of view of the far zone, the near sensor was switched off and when it enters the near sensing zone, the far sensor is switched off.This way, no two sensors operate at the same time thereby avoiding noise due to IR interference.Figure 32 shows the effect of interference on the quality of depth data.However, the second generation of the depth sensors, such as the Kinect v2 [19], do not present any interference problems when multiple sensors are used at the same time.

Accuracy of depth sensor motion tracking
In this work, motion data captured by the depth sensors is compared with that obtained from the industry standard laser tracker.Hence, the error in depth sensor motion data is computed relative to the motion data produced by the laser tracker.The motion parameters used for the relative error computation are values of motion speed, deviation amplitude and deviation frequency for all 5 motion patterns averaged over 10 iterations each (Table 7).
From the above results, it can be noted that the near sensor is more accurate in motion tracking than the far sensor due to   its closer proximity to the moving wheel hub that enables it to capture better depth images of the wheel hub.It can also be noted from Table 7 that the near sensor is able to better track the motion deviations of the wheel hub with lower deviation frequency error than the far sensor.Therefore, there is significantly less lag in tracking motion deviations of the moving hub the near sensing zone than in the far sensing zone while maintaining the error difference between them.Finally, from the error values of motion pattern 4 and 5, it can be observed that the depth sensors are less accurate in tracking motion in the depth axis (z-axis) than in the other 2 axes.This phenomenon could be linked to the way in which the sensors calculate the depth values of pixels in the 3D scene by way of interpolation based on the structured light technique [20] rather than absolute depth measurement of each pixel.The measurement of wheel hub speed along the direction of motion is critical in determining the position and time at which to load the wheel.In this work, the error in measurement of average wheel hub speed ranges from 0.15 to 8.06 mm/s (Table 7).Contrary to expectation, the far sensor average speed errors are lower than those of the near sensor for all motion patterns.However, on closer observation, the error spread along the entire tracked motion is more erratic for the far sensor than that of the near sensor, an example of which is shown in Fig. 33.

Performance of the proposed method
For the wheel loading use case selected in this work, the wheel bores are 20 mm in diameter and the wheel hub studs are 12 mm in diameter.Therefore, an assembly tolerance of 4 mm is required for successful wheel loading irrespective of the motion patterns.It can be noted from table x and y that the error in measuring motion deviation amplitude, especially in the crucial near motion sensing zone, is less than the required assembly tolerance of 4 mm.For the measurement of angular position of the wheel hub studs, the maximum standard deviation noted was 1.54°, which is an equivalent of 1.46 mm, is also less than the 4 mm tolerance required.Therefore, the proposed method is feasible to be implemented in wheel loading operations that use the specifications of the wheel and the wheel hub used in this work.

Future work
According to Chen et al., the minimum assembly tolerance used in the industry is 2 mm [5].The maximum error recorded is 2.78 mm for composite y-axis and z-axis deviations in this work renders the proposed method unsuitable for the industry in its current version.However, the Kinect v2 with higher resolution of depth images (512×424 pixels) and better object recognition algorithms would be investigated in an attempt to obtain the error values below the required 2 mm.
In order to apply the proposed method of capturing wheel loading operation data on the actual wheel loading workstation, firstly, the area between the Kinect sensors and the plane of wheel hub motion on the conveyor line must be reserved without any disturbances.Secondly, the distance constraint of 850 mm from the near Kinect sensor to the plane of wheel hub motion may limit the operational space of the wheel loading robot.Both these requirements could be met if the near Kinect sensor is mounted on the wheel loading robot itself.This setup saves the space taken up by the near sensor on the shopfloor and at the same time enables the robot arm to be controlled precisely as the wheel hub motion is tracked in real time.
In a real wheel loading scenario, the wheel hub mounted on the vehicle axle consists of additional components such as the brake disc callipers and in some cases the drum brake setup is installed.Therefore, the tracking method proposed here will need to be amended to recognise the wheel hub and recognise  the alignment features in the presence of such components.Also, the motion of the car body on the conveyor line and its effect on the resulting motion of the wheel hub considering all 6°of freedom is complex and needs extensive investigation by studying the motion characteristics of an actual assembly line at the wheel loading workstation.IR interference between the far and the near sensors compelled the motion sensing zone to be divided into mutually exclusive far and near motion sensing zones.The Kinect v2 sensors are expected to be significantly less affected by IR interference, and therefore, a motion tracking setup with the far sensor tracking the moving wheel hub along the entire length of the loading workstation can be used.This setup will enable the far sensor to constantly track the moving hub for major disruptions whereas the near sensor can track the motion more precisely and determine misalignments more accurately.
In this work, the depth sensors are not calibrated apart from the calibration done by the manufacturer, and therefore, object recognition and tracking quality degrades as the object moves away from the centre of the field of view of the sensor.A calibration method is needed to enhance the accuracy of coordinate mapping between the sensor coordinate system and the real world coordinate system, and this is expected to enhance the accuracy of motion tracking and feature recognition.
Finally, as an extension of this work, visual servoing methods will need to be developed in order to precisely control the motion and the wheel loading action of the industrial robot arm.The input data to such methods will be the realtime motion tracking data provided by the Kinect sensors using the method proposed in this work.

Conclusions
In this paper, a new method to digitise the wheel loading operation, which is one of the most complex, cost-intensive and yet to be automated moving assembly operations in the automotive industry, is presented.Digitisation involved simultaneous motion tracking of the vehicle body on the conveyor line including the measurement of its motion characteristics and measuring the misalignments between the moving wheel hub and the wheel to produce data which is critical to the success of the wheel loading operation.The novelty of this research is the use of multiple consumer-grade depth imaging sensors, commonly used in gaming, to obtain the motion and misalignment data in real-time.The proposed method is unique also in its approach to divide the wheel loading workstation into far and near motion sensing zones to use the depth sensors at varying accuracy levels as demanded at different times of the operation.This feature attempts to overcome the relatively lower resolution of the depth sensors in view of their lower cost of ownership and operation as compared to some of their more expensive and highly accurate industrial counterparts.
In this work, the wheel loading operation was simulated in a controlled lab environment with the wheel hub mounted on an industrial robot arm to mimic the conveyor line motion.Two depth imaging sensors, one each in the far and near motion sensing zones, respectively, were used to track and record the coarse and fine motion of the moving wheel hub, respectively, along all 3 axes.The alignment features on both the stationary wheel and the moving wheel hub were also recognised by the depth sensors and misalignment between them was measured.These zones were mutually exclusive in time and never operated at the same time to avoid IR interference between the two depth sensors.Five different wheel hub motion patterns were simulated, and experiments to track motion and recognise alignment features were conducted for each pattern with 10 iterations each.A laser tracker was used to simultaneously capture and record the motion of the wheel hub in each of the experiments.This allowed the authors to gauge the motion tracking accuracy of the depth sensors in the two motion sensing zones vis-à-vis the laser tracker, which is considered as the gold standard in this research on account of its high accuracy.
The results show that depth sensor and the associated image processing code is able to track the moving wheel hub and measure its motion characteristics in real-time.The use of a far and near motion sensing zone isolates the low and high accuracy needs of the wheel loading operation while being able to capture the entire workstation length of 2.5 m.Despite the relatively low resolution of the depth imaging sensor, placing it at a short perpendicular distance of 850 mm from moving wheel hub plane, the resulting motion tracking error of 2.78 mm is recorded.This method therefore meets the assembly tolerance of 4 mm, which is dictated by the type of wheel and the wheel hub used in this work.However, this error margin is higher than the reported maximum assembly tolerance of 2 mm used in the industry.With the recent launch of the second generation of the depth imaging sensors (Kinect V2) with almost double the depth resolution coupled with improved depth-based object detection algorithms, it is anticipated that this method will be able to achieve motion tracking error of less than 2 mm and therefore be suitable for adoption by the industry.
This paper exploits the consumer-grade nature of the depth imaging sensors to propose a cost-effective method to digitise moving assembly operations with an aim to enable the automation of these operations in the future.The constant marketdriven upgradation of these sensors as a result of their widespread use in the gaming sector will result in the development of better imaging capability making complex object tracking and feature recognition possible in the future.The low-cost of ownership and operation, the consumer-proven robustness and the resulting simplification of image processing algorithms due to the availability of constantly improving 3rd dimension along the depth axis will therefore open up a whole new area of moving assembly digitisation research.

Fig. 3 Fig. 4
Fig. 3 Depth sensor positioning for far and near motion sensing

Fig. 8 Fig. 9 a
Fig. 8 Location of wheel hub studs

Fig. 10 a
Fig. 10 a Depth image of the wheel with bores recognised and b 2D drawing of the wheel

Fig. 12
Fig. 12 Wheel hub positions captured by the far sensor and the laser tracker

Fig. 13 Fig. 14
Fig. 13 Wheel hub motion speed (x-axis) captured by the far sensor and the laser tracker

( a )Fig. 15
Fig. 15 Wheel hub motion speed (x-axis) captured by the near sensor and the laser tracker

Fig. 16 Fig. 17
Fig. 16 Wheel hub positions captured by the far sensor and the laser tracker (x-axis)

Fig. 18
Fig. 17Wheel hub speed captured by the far sensor and the laser tracker (x-axis)

Fig. 19
Fig. 19 Wheel hub speed captured by the near sensor and the laser tracker (x-axis)

1 )Fig. 20 Fig. 21
Fig. 20 Wheel hub positions captured by the far sensor and the laser tracker (y-axis)

Fig. 22
Fig. 22 Wheel hub positions captured by the far sensor and the laser tracker (z-axis)

Fig. 23
Fig. 23 Wheel hub positions captured by the near sensor and the laser tracker (z-axis)

Fig. 24 Fig. 25
Fig. 24 Wheel hub positions captured by the far sensor and the laser tracker (y-axis)

Fig. 27
Fig. 27 Wheel hub positions captured by the near sensor and the laser tracker (z-axis)

Fig. 28
Fig. 28 Impact of sensor distance on feature recognition precision

Fig. 31
Fig. 31 Number of frames averaged impact the data accuracy

Fig. 33
Fig. 33 Speed values computed by a the far sensor and b the near sensor and the laser tracker

Table 1
Angle of the first bore on the stationary wheel

Table 2
Angle of the first wheel hub stud

Table 3
Angle of the first wheel hub stud

Table 4
Angle of the first wheel hub stud

Table 5
Angle of the first wheel hub stud and its standard deviation measured over 10 iterations

Table 6
Angle of the first wheel hub stud

Table 7
Relative errors in depth sensor motion tracking data