1 Introduction

The manufacturing of products usually requires multiple steps to complete, often split among several working stations. In general, it is true that the more complex a product is, the more steps are needed to manufacture it. Additionally, it is not uncommon that a complex product requires other sub-products and might have some variations that address the specific needs of a particular customer. This introduces a lot of complexity into the manufacturing process. The path a product takes through the factory is linked to that complexity. In order to make any statements about the performance of a factory layout or production process pipeline, a general overview and understanding of the material flow are necessary. When and where a workpiece is produced and in what time frame is critical information for the assessment of productivity and efficiency. For that, real-time-locating-system (RTLS)-based approaches reduce the work required to gather this information. Instead of labor-intensive manual tracking of workpieces or expensive full-scale robotic automation or digitalization of factories, only a pair of tracking devices need to be used. This makes this solution especially interesting for small and midsized companies without an abundance of either of these resources. The use of RTLS in factories covers a lot of different applications, such as layout planning or adjusting raw material buy orders. [4, 7] Hammerin et al. [8] proposed the use of RTLS for real-time management of production environments. Thiede et al. [9, 10] concentrated on optical and AI-based image recognition RTLS-based approaches capable of identifying humans in factory settings. Wolf et al. [11] derived efficiency data from human-centered ultra wide band (UWB) tracking. While Löcklin et al. [12] focused on the prediction of human movement to avoid accidents. Other technologies, like radio-frequency identification (RFID), were shown by Arkan et al. [13] to work in different scenarios. Küpper et al. [14] investigated the application of 5G for RTLS. Sullivan et al. [18] introduced value stream mapping using UWB as a valid method for the decision making process in manufacturing systems. The visualization, classification of the data, and detection of outliers are some of the contributions that we add to enhance the capabilities of this approach.

While the technical aspects of RTLS are thoroughly explored, the application and data analysis remain ambiguous. In order to derive knowledge and make informed decisions, the raw data has to be processed and displayed in an effective way. Type and quality of visualizations influence our decision-making process. [15] Visual analysis tools are therefore key components of any material flow optimization approach. The scalability of RTLS approaches also needs to be considered. If it is necessary to manually review each data point, automating the data collection phase does not add much efficiency. To bring up important information while minimizing the analyst's workload, an intelligent framework is required.

Typically, a company may want to review its manufacturing processes on a regular basis, but especially after new machines, products, or employees are introduced into the process. The analyst tags certain workpieces with the tracking device and gathers the data automatically. It can then be processed, displayed, and determined what changes are required to maximize productivity (e.g., an additional machine or employee is required, milling must be done differently to produce fewer workpieces that must be reworked, etc.).

To detect these anomalies, simple statistics are often not sufficient. Demonstrated in Fig. 1 is a plot of our workpiece trajectory data. Each column corresponds to a workpiece trajectory. Workpiece trajectories may differ in duration because of traffic building up, the reworking of a product, or any other reason. Based on that, one would expect to spot all outliers this way. However, the total duration correctly indicates an anomaly for trajectory index 3, but misses other anomalies for indexes 0, 5, and 6. Further analysis methods have to be provided to identify them.

Fig. 1.
figure 1

Total time of material paths of the unfiltered dataset. Anomalies in path 0, 5, and 6 cannot be identified.

To improve on all that and enable a proper analysis of RTLS tracking data, we propose a visual analysis framework and processing pipeline. Our core contributions are

  • the generation of a UWB material flow dataset, its filtering, and its preprocessing

  • the specification and automated detection pipeline of anomalies for bottleneck identification

  • the workpiece specific graph-based trajectory visualization for comparability

  • the concept of data type similarity between streamlines in flow fields and RTLS trajectory data

  • and the embedding and clustering of workpiece trajectories for automated anomaly detection and pattern identification

In the following, we will chronologically describe the data acquisition and experimental setup before the preprocessing explains how the data needs to be filtered and transformed. The visualization chapter then goes into detail about the analysis and anomaly detection. At last, the limitations and future work are contained in the discussion section before the final conclusion.

2 Method

2.1 Dataset

Since the research on RTLS in the context of material flow is relatively new, there have not been any state-of-the art datasets made available for benchmarks or comparisons. In other domains of research (e.g. computer vision, machine learning, etc.) there is a consensus among researchers to compare their work on the same datasets to enable an objective assessment of the quality of their models. [1,2,3] This development has not yet taken place in the manufacturing research community. However, it will become increasingly important in the future to establish these types of datasets in order to enable a standardized common ground and build more complex models for automated and reliable systems. The generation of these come with their own set of unique challenges.

Often times, companies will decide to withhold the publication of their data so as not to give competitors insights or other advantages. This isolates the R&D departments of different companies, drives up costs, and hinders innovation. Some companies adopted a middle course, cooperating with universities and sharing insights into non critical processes.

Another challenge is the generation of material flow data itself. Data collection under real-world manufacturing conditions takes time if there is no steady production (as in universities and laboratories). This leads to the current situation with little to no publicly available datasets.

To still enable research on this topic, we recorded our own dataset to demonstrate our approach and provide one building block to close that data gap. It can easily be expanded or modified in the future and serves to provide reproducibility.

There are multiple RTLSs known, each with advantages and disadvantages. [4] Radio-Frequency-Identification (RFID), for example, provides high spatial accuracy but has a very limited operating range (1 m). Bluetooth Low Energy (BLE) suffers from a similar range limitation. With different WiFi localization methods, the range is between 150–200 m, with an accuracy of 1–5 m. Similar accuracy is achieved by 5G, whose range is virtually unlimited for factory scale RTLS. This also applies for GPS, whose accuracy is the worst with 2–10 m and its inherent limitation to outdoor use.

We selected UWB tracking since its precision is the most accurate (~0.5 m) among other technologies and its range can cover a significant part of the factory floor (150–200 m). The tracking was done by a station and a client device. Since UWB tracking is not yet a standard feature for consumer-grade smartphones, a modified Raspberry Pi was used as a handheld device to track the position. It is small enough to be moved together with any other workpiece through a factory.

For our dataset, we used an existing factory hall and set up virtual stations that can represent any kind of material processing, like milling, drilling, or quality control. These stations do not exist in reality, since the actual manufacturing of workpieces would massively exceed the scope of this research and do not contribute directly to the quality of the data. Instead, the stations are simulated using cardboard boxes and tables. Similar to a factory setting, the material is introduced into the factory at some position (the start point). A sketch in Fig. 2 illustrates the qualitative layout of our setup. After the material is present in the manufacturing environment, the tracking device is associated with a single workpiece. It then moves to the first station. In our case, the raspberry pi was taken and moved to one of our artificial stations.

Fig. 2.
figure 2

Sketch of the factory layout for artificial material flow data generation.

It is then processed at that station. For our setup, this means that we let the tracking device lie near the station just like it would in a real setting while the workpiece is processed. The benefit of our approach is that we can shorten the time span compared to real processing of workpieces. A fixed amount of time, usually in the magnitude of a couple seconds, is enough to simulate the processing of a workpiece. This lets us record more data without any loss in quality. The tracking is then continued from station to station until our fictional product is completed. To enhance the dataset, pre-planned anomalies were introduced at certain steps. This allows algorithms to identify these anomalies and compare their findings with the ground truth. In Table 1 the recorded trajectories and the incorporated anomalies are listed for the first dataset. This clearly identifies what an anomaly is and what an algorithm is supposed to identify. Depending on the manufacturing context, the notion of what constitutes an anomaly might change. For our setup, a significant amount of extra waiting time at a particular station or a different route (rework at the previous station) are considered different from the normal production flow.

Table 1. List of pre-planned anomalies introduced in first dataset. The paths 2, 3, 5, and 8 do not include any anomaly and are thus not shown here.

We generated three additional datasets, each corresponding to a fictional product with its own path and anomalies, in the same manner as described above. The additional datasets enable us to split it into training and validation data. The robustness and generalizability of any data processing and exploration approach are critical properties. The effects of shifts in scale, noise, time, and other factors can be examined by cross-validation. In the following, we will be using these datasets to present our visual analysis framework and anomaly detection pipeline.

2.2 Preprocessing

Although the precision of UWB tracking is among the most precise technologies available and theoretically resilient to multi-path interferences. [4, 6] It is prone to noise and interferences with metal objects in its path. [5] Disconnects result in duplicates or extreme outliers in the recorded positions for our setup. For the data to be used in any analysis tool, it first has to be cleaned and filtered so that structures can become visible. We did this by removing duplicates and outliers from the trajectories, which allowed them to be identified. All trajectories from the first dataset are shown in Fig. 3, which lets us clearly identify the working stations and rough layout of the factory.

Fig. 3.
figure 3

Unfiltered material flow trajectory dataset with outliers already removed. Paths and working stations can be identified but signal is still very noisy.

However, the signal is still very noisy. To facilitate further processing, filtering has to be applied. While there are many filtering techniques available, we used our knowledge of the system and basic filtering methods to avoid any distortions and preserve the underlying structure.

Since the measurements are taken in the real world, physical and virtual units are correlated. The movement of the workpiece is therefore also bound by the laws of physics. If the acceleration or velocity is outside of expected ranges (e.g. <10 km/h), the recorded point is likely to be an outlier and should be removed. After that, we apply a moving average filter to exclude any high frequency noise.

An important component of the data is centered around the working stations. The length of time a workpiece remains there and the order in which it is visited are useful pieces of information for any type of analysis. For this, we extract stations out of the trajectories by detecting clusters of signals. A cluster is a place where the workpiece stayed for an extended period of time. This way, the working stations and all the points associated with them can be identified. To create the finished material flow path, they are combined, corrected for timestamps, and collapsed if part of a small loop. In Fig. 4 a single filtered trajectory with identified working stations is displayed. It is the basis for further analysis.

This approach may also lead to stations being falsely identified if a workpiece stays in between stations for a longer time. In comparison to other routes, an additional station will pop out immediately. This allows for easy identification of material flow traffic jams in factories.

Fig. 4.
figure 4

Single filtered trajectory. Path is can be clearly seen. Stations have been identified.

2.3 Visualization

To quickly allow the identification of the cause of an anomaly, it is important to display the data in a meaningful way. With a visual understanding of why there has been an outlier in the data, it becomes easier to determine what actions need to be taken in order to mitigate the problem. Numerical information about the duration at each station is only useful in the second step.

For this, we chose a graph-based approach. This has several benefits. The general positions of the stations and the trajectory of workpieces remain the same. This lets the user grasp the spatial domain better and understand the scope of the problem. Secondly, the graph-based layout is intuitive because it is used in other applications. The user is therefore already used to it and does not need extensive training to understand what is displayed.

The primary requirement for a material flow visualization in this application is the ability to display and distinguish its properties. For that, we choose a directed graph since the order of workpiece processing is generally relevant. Another property is the timing, which is cumulatively encoded as the thickness of the arrow or circle. It represents the time it took for the workpiece to traverse it. If the trajectory has one of a set of known anomalies that can be detected in the filtered data, it can be marked with a different color to quickly draw attention to it.

2.3.1 Graph Visualization

The benefits of this visualization lie in its ability to display trajectory data without excessive visual clutter. Positional information is still conveyed without details about the exact position of the workpiece. For efficiency analysis purposes, it is irrelevant if a workpiece moves a couple of centimeters more to the right or left of the path. Time information is tightly linked to efficiency. So the time information is displayed cumulatively. A workpiece should be built in the same amount of time regardless of the time of day. To use scale as the channel to display duration information has the benefit that it lets users qualitatively compare the quantities, which is needed if outliers need to be identified. However, the representation of duration as size also comes with known human-centered biases. It is known that an area is underestimated (by an exponent of ~0.7) while other visual stimuli are overestimated. [17] One might want to adjust for this factor to aid visual perception, but this sacrifices absolute comparability. For our visualization, we decided against a perception-based correction. Lastly, the choice to use the color of the glyph to highlight specific elements is natural due to the effect of warning colors on human perception.

All this allows for intuitive exploration and quick assessment of different outliers.

Fig. 5.
figure 5

Graph visualization. Thickness of arrows and stations display the cumulative time it took the material to traverse. Anomaly (double time at top most station) (marked in red) can be identified for left dataset (a) in comparison to right dataset (b).

An example can be seen in Fig. 5, where two trajectories of workpieces from the same product group are displayed. Their position and direction all match up. The layout of the factory can be conceptualized easily. Through careful examination, we are able to identify that (a), the left workpiece trajectory, remains twice as long at the topmost (3rd) station than (b), the right piece. Because excess time on a station is one of the a priori known anomaly types, we can also color the affected node. Not only the presence but also the accuracy of the detected anomaly are sufficient. Neither the recording nor the filtering changed the qualitative scale of the trajectory. The node is twice the size, meaning the workpiece remained there twice as long, which is correct in comparison to the ground truth.

Other anomalies, like additional stations caused by traffic jams or extremely slow transportation times, can be detected in much the same way.

Fig. 6.
figure 6

Graph visualization. Anomaly (revisit station 3 then go back to station 4) (marked in red) can be identified for left dataset (a) in comparison to right dataset (b).

Another example is shown in Fig. 6, where on the left side (a) the workpiece travels back to station 3 before it then continues back to station 4 until finished. In comparison, on the left side (b) the regular path does not involve loops. In our product specification, this is considered an anomaly. The workpiece has some kind of defect, which needs to be fixed at the previous station. But depending on the manufacturing procedure, this may be part of the normal production cycle.

2.3.2 Overlapping Graph Visualization

There are also more advanced visualization techniques that suit the needs of particular applications or improve workflow. Examples are shown in Fig. 7 (a), where additional information in the form of overlaying trajectories may aid in the investigation of certain events.

Increased practicability may be achieved by the overlapping graph visualization of Fig. 7 (b), where less visual memory is required for the comparison between two trajectories.

Fig. 7.
figure 7

Graph visualization with overlapping detailed trajectory plot on the right (a) and overlapping graphs for direct comparison on the right (b).

2.3.3 Embedding Visualization

The graph visualization is useful for single workpiece trajectories. The properties are visualized intuitively. However, for larger amounts of data, it becomes tedious to compare and analyze individual graphs with each other. This requires automatically detecting patterns in the data and focusing on outliers or groups of trajectories rather than individual ones.

For this, we use the work of Rossl et al. [16] originally designed for the embedding of streamlines. They optimized their embedding using MDS by using the hausdorff distance between two streamlines.

Fig. 8.
figure 8

Embedding using hausdorff distance and MDS. Cluster in embedding (left) represent similar workpiece trajectories (right). Can be used for pattern recognition and identification of outliers.

Similarly, we can apply this to our data and generate an embedding, as seen in Fig. 8. Workpiece trajectories can be thought of as streamlines of material flow inside a factory. Using this insight, we can apply methods designed for fluid flow and streamlines to our data. Using their approach, the product trajectories get embedded into a lower-dimensional space (here 2D), where Euclidean distance corresponds to similarity. With this, similar workpiece trajectories naturally form clusters in the embedding and unique, dissimilar ones form outliers or anomalies. In Fig. 8, the embedding on the right contains a single point in the top left-hand corner. It corresponds to the trajectory of the workpiece that had to revisit the previous station. But also trends or structures that are more common in the dataset can be identified that way. If the dataset is shifted or roughly equally divided, it may not show any statistical abnormalities, but half the time, small delays are introduced that eventually propagate further. For example, the three points selected in the middle have all spent more time at a station than the rest. Even though they form a considerable portion of the dataset, it is still possible to identify them as different. With this tool, it is possible to detect larger trends and patterns in the data, which can then be individually analyzed using the graph visualization.

3 Discussion

Explorative visual data analysis is important to examine workpiece trajectories of RTLS systems. We showed how automatic processing and visualization can be constructed to aid in the identification of anomalies and production bottlenecks. With the novelty from this paper of pairing technical advancements in RTLS with established visual analysis tools comes the discussion on how such system should be created.

It can be argued that the simplicity of the presented visualizations could be exchanged in favor of more sophisticated target-specific visualizations and proper training of personnel. The filtering and preprocessing of the data also allow for a variety of techniques. Spatio-temporal data might benefit from dedicated trajectory filtering. Advanced approaches were not necessary for our data but may be needed for other factory settings with more metal surfaces causing interferences for UWB receivers. The scalability of our graph visualization is also limited by the number of nodes and arrows that can intersect each other before the result becomes too cluttered. A dynamic alpha value for an interactive exploration of very long paths (high alpha values for nodes close to the selected time) could be one solution. This was not necessary for our data, though. Future work might also inevitably produce more specialized solutions for factory-level feature analysis. Lastly, we expect advancement to be dependent on the availability of public datasets. For this, different kinds of anomalies and other problems in manufacturing can be introduced as an extension to our dataset. Also, our approach is limited to a number of experimental workpiece trajectories. Real manufacturing environments may include additional obstacles like obscuring objects or more volatile movements. Systems for real world applications may have to deal with these additional technical challenges.

Another promising research topic would be the utilization of more advanced methods for analysis, such as machine learning. The application of traditional streamline, flow, or other domain-specific algorithms on workpiece trajectory data has already proven useful and surely holds more opportunities for further optimization.

4 Conclusion

In this paper, a visual analysis and automated processing pipeline for UWB trajectory data was introduced. We used UWB tracking to generate datasets for material flow in manufacturing environments (RTLS). Through filtering and preprocessing, we enabled the automated detection of anomalies and were able to accurately identify bottlenecks. A workpiece trajectory specific graph based visualization allowed the intuitive and quick comparison of individual paths, while bigger datasets could be examined by approaches developed for streamlines in fluid flow visualizations because of its datatype similarity. We showed that cluster selection of embeddings greatly increases the scalability of anomaly detection and enables the systematic examination of factory material flow efficiency. In the future, we expect more work in the automated detection and analysis of this data, together with the rise of industry 4.0 to utilize the computational advancements in other fields and leverage the efficiency of manufacturing factories.