Perspectives on data-driven models and its potentials in metal forming and blanking technologies

Today, design and operation of manufacturing processes heavily rely on the use of models, some analytical, empirical or numerical i.e. finite element simulations. Models do reflect reality as best as their design and structure may appear, but in many cases, they are based on simplifying assumptions and abstractions. Reality in production, i.e. reflected by measures such as forces, deflections, travels, vibrations etc. during the process execution, is tremendously characterised by noise and fluctuations revealing a stochastic nature. In metal forming such kind of impact on produced product today in detail is neither explainable nor supported by the aforementioned models. In industrial manufacturing the game to deal with process data changed completely and engineers learned to value the high significance of information included in such digital signals. It should be acknowledged that process data gained from real process environments in many cases contain plenty of technological information, which may lead to increase efficiency of production, to reduce downtime or to avoid scrap. For this reason, authors started to focus on process data gained from numerous metal forming technologies and sheet metal blanking in order to use them for process design objectives. The supporting idea was found in a potential combination of conventional process design strategies with new models purely based on digital signals captured by sensors, actuators and production equipment in general. To utilise established models combined with process data, the following obstacles have to be addressed: (1) acquired process data is biased by sensor artifacts and often lacks data quality requirements; (2) mathematical models such as neural networks heavily rely on high quantities of training data with good quality and sufficient context, but such quantities often are not available or impossible to gain; (3) data-driven black-box models often lack interpretability of containing results, further opposing difficulties to assess their plausibility and extract new knowledge. In this paper, an insight on usage of available data science methods like feature-engineering and clustering on metal forming and blanking process data is presented. Therefore, the paper is complemented with recent approaches of data-driven models and methods for capturing, revealing and explaining previously invisible process interactions. In addition, authors follow with descriptions about recent findings and current challenges of four practical use cases taken from different domains in metal forming and blanking. Finally, authors present and discuss a structure for data-driven process modelling as an approach to extent existing data-driven models and derive process knowledge from process data objecting a robust metal forming system design. The paper also aims to figure out future demands in research in this challenging field of increasing robustness for such kind of manufacturing processes.


Introduction
A model is a consciously constructed reproduction of reality and is based on structures, functions or analogies. Models are used to solve tasks, where direct operations on the original are possible only with difficulty, are too costly or not expedient [1]. In production technology, the design and operation of manufacturing processes relies on different types of models for process layout, observation, improvement and control. However, solving these tasks does require a detailed multi-level procedure starting with engineering and designing manufacturing equipment allowing to integrate measuring devices, sensors and additional actuators to a certain extent. Also, suitable sensors and signal amplifier have to be specified to ensure the sufficient quality of raw process data. Using comprehensive data management, a reduction of the data dimension usually must be carried out in order to generate suitable data sets enabled by modern feature engineering techniques [2].
In the following literature review, we cover these aspects by displaying current trends in developing datadriven models for different objectives in sheet and bulk metal forming. Also, recent research is briefly discussed to illustrate and to disclose current use of white-, greyand black-box models in this field of manufacturing. Also feature extraction and feature selection plays an important role in data science, since feature extraction can be performed by time, frequency or mixed domains. The presented overview tries to introduce current trends in using neural networks and other tools in artificial intelligence (AI) in the field of process control and prediction of metal formed or blanked part quality. It is motivated by the fact that using state-of-the-art models such as numerical process simulation in engineering and operating metal forming processes are limited. Unfortunately, a sophisticated explainability of effects and impacts of uncritical process noise and critical process fluctuations on the component quality appear unreachable. Literature in that field unfortunately disclose limited and only incremental progress in performing numerical process simulation codes during last two decades in terms of prediction capabilities of part quality and explainability of transient process interactions. For that very reason, the authors recognise a futureoriented approach in supplementing existing numerical process simulations and sensitivity calculations by datadriven (black-box) models based on heuristic empirical knowledge. This is due to particular challenge in metal forming, that numerous physical effects (e.g. wear, elastic deflections, thermal expansion) often show a simultaneous and spatially resolved nonlinear effect on forming stages or the process sequence at all and thus also on the part quality [3]. These complex interrelationships are accompanied by various impacts of the process noise, e.g. batch variations of the ingoing semi-finished product, which is sequentially processed in one or more forming tools. Currently, these so far unexploited effects can be observed in large amounts of process data captured by multiple sensors [4]. The main chapter of this contribution for that reason contains four use cases to highlight specific challenges in gaining sufficient process data, an efficient feature extraction and reliable interpretation of process outcome. Thus, the final discussion in this context raises the question to which extent process data gained from previous production runs and process simulations can be used for data-driven modelling by informed machine learning algorithms [5].
Next to the integration of domain knowledge, newly available explainable artificial intelligence (XAI) may help to increase transparency, interpretability and explainability [6] to derive an optimised design of forming tools and its active surfaces to be manufactured for the next serial production.

Classification
Volk et al. reviewed the classification and characterisation of models, focusing on the field of metal forming technology [7]. To identify and evaluate proper models for specific purposes, an evaluation of capability of existing models in terms of the following criteria supports the selection procedure. Accuracy describes the deviation between real and modelled output and precision refers to consistent results and low spread [8]. Execution and response time of models do evaluate the computational efforts and latency. Model robustness refers to a constant precision of results under different boundary conditions. Transferability and adaption of models for different and non-intended tasks is described by flexibility. Explainability and transparency of models are evaluated by the degree of knowledge gain. Production engineers therefore mainly are interested to increase the degree of process knowledge gain, which may base on physically influenced models.
A white-box model bases completely on physical and mathematical correlations and therefore provides a high degree of knowledge gain and transparency. Those deterministic models, based on valid physical correlations, may often reduce complexity with the aid of simplifying assumptions to reach suitable response time and often accompanied by lack of accuracy. In the field of metal forming, such closed mathematical descriptions and semi-analytical methods are limited for simple load cases, e.g. the estimation of bending processes [9]. Nevertheless, these models do serve as a basis for the development of numerical process simulations by e.g. finite element method (FEM). Hybrid modelling involving physical and empirical knowledge, so called grey-box models are commonly used and further developed over the last decades in many use cases in the tool and process design of metal forming applications. Therefore, recent developments of grey-box models in this field do consider the description of general physical phenomena like friction or impact of varying material properties in combination with fundamental physical experiments. Current and future developments address the modelling of series production effects by stochastic methods and an increased computational effort [10]. On contrary, empirical black-box models are based exclusively on data [2]. Correlations between input and output can be determined by stochastic or physically supported methods or self-learning algorithms. This results into high performant data-driven models regarding accuracy of the prediction of process outcome, execution and response time but appears disadvantageous with respect to transparency and gain of knowledge [2]. In the field of metal forming, black-box models based on process data may help to describe complex phenomena in series production like wear or material scatter and other superimposed effects without increased effort in modelling and computation.

Data acquisition, transformation and modelling
During last few years, great efforts have been made to digitise metal forming manufacturing processes and to analyse process characteristics in real time or based on historical data to predict their outcome and to improve process understanding [11]. However, in a metal forming process the complexity in many cases is increased by the variety of operating principles, which hinders the procedure of knowledge discovery from one process to another.
In manufacturing industries, the quality of a produced product is often characterised by data with respect to material and geometric specifications. Also, the product quality can be associated with corresponding process data. Recent developments were made in the combination of product and process data and the data acquisition gained from sensors, actuators and the forming machine itself. Their transformation onto mathematical spaces for data modelling may allow in the future a reduction of error propagation within process sequence and to increase process robustness. [12]

Data acquisition
Data sets are usually captured by sensors gained from actuators, drives, deformation or force gauges and other electrical components consuming energy. Measurement normally is based on direct or indirect measurement principles [13]. Direct measurement requires capability of the sensor to directly capture the desired physical value. In indirect measurement, sensors are used to determine indirect values of corresponding physical measures, whose correlation to the actual desired physical value is known through a non-deterministic transfer function [14]. In manufacturing processes, a multitude of sensors, e.g. pressure, temperature, speed, position sensors and others are used to acquire real-time status information from different physical actions and effects [15], being observed in the e.g. forming machine or tool. Considering the technological context of metal forming technologies, each forming process is characterised by specific measurement conditions, which pose technology-related challenges for data acquisition that have to be taken into account. For instance, hot forming processes such as drop forging can be characterised by high temperature and forming forces as well as long motion paths of workpiece volume on the active die surface during the process [16]. Therefore, the sensors to be used must be suitable for corresponding temperature or force ranges. The occurring temperatures as well as high mechanical loads during forging are applied in a cyclic manner resulting into alternating loads. High stroke rates and correspondingly high frequencies are often also associated [17] to a harsh forging manufacturing environment, which requires high speed and high volume data recording [18]. In addition, the potential spray cooling or dust poses yet further challenges to capture the exact process status in hot forging technologies [19]. Comparable constraints in data acquisition are found also in other bulk forming technologies. Commonly and in another field of forming technologies, metal blanking, forces result from peak loads when the tool hits the workpiece and the material breaks are measured. This leads to nonlinear and transient time series that represent the physics of these processes. Thereby, high accelerations and short tool engagement times have to be considered by integrated sensors into blanking tools as well as designing the measurement chain [20]. In this context, the sensor type as well as its location in the tool or process should be taking into account [21].
Referring to the term "garbage in-garbage out" in the context of data processing, high demands should be placed on data acquisition, validity check of data as well as on data quality [22]. Data quality is often comprised of multiple dimensions [23] covering accuracy, consistency, time series data und completeness as being the most commonly referenced and widely respected dimensions [24] of data featuring. Considering those dimensions, it needs to be taken into account that, having a poor data quality results into poor data output [25]. This leads to the necessity of an evaluation of collected data for quality levels [18]. In addition to the challenge of ensuring good data quality, three main aspects in terms of variety, volume and velocity data acquisition must be addressed [23].
• Variety The data acquired originates from a multitude of sources and has a heterogeneous character [26]. In general, data can be divided into structured, semi-structured and unstructured types of data. While structured data is tagged and ready to be sorted as well as analysed, unstructured data is random, which makes it difficult to process [27]. • Volume The sensors and other measurement devices normally do generate time-dependent series data during operation, which increasingly accumulate into very large data sets. Therefore, distributed storage of massive sensor data is required [15]. • Velocity The expression velocity depends primarily on the speed of data acquisition as well as the reliability of data transmission. Besides that, the efficiency of data storage and the speed of discovery of useful knowledge also needs to be considered when use of data is aimed on advanced feature engineering objectives [18].
Given these aspects, the vast amount of data sets, which also include a great amount of data contents having many attributes, in fact do make manual data analysis unfeasible [28]. Automated methods and approaches for data transformation and modeling are needed, which are presented hereafter.

Data transformation
Following recent trends in digitisation of manufacturing processes and the related optimisation of sensor systems as well as the development of high-performance measurement software, companies in the field of metal forming technology are faced with heterogeneous nature of data captured in manufacturing due to different data types, properties or sources.
Bellmann [29] describes this problem from the perspective of machine learning as the "curse of high dimensionality" in order to point out the fact that algorithms that work well in low dimensions become intractable with high dimensional inputs. The complexity of models grow exponentially with higher dimensionality (number of features), which obviously hinders the generalisability of such models [30]. Furthermore, an advanced data acquisition procedure results into a data space including redundant features adding no further information to the model while drastically increasing computational effort of the learning algorithm. Thus, feature reduction techniques aim to reduce the dimensionality as well as remove noise and redundancy of given data without removing relevant information [31]. For practical application of black-box modelling approaches, this data transformation is divided into the two steps of feature extraction and feature selection (see Fig. 1).
During feature extraction, the dimensionality of data is automatically reduced by a transformation operation. However, this leads to a reduction of the engineering interpretability of extracted features and a loss of information. According to Li [32], extracting features from time signals is performed either from the time domain, the frequency domain, the time-frequency domain or it is based on a model approach. Features in the time domain do not require a special transformation operation and can be determined directly from the given data set. Mostly, these features are statistical parameters such as extreme values, the statistical moments of first to fourth order such as mean values, standard deviations, skewness, kurtosis or the root mean square [33]. Spectral and frequency analyses can be used to transform the time signals into the frequency domain, where spectral features such as the power spectrum, maximum frequencies, spectral entropy can be determined. Since in forming technology mainly sensor-based data such as forces, accelerations or torques etc., which have a transient signal characteristic, are acquired, the use of conventional spectral analysis techniques is limited [34]. In this context, advanced transformation techniques in the frequency domain such as the Haar and Hilbert-Hung transformation techniques are becoming increasingly important [35,36]. Since spectral transformations in the time domain show poor resolution and information is lost when transforming to the frequency domain, approaches from the time-frequency domain such as the wavelet transform or the Wigner-Ville distribution do offer the possibility to consider features from both domains simultaneously [37]. In addition, model-based approaches for the feature extraction in manufacturing processes, which transform a high-dimensional data for visualisation into a lower two-or three-dimensional space, can more and more be found in literature. are autoregressive models, principal component analysis (PCA) and t-distributed stochastic neighbour embedding approach (t-SNE). One of the most promising learning techniques for dimensionality reduction is the Uniform Manifold Approximation and Projection, which is based on the t-SNE approach, and receives increasing attention in literature due to its low computational cost and good graphical separability of classes [38].
In the following step of feature selection, irrelevant features are removed and an optimised feature space for the model is derived. In addition to the use of automated algorithms, feature selection techniques often use a heuristic approach and consequently challenge the engineer with high demands. Therefore, selecting relevant features manually requires a deep process knowledge [39]. In principle, techniques of feature selection approaches can be divided into filter, wrapper as well as hybrid methods. When applying the filter method, a criterion is defined which quantifies the information value of single features for the performance of the model. The selection of the relevant features is performed by statistical evaluation of the correlation between the used feature and the performance of the model. In contrast, wrapper methods do not quantify the informational value of each feature for the model, but estimate the performance of the model selecting a certain feature subset. Hybrid methods do combine the advantages of filter and wrapper methods. While filter methods exclusively investigate the influence of specific features on model performance and therefore are computationally efficient. Wrapper methods establish correlations between model performance and a specific feature space, which needs high computational effort [40]. Table 1 gives an overview of feature extraction and selection approaches in metal forming technology which currently belongs to the state of the art. This survey shows that in a manufacturing context the transformation of data is mainly based on statistical features. Thereby, these features are used for monitoring the actual process state by thresholds, envelops or flat lines. Since transformation approaches in the time or time-frequency domain are mainly suitable for periodic or transient signals, they are rarely used for applications in the forming technology. Although the use of model-based transformation approaches has a great potential to reduce the amount of data with very little loss of information, they are also rarely implemented in real production environment. This is mainly due to the fact that the generated features are not interpretable by qualified personnel, but represent the physics of the process through abstract features. Here, techniques of explainable artificial intelligence (XAI) may help to derive correlations between these abstract features and the physical state of the process and thus provide acceptance for the increased use of such transformation approaches.

Modelling
After data transformation, models solely based on data are used to identify correlations between the transformed data (features) and the system output, which is typically experimentally measured. Especially the recently emerging black-box models generally show a performant capability in determining and representing those correlations. However, the disadvantage of this approach is often given by a high computational effort in finding the most useful model and, depending on its type, a gradual loss of explainability [2]. Here, explainability refers to the amount of reliable gain of knowledge and allows deeper insights into the phenomena acting in processes [7]. While common black-box models like regressions and support vector machines (SVM) are comprehensible and therefore widely used in the production engineering environment, the results generated by a neural network training are usually no longer transparent for the Blanking (force and displacement) [41] Deep drawing (AE) [44] Blanking (acceleration and force) [47] Forging (force) [50] Stamping (force) [51] Roll forming (temperature and force) [42] Blanking (acceleration and force) [45] Progressive stamping (force) [48] Fine blanking (force) [52] Blanking (force and AE) [43] Deep drawing (AE) [46] Blanking (acceleration and force) [49] Blanking (force) [53] Feature selection

Wrapper method Filter method
Blanking and bending (force) [54] Blanking (force) [55] Forging (force) [50] Roll forming (force and rotational speed) [56] user [2], which opens future demands in extended research in the field of an explainable black-box-modelling. A selection of typical examples for use cases of blackbox models in the field of metal forming production engineering are given in Table 2. These so-called ´deep learning approaches´ are indispensable with regard to further performance improvements, since they allow to determine correlations for universal problems without specific proximity functions. Belfiore et al. [57] for example presented a novel approach for the calculation of abrasive wear in grinding processes using neural networks. Analogous to the Archard model, the variables contact pressure, sliding velocity and temperature are employed to determine a functional relationship for wear estimation but without presuming the initial equation of Archard. This approach was able to achieve a high prediction quality with regard to wear evaluation without the restriction of an analytic model given by the need of proximity functions. However, it must be mentioned that the neural network has already been provided with pre-selected data channels for training, which are known to correlate highly with wear. Yet this circumstance generally encounters the typical problem in industrial applications that already today 'any' data is often recorded, which may not offer any significance for the modelling with regard to the result variable. For this reason, common scientific approaches in metal forming for other investigation objectives in production engineering often include the evaluation of highly relevant process parameters, for example the press force or a measurement of produced parts or tool geometries. With models derived this way, it was possible in individual cases to achieve a high level of predictive quality for the considered problem. Due to the specific use of the black-box models, however, the transferability to other specific applications is hindered, since efforts have so far been limited to demonstrate the general applicability. Following this statement, there is a lack of a commonly accepted understanding of which practical guidelines black-box modelling approaches have to follow in order to be firmly adopted into the repertoire of knowledge-building techniques.
For future investigations, the importance of explainability appears extremely high for targeting a wide acceptance of models and new dimensions of process related calculations. A more far-reaching but practical example, in which traceability of calculations is required by law, is given by the development of the open accessible code_aster FE-calculation kernel, which was initially intended for the design of nuclear facilities [58]. In doing so, the authors see this example as transferable to manufacturing processes of the future, which with increasing complexity will also very likely have to withstand increasingly complex legal testing processes. To overcome this general issue, two main solutions appear conceivable. Initially, novel modelling approaches can be • fine blanking -roll height prediction based on FE-calculated training data [65] • stamping -identification of critical process states (poor workpiece quality) by monitoring press force signals [48] prognostic regression • sheet metal forming -forming force prediction based on other process parameters and flexible rolling -quality prediction based on sheet thickness measurement [56] • sheet metal forming -spring back control by evaluation of press force and tool geometry [66] • deep drawing and bending -real-time process model based on historical force data to predict part quality of cutting, operations [67] • metal forming -estimating product-to-product variations in metal forming using force measurements [68] neural network • hot steel rolling -defect prediction by evaluation of self-organizing maps [69] • bulge testing -identification of material parameters using pressure displacement curves [70] • incremental sheet metal forming -machine learning-based parameterization of local support in robot-based incremental sheet forming [71] • flow forming -wear prediction based on experimental block-on-ring testing [57] supported by existing domain knowledge, but this limits the expected development steps. The research field of XAIalgorithms do present promising potential by trying to transform the results of neural networks back into an explainable dimension space by means of further processing techniques [59].
Recently developed XAI models like the SHAP Model introduces the possibility to automatically generate new "surrogate parameters" from the individual inputs [60]. Other models such as the DEEPLIFT use a scoring approach to identify relevant features for dimension reduction [61], while the LIME approach aims for an explanation of the initial black-box prediction through representation of local approximations [62]. However, according to literature, the evaluation of time series or sensor data sets presents a major challenge, since a time signal itself is unsuitable as an input parameter for a learning process and thus requires segmentation (feature engineering) [63]. Hu et al. [64] concluded that error deviations of less than 10% can be achieved when using perfectly time-synchronised data sets. In the field of metal forming technology, this time segmentation is well implementable via a time trigger synchronised e.g. to the top dead center of the press ram. In this way, individual features can be extracted from the data signals and related to a defined number of cycles or strokes, whereby the time series challenge described in the literature can be circumvented. Despite the steadily increasing number of new model approaches, XAI modeling is considerable as a relatively new approach in the scientific world and no prominent examples of its application in the field of metal forming are known up to date. However, there are high expectation to cover location and time dependent phenomena in complex forming processes with the help of prospectus XAI models. Compared to the capabilities of the models shown so far, high model prediction accuracies can be expected for the evaluation of serial forming processes, which is supported by the following use cases in the field of forging and blanking technology.

Practical approaches of data science in metal forming
As a result of the state of the art, the authors recognise in the long term a high potential of data-driven models in answering questions in metal forming technology far beyond numerical process simulation. The traditional and commonly used approach for gaining engineering knowledge about metal forming process interactions so far consists of building a feed forward model, starting with domain knowledge based on experience, physical analysis and numerical simulation code. Following conventional capability of numerical models and calculation methods for achieving low-scrap production in sheet and bulk metal forming technologies, respective improvements in prediction accuracy of the expected part quality in fact will be needed in future. This target can be achieved by the use of data-driven models based on large quantities of process data recorded from continuously running series production processes. Furthermore, the incorporation of existing knowledge by informed machine learning technologies ensures an enhanced model quality and explainability. By following this approach, a reusable knowledge gain that does not require a time consuming and experienced based representation of formalised knowledge is enabled. However, the challenge here is to find a new generalised methodology with which the described modelling approaches can be implemented. In the following section, authors do present four exemplary use cases in which multiple aspects, current limitations and potentials of data-driven modelling were applied to achieve mentioned goals on the long term by focusing on conventional hot forming and metal cutting processes.

Hot forging
Bulk forming is one of the oldest disciplines in production and metal forming technology. New process management strategies address a higher forging process capability by means of reduced press force levels and preheating of the components. However, the additional temperature dependent process control results in conflicting objectives arising during the manufacturing of components. On the one hand, preheating of the material and tools lead to a significant thermo-mechanical load onto the tool active surface, which during series production leads to an increase of wear. On the other hand, the controlled temperature management in the components volume offers a high potential for the specific adjustment of e.g. lifetime properties and thus for an overall increase of quality of component. The following chapter therefore presents two current research trends, in which digitisation concepts are used to improve the prediction and assurance of component quality and tool wear under consideration of thermo-mechanical loading.

Digitisation in hot forging of processes of aluminum
The joint research project 'Increase of Performance of forging through development and integration of digital technologies-EMuDig 4.0' being performed at the Institute for Metal Forming Technology Stuttgart together with partners addressed the implementation of digitisation in bulk forming technologies. The publicly funded project aims to create a self-learning database for improved end-to-end product engineering and a significant increase in drop forging process capability. High requirements on forged components and corresponding tools under varying conditions in production like in-going material composition or lubrication during pressing set high demands on the quality and explainability of a suitable model. To meet those requirements, a feed forward controllable and digitised process sequence of exemplary aluminum hot forging was initially designed and operated successfully under laboratory conditions. Afterwards, the elaborated data-driven modelling methodology was transferred to real series production processes of project partners in forging companies, in particular to improve process control and failure detection during manufacturing of steel components and aluminum wheel forgings efficiently. Figure 2 provides a schematic representation of realised laboratory forging process chain showing also the flow of material and data. Delivered raw material was separated into billets while geometric and material properties were manually captured. After a controlled and supervised inductive heating stage, the aluminum billet subsequently was formed by a two-step hot forging process and conclusively annealed. The part handling between forming and optical measurement operations as well as lubrication was realised by a 6-axis robot. Real time process data management was supported by a factory cloud based work piece tracking aided by a newly developed online analytical processing (OLAP) data base [72].
Besides the development and implementation of sensors, actuators and automation components, a valid numerical modelling of realised two-stage forging process served as a knowledge and data base of tool design and process control. Here, both forming operations were modelled by 2D-finite element method, considering simplifications like part symmetry and rigid tool structure. Afterwards, results of more than 1,200 stochastic numerical experiments were used as the initial database of data-driven modelling both forging stages to identify relevant actuated variables and first estimations of most sensitive parameter correlations.
With the help of the numerical gained synthetic forging domain knowledge, a recurrent neuronal network with autoencoder structure (RNN) was pre-structured for each forming operation (see Fig. 3). This informed machine learning technology provided explainable correlations between input and output parameters such as in-going material properties, heating time, dimension of workpiece and microstructure as an advantage on the one hand. On the other hand, batch variations of raw materials, process fluctuations and numerical modelling assumptions, data-driven model accuracy was not sufficient enough for proper part quality prediction and process control. Therefore, real heterogeneous process data gained from production experiments were recorded and unified by a programmable logic controller (PLC). Figure 3 Factory Cloud  Fig. 2 Schematic representation of the controlled hot forging process (laboratory scale) showing flow of material and process data according to [72] and [73] represents an iteratively refinement of the knowledge based RNN by experimental data, thereby increasing model predictive quality after only a short time and number of strokes [74]. Now, the data-driven model overcomes existing lack of accuracy. Furthermore, implemented control logic on laboratory scale suggested autonomous decisions for each part and different materials and a change of the process sequence according to specific situations in order to keep the window of tolerances of the final product as small as possible [73]. The previously described and elaborated method developed within the framework of the joint research project EMuDig 4.0 was transferred for quality assurance and control objectives to two different forging companies. Both industrial partners belonged to the project consortium in order to support the transfer of elaborated method into real forging processes such as car steering components made of steel and a hot forging process sequence of aluminum wheel rims. To detect process anomalies on data-driven modelling such as underfilling and other kinds of scrap, a long shortterm memory network (LSTM) for time dependent behavior combined with a sequence-to-sequence network for expected part quality prediction objectives was applied. By means of feature extraction on process force signals with discretewavelet-transformation (DWT) and Autoencoder, a high data volume reduction and minimum loss of information for various recorded heterogeneous process signals was achieved. In doing so, the numerical pre-structured data-driven model was capable to judge between nominal and abnormal process signals (see Fig. 4), an automated suggestion of countermeasures for part quality robustification and to additionally predict wear of active tool surfaces as a correlation of data reconstruction error [75].  Conclusively, this project carried out an enhanced datadriven model for quality prediction of hot forged parts by an informed machine learning, based on domain knowledge of numerical simulation and enhanced by process data. This enhancement exhibits a typical black-box characteristic which makes comprehensibility difficult and impede the use in forging tool and process design. This raises research questions in terms of transparency and explainability of data-driven models. The contained information in collected process data like correlations, anomalies and interactions may be uncovered by newly emerged data science technologies like XAI. In future work, process and material fluctuations might be considered without such an increased effort in numerical modeling, which makes data-driven modelling accessible for a robust tool and process design in metal forming.

Tool life prediction in hot forging determined by digital process data enhanced by FE-simulation
The dimensional precision of forged component mainly is joined with the durability of active surface of forging tools, process condition and scatter of material properties. The large scale production of forgings until today severely do suffer under unavoidable wear and structural damage due to extreme hot forging process conditions such as high cyclical mechanical as well as thermo-mechanical load applied onto active surface of forging tool. Subsequently the service life of the tools in hot forging is typically subjected to significant fluctuations [76] as depicted in Fig. 5. Reasons for that can be found in various causes of failure, e.g. crack formation propagating from active surface of the tool into its depth, and abrasive wear, which occurs unpredictably due to fluctuations of process parameters in general.
In manufacturing, there are no in-situ measurable parameters available for proper defect identification or prognosis.
Tool changes or tool rework are carried out often in advance and based on experience. On contrary, FE-simulations previously were used to calculate individual types of wear in isolation and rely on specific and mostly empirical models. A well-known white-box model is given by the Archard model for calculating abrasive wear. This model enables a comprehensible calculation of wear locations by evaluating local contact stresses, material sliding velocity and local tool hardness [77]. While the conventional Archard model considers the tool hardness as constant, current literature agrees that cyclic thermomechanical loading to the surface leads to extensive changes of microstructure and hardness of the layer during the running production [78,79]. With regard to a data-driven modelling of this phenomenon, a test methodology was developed at the Institute of Metal Forming and Forming Machines (IFUM) to realistically replicate this load accumulation using a forming dilatometer to later analyze the resulting changes in the hardness of forging tool material [80,81]. These measurements generate detailed and applicable domain knowledge that reduces uncertainties of existing models to predict local wear effects in the forging tool more reliable. Gained results finally were processed into hardness evolution curves (H = f(T,σ,n)), so that a realistic changes of hardness values can be estimated over the course of the forging operation. Process planners in industry do need such a function considering calculated process peak temperature T and comparative stress σ in conjunction with a specified number of forging cycles n. This function was successfully implemented into common FE codes by means of self-developed user subroutines and now can be directly applied to asses wear in more detail after a typical process calculation of a hot forging process [82].
However, we must acknowledge, that the results of a FE calculation model always represent one particular outcome or set of results based on a specific set of model and process parameters originating from a complex system of interactions in reality. Due to lack of data coverage and fluctuations in the process (see Fig. 5) Errors or deviations from the calculated results are hardly predictable and explainable by FE-models. A new approach to overcome such problems is expected by the authors to be achieved in future research by using data-driven models, which can lead to a significant improvement of wear prediction capabilities in forging process engineering. To generate suiting input data, the continuous recording of extensive sensor data (for instance of stroke paths, press forces, temperatures, etc.) is of high interest. These data could serve as input for a dynamic feature processing model (Fig. 6) that provides a continuously updated tool life based on a pre-teached prediction data processing model. Pre-teaching is carried out on the one hand by incorporating the domain knowledge available for the specific forming operation, which is specifically integrated by means of FE simulations or material charts experimentally gathered as demonstrated by a recent examples in the literature [83]. On the other hand, XAI algorithms could be used to transform the features and mechanisms of the data models into a comprehensible domain, which in turn will enable a deeper understanding for further optimisation of tooling surfaces or process settings. Following this approach, a basis for future process enhancements is generated, which for example allows the implementation of a continuously updated tool life counter directly at the press control center. Therefore, the wear mechanism has to be precisely modelled considering the full range of process conditions as well as material of tool and part. The future approach of data-driven modelling may fulfill the requirements of tool life prediction and enable an increased explainability of wear mechanisms and apparently stochastic tool life.

Digitisation in sheet metal blanking technologies
Shear cutting in general as well as fine-blanking of sheet metal both belong to the group of material separating processes that share a similar operating principle. Beyond that, both processes do have in common that the economic efficiency of the process is significantly influenced by the wear of the blanking tool components, which directly impacts the punched workpiece quality. However, in situ wear detection itself poses a challenge, since the tool remains completely closed during the shearing process of the sheet metal, thus emerging physical effects and technological interactions cannot be observed visually and assessed directly. Recent research proposes to measure the wear indirectly during the running process. Indirect measures of signals such as forces or vibration show considerable variations on a stroke-tostroke basis as well as long-term trends, but are influences not only by effect of wear, but also by e.g. inhomogeneous in-going material properties or dynamics of the machine tool [84]. Together with the sensitivity of the sensor opposes challenges to potential wear monitoring systems that need to filter important wear related information from auxiliary signals and derive estimators for specific wear effect in complex tool geometries. The first presented study shows Fig. 6 Concept of a feature-based data processing for real time tool life calculation based on numerical process design and real time data measurement a visualisation of long-term trends in force and mechanical vibration signals captured during running fine blanking processes. This is followed by a further study presenting a new approach to predict the current wear state in shear cutting within an experimental environment.

Long-and short-term variations in fine-blanking process data
Fine-blanking as well as shear cutting processes in general can, once setup correctly, be regarded as stable processes that can produce consistent high-quality outcomes in mass production. However, when executing extended stroke series, quality and process variations occur due to varying material properties and increasing wear of active tool components, both affecting the process performance. To understand occurring variations of the signals, the quantification and individual analysis of the fluctuations on a stroke-tostroke basis as well as long-term trends is a prerequisite to discover their causal relationship to specific physical conditions defining the process, e.g. wear of tool components.
In the used fine-blanking press machine of this study, nine piezoelectric sensors have been integrated into the tool structure to measure the relevant process forces, namely punch, counterpunch and part holder force [52]. In addition, acoustic emission sensors have been applied to the upper pressure plate close to the punch positions [85]. These sensors detect elastic structural deflections as complex vibration signals, which are emitted from various points in the tool structure and press machine ( Fig. 7). Force [86] as well as acoustic sensors [87] have proven to capture important information about development of wear of tool components such as rounding emerging at cutting edges of punch and matrices during the life time of tool. Each fine-blanking operation was represented by a number of time series representing sensor signals during the operation from start to end. In this study each time series consisted of at least 10,500 data points that were regarded as input features originating from the blanking operation for subsequent analysis purposes. To reduce the number of features and to increase data point density, feature engineering describes the process of manipulating the raw sensor signals to extract, select and to condense relevant features for further processing steps. In the presented case, the raw sensor signal was first cleaned and subsequently segmented into phases that represent the stages during the shearing operation, such as blanking and stripping. In a next step, features of different feature domains, namely spectral, temporal and statistical features, were extracted using feature templates from the time series feature extraction library (TSFEL) [33]. Using the predefined feature templates, a total of 360 features, partly highly correlated, have been extracted for each stage of the forming process. To further reduce the redundancy and the amount of data, the dimensionality reduction technique uniform manifold approximation and projection (UMAP) [88] was utilised. UMAP is based on manifold learning techniques and ideas from topological data analysis and is basically to project the dimension of large dimensional data into lower dimensions while preserving the local structure of the signal. The presented feature engineering approach has been applied to two case studies on fine blanking including force measurements of 1488 consecutive strokes as well as the variation in acoustic emission (AE) signals over the course of 14,000 strokes in an industrial setting of a blanking process. The data sets were modeled by use of an unsupervised non-linear projection algorithm UMAP [88]. In this context, UMAP embeds the signals extracted features into twodimensional Euclidean space for visualisation purposes by computing two dimensionless components. Results achieved in this study are presented in the two-dimensional plots in Fig. 8 that show the embedding of the force and acoustic emission at 1488 and 14,000 consecutive strokes, respectively. Both signals show a long-term trend of the derived two-dimensional representations, correlating with the stroke index (specifically features from time, statistical and frequency domain of the acoustic emission data), as well as stroke-to-stroke based and short-term variations (specifically force data). This embedding was also found in higher dimensions and is researched to understand the presented variations, to quantify them, e.g. by stroke-wise Euclidean distance, and to correlate this derived metric with increase of wear or other phenomena observable during process execution. Recent studies already indicate that this variation indeed can be connected to tool wear phenomena [89].
The presented plots are found representative for several series of experiments conducted with the fine-blanking process. For further analysis, reliable metrics that can grasp the amount of variation on a stroke-to-stroke basis [85] as well as long-term trends [90] need to be researched in detail.
The current results gained in this study are limited due to the following reasons: (1) The UMAP approach as well as other non-linear projection methods are only marginally explainable. Thus, an automated extraction of knowledge about which specific data points are important for the projection is missing and limiting the value of the approach.
(2) Neural networks do offer the possibility to automatically learn and extract compact features and representations of process signals but require large quantities of sample data for reliable training. The current data sets are large, but still do not fit to the amount needed for deep neural network training. This can be countered by using data augmentation techniques together with domain knowledge to support neural networks during the learning process and lower the amount of data needed for training. (3) The projections presented above utilize only one sensors data, while several more sensor data are available. Potentially, each forming operation can potentially be represented by sets of sensors based on force, mechanical vibration, tensions or power consumption. Additionally, data from the process itself can be further complimented by sensors detecting varying material properties online during process execution that further increases the complexity to the analysis. Finally, iv) the results have to be validated by measuring process outcome and wear increase of active tool components and complimented by white-box model that are already available, instead of relying only on (explainable) black-box models. In order to meet mentioned challenges and to overcome listed limitations in the near future, the future potential of XAI models to understand, and eventually reduce the amount of variations and trends in the execution of sheet metal forming processes can be leveraged.

Wear prediction during blanking using a multiclass support vector machine
Almost every product manufactured by sheet metal forming involves one or more cutting operations within its manufacturing process. Especially due to increasing requirements for processing high-strength materials combined with increasing production rates, wear is a phenomenon that significantly influences the quality and productivity of this process [91]. In this context, workpiece quality is influenced by severe wash out of sheared edges as well as outline of workpiece, productivity of process is influenced by tool break down time, increased maintenance and high risk of breakage of highly loaded punch corners. Currently, this challenge is addressed by conventional monitoring systems that provide binary control of the process state (e.g. scrap part/good part) [92]. However, predicting the amount of wear at cutting tools during series production is currently not possible. In this context and in a preliminary work, a black-box model based on a support vector machine (SVM), which allows for prediction of current wear state based on ´in-process force-signals´ was developed [53]. Therefore, five abrasive wear states were characterised by the increase of cutting edge radii r i of the blanking tool during processing and are estimated by a classification model based on a SVM. Force signals were acquired with a sampling rate of 90 kHz by multisensory blanking tool on a BRUDERER mechanical blanking press operated at a stroke rate of 200 strokes per minute using two different sensor types, a strain gauge applicated at the press frame S and a piezo electrical force washer integrated to the upper part of the tool (P). For each wear state, 100 experiments (time series with 4950 data points) were conducted per wear state leading to an aggraded force matrix F whereby F ∈ ℝ 600×4950 . Figure 9 shows the experimental setup and a representative time series of each sensor type. In addition, it was necessary to link gained process data with a label quantified by the cutting edge radii r i (quality data) of the blanking tool. Since the wear state was not measured for each captured time series and a small amount of labeled data is combined with a large amount of unlabeled data to train the model, the case shown is considered as a semi-supervised learning approach. A cold rolled steel (1.0347) with a thickness of 2 ± 0.002 mm was used for the experiments.
Due to the high stroke rates of industrial blanking processes the amount of data provided for modeling increases rapidly. To reduce the amount of data with as little loss of information as possible the model-based transformation technique PCA was used. For further investigations only these two principals were used as input for the SVM. Finally, to show that a classification of the wear state during blanking is possible, the SVM model was trained based on these two principal axis from the force signals. A grid search afterwards was used to optimise the hyper parameters of the SVM and to find a suitable kernel function (kernel function: linear; regularisation parameter: 4.9321; margin of tolerance: 0.0042). In order to validate the model, the data set was split into a training data set (80%) and a test data set (20%).   Figure 10 shows the results of the classification model delivering reliable results of both types of sensors. To quantify the performance of the number of correctly predicted wear classes in relation to the total number of predictors, as well as the separability of the classes quantified by the "Mahalanobis distance" was determined [93]. Thereby, the wear state classified by the model based on the signal of the strain gauge sensor shows a twenty times worse separability than the model based on the signal of the piezo electrical sensor. This is due to the physical distance of strain gauge sensor towards the actual forming zone as well as the tendency of this sensors type to electrical noise that is superimposes the physical relevant component of the forces signal. Despite the worse signal quality gained from the strain gauge sensor, an intelligent transformation of the data in combination with optimised SVM improves the accuracy of the model to 97%.
However, it must be conceded that the black-box models are able to quantify process states based on sensorial acquired data, but fail to derive knowledge for a system improvement. Especially, an abstract description of non-linear interlinkages through black-box models between varying process states such as wear, tool and press elasticity or material fluctuation and acquired process variables is difficult to interpret from an engineering point of view. However, to be able to identify system improvements for the reduction of varying process states it is necessary to explain the used black-box models as well as the transformation techniques and their system knowledge that is stored in terms of abstract models or features. Especially, these correlations between features or the model itself to physical states of the blanking process are enabled by XAI. This allows to generalise black-box models by creating an understanding of the interdependencies between the process state, extracted features and the structure of the model. In addition, XAI can help to select robust data based on features to ensure a model-based description of production systems via black-box models even with varying process parameters and increasing uncertainty.

Research objectives for an explainable data-driven modelling in metal forming
The complexity of metal forming processes is essentially caused by the contact conditions in the deformed zone between the workpiece and the tool. Furthermore, the variation within time and location, depending on the geometry, the state of the process and the material properties increase the level of complexity. As a result of those process conditions, the microstructure of the component and its entire geometry in fact do change after each forming step. In addition, high force densities in bulk forming or large area loads in sheet metal forming, for example, can cause considerable elastic deformation of the acting structure. An accurate physical modelling of these deformations is limited, but in reality, a detrimental effect on the surfaces between the die and the workpiece under load must be acknowledged. In many forming operations, these interactions are always superimposed by unavoidable process noise, i.e., transient process effects, so that their overall stochastic effects on the final component quality are often not comprehensible and their causality remains unclear. Indicated by the aforementioned use cases, the current challenge to grasp and to model the stochastic nature of forming process by using black-box approaches yield to promising intermediate results. This motivates further investigations on the combination of existing domain knowledge with new data-driven models. Specifically, presented results lack the interpretability to validate the plausibility of blackbox model behaviour. A combination of white-, grey-and black-box approaches, together with cutting edge methods to enhance the explainability of data-driven models based on production as well as experimental data sets may overcome the current limits in modelling stochastic effects. Furthermore, it may improve the understanding and design of forming processes and tools. This can be achieved by taking explicitly available heuristic expert knowledge, numerical calculation results and digitized information on the semifinished product, the condition of the forming tool and press machine as well as transient process conditions into account.
In this context, the following scientific questions contain consecutive hypotheses and subsequently define important milestones along future work in this field: 1. How can sufficiently accurate digital representations of metal forming operations be described mathematically and combined with domain-specific knowledge in an explicit manner? 2. How can the combined use of explicit knowledge and process data isolate process noise, decompose it into short-and long-term variations, quantify and relate it to key characteristics for influencing factors of the forming process and tool surface geometries? 3. How can the integration of the derived models into real production environments provide new and more explainable knowledge for a more efficient, demand-oriented determination of the process parameters and the effective surface of the die, and how can the explanatory power be increased? 4. How can the newly developed method and derived model be used to draw conclusions about process design to adjust system parameters based on a domain knowledge-supported, data-driven evaluation of process data?
For this purpose, use cases in Sect. 3 do represent those first working steps towards answering the above-mentioned 1 3 research questions on a laboratory scale. While the first and second research questions can be tackled by interdisciplinary research between mathematics, computer science and engineering departments alone, the third and fourth require extensive involvement of industry partners to gather data and sharpen the studied cases. For production engineering researcher, the claim is to disclose new scientific causeeffect relationships in industrial relevant metal forming processes. At the same time, characteristics of acquired process data can be explored and used to develop and compare new methods for the evaluation of these data. Finally, the model quality and transferability are assessed by a comparison with existing, state of the art methods for tool and process design as well as an application on new and unknown data sets of comparable forming processes. Thus, the need for research is to decode the interrelationships and isolate the scatter of different process stages, while at the same time exploring new, transparent, mathematical modelling approaches that exhibit high interpretability and accuracy (see Fig. 11). New data science techniques like XAI can be used for decoding and interpretable modelling of previously unknown dependencies. In doing so, new approaches for the optimization of the processes within one or several forming stages can be derived and new statements about process stability and tool design can be made. Only with the combination of explicit domain knowledge in various forms derived by years of research, aided by modern emerging tools in mathematics as well as computer science the potential of the increasing amount of field and experimental data can be leveraged.