1 Introduction

Rubber extrusion is an integrated manufacturing technique utilized extensively across various industries to produce complex rubber geometries and seals [1, 2]. Due to the highly variable nature of raw materials and the complexity of balancing input parameters, process control poses significant challenges in ensuring consistent product quality. One of the key factors affecting the consistency and quality of extruded rubber is temperature control [3]. Given that even minor deviations can affect the rubber’s rheological behavior and subsequent application properties by scorching, precise control mechanisms are essential for sustainable and efficient production [4].

To address these challenges, recent advances in Machine Learning (ML), particularly Neural Networks (NN), offer promising methods for process analysis and optimization in manufacturing [5,6,7]. Applying these computational methods to rubber extrusion not only offers the potential to further refine process efficiency but also aligns with previous research highlighting the optimization of rubber manufacturing processes. Takada et al. illustrated the potential of random forest algorithms in enhancing the blending process for polyphenylene sulfide with elastomer using twin-screw extruders [8]. In their study, Brause and Pietruschka proposed an artificial neural network (ANN) model that predicts the metal profile shape required for rubber belt extrusion based on the specified rubber profile target and serves as an additional source of information for the human operator [9]. Similarly, Huri and Mankovits have conducted research on the design of automotive rubber components, utilizing support vector machines (SVM) to predict the shape of rubber products [10]. In another domain, González-Marcos et al. introduced an ANN model that enhances rubber extrusion efficiency by predicting rubber properties from its composition and mixing conditions, reducing the need for costly laboratory testing [11]. However, existing research primarily focuses on value prediction to assist the human operator and does not explore the potential of implementing an ANN-based system for real-time rubber extrusion process control. Consequently, this gap in literature prevents the establishment of live performance benchmarks within systems that depend solely on ANNs for the comprehensive management of all variables specific to rubber extrusion.

In contrast to conventional proportional-integral-derivative (PID) control methods, which use a combination of proportional-, integral-, and derivative terms to maintain control over a system by minimizing the error between a desired value and the actual process variable, ANN-based systems offer distinct advantages. PID controllers rely on predefined parameters and are effective for linear systems. However, they often struggle with complex, non-linear, and batch-specific process dynamics. ANN-based systems, on the other hand, have the ability to dynamically learn and adapt to these complexities. Nevertheless, the extended computation times often associated with ANNs pose a challenge for their real-time application due to high processing requirements [12]. In addition to the challenges associated with computationally intensive training, the practical use of ANNs in production systems presents additional hurdles. This is particularly evident in state-of-the-art applications, where the requisites for graphics processing unit (GPU) computing power are substantial [13]. Despite the considerable computational resources available, achieving real-time detection remains elusive in many contexts [14]. The requirement for high GPU utilization not only increases operating costs but also limits scalability, which is a significant barrier to widespread adoption. Hsia and Nithesh have demonstrated that it is possible to reduce the computational load of ANNs in object recognition tasks by streamlining the network’s parameters [13, 14]. By implementing parameter-reduction strategies, real-time object recognition on non-GPU edge devices is achieved, with a decrease in computational floating-point operations (FLOPs) exceeding 75% and detection speeds reaching up to 4.5 frames per second (FPS). However, these advances are mainly limited to the area of object recognition, particularly to the underlying data and models, and have yet to be effectively transferred to real-time control applications, such as those found in rubber extrusion.

Additionally, while traditional PID controllers have established methods for stability analysis, such as the Lyapunov–Krasovskii functional [15], ANN-based systems require empirical inline testing for their stability assessment, given their non-linear and adaptive nature. Despite extensive research in data-driven applications for rubber manufacturing, a research gap remains in deploying ML systems for real-time temperature control of rubber extrusion processes. In the context of rubber extrusion, where cycle time is inherently absent, evaluation of real-time capability requires consideration of the system’s significant inertia. This characteristic requires rapid adjustments in order to effectively prevent excessive overshooting when the operating tolerances are exceeded (see “Inline validation on extrusion system”). Based on this requirement, our study aims to fill this research gap by exploring the central question: How can ML techniques be effectively employed for real-time, adaptive temperature control in rubber extrusion lines to enhance product quality across varying operational conditions?

2 System overview

Figure 1 introduces the structure of the control architecture, comprising the core components: mixing and extrusion setup, vulcanization line, product, data mining model, measurement components, databases, and process control modules. The interplay between these components is engineered to create a control system that dynamically adapts to process variability by maintaining user-defined tolerances. The following sections will provide a more detailed view of each subsystem.

Fig. 1
figure 1

System overview: high-level structure

2.1 Experimental setup and extrusion parameters

The extrusion setup in this paper is used to produce windshield seals utilizing an ethylene propylene diene monomer (EPDM)-based compound. For each training (see “Artificial neural network-based data mining algorithm”) and control validation test series (see “Inline validation on extrusion system”), a distinct EPDM compound was extruded. This approach ensures that the training data incorporates different EPDM compounds compared to the validation test series, where the control algorithm is applied live on an operational extrusion line. Consequently, the algorithm’s ability to generalize and handle material combinations not represented in training datasets is evaluated. Figure 2 shows the extrusion process, with the production direction running from right to left. The process initiates with the introduction of the base material as a feeding strip into the feeding zone. Subsequently, the material is sheared by a rotating screw and inserted pins, which further facilitates the homogenization of the material. As the material progresses through the conveying process, it undergoes heating in distinct zones (see control parameters). In the following phase, the heated material is extruded through a replaceable die, which shapes it into the desired final geometry. Finally, the extrudate is transported away on a conveyor belt and afterward vulcanized in a hot air channel.

Fig. 2
figure 2

Experimental setup: extruder and measurement technology

Various measuring systems have been integrated. The temperature of the rubber compound is measured at three locations in the screw channel (TS1–TS3). In addition, pressure in the screw channel is recorded at the corresponding temperature measuring point (p1–p3). A temperature measuring blade was integrated at the end of the screw channel, which measures the temperature of the rubber compound at seven points across the channel diameter (T1–T7). The eighth location measures the temperature of the screw channel itself (T8). After the die, the extrudate geometry is determined optically.

3 Control parameters (input)

Figure 3 illustrates the control variables applied during the experimental test series. Control parameters consist of the screw speed and temperatures across five distinct zones. The underlying dataset comprises three different test series, each utilizing distinct EPDM compounds, amounting to a total of 14,923 measurement points. The temperature control trajectory following the first seconds after changeover diverges from the actual system performance due to high inertia, which must be considered in the modeling (see “Artificial neural network-based data mining algorithm”).

Fig. 3
figure 3

Input variables of the extrusion process as a function of time

Within these test series, control parameters were varied to provide the algorithm with a diverse dataset, covering a spectrum of operational scenarios to ensure robust training. Almost static temperatures were maintained in the first test series, with the screw speed varied between 10, 20, and 30 rpm to observe the effects of velocity changes. Conversely, the second series implemented frequent temperature variations while holding the screw speed constant at 25 rpm. The final series offered a detailed analysis of temperature changes by minimizing temperature increments to 2 °C and adjusting the screw speed only one time from 10 to 20 rpm.

4 Measurement parameters (output)

Figure 4 illustrates the output variables as a function of time, which showcases the impact of different settings across the test series. All data points were recorded at 1-s intervals. A temperature offset between the measuring sword values is present, resulting from the sensor arrangement. The temperature profiles in the screw area exhibit behavior comparable to those recorded at the measuring sword. In contrast, pressure values, especially p2 and p3, exhibit greater variability than temperature measurements. The aim of the following model development will be to predict the measurement parameters from the corresponding input variables.

Fig. 4
figure 4

Output variables of the extrusion process as a function of time

4.1 Artificial neural network-based data mining algorithm

Our data mining model, iteratively developed using TensorFlow [16], aims to predict the extrusion process output parameters using a feedforward neural network (FNN) (see Fig. 5b). The NN architecture, consisting of three fully connected layers, was designed to optimize the balance between prediction accuracy and computational efficiency. The primary objective was to minimize the number of parameters, reducing the FLOPs required per function call. This reduction enhances the system’s real-time processing capabilities, as detailed in “Real-time capability.” Through the use of 32 neurons and the rectified linear unit (ReLU) activation function, the initial layer transforms the input features. ReLU was selected for its efficiency, as it reduces the time and resources required for training by effectively turning off a portion of the neurons during the forward pass [17]. The following layer, also with 32 neurons, leverages the ReLU activation alongside L2-regularization to prevent overfitting. The final layer linearly combines the processed features into 10 outputs through linear activation, representing the model’s temperature predictions. In total, the model consists of 1610 parameters, encapsulating a comprehensive yet computationally efficient framework. The model’s architecture was refined to lower the mean absolute percentage error (MAPE) function using the Adam optimizer, with hyperparameters fine-tuned through a hybrid method that combines iterative development and systematic optimization via the Hyperopt framework [4, 18]. To ensure the training data accurately reflects the system’s actual behavior, only data from the stabilized state, represented by the last 20% following a changeover, was utilized for training the algorithm.

Fig. 5
figure 5

Data mining: prediction of T4 (a) and neural network architecture (b)

Performance-wise, the model’s average precision in temperature prediction for the validation data was high, with a mean squared error (MSE) of 0.63°C2 and an MAPE of 0.68% for all temperature values. This accuracy was consistently maintained during the cross-series prediction of different EPDM compounds. Utilizing the conventional 80/20 split for validation, the model predicted temperature variations within a precise error margin of less than 1 °C. Figure 5a illustrates the training (orange, representing 80% of the data) and prediction (blue, representing 20% of the data) processes used by the model, employing the example T4 from test series 3 in a stabilized state (see Fig. 5a).

To initiate the integration of extrusion line components and assess the algorithm’s generalizability, we tested it in the context of predicting mixing temperatures, as illustrated in Fig. 6. The dataset utilized for this purpose included measurement data from two distinct recipes within the mixing process. The first recipe involved three different batches (R1B1, R1B2, R1B3), while the second recipe was represented by a single batch (R2B1). This diverse dataset aimed to challenge the model with a range of operating conditions, thereby testing its adaptability and prediction accuracy in varied contexts. Aligning with the methodology used in the extrusion temperature prediction model, the data was split into 80% for training and 20% for validation. Importantly, to enhance the training’s robustness and ensure a diverse data distribution, the validation dataset was randomly selected from all batches and recipes. This method was key to avoiding potential bias and guaranteeing the representation of all recipes and batches in both training and validation sets. After adapting the input and output layers to the new context, we implemented the existing architecture. During the validation phase, the algorithm achieved an MAPE of 5.71% and was primarily effective in identifying trends, although it struggled to capture detailed variations. This result underscores the inherent differences between the continuous process of extrusion and the batch nature of mixing. To more accurately address these complexities, we enhanced the NN architecture by incorporating an additional layer and increasing the neuron count to 128. Additionally, the epoch size was increased to allow the model more iterations to learn from the mixing data. Following these adjustments, the model demonstrated high precision in predicting mixing temperatures, evidenced by an MSE of 4.69°C2 and an MAPE of 1.68% for the validation dataset. These results provide the critical foundation for interlinking extrusion line components, such as enabling a cross-process control system.

Fig. 6
figure 6

Data mining: prediction of mixing temperature

4.2 System architecture of the extrusion process control

Figure 7 illustrates our system architecture of the automatic process control, comprising its main components: user-input, inverse model, data mining model, and hardware interface.

Fig. 7
figure 7

System architecture logic of the automatic extrusion control

User-input

The control process initiates with the entry of input parameters (Xn) and the upper (Y+) and lower (Y-) tolerance limits through a graphical user interface (GUI). These limits are used to trigger the inverse model within the control adjustment process. Additionally, the system calculates both upper (Y+p) and lower (Y-p) preset values from these limits, serving as process targets. Users are provided with the flexibility to adjust these values as needed. The measured variables (Yn) are stored directly in 1-s intervals in the InfluxDB, where they are continuously monitored as part of subsequent processes.

Inverse model

Upon surpassing the tolerance limits, an adjustment process gets activated through the inverse algorithm. Functionally, the inverse model essentially mirrors the data mining model, but with the input–output relationship reversed. Therefore, the inverse model consists of 11 individual models corresponding to each output parameter in order to calculate the necessary input variables. Each model is deployed by loading its trained weights from the.h5 file and its architecture from the.json file. When a deviation is detected, the algorithm performs a dynamic model selection process based on user-defined criteria and selects the appropriate model (T1 or T2 or …). It then uses the selected model to calculate the necessary input variables (XIM,1, XIEM,2, XIEM,n) with the aim of adjusting the system output to either the positive or negative preset target value (Y+p/Y-p). This process allows the system to adaptively respond to variations by creating an additional layer of validation in conjunction with the data mining model.

Data mining model

Validation of these calculated input variables (XIM,1, XIEM,2, XIEM,n) then follows in a subsequent phase, utilizing the data mining algorithm. Predicting the expected output variables (YDM), the algorithm evaluates its predictive accuracy against the desired preset values (Y+p/Y-p). This validation occurs within an iterative loop, where a multiplier adjusts incrementally to minimize the discrepancy between the predicted (YDM) and desired preset values (Y+p/Y-p). This loop persists until the predicted and targeted values converge within a defined accuracy margin (< 1%).

Hardware (HW) interface

Upon reaching the accuracy margin, the calculated input variables are transmitted to the hardware interface through an MQTT Broker, utilizing the open-source microcontroller Arduino. This broker serves as an intermediary within the MQTT protocol framework, ensuring an efficient transmission of values between the ANN-based control system and the extrusion line.

5 Experimental results

This chapter outlines the validation process of the automatic control system using two different methods: simulation-based validation and live testing on the actual extrusion line.

5.1 Validation by simulation

The first approach involved simulating the extrusion system using a subset of data sets from various test series. This approach was based on the assumption that the simulation would closely mimic the real extrusion behavior. Each test data row was gradually made available to the control system at 1-s intervals, mirroring the measurement system and InfluxDB. Subsequently, the established control logic, as outlined in Fig. 7, comprising the inverse and data mining models, was applied. Instead of utilizing the HW interface, calculated output variables were displayed in the GUI. This test series effectively showcased the iterative optimization loop’s functionality, achieving parameter adjustments with a deviation of less than 1 °C.

Besides the multiple inverse model strategy, alternative approaches were also explored. For instance, an iterative optimization using a singular model involves substituting the out-of-tolerance value with the predefined preset value (Y±p) and then employing all measured variables at the time of tolerance breach, along with this substituted value as inputs for the inverse algorithm. For this approach, a high number of iteration loops and out-of-bound temperatures were occasionally observed, particularly when a default value significantly differed from the tolerance limit. This issue emerges from the impossible occurrence of certain combinations of measured variables in the actual extrusion system caused by their dependencies (such as symmetry). As a result, these combinations lead to incorrect predictions. Consequently, the first approach was more suitable and adopted for the inline validation on the extrusion system.

5.2 Inline validation on extrusion system

The second validation approach involves live testing on an operational extrusion line. This setting is based on the assumption that a live environment provides a practical evaluation of the system’s real-world performance, enabling an analysis under varying operational scenarios (see. Figure 8).

Fig. 8
figure 8

AI-based control: output variables of the extrusion process

The objective of this test series is to maintain T3 within the range of 69.0 to 91.0 °C. For this purpose, the user establishes upper and lower tolerance limits at 70.0 °C (Y-) and 90.0 °C (Y+), respectively, incorporating a safety buffer of 1 °C to account for the extrusion system inertia. For this test series, the preset value (Y+p/-p) is specified at 80.0 °C as the target temperature to trigger the system’s response upon exceeding tolerance limits.

The series starts with a warm-up phase, setting Tc, Ts, and Td to 75 °C while maintaining Ti and Tf consistently at 70 °C and 50 °C throughout (see Fig. 9). The screw speed is initially set to 18 rpm. After the output values had stabilized at 200 s, Tc, Ts, and Td are manually increased to 88 °C. A further manual increase of Tc, Ts, and Td to 95 °C and screw speed to 20 rpm is performed after 800 s, thereby raising the output temperatures due to increased shear to challenge the upper tolerance limit.

Fig. 9
figure 9

AI-based control: input variables of the extrusion process

At 1300 s, the upper tolerance limit is reached, triggering the control system to intervene automatically. Tc, Ts, and Td were adjusted to 73 °C, and the screw speed is simultaneously reduced to 13 rpm. The control system adjustment duration, including calculation and automated value transfer, is below 44 ms (see end of chapter for a detailed analysis). In this test case, only a single iteration loop is necessary to achieve the desired accuracy margin. As expected, due to the extrusion system’s significant inertia, T3 exceeds the tolerance limit for a few seconds, reaching a peak of 90.4 °C, yet consistently staying below the safety buffer threshold of 90 °C. The system reaches the target temperature of 80 °C in 1600 s and then drops slightly below this level. This behavior is due to the high inertia of the extrusion system. Despite this, the deviation is minimal (< 1 °C), underlining the effectiveness of the ANN-driven control mechanism.

The subsequent cooling phase is accelerated by manually reducing the screw speed to 10 rpm and Tc, Ts, and Td to 60 °C at 1700s, aiming to challenge the lower tolerance limit. This adjustment leads to another intervention by the control system at 2300 s, in which Tc, Ts, and Td are set to 73 °C, and the screw speed is raised to 13 rpm. In this test case, the control system completes its adjustment process within two iteration loops, with a total duration of less than 68 ms. During this phase, T3 exceeds the lower tolerance limit for a few seconds, recording a minimum value of 69.7 °C while consistently maintaining above the safety buffer threshold of 69.0 °C.

5.3 Real-time capability

The total transmission duration is mainly calculated by combining the following key elements: network ping latency, control algorithms processing time, and required time for variable transmission from the InfluxDB and Arduino. Variable transmission by the InfluxDB and Arduino typically occurs in under 20 ms, depending on network latency. The algorithm’s total loading time includes loading the inverse and data mining models, input data preprocessing, and model inference. The inverse and data mining models are intentionally preloaded at the program initiation and subsequently only referenced during each adjustment iteration, resulting in a significant process time reduction per loop. Consequently, the main factors influencing computation duration are data preprocessing and model inference, which are strongly dependent on the NN architecture and the specifications of the utilized computational HW. In the conducted inline test series, computations were performed using an Intel Core i7 1255U central processing unit (CPU), operating at 1.70 GHz, to process the inverse and data mining model. The comprehensive inference time per model encompasses the transfer of data to and from the CPU, the conversion of input data formats, the allocation of memory or computational resources, and the initialization of TensorFlow’s operations for prediction, averaging 12 ms. Consequently, each loop, incorporating both the inverted and data mining model, requires a total of 24 ms to complete. The overall duration thus depends on the number of loops, generally fewer than three adjustments, resulting in a total time of 68 ms for a two-loop iteration adjustment. Given the system’s high inertia and sampling rate of 1 s, the real-time capability of the control system would be ensured even with a very high number of iteration loops (see Table 1). In contrast, PID controllers operate on a feedback loop [15] directly linked to the sensor speed, excelling in simple linear systems but struggling with complex, non-linear processes. Adjustments are based on error values obtained from sensor feedback, which can be slower, especially when the sensor rate is constrained to a 1-s sampling interval. In conclusion, both simulation-based and inline validations underscored the system’s stability in maintaining process variables within specified tolerances, effectively addressing the challenge of managing real-time control in rubber extrusion processes.

Table 1 Summary of real-time capability key elements

To enhance the real-time capability and scalability of the control system, future efforts should focus on integrating edge devices, such as the Raspberry Pi 5, rather than relying solely on traditional CPUs. This shift can enhance scalability and reduce latency by facilitating data processing closer to the source. Additionally, optimizing the parameters of the inverse and data mining models will decrease inference times, thereby reducing the overall execution time of the system. For this approach, it is crucial to ensure that high levels of prediction accuracy are maintained despite the reduced parameter count. Furthermore, implementing online learning techniques allows the control system to adapt dynamically to changing operational conditions without periodic retraining, a topic to be elaborated on in the following conclusion.

6 Conclusion

Based on our previous research, which initiated the development of an FNN without real-world empirical application [4], this study presents the first ANN-based control system for rubber extrusion lines. The primary advantage of the system lies in its real-time adaptability in maintaining temperature within user-defined tolerances, ensuring consistent product quality even with batch-specific material variations. By integrating data mining with inverse models, an additional validation layer is introduced, enhancing the system’s predictive accuracy and ensuring robust performance. Validation results confirm the effectiveness of the system, with a high accuracy of less than 1 °C deviation in temperature control. Moreover, the system’s capability to handle variable combinations not represented in the training dataset highlights its robustness, marking a significant advancement from the earlier FNN model to a fully functional, empirically validated application.

Compared to traditional PID control systems, our ANN-based system offers superior adaptability. Although PID controllers are typically simpler and require fewer computational resources, they often struggle with complex, non-linear system dynamics common in processes like rubber extrusion. In contrast, our ANN-based systems excel at managing such nonlinearities, as evidenced by an MSE of 0.63°C2 and an MAPE of 0.68% for predicting extrusion temperatures, and an MSE of 4.69°C2 and an MAPE of 1.68% for the mixing temperature. This generalization capability, demonstrated across different processes, forms the basis for developing interconnected control systems.

Despite its strengths, the system faces challenges common to ANN-based models, such as dependency on large, diverse datasets for training and potential overfitting issues. To counteract these requirements when applying the system to other manufacturing processes, approaches such as data augmentation could be employed to enhance the robustness and effectiveness of the model without the need for excessively large datasets. While our ANN-based approach requires more computational resources compared to traditional PID controllers, efforts can be made to optimize the efficiency of the neural networks. By further reducing the number of parameters, it is possible to decrease inference times, thereby reducing the overall execution time of the system.

Looking ahead, there is potential for expanding the applicability of this system to interlink various manufacturing processes beyond rubber extrusion [19]. Additionally, integrating more comprehensive datasets and exploring further algorithmic refinements will improve the precision and speed further. The potential integration of online learning techniques, in particular the online deep learning (ODL) framework [20], promises to improve the adaptability of our ANN-based control system. ODL’s hedge backpropagation method, which allows for real-time updates of NN parameters, directly addresses the convergence challenges we face with streaming extrusion data. Incorporating this framework has the potential to enable our system to dynamically adapt to varying scenarios with even greater accuracy.