A Machine Learning Strategy for Race-Tracking Detection During Manufacturing of Composites by Liquid Moulding

This work presents a supervised machine learning (ML) model to detect race-tracking disturbances during the liquid moulding manufacturing of structural composites. Race-tracking is generated by unexpected resin channels at mould edges that may induce dry spots and porosity formation. The ML model uses the pressure signals recorded by a sensor network as input, providing a classification of the race-tracking event from a set of possible scenarios, and a subsequent variable regression for their position, size and strength. Such a model is based on the residual network (ResNet), a well-known artificial intelligence architecture that makes use of convolutional neural networks for image recognition. Training of the ML classifier and regressors was carried out with the aid of a synthetically generated simulation data set obtained throughout computational fluid dynamics simulations. The time evolution of the pressure sensors was used as grey-level images, or footprints, as inputs to the ResNet ML. The trained model was able to recognise the presence of race-tracking channels from the pressure data yielding good accuracy in terms of label prediction as well as position, size and strength. The model correlation was carried out with a set of injection experiments performed with a constant thickness closed mould containing induced race-tracking channels. The ability of ML models to provide an approximation to the inverse problem, relating the pressure sensor distortions to the cause of such events, is analysed and discussed.


Introduction
Fibre-reinforced polymer composites (PMCs) are nowadays widely used in applications that require lightweight materials such as those found in the aerospace, automotive, energy, health-care sectors and, among others, sports [1,2]. However, at present, the dependence of processing on external factors or disturbances is still remarkable in some strategic areas such as aeronautics, aerospace or the automotive sector which means that the number of parts to be inspected and either rejected or reworked produces a considerable impact on the final cost distribution. With that in mind, the cumulative and recurrent costs for the non-implementation of the appropriate diagnosis manufacturing policies that could allow recovery from failure will produce, without any doubt, negative impacts on the competitiveness of the industrial sector in the future.
During the last decade, enormous advances in highperformance computer modelling have enabled concurrent material design, optimisation and a way of manufacturing through virtual processing (VP) within an integrated computational materials engineering (ICME) framework [3][4][5]. This is especially relevant in the case of manufacturing of structural composites as the material performance is strongly path dependent. There is a need to incorporate such knowledge from the early stages of the design phase. The future of ICME and, in particular VP, involves a concurrent and seamless hybridisation with data-science and artificial intelligence (AI) in a new modelling paradigm [6]. Such a paradigm will be dedicated to predicting the evolution of material properties through the entire production chain, including processing disturbances, enabling right-first-time concepts and zero-waste production.
Within the family of manufacturing methods for structural composites, liquid moulding (LM), and more precisely, resin transfer moulding (RTM) emerged as the out-of-autoclave technique that best provides high-quality parts at optimised costs. The RTM process starts with a dry fabric preform which is draped and placed into a mould for its impregnation with a liquid resin driven by a pressure gradient [7,8]. After the part is totally impregnated, the resin is cured, normally by the action of heat sources until the part becomes solid and can be demoulded. However, one of the major drawbacks associated with RTM arises from the inherent uncertainty as regards the flow patterns produced during impregnation, which are seriously affected by processing disturbances that lead in the end to dry spots and void formation. Such dry spots are generated by unexpected resin channels at mould edges, known as race-tracking channels [9,10], resulting in non-homogeneous resin flow diverted directly to the outlet gates. In addition, resin velocity variations, mainly attributed to local values of permeability and viscosity, may induce void formation due to unbalanced capillary and viscous forces in textile reinforcements. In the last two decades, efforts have been made in the field to propose automated systems that perform corrective actions after detection of manufacturing disturbances in RTM. The concept of online filling in RTM developed by Advani and co-researchers is a good example of such a philosophy [11][12][13][14]. These strategies rely on smart use of sensors for flow and pressure detection [15,16] distributed over the surface of the mould.
AI is opening revolutionary opportunities across engineering disciplines such as continuum mechanics [17] and fluid mechanics, [18] as well as manufacturing [19,20]. However, the permeation of practical knowledge in manufacturing is still limited and applications are at an initial stage [21][22][23]. Focusing on structural composites, it is worth mentioning some of the latest contributions in the area [24][25][26][27]. Iglesias et al. [24] used Bayesian inference concepts in combination with the inversion method to address the problem of fabric permeability and uncertainties in RTM by using pressure sensors. Successful applications to detect local variations in fabric porosity and permeability in RTM were presented in [26], although the predictions strongly rely on the geometrical approximation of the fluid front position through flow detectors. The detection of regions of dissimilar permeability based on signals recorded by pressure sensors was demonstrated in [25]. These authors used a convolutional neural network (CNN), operating with pressure footprints influenced by the presence of a dissimilar permeability region. CNN regression was able to predict the presence, position, extension and strength of a dissimilar material region based on knowledge gained through the use of synthetic data sets. A different approach was presented recently in [27]. These authors divided a rectangular RTM mould into equal-size rectangular regions through using the synthetic pressure signals obtained by simulation for the detection of permeability dissimilar regions inducing voids. The classification problem was approximated by a gradientboosting method that provided reasonable results. However, from an industrial perspective, the number of sensors used in this work is still unrealistic and minimisation of use without losing information becomes mandatory.
Machine learning (ML) is a paradigm within AI, which is focused on solving tasks without being explicitly programmed for it. It is the data that allow ML models to begin to learn the specifics of the problem and, throughout trial and error, the model ends understanding the patterns hidden in data. Learning from inputs and outputs can be considered a gradual task continuously improving by experience gains. Somehow, this paradigm imitates the way humans learn. The main purpose of this work is to provide a first experimental exploration of ML methods to detect automatically flow disturbances caused by the presence of race-tracking channels in RTM. The detection capabilities of the model fall on the analysis of pressure changes recorded by a distributed network of only five sensors. The model was developed to address the problem of the presence of race-tracking regions in a squared flat RTM mould. The experimental system was intentionally kept simple with the purpose of exploring ML capacities for diagnosis in manufacturing.
The primary focus of this manuscript is to describe the general ML methodology and correlate it with data collected from the experiments. The basic idea distilled is to demonstrate the possibility of training ML models with only synthetic data extracted from simulations, using them for classification and regression purposes without performing an extensive and costly experimental campaign. Training ML models using only empirical data is prohibitive in many manufacturing systems, so efforts are made to use virtual processing information as much as possible. Future improvements in the ML model will be related with the inclusion of a new portfolio of race-tracking types and/or new processing disturbances scenarios such as non-homogeneous fibre volume fraction, deformations induced in the fabric textile while preforming among others.

3
The general description of the experimental set-up and the RTM tests with induced race-tracking channels is presented in Sect. 2. The systematic generation of a fully-based simulation data set corresponding to the physical problem is presented in Sect. 3 while the general architecture of the ML model, training and deployment is described in Sect. 4. A general discussion on the accuracy of the model for synthetic and experimental data is carried out in Sect. 5.2. Lastly, some remarks are made and conclusions drawn in Sect. 6.

Materials and Experiments
Resin injection experiments were carried out by using a 600 × 300 mm 2 closed mould as shown in Fig. 1a. The upper half of the mould was manufactured with transparent polymethyl methacrylate (PMMA) providing optical access to its interior while the other half was steel-made. In order to avoid excessive deflections of the transparent PMMA cover due to the injected fluid pressure, the mould was externally reinforced with a set of four steel beams helping to maintain a constant cavity thickness of 3 mm. Injection experiments were performed by imposing an overall uniaxial flow between the inlet and outlet gates. To this end, a single injection point was placed in the middle of one of the 300 mm edges to guide the flow inside a triangular channel to approximately half of the total mould length. The RTM system was kept as simple as possible to serve as a database for the correlation of the models. The race-tracking channels strongly distorted the fluid flow, providing a double correlation benchmark in which the physical model (Darcy's flow) and the ML classifier-regressors are validated.
The geometry of such a channel was created simply by using vacuum bag sealant tape. The outlet gate was placed at the opposite 300 mm edge and connected to a resin trap open to the atmosphere. The effective fabric dimensions inside the mould were L = 300 × 300 mm 2 , as shown in Fig. 1b. The two mould halves were closed with 16 screws generating a uniform clamping pressure on the mould cavity. A silicone seal surrounding the mould contour was used to prevent any fluid leaks during the injection.
The inlet was connected to a pressure pot in which fluid pressure was controlled externally with a compressed-air line. Fluid pressure in the panel was measured with five pressure transducers (Omega PX61V0-100AV). Four of them, namely P 1 to P 4 , were distributed equally in a uniform square array while a fifth sensor, P 0 , was placed at the fabric entrance, Fig. 1b. The transducers were screwed into the transparent PMMA cover through specific cylindrical cavities machined on it, exposing the testing sensor area to the fluid action without direct contact with the fabric. An additional sensor was used to control the electro-valve activity of the pressure pot, avoiding line pressure disturbances during the tests. All the tests were performed by applying a constant pressure at the injection pot of p in = 0.7 bar . The pressure sensor signals were recorded by using a Catman Quatum HBM data acquisition system. A video camera installed on the top of the mould was used to track the flow front evolution for direct comparison with the simulations.
A blend of corn-syrup with water at normal concentrations of ≈ 70 − −30% was used as an injection fluid for all the experiments. The blends were first degassed for 20 minu by using a vacuum container. The viscosity of the fluid was measured individually and immediately before the experiments were performed by means of a rotational viscosimeter (Fungilab) with L2 type spindle at 150 rpm. No shear strain rate effects were taken into account, assuming that the fluid behaves as a Newtonian one does. The viscosity values measured ranged between 130 and 160 mPa ⋅ s for all the experiments corresponding to slight variations in the blend as well as the laboratory daily temperature.
The fabric lay-up involved stacking four plies of a polyethylene terephthalate (PET) non-woven felt of 280 g/ m 2 areal mass with a volume fraction of reinforcement of ≈ 30% . Such a combination of the aforementioned corn syrup blend with the PET non-woven felt provided a good Fig. 1 a General overview of the RTM testing system, b Mould for RTM experiments with details of sensor locations P 0 to P 4 . The testing area corresponded to a L = 300 × 300 mm 2 square. Coordinate x − y origin at the lower-left corner of the fabric. The test presented corresponded to a no race-tracking case with homogeneous flow along x direction optical contrast for fluid front visualisation purposes. Racetracking was induced by cutting with a blade lateral channels along the flow direction of ≈ 10 mm width. Special care was taken to prevent leaks on regions with non-racetracking edges in which resin flow may divert. In this work, four types of race-tracking types were studied as shown in Fig. 2a. In all the cases, race-tracking channels were introduced at the edges parallel to the flow direction. Type 1 and Type 2 can be considered the same race-tracking type but flipped with respect to the flow direction axis. Type 3 race-tracking contains the two channels starting from the fabric entrance. Type 4 was generated by producing centred channels totally disconnected from the entrance and exit of the panel. A set of four snap-shots corresponding to one case with the Type 3 race-tracking configuration is plotted in Fig. 2b. The upper and lower length of the race-tracking channels was set to l 1 =0.075 m and l 2 =0.225 m, respectively, for this case. As expected, the fluid progressed faster through the channels which produced a clear deviation of the flow front from the theoretical linear unidirectional case.
The pressure signals corresponding to the five cases analysed (no race-tracking, Type 1, Type 2, Type 3 and Type 4) are presented in Fig. 3. The existence of the sensor P 0 at the fabric entrance requires a specific comment. In all the cases analysed, the controlled pressure at the injection pot was ≈ 0.7 bars which differed considerably with the readings of sensor P 0 at the fabric entrance. A significant pressure drop of ≈ 30% was attributed to viscous flow and local losses produced along the flow path from the pot to the fabric entrance making necessary its direct measurement. In addition, the pressure recorded by sensor P 0 was not constant in time and depended on the fabric resistance to flow. This has a considerable effect on the filling time which diverts from the theoretical value obtained under the constant pressure assumption. In this case, by assuming a linear pressure gradient over the length L of the fabric, and assuming a constant inlet pressure of p max with no race-tracking effects, the filling time can be obtained by the direct integration of the Darcy equation. Such an integration yields the well-known relation t 0 ff = L 2 ∕(2Kp max ) , where and K stand for the fluid viscosity and fabric permeability. However, this expression is no longer valid if the inlet pressure is not held in time. If the pressure evolution at sensor P 0 is given by . The proportionality between injected length and time, t ∝ x 2 (t) , is no longer fulfilled and the filling time is delayed to t ff = t 0 ff . The factor depends on the shape of pressure build-up function f(t). Therefore, when f (t) = 1 , the classical constant pressure case is recovered with = 1.
The experimental characterisation of the permeability of the fabric K in the absence of race-tracking channels was determined first, see Fig. 3a. In this case, the flow progressed homogeneously and the readings of sensors P 1 − P 2 and P 3 − P 4 were similar in time and lay within the experimental scatter. The direct measurement of the filling time as a function of the injected length, see x(t) length in Fig. 1b, was used for the permeability characterisation. To this end, three tests without race-tracking channels were carried out. The integral of the experimental function f(t) obtained from sensor P 0 reading as ∫ t 0 f (t) dt is then represented against the term x 2 (t)∕(2p max ) , with the slope of the relationship being the permeability of the fabric K. A Bayesian marginalisation of said slope, the permeability K, is computed by means of the EMCEE Python implementation of the affine invariant . The probability density function is then obtained, providing the averaged value of K and the 5% and 95% confidence percentiles as K = 4.6515 +0.0974 −0.0994 ⋅ 10 −10 m 2 . The pressure evolution curves for the remaining cases analysed are gathered in Fig. 3b to e. The distortion produced by the presence of the race-tracking channels was evident with effects on the maximum pressure, arrival times and pressure build-up. This information is stored in a non-dimensional space in the form of a pressure footprint image as shown in Fig. 4. The footprint is defined as a pixel-based image of n × 5 size in which the vertical axis of the array is the nondimensional time t = t∕t 0 ff and the horizontal axis the specific sensor number ( P 0 to P 4 ). The pixel intensity is assigned to the pressure value p = p∕p max . The time size of the images was set to n = 150 , with this value being justified in Sect. 4.

Mould Filling Model
The physical model used in this work is based on the resolution of the Darcy equation for the fluid flow through a porous medium. The Darcy equation establishes a linear relation between the average fluid velocity through the fiber preform ( , t) and the pressure gradient ∇p( , t) , with the proportionality factor being related to the fabric permeability tensor ( ) and the fluid viscosity as In this equation, and t stand for, respectively, the position of a given point in the fabric and the time. Assuming flow continuity, ∇ ⋅ ( , t) = 0 , the governing equation for the pressure field can be obtained as Initial ( t = 0 ) and boundary conditions should be given to determine the evolution of the pressure and velocity fields during the time. Such a problem is normally defined as a moving boundary problem because the flow front position Γ( , t) evolves during the time until the preform is completely filled. Normally, if the position of the flow front is known for a given time t, the pressure and velocity fields are determined by standard finite element modelling. Once such information is acquired, an update of the flow front position for time t + Δt can be obtained. Several numerical techniques were developed in the past to solve such problems in liquid moulding manufacturing of composites. For instance, the finite element/control volume method uses regions associated with every node to update information regarding the filling factors through the use of the flow rates obtained with the velocity fields. Other ways to simulate mould filling processes are based on the direct solving of the two-phase flow problem by using the Navier-Stokes equations for incompressible fluids. In this case, the continuity equation ∇ ⋅ ( , t) = 0 is accompanied by the linear momentum equation reading as where is known as the Darcy-Forchheimer sink term and where and stand for material parameters. The second term in Eq. (4) is related to inertial effects which are negligible for the case of liquid moulding of composites under a low Reynolds number flow. In this situation, the inertial terms related to the velocity can be neglected, recovering the standard Darcy equation with the factor being the inverse permeability of the fabric. In two-phase flow, the interface between the two fluids, namely resin and air in liquid moulding, is tracked by means of the volume of fluid (VOF) approach by using as a phase variable. This variable ranges, respectively, between = 1 and = 0 for the resin and air fluids. Within this approach, the variable is continuously updated during simulation time by using the advection equation given by Pressure footprints for: a no race-tracking, b race-tracking Type 1, c race-tracking Type 2, d race-tracking Type 3, e racetracking Type 4. Each image is formed by 150 × 5 pixels corresponding to five sensors ( P 0 to P 4 ) and n = 150 temporal divisions OpenFoam (Open-Source Field Operation And Manipulation) [29] is a free open-source computational fluid dynamics (CFD) software that can be used to solve the problems related to filling in the liquid moulding of composites. OpenFoam includes interFoam a solver for two-phase incompressible, isothermal and immiscible fluids, tracking interfaces with the VOF method. Details and performance of the algorithms used to track interfaces, as well as pressure and velocity solvers for two-phase flow, can be found in [30].

Data Set Generation
OpenFoam provided the massive synthetic data required for the ML model. Given that obtaining such a data set size from only experiments is prohibitive, they were replaced by numerical simulations. Computations were carried out in a non-dimensional space by dividing length, time and pressure by L, t 0 ff and p max , respectively, so that the model could be used for different panel size L fabrics, permeability K and fluid viscosity . Macroscopic unidirectional flow was induced in a unit square fabric of 1 × 1 × 1∕100 by imposing a time-dependent pressure boundary condition at the inlet ( x = 0 ) following p 0 (t) = 1 − e −̂t while the outlet pressure was kept at p = 0 ( x = 1 ). The time-evolution parameter ̂ controls the pressure build-up in sensor P 0 in non-dimensional time. The pressure build-up at the inlet was introduced to follow experimental observations being the exponential function with a single parameter ̂ good enough to capture the general evolution. However, additional terms can be added to the model without loss of generality. Slip-free conditions were applied to the remaining faces of the mould. Race-tracking channels were modelled as dissimilar regions of width 1/30 and length ̂l 1 and ̂l 2 with isotropic permeability given by K ch = K∕ . K is the permeability of the bulk fabric while = K ch ∕K the race-tracking strength. A uniform brick cell discretisation of the domain was used with in-plane dimensions 1/100 and a single cell in the mould through-thethickness direction. The models contained a total of 10000 brick cells which were judged fine enough to capture accurately the flow front position evolution during time, being the effect of discretisation negligible at the pressure sensor points. Viscosity and permeability K were arbitrary chosen in this approach to provide t 0 ff = 1 for the non-dimensional space.
A four non-dimensional tuple formed by X = [l 1 ,l 2 , ,̂] defines the characteristics of an individual non-dimensional OpenFoam case used for the synthetic data set generation. The X variable space was uniformly covered by means of the Latin hypercube sampling technique by means of the PyDOE software package [31]. All the regression variables were assumed to follow uniform proba b i l i t y d i s t r i b u t i o n s : Ûl 1 = Ûl 2 = U(0.05, 0.95) , U = U (10,2000) and Û= U (2,25) . The computation time was kept constant for the simulations and equal to t ff = t 0 ff with = 1.5 . The latter value of = 1.5 corresponds to the resolution of the integral equation ∫ The set of simulations with different four-tuples X = [l 1 ,l 2 , ,̂] for each of the types of race-tracking, as well as without it, was generated with OpenFoam. The automation of the computational process was carried out with PyFoam, a Python library that manipulates and control OpenFoam running cases. Instances of the problem corresponding to each of the four-tuples were generated by modifying the input parameters of the topoSetDict and controlDict dictionaries. Once an individual simulation was finished (approximately in 180s in a standard computer), the pressure probes data at the position sensor p i (t) of the experiments were stored. However, the particular computational time may significantly increase for other complicated geometry and boundary conditions problems. The process was followed with a new job submission until the whole data set was created. A total of 4000 OpenFoam simulations were run to serve as a database for model benchmarking. Data sets were serialised into a Python pickle, providing easy access for subsequent ML training tasks.

Machine Learning Classifier-Regressor Structure
ML scripts were built by using the end-to-end open-source platform PyTorch [32]. In this work, among those available in the published literature, a kernel architecture made with the deep learning Residual Network (ResNet) [33] was selected. Similar results with other top-performing DL models, such as DenseNet [34] or EfficientNet [35], could be obtained; however, the comparative performance analysis is beyond the scope of this work. The ML model proposed here is a combination of two basic and subsequent operations. First, a classifier that predicts the race-tracking labels (no race-tracking, Type 1, Type 2, Type 3, Type 4) as a function of the pressure footprints and, second, a regressor that provides an approximation for the four-tuple.
ResNet is developed by mimicking pyramidal cells existing in the cerebral cortex with skip connections jumping over convolutional layers. Convolutional networks present some degree of inconvenience with regard to back-propagation and gradient vanishing. This happens when the depth of the network is considerably large, meaning that the backpropagation of gradients shrink to zero after several training epochs. As a result, network weights are not updated properly and learning is not produced. In order to address this, the ResNet architecture introduced skip connections between layers. The structure of the ResNet model used in this work is presented in Fig. 5. It is composed of three subsequent block operations from the input images of the pressure footprints: convolution Conv2D, batch normalisation BN and activation with a Rectified Linear Unit ReLU: • Convolution2D (Conv2D). This corresponds to an image operation based on the application of a given set of filters to enhance specific features of the image. The 2D convolutions of an individual image may be obtained, namely , from an input image, namely , by applying the kernel function , as where stands for the filter applied and * the convolution operator. In this work, a total of 17 convolutions with kernel-size between 1 × 1 and 3 × 3 size were applied in the convolution, using stride=1 and stride=2 (which informs about the step pixel displacement through the image) and padding=1 to avoid image trimming layer on the image edges. • Batch normalisation (BN). This seeks to alleviate the movements produced in the distributions of internal nodes of the network with the intention of accelerating training. Those movements are avoided via a normalisation step, constraining means and variances of the layer inputs and saving them during the training process. Furthermore, it reduces the need for dropout, local response normalisation and other regularisation techniques [36]. • Rectified linear unit (ReLu). This is the activation function g(x) = max(0, x) used to introduce non-linearity between layers. • Average pooling (AP). This is applied to reduce the dimensions of the convolution filtered images with the purpose of obtaining more efficient and robust characteristics. The model uses a 4 × 4 pooling filter after the last convolution block and before the fully connected layers, taking the averaged value of the pixels in the neighbourhood as the result for a given point. The footprint images of 150 × 5 are reshaped into 28 × 28 , adding a padding to match the dimensions. The new reshaped images are first subjected to a first set of operations (Conv2D, BN and ReLU) to increment the number of features before using the deep residual connections. At this point, the results of this first operation set are submitted to four sequential layers which contain two blocks with internal skip connections. All the mentioned layers end with an averaging pooling AP which averages the resulting values. Lastly, a set of features is received as a result of the previous operations which feed a fully connected neural network with 128 neurons directly linked to either the class label (no race-tracking, Type 1, Type 3, Type 4) or four-tuple ( ̂l 1 ,l 2 , ,̂ ) regression parameters. Each of the blocks in Fig. 5 contains the aforementioned ResNet skip connections. The input skips the block layers and feeds the layer prior to the entrance in the activation function to alleviate the vanishing gradient problem [37]. By adding the input image x to the block-transformed value with Conv2D+BN+ReLU+Conv2D+BN operations, namely h(x), the composed response obtained H(x) = h(x) + x is less sensitive to the vanishing gradient. as H(x) is close to the unity. Thus, given that the deep network with skip connections learns close to identity layers it produces an optimum configuration.
The race-tracking classifier and each of the regressors used were trained individually on an NVIDIA RTX 3090 24 GB GPU system. Minimisation was carried out with the stochastic gradient descent (SGD) algorithm by using the cross-entropy (CE) and mean squared error (MSE) loss functions for, respectively, the optimisation process of the classifier and regressor. The first question to address is to define the number of simulations necessary to ensure model accuracy and the generalisation level. To this end, a data set containing 4000 simulations was generated following the proposed approach and, then, it was randomly divided in two parts corresponding to training and test sets following a ratio of 75/25. The training set is presented to the model to learn patterns while the test set is used to evaluate it with never-seen data while. Subsets of 100, 500, 1000, 2000 and 3000 simulations were randomly extracted from the training set validating them over the same test set to determine the model variances. The model is said to generalise when the losses corresponding to training and test data sets are close together within a numerical scatter band. Each subset was trained until asymptotic loss behaviour in CE and MSE was obtained which occurred typically after 400 epochs. The CE and MSE obtained for each of the subsets in the last 20 training epochs was recorded, with the mean value and variances being in the form of box-and-whisker plots presented in Fig. 6a, b. The CE and variance diminished strongly when the data set contained more than 500 simulations, indicating that the information embedded in the pressure footprints was enough for classification purposes. However, in the case of the regression variables, the trends were slightly different. Figure 6b shows the MSE corresponding to the race-tracking Type 1 with the different subset size. Once again, the model was trained to a highly effectively degree with a very small MSE even when the data set used involved 100 or 500 simulations. However, from the results obtained, it could be argued that generalisation occurs only with a minimum of 2000 simulation data sets. Therefore, the analyses presented in the following sections are obtained with such a data set size. Still, the data set can be reduced even to 500 per condition without significantly impacting the error. A compromise between the error obtained in the regression with the synthetic data and the experimental scatter should be given.

Synthetic Data Machine Learning Accuracy
Although not presented here, the ResNet confusion matrix for the race-tracking types corresponds to almost perfect identity. Such a confusion matrix compares the labelling predictions of the classifier with the truth value used as input. From the ML model perspective, the classification problem of race-tracking produced no misleading results and the accuracy level was close to 100% without representative values of false positives and negative values. It should be remarked that such a level of accuracy is attained with the only information given by five sensors equally distributed over the mould surface.
The comparisons between the ground truth and the predicted values of the race-tracking regression variables are gathered in Fig. 7a to c. For the sake of simplicity, only data corresponding to the race-tracking case Type 1 is presented. Training (in red) and test (in green) data sets were included in the comparison. It should be noted that the test set corresponded to never-seen data which was not used for model training. The results for ̂l 2 were not included in the figure because of the similarity to those obtained for ̂l 1 . Again, accuracy plots for the remaining race-tracking cases were omitted in the paper for simplicity. The relative regression deviations (X pred − X truth )∕X max computed by using only the test data set (green dots) were quantified by means of the probability density functions and the results plotted in Fig. 7d. The distributions followed the expected trend centred on the truth values X truth with an slight skew. The ̂l 1 , ̂l 2 and ̂ distributions were narrower than the one. Such distributions were used to construct the 5-95% confidence offsets to the one-to-one slope for Fig. 7a to c. Maximum deviations of ≈ 3.6% , ≈ 14.2% and ≈ 1.48% were found for ̂l 1 and ̂ for this confidence interval. Thus, it can be assumed that the accuracy level meets the requirements, at least from the synthetic data, and the classifier and the set of regressors could be used to ascertain the race-tracking classes and regression parameters. However, the experimental accuracy should first be evaluated.

Experimental Data Machine Learning Accuracy
The experimental values of the pressure sensors for different race-tracking cases were presented to the trained ResNet ML model to ascertain the accuracy level obtained. To this end, the experimental pressure and time were first projected to the non-dimensional space by dividing them with the corresponding values of p max and t 0 ff , respectively. The theoretical filling time, in the absence of race-tracking, for each individual test was computed by using the measured permeability K = 4.6515 ⋅ 10 −10 m 2 and the corresponding test viscosity of the experiment. Such experimental footprints were also projected to share the same pixel dimensions 150 × 5 as those used for the ResNet training set. The pixel grey level of the images corresponding to the time value of the non-dimensional pressure p(t) .
Experimental footprints were first presented to the trained ResNet classifier. Again, the operation resulted in no labelling errors for the experimental data set composed by the no race-tracking, Type 1, Type 2, Type 3, Type 4 tests. The pressure footprint features extracted by the ResNet ML model were adequate for classification purposes given the pressure sensor footprints. This latter result on the classification problem deserves additional comment. When training the ResNet model with synthetic data, the accuracy level obtained in the classification tests was very close to 100% . It should be noted that similar results can be obtained when other less sophisticated models, such as decision trees are used. However, those models fail when trying to predict the class corresponding to the experimental results. In addition, regressing the parameters of the position, size, and strength of the race-tracking in an actual situation is a challenging task requiring modern deep learning techniques such as ResNet.
After classification, each regressor was applied to the experimental results, yielding the corresponding non-dimensional four-tuple ̂l 1 , ̂l 2 , and ̂ . The results were back-projected to the experimental testing space, l 1 , l 2 , and , with the appropriate fabric dimensions L and filling time t 0 ff . The numerical results obtained are gathered in Fig. 8, including the ResNet predictions and the truth experimental values of the race-tracking variables. Clearly, the accuracy in the regression parameters was modest as compared with the classification problem but lay within the expected range. The values of l 1 , l 2 were compared with the imposed race-tracking lengths in Fig. 8a and b. The ResNet prediction for the parameter of the exponential pressure build-up at the fabric entrance was compared with that obtained by direct fitting of the P 0 sensor pressure signal, Fig. 8d. Unfortunately, the race-tracking strength = K RT ∕K cannot be compared with any imposed value in the test. Only crude comparisons can be made by using the theoretical equivalent race-tracking ) for a rectangular duct of width w and height h under Stokes flow). For instance, by assuming a channel depth of h = 3 mm and channel widths ranging in w = 1 − 10 mm, the race-tracking strength ranges between 140 and 1300 being these factors similar to those found during the regression of the variable. Such a reference value is presented in Fig. 8c but only for comparison purposes.
In summary, the level of accuracy found was reasonable for regression of the variables given that they were predicted with the only information of five pressure sensors distributed on the mould surface. Of course, it could be argued that better estimates could be obtained by installing sensors at specific positions in order to gain sensitivity. After the regressions, OpenFoam simulations were run again by using the predicted results presented in Fig. 8.
The numerical values of the evolution of the sensor pressure signals are plotted in Fig. 9 and compared with the corresponding experimental results. The shape of the curves was accurately captured, and agreement includes relevant features such as the shape of pressure build-up and fluid arrival time. The degree of agreement in the case of race-tracking Type 4 was modest and the main reason was attributed to the lack of prediction of pressure buildup at sensor P 0 . It should be remarked that such a sensor signal was assumed to follow a single-parameter exponential function ( p 0 (t) = 1 − e −̂t in the non-dimensional space) which, in this case, deviates from the experimental values as already noted in Fig. 8d. Better predictions could be potentially obtained by including additional parameters for the description of the pressure build-up at the fabric entrance, but this analysis is considered out of the scope of the present work and will be included in future research. The direct comparisons of the snap-shots, numerical and experimental, of the front flow evolution for different times are presented in Fig. 10a to d. The numerical results were overlaid in a transparency mode so that the direct comparison with the experiments could be carried out easily. The experimental flow front position is highlighted with a white dotted line in all the snapshots. Agreement between simulation and experimental front flow evolution was notable in shape and arrival time to specific positions bearing in mind the discrete information recorded by the sensors.

Conclusions
A new, simple and robust methodology based on ML algorithms for race-tracking and non-homogeneous flow detection is presented in this work. The presence of race-tracking channels that may induce dry spots and voids is a common processing disturbance that occurs during liquid moulding of structural composites. The presence and strength  Fig. 8. Results were back-projected to the experimental space with the fabric length L and filling time t 0 ff of race-tracking channels were detected from the signals recorded by a discrete network of five sensors distributed over the mould surface. The readings obtained from pressure sensors were used as inputs of the ML model, delivering the race-tracking class among a set of possible scenarios (no race-tracking,Type 1,Type 3,Type 4) and regression variables for geometry (length and position) and strength. In this work, pressure sensors were preferred over standard flow sensors because of the superior information captured by the first. Flow sensors only record the time the resin passes through a given position. The pressure sensor provides, in addition, information on the resistance to flow in terms of pressure build-up.
It is often assumed that ML models require a significant amount of data for training purposes. This stringent limitation is reduced if physics-informed ML is used instead [6]. In the approach offered by this paper, it was decided to replace experiments by numerical simulations obtained through the extensive use of computational fluid dynamics.
The CFD forward model based on Darcy's flow through porous medium permits the complete generation of a set of pressure footprints from the type of race-tracking event, size, position and strength. However, the engineering challenge involves the approximation of the inverse problem relating the distortion of the pressure footprints to its cause or manufacturing disturbance. This was addressed by using a ML model based on Residual Network (ResNet) which is one of the current outstanding convolutional network architectures used for image analysis and recognition. By applying a sequence of residual blocks operating on the input footprint images, our model was able to learn how to classify racetracking problems among a set of different possible scenarios and provide insights into the extension and intensity of the disturbance event. The ML model was trained with 2000 synthetic data obtained through OpenFoam.
The accuracy of the ML model was also checked with both synthetic and experimental data. The correlation against synthetic unseen data not used during model training (known as test data set) was excellent in both classification and regression tasks. The confusion matrix obtained with the OpenFoam synthetic data was almost diagonal and the accuracy of the regression parameters was below ≈ 4% for all the variables with the exception of the race-tracking strength . In this latter case, the maximum error from the test data set was ≈ 15%.
The ResNet predictions were also good when compared with experimental results. In this case, the experimental pressure footprints were used as inputs of the ML, yielding the race-tracking class and the regression parameters. The classification problem was again perfectly solved without labelling errors, although care must be taken because of the limited number of experimental tests available. The regression accuracy when compared with experimental results was modest, and errors in the prediction of the race-tracking lengths and strength of ≈ 40% were obtained in some cases. A number of factors may influence such a loss of accuracy: among them the most relevant ones are experimental scatter associated with stochastic fabric permeability and model inadequacy.
As mentioned previously, the primary purpose of the paper was to demonstrate the possibility to train a model with synthetic data generated with a standard computer fluid dynamics software, such as OpenFoam, and use it to detect critical disturbances occurring during the liquid moulding manufacturing of structural composites, such as a racetracking situation. Thus, the challenge was to prove whether modern machine learning models, trained exclusively with synthetic data, can generalise and predict race-tracking conditions in a real experiment.

Perspectives and Improvements
In order to overcome some of the limitations that we have encountered in the present model, the following future improvements can be introduced: • Data set hybridisation. In the current situation, data sets are built exclusively by using CFD simulations. All the prediction capabilities of the model rely on accurate modelling of the physics involved in the process. However, it will be possible in the future to include experimental results by developing hybrid data sets containing experimental and simulation data. Experimental data will naturally incorporate into the ML model such inadequacies/discrepancies resulting when the current physics of impregnation does not explain all the experimental facts. • Optimisation of sensor positions to improve the resolution of the technique. The pressure sensors were distributed homogeneously on the mould surface in this work.
Considering that the effect of the race-tracking channels depends on the distance between the sensor and the disturbances, it is clear that sensor positions could be a matter of improvement. The paper introduces a general methodology for the analysis that permits wrapping it into optimisation algorithms that would automatically provide a solution. • The ResNet structure used in this work was trained and experimentally validated with only four different racetracking scenarios which can be understood as a small set of an infinite number of possibilities. Although this is true, ML can incorporate incremental training so additional data sets will be included increasing the predictability of the model. Such different scenarios may include other race-tracking channel events, the generation of regions with different permeability caused by local variations in volume fraction, deformations induced in the textile preform or uneven mould clamping pressure. • Without the lack of generality, the present model is able to provide useful information about possible processing disturbances occurring during liquid moulding of composites by learning from pressure sensor signals. Pressure signals are immediately analysed after manufacturing providing an instant classification and regression of the race-tracking problem which can be considered a diagnosis procedure. However, the final intention will be the continuous interaction with the injection system for applying the necessary on-the-fly corrective actions to alleviate the defects induced by the presence of processing disturbances. • Trustworthy ML. Industrial applications require the use of trusted ML. It is necessary that the ML system delivers not only an accurate and robust response in terms of classification and regression but to rely on how algorithm decisions are taken. In this respect, physics-informed ML constrains responses by aligning them with well-known physics laws [6].