Artificial neural network tools for predicting the functional response of ultrafast laser textured/structured surfaces

Artificial Neural Networks (ANNs) are well-established knowledge acquisition systems with proven capacity for learning and generalisation. Therefore, ANNs are widely applied to solve engineering problems and are often used in laser-based manufacturing applications. There are different pattern recognition and control problems where ANNs can be effectively applied, and one of them is laser structuring/texturing for surface functionalisation, e.g. in generating Laser-Induced Periodic Surface Structures (LIPSS). They are a particular type of sub-micron structures that are very sensitive to changes in laser processing conditions due to processing disturbances like varying Focal Offset Distance (FOD) and/or Beam Incident Angle (BIA) during the laser processing of 3D surfaces. As a result, the functional response of LIPSS-treated surfaces might be affected, too, and typically needs to be analysed with time-consuming experimental tests. Also, there is a lack of sufficient process monitoring and quality control tools available for LIPSS-treated surfaces that could identify processing patterns and interdependences. These tools are needed to determine whether the LIPSS generation process is in control and consequently whether the surface’s functional performance is still retained. In this research, an ANN-based approach is proposed for predicting the functional response of ultrafast laser structured/textured surfaces. It was demonstrated that the processing disturbances affecting the LIPSS treatments can be classified, and then, the surface response, namely wettability, of processed surfaces can be predicted with a very high accuracy using the developed ANN tools for pre- and post-processing of LIPSS topography data, i.e. their areal surface roughness parameters. A Generative Adversarial Network (GAN) was applied as a pre-processing tool to significantly reduce the number of required experimental data. The number of areal surface roughness parameters needed to fully characterise the functional response of a surface was minimised using a combination of feature selection methods. Based on statistical analysis and evolutionary optimisation, these methods narrowed down the initial set of 21 elements to a group of 10 and 6 elements, according to redundancy and relevance criteria, respectively. The validation of ANN tools, using the salient surface parameters, yielded accuracy close to 85% when applied for identification of processing disturbances, while the wettability was predicted within an r.m.s. error of 11 degrees, equivalent to the static water contact angle (CA) measurement uncertainty.


Introduction
Artificial Neural Networks (ANNs) are popular and wellestablished learning systems that employ the principles of biological nervous systems. They are typically composed of several layers of simple nonlinear processing units called neurons. The first layer buffers the input data, after which the signal is processed by a variable number of interconnected hidden layers. Lastly, an output layer provides the ANN's response [1]. Given ANNs' ability to approximate any given function, they are a proven tool with applications onto a wide range of industrial problems such as functional prediction or system modelling. Thanks to their learning and generalisation capabilities, ANNs are particularly useful in cases where physical processes are unknown or too complex to be described analytically [2]. ANN development and applications are not limited to specific areas: they can be successfully employed not only in engineering and manufacturing but also in finance, medicine and many other fields [3].
In recent years, ANN developments applicable to laserbased manufacturing processes gained considerable research interest as a novel alternative to physics-based analytical and numerical methods. Most commonly, machine learning algorithms were employed to predict the dimensions of laser-ablated profiles [4][5][6], along with forecasting surface quality and material removal rates based on the input of the key laser processing parameters [7,8]. ANNs were also used to identify the optimum laser pulse energy needed to obtain the desired craters' depth and diameter for different materials [9]. Furthermore, ANNs were effectively applied to monitor and control laser processes, and to identify defects by non-destructive detection methods. This was achieved by building a system that identifies defects based on the extracted significant measurement data by employing only image processing [10]. Other methods focused on the analysis of acoustic emissions from the laser-induced plasma [11] or through in situ speckle pattern observations [12]. In all of the various tasks, where the input/output dataset pairs differed significantly, trained neural networks were able to achieve very high prediction accuracy.
The key to obtain good results when applying ANN tools into manufacturing processes is to select an appropriate ANN topology, learning method and suitable data preparation techniques [2]. In addition, a high amount of experimental data is required to train ANNs for optimal performance. Ideally, they should obtain all the relevant information to successfully carry out the desired task. However, building a system from sufficiently big data sets is time-consuming, problematic and in most cases not viable. A common solution to this issue is to augment the available training data, and such approach was already successfully applied in simulating complex systems based only on small experimental datasets [13,14]. One of the novel augmentation techniques is Generative Adversarial Networks (GANs). They are composed of two convolutional neural networks and were originally designed to generate artificial images that are indistinguishable from the real ones [15]. GANs were already utilised as a predictive visualisation method in laser machining. Laser-ablated topographies were recreated based on spatial laser intensity profiles [16] or by transforming the key laser parameters into predicted 3D surface profiles [17].
Another area where ANNs can be effectively applied is laser structuring/texturing for surface functionalisation.
A particular type of sub-micron structures, generated by ultrafast lasers, are Laser-Induced Periodic Surface Structures (LIPSS). Low Spatial Frequency LIPSS are especially attractive to researchers due to their vast applicability and the wide range of achievable surface functionalities, e.g. modifying wettability, enhancing cell proliferation or structural colouring, to name a few [18]. The functional response of LIPSS surfaces is mostly dictated by their topological characteristics, i.e. periodicity, amplitudes and regularity of ripples. LIPSS are all sensitive to changes in laser processing conditions, in particular in cases where processing disturbances affect the laser structuring process, e.g. when LIPSS are generated on 3D and freeform surfaces. The most common disturbances are variations in the Beam Incident Angle (BIA) and Focal Offset Distance (FOD). The relationship between disturbances and their influence on LIPSS topographies has been studied, and it was shown that BIA affects their periodicity while FOD mostly influences ripples amplitudes [19,20]. Thus, any variations in processing conditions due to structuring disturbances during the LIPSS generation affect the surface functionality, too. Typically, the surface responses are analysed experimentally to confirm whether the functional performance is still within acceptable limits [21,22]. However, obtaining functional performance data from the laser treated surfaces is often time-consuming, limited to specific processing settings, and requires special instruments and measurement setups. Another issue related to LIPSS generation in the presence of processing disturbances is the lack of adequate process monitoring and quality control tools to maintain the process in control. ANNs can offer promising solutions for condition monitoring during the laser structuring process and consequently indirectly to judge whether the surface's functional performance is still within some predefined limits.
In this research, ANN tools were developed for pre-and post-processing of LIPSS topography data, i.e. their areal surface roughness parameters, for two main tasks. The first is the identification of whether any processing disturbances were present during the laser structuring process. The second is the mapping of the LIPSS topographies to their functional responses, here wettability. For both tasks, a small representative experimental dataset augmented with GAN-generated LIPSS topographies was used to develop and train ANN classifiers, while the validation was performed on a larger unseen dataset. The pre-processing step involved the application of feature selection methods to minimise the number of data attributes based on their relevance and redundancy. The next section outlines the experimental methods used to create representative data sets of LIPSS topographies. These data sets are required to develop and validate the proposed ANN tools. Then, the pre-processing methods (GAN and feature selection) and the ANN structure optimisation tools are described, together with the ANN tools developed for the two tasks. Subsequently, the experimental results of the implementation of the proposed methods are presented and discussed. Finally, conclusions are made about the effectiveness of the investigated feature extraction methods and ANN tools, and their applicability to the two classification and prediction tasks associated with the use of LIPSS treatments.

Experimental methods
Laser structuring was performed using an ultrafast Ytterbiumdoped laser source (Satsuma from Amplitude Systems) with a near-infrared wavelength ( ) of 1032 nm, pulse duration of 310 fs, maximum average power and pulse energy of 5 W and 10 µJ, respectively. A linearly polarised Gaussian laser beam was focused with a 100 mm telecentric lens on workpieces to deliver a beam spot size of 40 µm. The laser processing of surfaces was realised by employing a 3D scan head. A motorised rotational stage was employed, and a dynamic focusing module with a working range of ± 3 mm from the focal plane was used to control the laser focusing for the samples produced with varied BIA. The LIPSS treatments were performed on 1.5 mm thick, mirror polished, 304 stainless steel plates.
Optimised laser settings and strategy for generating regular and uniform LIPSS obtained from initial trials were used, in particular: peak fluence of 0.28 J/cm2, pulse repetition rate of 10 kHz, 40 mm/s scanning speed and 6 µm hatching distance between the pulse trains that yielded the pulse distance of 4 µm and 6 µm in x and y direction, respectively. The relatively low scanning speed was chosen due to the limitations of the dynamic focusing module. The schematic representation of the described laser processing strategy is presented in Fig. 1a. The laser processing settings were set constant, while structuring disturbances were present and controlled as shown in Fig. 1b. Square fields of 8 mm x 8 mm were produced with varying disturbances, i.e. from 0 to + 900 µm with an increment of 100 µm for FOD, and separately from 0 to 35 deg with an increment of 2.5 deg for BIA. Each field with different set of disturbances was produced three times. Additionally, 15 supplementary LIPSS topographies were produced with the same scanning strategy, without disturbances but with varied peak fluence in the range from near-threshold 0.16 J/cm2 to 0.54 J/cm2. The topographies of the LIPSS-treated surfaces were analysed by using an Atomic Force Microscopy (AFM) from Digital Instruments D3100 with NanoScope controller. In total, 87 scans of 20 µm × 20 µm (256 px × 256 px) fields were analysed, and all necessary topography data were acquired. Then, each surface sample was used to extract 16 reference images (100 px × 100 px) by using an overlapping sliding window every 50 px. Pre-processed images were fed into the Alicona Meas-ureSuite software to calculate 21 standardised areal surface roughness parameters according to ISO 25178. The roughness parameters are the most commonly used to characterise surfaces, i.e. sets of height, spatial, hybrid and functional parameters, and they are listed in Table 1.
The wettability of laser structured surfaces was analysed with the contact angle (CA) goniometer (OCA 15EC, Data Physics GmbH). The static CA on each laser-processed surface was measured 4 times employing the sessile drop arrangement for optical measurement of CA by using a drop shape analysis. Droplets of de-ionised water were deposited with 1 µl/s speed to form a droplet of 4 µl, and then, they were carefully placed on the laser-processed field. Prior to CA measurement, each test sample was carefully cleaned in an ultrasonic bath, first in acetone and then in 99.8% ethanol solution for 3 min. Next, the analysed surfaces were rinsed with deionised water and dried with compressed air after each bath. The reason for such rigorous sample preparation procedure was the necessity to minimise the effects of varying surface chemistry, and the presence of organic residuals after laser irradiation, which affect the resulting wettability of LIPSS-treated surfaces [23]. All CA tests were repeated more than 6 months after the laser processing while the samples were stored in ambient conditions. The CA of as-received steel substrates was 73.3 ± 10 degrees.
The produced samples were split into a small experimental subset, i.e. Set A, and a much larger validation set-Set B. Set A consisted of 18 surface samples, where 5 were produced with varying FOD and another 7 with varying BIA. The remaining 6 samples were chosen from the supplementary set produced with optimised laser settings but varying laser fluence. From each sample, 16 topography images were extracted, for a total of 288 created topographies. Set B comprised the remaining 69 surfaces from the conducted 87 AFM scans. Again, from each scan of the validation set, 16 topography images were created, for a total of 1104 LIPSS topographies.

General Adversarial Networks for data augmentation
In this research, GAN, as a novel data augmentation technique, was used to generate additional realistic artificial nal of the Discriminator, which allowed the height maps, created by the Generator, to become progressively more realistic and similar to the real/reference surfaces LIPSS topographies based on Set A. The extracted LIPSS topographies were treated as height maps/depth images. The respective AFM data were converted into 16-bit greyscale height maps that contain the coordinates of each point on the surface in a three dimensional Cartesian system, i.e. the known Z resolution (nm per greyscale value) and pixel size value for X and Y [24]. One GAN was trained separately for each laser structured surface sample, using the 16 extracted height maps as reference images. The schematic representation of a GAN is shown in Fig. 2. The main role of the Generator is to produce artificial images that are indistinguishable from the reference images for the Discriminator, and this is the basis for the training procedure. That is, the Generator's aim is to learn to create images of progressively higher similarity to the reference ones. The aim of the Discriminator is to learn to distinguish the reference images from the artificial ones. After completing the training, the Discriminator was discarded and only the Generator was used to create 20 artificial images. The 100 px × 100 px height maps created by the Generator were imported into the Alicona software, and the areal surface roughness parameters were calculated for each of them.
The GAN architecture was determined by trial-and-error during a preliminary process of parameters fine tuning. The detailed learning process is described hereafter. Each artificial image created by the Generator, whose architecture is presented in Table 2, was based on a vector of 100 random scalar values fed as input to the network. Using a sequence of upscaling and convolutional layers, a matrix of 100 × 100 elements (normalised in [-1,1]) was produced. The final image was generated by re-scaling the matrix elements to 16-bits unsigned integers. The Discriminator architecture, described in Table 3, was composed of an alternate stack of convolutional and dropout layers. Both the Generator and Discriminator were trained together using the Adam optimiser, albeit with different learning rates. For each epoch, an equal number of real images (sampled with replacement from the reference images) and artificial images, created by the Generator, were fed to the Discriminator that was trained against a binary label (i.e. real = 1, fake = 0). The Generator was trained on a complemented value of the Discriminator loss, in a zero-sum fashion. The training parameters are given in Table 4. To improve the early convergence of the Generator, a measure of noise has been added to the data used by the Discriminator. Each time a reference (i.e. real) image was fed to the Discriminator, the associated label was randomly flipped (with p = 0.5). This was not performed in the case of the artificial images. This regularisation procedure limited the Discriminator potential of greatly outperforming the Generator in the early stage of the learning   process, to the point of hindering its ability to learn to generate good quality images.
Overall, the set of GAN-generated images (henceforth Set GAN) consisted of 360 artificial topographies (18 × 20); each was described by 21 areal surface roughness parameters and one CA value. The procedure of assigning the CA to the GAN topographies (as well as to Set A and Set B) was as follows: the mean ( ) and standard deviation ( ) were calculated for the obtained CA values for each surface sample. Then, one CA value was assigned randomly to each topography from a uniform CA distribution within the interval ( − , + ).

Feature selection and ANN structure optimisation
The feature selection analysis, ANN optimisation and validation procedure were run three times in parallel to assess the usefulness of the GAN-generated artificial LIPSS topographies. By using only the small Set A, it was intended to test the feasibility of performing the whole study using a limited amount of experimental data. Then, the quality of the GAN-generated topographies was assessed based on tests run only on the Set GAN. Finally, the benefits of augmenting the available experimental data with the artificially generated ones were evaluated on the merged Sets A and GAN. All three cases were also validated on Set B. Feature selection methods were applied to the datasets to filter out redundant and irrelevant attributes among the ISO areal surface parameters and jointly perform ANN structure optimisation [16]. A parameter/feature is considered relevant when it conveys useful information for a given task, and redundant when it does not add additional information that has not been already provided by other parameters. The purpose of feature selection was to find the smallest number of most related areal surface roughness parameters, without significantly reducing the ANN's accuracy for the two specific tasks. The first task was a classification problem, where the ANN had to be trained to detect either the presence of processing disturbances (FOD or BIA in this research), or the use of optimised laser settings during the structuring process (labelled as class N). Then, the same group of surface parameters was applied to the second task. The second task amounted to a regression problem, where the ANN had to learn the relationship between the identified group of areal surface roughness parameters, and the static water CA of the laser-treated surfaces. It is important to state that the ability to detect alterations in LIPSS topographies due to any processing disturbances might also help to foresee potential variations in the surface performance. Therefore, the results from the classification task can indicate potential changes in the surface functional response and can be used to trigger some corrective processing routines to keep it within predefined limits.

Feature redundancy analysis
Data feature (attribute) redundancy was assessed by using the well-known Pearson correlation coefficient [25]. In this study, two data attributes (areal surface parameters) were considered highly correlated and hence redundant if their correlation coefficient | | | xy | | | was higher than 0.8. After the analysis, redundant parameters were removed sequentially, starting with the one that showed significant similarities with the largest number of other parameters. Once this surface parameter had been removed, the one amongst the remaining that had the largest number of significant similarities with the others was eliminated, and so forth until no redundant parameters were left.

Feature relevance analysis and ANN structure optimisation
Feature relevance is usually assessed by some measure of correlation between the feature and the target variable. The analysis of relevance is complicated by the fact that analysing one feature on its own, as done in univariate feature selection approaches [26], may lead to the removal of elements that are not significantly correlated with the target variable, but that might become highly informative in combination with other features. For this reason, a multivariate method based on the evolutionary ANN Evolver (ANNE) algorithm was used. ANNE is specially designed for the optimisation of ANN classifiers [27,28] and can be regarded as an embedded feature selection method that simultaneously performs feature selection, ANN structure optimisation and weight training [29]. ANN optimisation and feature selection were carried out for the processing disturbances classification task, and then, the results were re-used for the wettability prediction task. A Multi-Layer Perceptron (MLP) ANN [30] was used as classifier in the first task, and predictor in the second. Preliminary tests revealed that one hidden layer of units was enough to attain a very high accuracy.
ANNE was run using the group of surface parameters obtained after redundancy analysis and thus was employed only for relevance-based feature selection. The feature relevance selection and ANN optimisation procedure consisted of two stages as shown in Fig. 3. In the first stage, ANNE was used to optimise the MLP structure, that is, to define the size of its hidden layer and to evolve minimal sets of relevant areal parameters. In the second stage, the MLP structure was set to the optimal configuration evolved by ANNE. Exploiting the feature selection results from ANNE, a number of candidate groups of surface parameters were formed, and their suitability was evaluated on the MLP ability to learn the classification task. The MLP was trained using the standard back-propagation (BP) procedure [31]. The main parameters of the MLP, and the learning parameters of the ANNE and BP algorithms were experimentally optimised and are listed in Table 5. The remaining parameters were set as in [27].
A final tuning step was performed to adjust the number of iterations required for the BP procedure because of the different nature of the final learning task. The learning curves were analysed, and the duration of the learning procedure was set in order to avoid overfitting. Training data overfitting occurred in both classification and regression tasks. Hence, the learning procedure had to be restricted to, respectively, 100 and 200 iterations, as stated in Table 5.
Following a common practice, a pre-processing step was performed where the areal surface roughness data were normalised using the mean-variance procedure. Due to the stochastic variability of the learning procedure, 10 independent runs of the ANNE algorithm were performed for each experiment, and the results were statistically analysed. For each learning trial, the data set (Set A, Set GAN, or Set A + GAN) was randomly divided into a training set containing 80% of the samples, and a validation set containing the remaining 20%. For the BP algorithm, 100 independent runs of the procedure were performed for each experiment. The reason for the different number of repetitions is the computational cost associated with the two algorithms, in detail about 16 min for ANNE and 3 s for the BP algorithm. Fig. 3 Steps of feature relevance analysis split into two stages. In the first stage (blue lines), the ANNE procedure was used to optimise the MLP structure and generate candidate groups of surface parameters. In the second stage (red lines), the parameter groups were evaluated on the learning results of MLP (using BP training) and a final minimal group of relevant areal surface parameters is generated

Datasets
Examples of LIPSS topographies from Set A and artificially generated topographies from Set GAN with and without the presence of processing disturbances with the respective average CA values are presented in Fig. 4. The influence of FOD and BIA on the LIPSS characteristics is clearly visible. The FOD increase entailed a decrease in LIPSS amplitudes that eventually led to spots where LIPSS were no longer generated, e.g. when FOD = 0.8 mm. In regards to the influence of BIA, two types of ripple periodicities were present on the surface, which is typical for LIPSS generated with a p-type polarised beam that is not normal to the surface [33]. Samples produced with lower BIA resulted in a dominant periodicity above the one achieved with optimised laser settings, while higher BIAs led to only smaller periods. Generally, the LIPSS topographies selected for the representative Set A, and consequently the ones generated by the GAN, had widely varied dimensional characteristics, which led to diverse areal surface parameters values and altered their wettability. All of the laser-treated surfaces showed hydrophilic behaviour and the obtained CA values ranged from 26 to 80 deg, with a measurement uncertainty of approximately 10 deg. In Table 6, the number of input topographies in each Set, the distributions of the classes and the range of output CA values are summarised.

Feature redundancy analysis
The correlation analysis revealed that several features, i.e. aerial surface roughness parameters, were redundant in all Sets, i.e. Set A, Set GAN and the largest Set B. The analysis of Set B was done only for reference purposes and kept for validation only. The redundancy analysis performed on the small Set A differs from that conducted on Set B. Out of 210 pairwise feature redundancy checks, 23 (11%) were different. Overall, despite some discrepancies, the analysis performed on Set A was in good agreement with the one performed on Set B. Thus, it can be judged that Set A is a representative example of the larger population of Set B. The analysis performed on Set GAN also differed from the distribution of Set B. It should be noted that Set GAN was created using the samples of Set A, and thus 'inherited' the inaccuracies of the latter. Out of 210 pairwise feature redundancy results, the analyses on Set A and Set GAN differed in 30 cases (14%). The results show a satisfactory agreement between the two sets, indicating that the GAN technique of  Table 7 shows the results of the elimination procedure for the three data sets. Redundancy elimination gave the same results for Set A and Set A + GAN, leading to a reduction in their attributes, i.e. ISO parameters, from 21 to only 10. On Set GAN, redundancy elimination reduced the set attributes to 9, where 6 of them are shared with the other two sets.

Feature relevance analysis based on ANNE algorithm
The averages of the feature selection and structure optimisation results, and the classification accuracies obtained for the validation set (20% of examples of the data set in consideration) are reported in Table 8. For reference, the results obtained using the full set of the ISO parameters are also included in the table. The frequency of each data attribute that was selected in the 10 runs of ANNE is shown in Table 9 for the three data sets. The results, presented in Table 8, obtained using the three data sets indicated that some ISO parameters might be further discarded due to being less relevant. However, it is important to note that the actual ISO parameters selected differed from set to set. Though, there was a considerable agreement in the size of the surface parameters group using the full 21 ISO parameters and the reduced group after redundancy analysis. In general, when all surface parameters were considered, ANNE tended to select slightly more relevant attributes. In terms of the selected ISO parameters and their selection frequency, the results obtained considering all or a reduced group of parameters, as shown in Table 9, cannot be compared. This is due to the fact that redundant attributes are equally likely to be selected, and the selected frequency is not necessarily an indication of their relevance. Table 6 Summary of LIPSS topographies within Sets A, GAN and B that was used to classify laser processing disturbances (the first task). Class N refers to samples produced without processing disturbances but with varying peak fluence. The output values of minimum and maximum CAs are also provided for the regression task, i.e. the wettability prediction  Table 7 Redundancy analysis of the feature selection procedure. Retained ISO parameters (attributes) of the three sets are depicted with '' Regarding the classification accuracy, the most evident result is the poor accuracies on Set A attained by ANNE. The analysis of the learning curves did not indicate significant overfitting of the training data. The most plausible explanation is the small size of Set A, which affected ANNE's ability to evolve to high performing solutions. For Sets GAN and A + GAN, the results suggest that MLP could be trained to identify the processing disturbances with high accuracy. There was no distinguishable difference in the accuracy between the results obtained using all or only a smaller subset of attributes, i.e. ISO parameters.

Evaluation of candidate surface parameters subsets
Based on the results obtained using ANNE, the MLP structure was fixed to one hidden layer of 5 nodes even though it was slightly larger than proposed in Table 8. The reason for this was that the smaller Set A alone might under-represent the complexity of LIPSS topographies. Using the results in Table 9, a number of candidate ISO parameter groups were created for each of the three sets as shown in Table 10. These candidate groups were based on the selection frequencies, starting with a minimal subset of most frequently selected ISO parameters and successively adding more attributes. These candidate sets were evaluated on the learning results of the MLP after it was trained using BP and with only the selected ISO parameters. The results are shown for each ISO parameters' group and data set in Table 11. For the sake of comparison, the results include those obtained using the full set and the set generated after redundancy analysis.
In terms of classification accuracy, the results reported in Table 11 are in good agreement with those obtained using ANNE and confirm again that high accuracy results can be obtained with a significantly reduced number of surface parameters. The classification accuracies obtained using Set A are lower than those obtained using the other two data sets, although the differences are significantly smaller than those recorded for ANNE. Table 8 Feature selection and structure optimisation results (ANN hidden nodes) obtained by the ANNE algorithm for the three sets. A summary of the classification accuracies achieved for Task 1 on the validation set (20% of examples of the data set in consideration) is included in the table, too. The results are calculated over 10 runs of the algorithm. In the table, 'all' refers to the trials run using the full 21 surface parameters, 'reduced' refers to the parameters group obtained after the redundancy analysis. The significance of the differences in the classification accuracies obtained using the full and reduced ISO parameters is analysed using Mann-Whitney tests and the p-values are provided in the    Table 11 shows that the removal of redundant features had marginal to no effect on the learning accuracy of the classifier for the data sets that included the artificial topographies.
On Set A, the differences are more marked although still moderate. The same effect was observed after the elimination of irrelevant ISO parameters on the learning results of Table 10 Candidate surface parameters groups tested on data Sets A, GAN and A + GAN. Their size is indicated by their group coding in the first column (e.g. F 6 has six ISO parameters). Selected parameters in the group are indicated by '' Table 11 A summary of the MLP classification accuracies obtained on the validation set (20% of the whole data set) using the parameter groups in Table 10. For each data set, the statistics refer to 100 learning trials using the BP algorithm. For reference, also the results of training the MLP using all ISO surface parameters are given. The significance of the differences in the classification accuracies obtained using the all and candidate attribute sets is analysed using pairwise Mann-Whitney tests and reported by the p-values  Table 11 suggest that the feature selection affects the classifier performance mostly for Set A. The most conservative choice would be to use the group of nonredundant ISO parameters F 10 , or if some further reduction in performance is acceptable, the parameter group F 6 could be adopted. If Set GAN is used, the surface parameters can be trimmed down to the six data attributes of F 6 without significantly affecting the performance. If Set A + GAN is utilised, the tests show that the classifier accuracy will suffer only a very modest deterioration (less than 1%) using group F 10 of non-redundant ISO parameters and only modest (around 2%) using F 8 data attributes. These final choices are validated in the last step, where the MLP is tested against the previously unseen Set B. Figure 5 reports the results on accuracies, achieved in identifying the processing disturbances, of the classifiers obtained after 100 independent runs of the BP algorithm on Sets A, GAN, and A + GAN while the validation was performed on Set B. In general, the accuracy results were inferior to those obtained in the feature selection steps (as shown in Table 11). The deterioration of the performance was most dramatic in the learning trials performed using only Set GAN, and least severe when only Set A was used. Set A + GAN achieved only slightly worse accuracy compared to Set A. It is worth stating that the learning tasks in the feature selection and classification stages were different, i.e. the first requiring generalisation to unseen samples of already introduced surfaces, and the second generalisation to different samples of previously unseen surfaces. The lower classification accuracies achieved in the latter experiments are likely to reflect the more challenging nature of the task. The artificial LIPSS topographies generated applying GAN appeared to capture at least partly the overall characteristics of Set A. However, MLPs trained on Set GAN were very poor at generalising the learning results when applied on Set B. This result shows that the GAN-generated data were not representative of the full distribution of Set B. Given that MLPs trained using Set A did generalise well, the results seem to indicate that the problem lies within the GAN procedure itself, rather than the poor quality of the scans that were fed to GAN. One reason for this result may be that the GAN learning process had been interrupted too early. At present, the duration of the GAN learning reflected a trade-off between the computational cost and the visual appearance. Further tests could investigate whether it is worth the GAN learning time to be extended.

Task 1: Classification of laser processing disturbances
Since the use of data samples from real surface scans produced the best learning results, the next step was to use Set A to re-train the MLP using the ISO parameters groups selected using Set GAN and Set A + GAN. This experiment aimed at evaluating the goodness of the feature selection results obtained using artificial data samples. The results, shown in Fig. 6, are very similar, with average classification accuracies mostly ranging between 82 and 84%. The only exception was the learning trials performed using the minimal parameter group F 6 that was selected using Set A, where the average accuracy was 80.4%.  Two surface parameters groups were tested per each dataset and the results compared to those obtained using all 21 surface parameters Sets A and A + GAN had the same parameter group F 10 , as indicated in Table 10 The best learning results were obtained using surface parameter group F 10 , i.e. the non-redundant data attributes selected after analysing Sets A and A + GAN. The removal of irrelevant surface parameters had instead a statistically significant negative effect on the accuracy results, although in practical terms this was very modest. Table 12 reports the confusion matrices for the classification results on Set B presented in Fig. 6, using MLPs trained on Set A by using the minimal parameter groups F 6 (selected on Set A), F 6 (selected on Set GAN), and F 8 (selected on Set A + GAN). The largest source of misclassifications was due to FOD topographies being identified as BIA. In proportion to the number of examples per class, the largest sources of incorrect classifications were samples from class N identified as BIA in the case of the surface parameters F 6 of Sets A and A + GAN, and FOD identified as BIA for F 8 obtained using Set GAN. It was also observed that for three tested cases a similar number of samples from class FOD were identified as class N. This could be attributed to the supplementary LIPSS samples produced with laser peak fluence close to the ripples' threshold, which had similar topographies to the ones obtained with higher FOD. Hence, those samples were more prone to be misclassified. Figure 7 shows the root mean square (r.m.s.) values of the CA predictions from the validation step performed on Set B. Similar to the processing disturbance classification task, a further experiment was run employing Set A to train MLP by using the ISO parameter groups selected using Sets GAN and Set A + GAN. The results of this last experiment are given in Fig. 8. The results presented in Fig. 8 are fairly similar with the average (r.m.s.) error of around 11 degrees for all combinations of training data sets and ISO parameter groups. In general, feature selection helped the MLPs to learn the CA prediction with marginally better results. Data augmentation appeared also to play a beneficial role, since the best accuracy results were obtained by training the MLPs using Set A + GAN. Although statistically significant, it should be noted that the measured differences in accuracy were always within 1.5 degrees. The best results were obtained when training the MLPs on the augmented Set A + GAN of examples and using ISO parameter groups F 8 or F 10 to describe the samples. In general, it can be stated that the average r.m.s. values obtained in the experiments are comparable with the CA measurement uncertainty, and most probably this limited the MLP learning abilities. Obtaining a more accurate CA measurements might improve the MLP training and allow the MLP to differentiate better the usefulness of different data sets and ISO parameter groups.

Conclusions
In this research, an approach is presented for applying ANNs to classification and prediction tasks when ultrafast laser surface structuring/texturing is performed. ANN tools were developed and validated for pre-and post-processing of laser surface treatment data, especially areal surface roughness parameters of LIPSS topographies, that proved to be sufficiently effective. In particular, high prediction accuracies were achieved by MLP classifiers on the detection of laser processing disturbances that affect the LIPSS generation. MLPs were also used to predict with high accuracy the functional response, i.e. wettability, of LIPSS-treated surfaces.
Regarding the applied ANN tools, using a small experimental dataset augmented with GAN-created artificial topographies proved to be beneficial for the tool's development. GAN-generated data were especially valuable when utilised for feature relevance analysis employing the evolutionary ANNE algorithm. Even if GAN-based artificial data reproduced well the statistics of real samples, the GAN-generated topographies were less useful in supporting the MLP generalisation capabilities on the laser processing disturbances classification task. That was attributed to the GAN insufficient learning process, especially its premature interruption.
A range of feature selection methods were applied. By combining their capabilities, it was possible to identify the number of salient aerial roughness parameters needed to characterise the surfaces, without any significant negative effect on the MLP performance. Specifically, feature redundancy analysis revealed that the initial 21 ISO parameters can be narrowed down to only 10, and such a small subset of data attributes was enough to achieve a high MLP prediction accuracy, especially in the laser processing disturbances classification task. Further trimming of irrelevant attributes down to an even smaller subsets of 6 or 8 surface parameters led to fairly similar prediction accuracies. Such substantial scale downs of data attributes can have a valuable impact on the practical aspects of data acquisition procedures, because it can reduce the number of costly, time-consuming, and sometimes complex measurements.
Finally, the ANN validation part on a larger unseen dataset showed that identification of processing disturbances could be accomplished with accuracy close to 85%. The wettability of LIPSS-treated surfaces was predicted within the static water CA measurement uncertainty of approximately 10 degrees. Considering those encouraging findings, it can be concluded that the developed ANN-based tools can represent a generic approach for monitoring the LIPSS treatment operations. These tools can map the resulting areal parameters of processed surfaces to any disturbances present during the process, and consequently also to their desired functional performance.
Author contribution Luca Baronti involved in conceptualisation, methodology, software, validation, formal analysis, investigation, data curation, writing-original draft; Aleksandra Michalek took part in conceptualisation, methodology, investigation, validation, writing-original draft, visualisation.; Marco Castellani involved in conceptualisation, methodology, software, validation, formal analysis, investigation, writing-original draft, visualisation, supervision; Pavel Penchev took part in conceptualisation, methodology, experimental trials, data collection; Tian Long See took part in supervision; Stefan Dimov involved in conceptualisation, methodology, writing-review & editing, supervision.  8 Accuracy results for the CA prediction task with MLPs being trained on Set A using the parameter groups identified for Sets GAN and A + GAN and validated on Set B. Two surface parameters groups were tested per each dataset and the results compared to those obtained using all 21 surface parameters