Review of image segmentation techniques for layup defect detection in the Automated Fiber Placement process

The aerospace industry has established the Automated Fiber Placement process as a common technique for manufacturing fibre reinforced components. In this process multiple composite tows are placed simultaneously onto a tool. Currently in such processes manual testing requires often up to 50% of the manufacturing duration. Moreover, the accuracy of quality assurance varies significantly with the inspector in charge. Thus, inspection automation provides an effective way to increase efficiency. However, to achieve a proper inspection performance, the segmentation of layup defects need to be examined. In order to improve such defect detection systems, this paper performs a comprehensive ranking of segmentation techniques. Thus, 29 statistical, spectral and structural algorithms from related work were evaluated based on nine substantial criteria as assessed from literature and process requirements. For reasons of determinism and easy technology transferability without the need of much training data, the development of new Machine Learning algorithms is not part of this paper. Afterwards, seven of the most auspicious algorithms were studied experimentally. Therefore, laser line scan sensor depth maps from fibre placement defects were utilised. Furthermore noisy images were generated and applied for testing algorithm robustness. The test data contained five defect categories with 50 samples per class. It was concluded that Adaptive Thresholding and Cell Wise Standard Deviation Thresholding work best yielding detection accuracies mostly >97\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$> 97$$\end{document}%. Noteworthy is that influenced input data can affect the detection results. Feasible algorithms with sensible parameter settings were able to perform reliable defect segmentation for layed material.


Introduction
Nowadays, lightweight structures are widely used in aerospace manufacturing. Examples of an increasing demand for these lightweight components are the Airbus A350 XWB or Boeing 787 wingcover and fuselage production (Marsh 2010;McIlhagger et al. 2020). Carbon Fiber Rein-forced Plastic (CFRP) offers superior stiffness and strength properties compared to metallic materials. For this reason lightweight structures are often made from CFRP. The production of these mostly complex lightweight structures is usually quite expensive. To make manufacturing economical, fast and efficient production techniques are essential. On this account, the relatively novel Automated Fiber Placement (AFP) technique is increasingly used in industry. This is the reason for its choice for further investigation of the inspection process, in this paper. To meet aerospace's high safety requirements a visual inspection step follows the fibre layup process.
Today, this manual inspection takes typically between 32% (Rudberg et al. 2014) to 50% (Eitzinger 2019) of the overall production time. Due to the manual inspection pro-cess, the inspection quality is often insufficient with regard to the specifications. This aspect provides great potential for improvements in terms of quality and speed.
The Laser Line Scan Sensor (LLSS) is frequently used in research and development for the inline inspection of AFP processes. For this reason, we are focussing on grey scale depth images from such a sensor. A LLSS is based on the principle of triangulation to obtain geometrical data. Therefore a laser beam is projected onto a surface and then reflected back to a camera sensor. This sensor has a slightly different position than the emitting laser.
One critical aspect in the automated inline inspection is the robust and reliable segmentation of defects within a sensor image. Previously, Hanbay et al. (2016), Mahajan et al. (2009) and Kumar (2008) summarised some research from the field of fabric defect segmentation in the textile industry. Additionally, Bulnes et al. (2014) mentioned predominantly periodic defects as critical for the web industry. Because textiles are somehow very similar to CFRP, research from this field can be very helpful for the algorithm selection. Later on Sacco et al. (2018) and Zambal et al. (2019) published Neural Network based defect detection methods for laser scanner depth images of AFP defects. Moreover, Tabernik et al. (2019) have investigated a deep learning system for crack segmentation in components. In particular, they emphasised the necessity of a small initial setup data set for establishing a system in an industrial application. These deep learning methods work quite well (Du et al. 2020), but face the challenge of determinism and traceability of Neural Network decisions (Lee et al. 2021). The categories of segmentation algorithms without the use of advanced more abstract models can be visually verified in terms of their correct operation (Joshi et al. 2018). Simple image metrics might be sufficient as an indicator for evaluating the validity of a calculation (Meng et al. 2020). Such an evaluation and traceability of the machine decision is much more difficult for Neural Networks or other advanced models (Lee et al. 2021), which is the reason for categorically excluding them in this study. Meister et al. (2020) announced our first results on defect detection algorithms for fiber placement processes. In this the focus was on the general overview of feasible algorithms.
Based on this previous work, a detailed investigation of suitable algorithms with regard to different geometrical groups of layup defects and a varying quality of the input data is necessary.
Within this postulation we like to focus on defect detection for different geometrical characteristics of defects. This image segmentation step is basically the first task in the inspection of composite parts. This serves to reduce the amount of data and prepare the image data for subsequent analysis procedures. Prior to the determination of the defect types or geometrical measures, as many as possible of these manufacturing deviations have to be detected on the produced component. Afterwards, in a following stage, a detailed analysis of the conspicuous image regions is carried out using techniques of varying complexity. Such a sequential inspection approach is similarly used from of Kuo et al. (2016) for the inspection of light emitting diodes. The detailed defect analysis and interpretation will be considered separately in further research and is therefore not part of this paper. In addition, the presented procedure can serve as a partially automated input generator for the synthesis of further defect images in AFP inspection, as described in our publication from Meister et al. (2021) or in similar fields as mentioned from Jain et al. (2020). For this reason the following research questions are selected for this publication: -Which algorithms are feasible to perform rapid defect detection for previous defined layup defect types under consideration of various geometrical clusters?
-What is the actual performance of feasible image segmentation methods under consideration of a varying data quality?
Regarding these research questions, this paper uses findings from previous fabric inspection and fiber layup defect detection research and reviewed these algorithms in the field of AFP defect segmentation. For this, various geometrical clusters of defects are examined. Thus, our research provides a novel perspective on segmentation algorithms and their application to fibre placement inspection. The findings of this postulation should be beneficial for manufacturers of inline inspection systems for AFP processes. Thus, these results can support their development and configuration of defect detection algorithms for various materials. Furthermore, the findings will support the certification process of such systems, especially in the aerospace industry. This article initially presents the related research on sensor and algorithm development. Subsequently all algorithms from the literature are summarised and evaluated according to a defined procedure. The seven most promising algorithms are then implemented and studied based on exemplary fibre layup defects.

Related research
This section provides an overview of the state of research and industrial developments of the manufacturing process, data acquisition techniques as well as algorithms for image smoothing and defect segmentation.

Manufacturing process
Following, the corresponding manufacturing process is explained. To begin with, various Fiber Placement techniques are currently available on the market. Very common methods are the Automated Fiber Placement (AFP), Dry Fiber Placement (DFP), Automated Tape Laying (ATL) and Direct Roving Placement (DRP). (Maass 2012;Lengsfeld et al. 2014;Grohmann et al. 2016) These processes apply CFRP material layer by layer onto a tool. (Campbell 2004) explained this procedure, which is also schematically shown in Fig. 1. AFP is a technique which is needed to manufacture complex composite structures. Nowadays, this technique gets more and more established in industrial aerospace manufacturing. This means, that the process is quite novel, but also used in industry. With the aim to realise a good transferability of the research results, we selected this method for further investigations. During the AFP process multiple narrow preimpregnated material stripes (tows) are placed along a previously programmed path (course) (Oromiehie et al. 2019). Each tow is supplied from an individual spool. The supplied material consists of the prepreg material itself and a release film. This release film is removed during the placement process and stored on another spool. Within the AFP process, composite material e.g. carbon prepreg material is fed into an effector. This effector guides the material to the tool's surface. Then, the material is heated at the moment of tow placement, to achieve better tack properties (Lengsfeld et al. 2014). Following to the material deposition, a compaction roller presses the material onto the mold. Thus, each component is made from many CFRP prepreg layers (Campbell 2004). This previously explained AFP process can be used to manufacture various part geometries. For this reason, Rudberg (2019) expects that the AFP method will be used more and more frequently in future applications. Various defects possibly occur during fibre layup. These defects are often directly related to the lay up process itself (Oromiehie et al. 2019). Harik et al. (2018) have investigated the link between the AFP defects and the process planning, layup strategies and machining. Furthermore, Potter (2009) has analysed factors for the variability in the AFP production. Referring to Potter (2009) andHarik et al. (2018), all defects which can occur during fibre layup result in geometrical changes and deviations from an accurate lay up surface. Hence, common AFP defect types are wrinkles, twists, gaps, overlaps and foreign materials (foil) Heinecke and Willberg 2019;Oromiehie et al. 2019). A sketch and a corresponding exemplary LLSS scan image is presented in the Fig. 2 for each defect type. The associated geometric measures and characteristics of these defects are summarised in the Table 1. Basic principle common to all fiber layup technologies, consisting of an effector with heating system and a compaction roller. F is the compaction force and v gives the layup velocity Fig. 2 Illustration of the defect types considered. Schematic drawings of the individual defect types are presented at the top. Below the associated, smoothed scan images of the LLSS are displayed. These are used as input for the image processing. l: length axis, w: width axis, w t : tow width The wrinkle and twist defects have varying but distinct topologies. Both types of these defects protrude from the layup surface. This results in large height differences and form clear defect edges. In longitudinal direction wrinkles form one clear edge. Twists in contrast show a very small growth in altitude over their length. Gaps and overlaps are very similar to each other regarding their geometrical characteristics. These two defects are very flat and hardly show any topology changes. Gaps reveal two slight edges at the beginning and end of this defect, in transverse to the fibre direction. However, overlaps form three small edges in this direction, since these defect types are mostly a combination of a gap and a tow overlap. In contrast, gaps and overlaps have almost no edges apparent along the tows. Due to their similarities, the differentiation between these two types is often quite difficult. The thin shape of these defects gives the opportunity to analyse algorithms for this scenario. These defect types are also often used as example defects by other researchers Heinecke and Willberg 2019;Oromiehie et al. 2019). Thus (Nardi et al. 2018) emphasises especially the disturbing influences of gaps and overlaps in the AFP prepreg layup for fibre metal laminates. Furthermore, foils are very common foreign materials in manufacturing processes. They show a very different reflection behaviour The range of the length-to-width (l/w) ratio is given here, due to the large variance of the geometry within each defect category. CPT in this application case is about 0.125 mm. For the thickness measure + indicates an increase in thickness and -means a decrease in thickness compared to the deposited fibre material (Miesen et al. 2015;Potter 2009). Following, feasible techniques for the recording of these defects are explained.

Sensors for data acquisition
Nowadays, the inline inspection for AFP processes is a highly discussed topic in research and industry. For this reason, various sensors are investigated to record the required inspection data. Sun et al. (2020) provides a comprehensive overview of the currently available systems and their performances. Accordingly, the Fraunhofer Institute for Integrated Circuits (IIS) investigated polarisation camera based systems (Atkinson et al. 2018;Schöberl et al. 2016). Furthermore, the National Aeronautics and Space Administr. (NASA) Gregory and Juarez (2018) as well as the Institute of Production Engineering and Machine Tools -University Hanover (IFW) focused on thermographic imaging inspection (Denkena et al. 2016;Schmidt et al. 2019a). Both types of sensors capture only 2D images.
Thus, InFactory Solutions (Weimer et al. 2016), Profactor (Gardiner 2018) and Danobat Composites (Black 2018) have developed LLSS based systems for inline Quality Assurance (QA) of AFP processes. These companies use a single LLSS for monitoring of the entire course. In contrast, Electroimpact (Cemenska et al. 2015) applied multiple LLSS systems to observe each individual tow.
A big advantage of these LLSS systems is the capability to provide topographical information of a surface. This may be the reason of the success of this measurement principle for the AFP inspection (Weimer et al. 2016). Schmitt et al. (2008), Schmitt et al. (2007) began studies on LLSS based methods for the contour scanning of fabrics and preforms, in 2007. They also investigated edge detection methods for the determination of preform misplacement errors. They observed sub-pixel accuracy for the contour measurement and showed that a LLSS is a suitable system for the fabric and preform inspection. Subsequently in 2012, Faidi et al. (2012) investigated laser triangulation systems for CFRP manufacturing. They aimed to find and measure wires with a diameter between 0.5 mm and 1 mm, which were incorporated into CFRP laminates. Miesen et al. (2015) suggested a method to sense defects with a point measurement laser displacement system. Additionally, they discussed influencing factors for deviations in their research and analysed the accuracy of such a system. Furthermore, they presented various defect types and their corresponding geometrical dimensions. Tonnaer et al. (2017) also showed a technique for LLSS based monitoring of AFP processes. They investigated the measurement precision of Otsu's algorithm for the edge detection on LLSS depth images (Otsu 1979). Further methods for data usage and processing are described below.

Image processing techniques
This section describes related work for defect detection and image smoothing, in order to process the sensor data. Hanbay et al. (2016), Mahajan et al. (2009) and Kumar (2008) have presented plenty of algorithms for fabric defect segmentation in the period between 2008 and 2016. The corresponding methods mostly excluded the use of Neural Networks. In the main, they all clustered the methods in structural, statistical and spectral approaches as well as other more advanced techniques and discussed their individual strengths and weaknesses. Hanbay et al. have mentioned, that in particular the statistical and the spectral analysis methods lead to good fabric defect detection results. They have also pointed out, that spectral techniques work better for regular texture patterns like fabrics. Mahajan et al. concurred with this view. Kumar goes so far as to suggest, that Gabor filters perform well individually, but can lead to even better results in combination with other methods. Meister et al. (2020) summarised various defect segmentation algorithms from other computer vision fields and gave a rough overview of their advantages and limitations. Further crucial tasks in the field of image processing are the smoothing of sensor data and the adjustment of pixel values. For this purpose, the Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm is a suitable method for image contrast equalisation, as Meister et al. (2020Meister et al. ( , 2021 have already explained. This technique was developed in the mid 80's and Pizer et al. (1986) used it first in the field of medical imaging. This CLAHE method considers regionally limited small image areas called 'tiles' and performs a histogram equalisation for each of these regions. To counteract the effect of image noise over amplification, the contrast is limited to a defined score. Pixel values above this relative threshold are shifted to a corresponding neighbouring bin. As mentioned above, plenty of algorithms and sensor systems from various fields are already available. Especially algorithms from textile inspection seem feasible to adapt to our use case. Within this paper we like to apply a suitable choice of algorithms on LLSS depth data with varying data quality. In addition, we gather types of defects into appropriate clusters and perform a detailed analysis on the defect segmentation for them. Subsequently, the methodology for the algorithm evaluation is discussed.

Methodology
The following section describes the experimental setup, the algorithm selection approach and their actual implementation as well as the procedure for results evaluation. Within this paper we use the abbreviations from Table 2 for the algorithms. Moreover, Table 3 give the notations corresponding to the results evaluation.

Experimental setup
For the performance tests of the desired segmentation algorithms a manageable and heterogeneous set of defect types  Figure 2 presents these defects and the Table 1 gives the corresponding geometrical measures. It should be noted, that other defect types were not examined in this paper. With the aim to review image analysis techniques for the AFP application case, representative data needs to be recorded. This collection of fiber layup images must be generated reproducibly and should be representative regarding the actual fiber placement process. For this reason, a setting which is not influenced by disturbing influences from the manufacturing process like heating radiation, undesirable contamination or geometric tilting of the lay up machine was used. A suitable test setup is presented in the Fig. 3. This arrangement consisted of an articulated robot from KUKA, the Automation Technology GmbH (AuTech) -C5 Sensor (Automation Technology GmbH 2019) and a CFRP prepreg material sample which contained the defects from the Fig. 2. The linear moving robotic arm records the data along the entire sample, with a velocity of 200 mm/s. The AuTech C5 Sensor captured 4096 (w) x 625 (h) px, 16bit grey scale depth images from the 250 x 150 mm fibre layup sample. The width of the image indicates the maximum resolution of the sensor in width direction. The height resolution is affected by the exposure time per pixel row and the time between frames. On this account, the image resolution decreases with increasing exposure time, assuming the same sample size and scanning speed. A laser voltage of 5V and the FIR-PEAK laser line detection mode (Automa- Fig. 3 The assembly for collecting defect data from CFRP prepreg material is shown. This makes use of a KUKA robot with attached C5 LLSS and conducts a linear motion parallel to the surface. Additionally, a close up view of an example test sample containing a wrinkle defect is displayed. This sample has the dimensions length l s = 250 mm and width w s = 150 mm. It is made from tows with the width w t = 1/4 = 6.35 mm tion Technology GmbH 2014) were used to calculate precise topological data. Within this device the FIR-PEAK algorithm is implemented as a derivative filter, which detects the zero crossing point of the first derivative of the laser intensity image. The data was transmitted via an ethernet connection using the GenICam protocol (EMVA 2009). The image processing was carried out on a computer with Intel Xeon Gold 5122 @ 3.60 GHz CPU, 48 GB RAM, a NVIDIA Quadro P6000 GPU and OpenCV 3.4.1 (Bradski 2000) with Python. The image analysis processes were preferably executed on the CPU. The GPU was only used for the visual display. In addition the following section explains the selection of the algorithms to be investigated and the procedure for evaluation.

Evaluation of detection methods
Subsequently, the evaluation procedure for segmentation algorithms is presented. With regard to the defect characteristics described above, edge-based features can be a valuable image feature for most of the chosen defects. In addition, the regular structures of gaps and overlaps may be well represented by frequency-based attributes. Some defect segmentation algorithms are already known from similar applications like textile inspection, which have been discussed in Sect. 2.3. These algorithms are theoretically evaluated here, on the basis of the literature. This summary from the Table 4 contains all structural, statistical and spectral algorithms found during the literature review in the fields medical imaging, autonomous driving and textile inspection. The individual evaluation criteria are listed above each column. The corresponding references are given in the right column. With the aim to review these algorithms, the entire inspection process chain needs to be taking into account. As a consequence, the defect segmentation must be calculated quickly and find most defects, but a small positioning error during the image segmentation is tolerable. For these reasons, the nine performance characteristics number of invariances, calculation speed, implementation effort, detection accuracy, false positive rate, false negative rate, localising accuracy, robustness and adaptability to input were chosen and weighted according to their importance. Therefore, the weighting of the individual criteria is specified subjectively, based on the process and system requirements from Sect. 2.1. Subsequently, the algorithms were then evaluated against these criteria. The accumulated rating v a is calculated as presented in the Eq. 1. For each considered criterion, w i describes the absolute weighting within the interval [1,5]. Further, c i expresses the algorithms rating for the indi- Due to the subjective weighting and evaluation of the algorithms based on the literature, it is essential to consider the robustness of the assessment results. Thus, the premature rejection of potentially suitable algorithms shall be avoided. Therefore the individual weights w i were randomly varied within the defined value range [w e − 0.5 ≤ w i ≤ w e + 0.5]. Based on this, we performed 20 individual Monte Carlo observations with the expected value w e . The corresponding results are presented in the Table  4. In summary, a weighted performance value was calculated. The ten algorithms with the best rating, which implies a mean performance value ≥ 3.21, were considered for the following investigations. The defect segmentation with Gradient Thresholding (GT), Image Projection(IP), Cell Wise Standard Deviation Thresholding (CT), OTSU Thresholding (OT), Adaptive Thresholding (AT), Morphological Segmentation (MS) and Gabor Filter Segmentation (GF) algorithms have been identified for continuing study in this paper. These methods are highlighted in the Table 4 and were applied for the validation. This selection and the exclusion of certain algorithms is briefly justified in the following. Due to the traceability and an easy technology transfer, it is important to use simply adaptable algorithms which do not require comparative data. This causes the Local Binary Patterns segmentation to be discarded. The Rank-order histogram method sorts the bins of grey values due to the number of contained pixels. In this manner, the distribution of pixel values can be displayed, but it is not possible to distinguish between areas with and without defects. Apart from this, it is a global criterion that does not allow the localisation    available. The algorithms further investigated in this paper are highlighted in bold italic of prominent areas. On this account, the Rank-order histogram can not be used for a proper defect detection. Thus, the algorithms Local Binary Patterns and Rank-order histogram were rejected despite a high scoring rate. In addition, we would like to emphasise that this paper aims at comparing various appropriate methods for defect detection in the AFP process. Accordingly, algorithms with different operating principles and varying degrees of complexity are considered. Thus, for different methods, this can lead to varying efforts in identifying suitable algorithm parameters. However, for the performance comparison in this paper only those configurations are considered which yield the highest detection accuracies for each algorithm. The operating principle of the chosen algorithms is described below.

Implementation of selected algorithms
This subsection briefly explains the implementation of the previously selected algorithms and their operating principle. The CT method determines the 'mean' and the 'standard deviation' of the pixel values of each individual cell in a large, image-spanning grid. Cells with an ascertained standard deviation above a given threshold value are marked as a conspicuous region. These are then used to estimate and localise the defects. For signal evaluation using the wavelet transform, a base function is applied to interpret the input signal. To analyse an image, a matrix with the corresponding base function is shifted over the image and thus individual areas are examined. A popular and easy to use base function is the Gaussian function, which is used for the so called Gabor filtering. On this account, we decided to apply the OpenCV implementation of the GF (OpenCV 2018a), which is also frequently used in the textile inspection.
For the GT a Sobel filter in combination with masking operation was carried out. Therefore, the convolution G i = S i * A is performed on the image matrix A. Thus, two separate convolution masks S i regarding the images width and height directions are applied. The results are aggregated to Eq. 2.
Finally, the edges were approximated by applying a binary thresholding operation to the matrix G (OpenCV 2019a). The IP method first computes a dimensional reduction vector r k = (r k,a , ..., r k, p ). Therefore, every a-th entry of r k is calculated for each row i r r (i) by Eq. 3. Analogous for every column j r c ( j) is determined using Eq. 4 (OpenCV 2019b).
This IP algorithm treats each image row or column as an individual vector and accumulates the corresponding values. Subsequently, a matrix of ones with size equal to the original image J n,m is multiplied by the projection vectors r k . This can be expressed as Eq. 5 P r = J n,m r r and P c = J n,m r c Afterwards, the total projection matrix P is computed by P = P r + P c . With the aim to estimate the defect regions a binary thresholding was applied to this matrix P. The OT searches for the threshold t which minimises the variance between two classes. This is defined as the weighted variance of these two classes (OpenCV 2018b). This again can be written as Eq. 6. For this w i are the weights and σ 2 i are the variances of these classes Otsu (1979).
AT converts an input image I(i,j) into a binary image I b . This is expressed in Eq. 7 using T (i, j) as a pixel-based individual threshold value (OpenCV 2018b).
Lastly, the MS algorithm was implemented here as a morphological closing method. Hence, a dilation operation combined with a subsequent erosion procedure was calculated. This can be expressed as (I ⊕ K ) K with the input image I and the morphological kernel K (OpenCV 2018c).

Image pre-processing and parameter adjustment
Subsequently, the image pre-processing procedure and the parameter optimisation approach is explained. Initially, the image borders were cropped to compensate for artefacts from the image acquisition. Due to the large number of zero value pixels within the raw image, a pre-processing is necessary to reduce their influences. For this purpose, first the images were dilated to increase the number of information-containing image pixels. Afterwards a contrast equalisation was carried out in order to utilise the entire available value range. Gaussian filtering was then applied to smooth the edges from the dilation step. Finally, the image was resized to a constant size of 1000 × 1000 px. This raw image pre-processing is shown in the flow chart in the Fig. 4. For the initial cropping step, the image borders were cropped by 60 px at the top and the bottom as well as 40 px at the left and right Fig. 4 This flowchart describes the calculation rule for the raw image pre-processing using the CLAHE method edge to eliminate artefacts from the data recording. Regarding the image dilation the kernel size k si ze = 3 × 3 was chosen from preliminary tests. In order to find the configuration with the best performance, the number of dilation steps were here varied from 4 -12. Then, the contrast equalisation was achieved using the CLAHE algorithm with varying tile sizes, ranging from 5 -25 px squared. Further, its clipping limit was varied within the range of 2 and 26, to find the most effective configuration. Afterwards an image padding with the same parameters as in the previous cropping step were applied, to compensate for the resulting error in the defect position. For the subsequent Gaussian filtering the kernel size was set to k si ze = 5 × 5 with σ = 0.5. These CLAHE configuration ranges from above were based on the settings from Ma et al. (2017) and Muniyappan et al. (2013). With respect to our optimisation strategy, the settings of the preprocessing are expected to have a great influence on the defect detection. For this purpose, every feasible combination of a pre-processing setting and a defect detection algorithm configuration was tested crosswise by the computer. The value ranges for the algorithms tests were determined on basis of the corresponding references for each algorithm from Table  4. The most suitable parameter combination for pre-pressing and defect segmentation for this application case were experimentally determined for each algorithm, as stated above. The related settings are presented in the Table 5 for the preprocessing and in the Table 6 for the defect segmentation algorithms. Each parameter was modified with a given increment. The configurations with the best performance for a given combination of certain settings for the pre-processing and segmentation algorithms were used for further investigations in this paper. The respective groups of defects were taken into account for this parameter optimisation. Regarding the manufacturing process the lowest possible number of undetected manufacturing defects is desirable. For this reason, the reduction of undiscovered defect areas was chosen as a criterion for the optimisation. This in turn means the decrease of false negative values. As already mentioned in Sect. 1, the defect segmentation examined here is only the first stage of the overall defect analysis, within the entire inspection system. Therefore an excess area value up to 100% can be tolerated. This implies, that the segmentation area can be up to double the area defined as 'ground truth'. This iterative parameter optimisation presented here, was conducted for each segmentation algorithm investigated. These are incrementally varied with a given step size to determine the parameter configuration that produces the best detection results for each segmentation algorithm The optimisation criterion for this step was the maximisation of the defect detection accuracy, ideally up to its optimum of 100%. Following to both optimisation steps, the results of each algorithm under review were evaluated and then compared. The aim here was to approximate the parameter setting with the best overall detection performance, with respect to the previously mentioned optimisation constraints. The findings discussed in this publication are always based on the most suitable configuration for the particular case under con-sideration. All the investigated algorithms were applied to the entire scan image directly, which is subsequently termed the Full Image (FI) variant. Additionally, a Grid Based (GB) variant was implemented. Here, the pre-processed measurement image with 1000 × 1000 px was split into nine equally sized partial images of 333 × 333 px each. An examined algorithm was executed for each of these partial images. In this way slight differences in the height of the defect sample, respectively resulting variations in image brightness, were to be reduced. Below, the evaluation of the individual performance characteristics is outlined.

Evaluation of results
For the analysis of the detection performance, eight attributes were examined. These are based on the criteria available in the literature and presented in the Table 4. Therefore 50 defect samples per class were used as the maximum amount of test data currently available. For the automated evaluation of the algorithms segmentation behaviour, the previously acquired image data was manually labeled. For this task, the Python tool 'LabelImg' (Tzutalin 2015) was used to provided 'ground truth' data for the comparison of different algorithms. At first, the processing duration was evaluated to reject algorithms with an enormous computing time. Afterwards, the defect accuracy d and the false negatives rate N as their counterpart were analysed to determine the detection performance of the algorithms. The detection accuracy d is the ratio of the number of the computerised determined defects n d divided by the number of labelled ground truth defects n gt . These values referred to all defects for an observation. This leads to Eq. 8.
Initially, the detection results are presented considering all types of defects. Afterwards, separate investigations are carried out considering two geometrical distinguishable defect groups. Group 1: Wrinkle, twist; Group 2: Gap, overlap. Foreign materials are difficult to assign to a distinct group. Therefore, they are not considered separately, here. For the following detailed analysis, the false positive values were measured and divided into the counts of false positive detections P n_i and the pixel area of false positive estimations P a_i . This separation aims to distinguish between the size and the number of these measurements. For correctly detected defects, the excess area A e_i and the overlap area A o_i were calculated, in order to identify unnecessarily selected pixels and dispensable marked areas. For the individual observations, the following calculations X i were performed for each examined defect sample image I. For the results presented in Sect. 4, the mean value and the standard deviation were calculated from all individual observations X i . n d_i gives the number of computerised determined defects, for the examination of each sample image I. Moreover, K = n gt_i describes the corresponding number of ground truth defects. n sd_i is defined as the number of superfluously predicted defects with the corresponding superfluously recognised defect area A sd_i . The variable A gt_i is the manually labelled and accumulated defect area for the evaluated defect sample. C represents the number of detected defects which can be linked to a corresponding ground truth region. The individual detection accuracy d i and false negatives rate N i are given from Eq. 9.
The position accuracy per defect was calculated as described in Eq. 10. Therefore, E d_i ∈ R + 0 represents the Euclidean distance between the computerised estimated and the corresponding ground truth defect position.
The number based P n_i and area based P a_i false positive rates are given in the Eq. 11.
The calculation of the excess area A e_i and the overlap area A o_i are presented in the Eq. 12 Furthermore, the two most powerful algorithms were analysed more closely. The aim here was the more detailed examination of the segmentation behaviour of each algorithm. For this purpose, initially, the findings considering all types of defects are presented. Afterwards, the results are again split up into the two groups of defects mentioned above. Moreover, with the aim to investigate the robustness of the algorithms for other input data, noise was applied to the input images. This noise was Gaussian distributed for an individual sample image I at position (x,y), with each pixel value I(x,y) as its mean I (x, y) = μ(x, y) and a standard deviation of σ = 0.5. Afterwards, the above investigations were repeated. The algorithms were executed without the previously required adjustments of the pre-processing and algorithm settings. This was also intended to check, if the related parameters have a high influence on the performance of the defect detection, as claimed before. Furthermore, this investigation served to examine the reproducibility of the results using different input data but the given configurations. All the related results are presented in the following section.

Results
This section analyses the performed experiments and their results. The individually modified pre-processing settings from Table 7 and for the segmentation algorithms from  Table 8, were used for all subsequent attempts. These settings were identified using our individual optimisation procedure. This means that for each algorithm considered, all settings for the pre-processing and the algorithm itself are individually adjusted. All parameter combinations are tested crosswise and the setting with the highest detection accuracy are then applied. The corresponding parameter adjustments are listed in the Table 5 for the pre-processing settings and in the Table 6 for the detection algorithm parameters. At first, the execution times of the seven considered algorithms are investigated. Therefore, the Fig. 5 displays the calculation times for each selected algorithm. The Table 9 summarises the corresponding values including their standard deviations. The MS, AT and OT algorithms were swiftest to perform the segmentation. They require less than 31 ms for the FI implementation and below 10 ms for the GB version on each entire input image. The GF algorithms perform worst with an average computing time of more than 300 ms. The calculation time in this case is independent of the content of the input image. In order to assess the calculation times, we should remember that the scan of the considered fibre material sample would be performed in 0.25 s at a maximum AFP process speed of 1 m s . The goal must be the completion of the defect segmentation of a measurement section before the scan of the next section is completed. Thus all algorithms with < 250 ms execution time are basically applicable. However, a faster defect detection helps to react as quickly as possible to arising defects. Thus, a very short calculation time is desired. Afterwards the detection performances of these seven techniques are examined. The Fig. 6a displays the results for the average detection accuracies d and the corresponding false negative rates N, considering all the available data from all defect types. Figure 6b shows the results for the gap and overlap defect detection. Figure 6c presents only the findings for the segmentation of wrinkles and twists.
In order to assess the results, we should bear in mind that the defect detection is implemented at the beginning of the image processing chain. Thus, it is important to segment as many defects as possible and pass them to the following image processing. Unnecessarily segmented areas are less problematic due to the subsequent classification step. Moreover, for the requirements in aerospace industry it is essential to have as few defect misses as possible. Hence detection accuracies of ≥ 95% are highly desirable. Starting with, we have a look at the detection performance considering all defect types. Here, we can recognise, that the AT and CT algorithms in the FI and the GB implementations perform best for this experimental assembly. The 5th most accurate is the FI GT algorithm. All the other techniques have detection rates of less than 50%, which is not suitable for this application.
For an independent segmentation of wrinkles and twists with a pre-processing optimised for this purpose we see, that the detection rates for the GT, CT and AT methods are almost 100%. On the other hand, the GF algorithm founds no defects at all. The remaining algorithms all have detection rates between about 45% and 85%. This trend is very similar to the analysis of all defects. These results slightly changed for the individual detection of gaps and overlaps. In this case, the findings for both CT implementations as well as the calculated detection results for the GB AT are > 95%. Except the FI OT, GF and IP implementations, all the other algorithms generate detection results > 80%. These results reveal the previously predicted major influence of image pre-processing and algorithm parametrisation. Furthermore, the categories of defects defined in this paper seem to be meaningful, since the detection performance for the individual groups increases significantly. For a realistic application of the defect segmentation, with initially unknown defect types, a parallel performed image pre-processing and defect segmentation with several different settings could be beneficial. The individual results could then be linked in a suitable manner. Due to the good performance of the AT and CT techniques they are investigated in detail, afterwards. Especially the detection results for the analysis of all considered defects are much better than for the competing algorithms. Their detailed findings are presented in the Fig. 7. Related, the Fig. 7a shows the corresponding segmentation performance considering all defects. Sub- Figure 7c presents the detection results only for gaps and overlaps. Similarly, Fig.  7d displays the findings for the detection of wrinkles and twists. In order to investigate the robustness of these algorithms, Fig. 7b visualises the detection results for noisy input images using the same algorithm settings that are used for the outcome in Fig. 7a. The findings for all defects analysis belonging to Fig. 7a are additionally listed in Table 10. To begin with, we examine the results presented in Fig. 7a for all types of defects. In this case, the GB AT algorithm yields the highest overall score of 100% detection accuracy. The GB CT method then follows with 98.2% (σ = 4.1%) detection rate. A quite comparable effect can be seen for the accuracies of the defect position. Therefore, the GB AT and both CT algorithms have positional accuracies of about 78%. Then, the FI AT algorithm follows, having a positioning accuracy of around 68%. Strangely enough, the number based false positive rates are much smaller for the two FI implementations Table 7 The ascertained, most capable parameter configuration for the image pre-processing in relation to the best segmentation result is presented   than for the GB ones. It is also noteworthy, that the average ground truth areas and the automatically segmented regions overlap by only about 15% in both methods. However, the protruding areas for detected defects are rather large. Particularly, the detection results for the evaluation of all considered defects are significantly superior to those of competing methods. The detection accuracy is > 95% with a false negative rate < 5%. Hereto, the FI AT method for overlaps and gaps constitutes an exception with around 83% detection accuracy. The defect position accuracy for wrinkles and twists is about 65%. For gaps and overlaps this measurement value is somewhat greater with around 85%. Among these, the FI AT forms an exception with about 75% positioning accuracy. A contradictory behaviour can be recognised for the area based false positive assessment. Gaps and overlaps generate false positive values for less than 20% of cases. For wrinkles and twists this value is larger than 35%, apart from the GB CT algorithm with about 19%. The number of false positive detections is always between about 20% and 42%. This means, that the size and presumably the shape of a defect area changes due to the defect geometry, but the numbers of false positive detections are very similar. The values for the excess areas are between 45% and 80% for all defects and algorithms. The overlapping of the ground truth regions and the detected areas is between 5% and 20%, whereby gaps and overlaps show somewhat slightly better overlapping results in this context. These findings show, that the examined algorithms represent the area of a defect poorly. In order to use them in an inspection application the region of interest must be artificially enlarged. However, due to the sufficient posi-

Fig. 5
The radar diagrams compare the calculation times of the individual segmentation algorithms, each for the FI and GB versions, in milliseconds [ms] tional accuracy of the selected algorithms, this extension of the viewing area is reasonably achievable. The shape of the expanded area could be connected to the algorithm configuration, since this is of varying sensitivity for different defect types.
The findings for the input images with noise are slightly different. They are displayed in Fig. 7b. These outcomes are generated from the same data set as well as the same preprocessing and algorithm configuration as for the results in Fig. 7a, but with additionally applied artificial noise. As already introduced in Sect. 3.5, this modification of the input data serves to test the robustness of the algorithms and respectively the settings for adjusted images, with constant configurations. The requirement for a data-dependent parameter adjustment has already been claimed above and  The values belong to Fig. 5 and are given in milliseconds [ms] is also reviewed in this way. This noise leads to the reduction of the detection accuracies for the GB AT algorithm by about 10% and for both CT methods by around 40%. The number of false positive detections stays equal for the AT algorithms. Surprisingly, the number of false positive detections for the CT algorithms significantly decreases by about 10%-20% compared to the non noise data. An improved setting, especially for noisy images, may cause much better results. However, similar statistical noise as exemplary applied for this experiment can be caused by disturbing scattering or beam propagation behaviour of the laser on the material. This obviously can lead to local noise deviations in the image, depending on the geometry of the component. This must be taken into account for a real application. In this case, a local image pre-processing technique might lead to even better detection results. In order to compare the theoretical ranking of the algorithms from the Table 4, with the experimental test results, these findings are comprehensively summarised in the Table 11. Therefore, the same expectation weights are used as for the theoretical appraisal. What is noticeable here is the miscalculation of the GF algorithm. Based on the literature from other use cases, the GF algorithm was judged to be very promising for the defect segmentation. However, the results in our experiments show the opposite. All the other algorithms behave approximately as expected. The findings presented above are discussed in the following section.

Discussion
In the following, the experimental results are discussed and reviewed in the context of related studies. Additionally, the research questions are answered and an outlook on further research is proposed. Compared to the defect detection findings based on Neural Networks from e.g. Schmidt et al. (2019a, b), the classical image segmentation techniques investigated in this paper also provide quite good results. The  The detection results for the well performing methods AT and CT are displayed. The crossbars each represent the corresponding standard deviations chosen segmentation algorithms worked quite well except for the GF algorithms, which yield very poor detection results. On this matter, Hanbay et al. (2016) pointed out in their publication, that the GF is suitable only for fairly uniformly structured image data and otherwise works rather poorly. The LLSS topology data of the prepreg material in this paper con-tains unevenly distributed artifacts. Considering this aspect, the performance of the GF algorithms are comprehensible. The OT methods also provide weaker performance in comparison to Tonnaer et al. (2017), which is probably due to the brightness gradient in the images. This effect causes the algorithms to segment a very large region of the image. An Overlap area (A o ) 11.6 (10.6) 9.0 (10.1) 9.9 (9.1) 15.7 (11.6) The values belong to Fig. 7a and are given as < value > (< standard deviation >) in % Assessment results are given in bold Italic row gives the weights for the weighting of the assessment criteria for evaluation improved pre-processing, which is more tolerant for changes in brightness, may reduce this issue. It should also be noted, that the CLAHE image adjustment algorithm used in this paper is very sensitive to changes in the "clip limit" parameter. This is a previously defined value which spreads areas of the histogram above this clip limit across the entire base of the histogram. From the results it can be seen that the preprocessing is very important for noise reduction. This stage needs to be individually adapted for each setup and manufacturing process. Furthermore, the selection of algorithms mainly based on findings from fabric inspection is considered valuable for our use case. Regarding the research questions it should be noted, that a sufficiently good defect detection can be performed with the AT and the CT algorithms for CFRP prepreg material. The GT algorithms also provide satisfactory defect detection results, but only for geometrically larger defects or as GB variant. For most of the defect types considered, the AT and CT algorithms have a detection accuracy of > 95%. For prominent defects, these mean detection accuracy is even up to 100%. Moreover, the defect position is usually determined with sufficient accuracy. In concrete terms this means, a position deviation of less than 30% from the marked ground truth defect position. Unfortunately, the actual defect segmentation in terms of segmented region is much worse, for both overlap and excess area. Below, valuable future research opportunities are outlined.

Future work and prospect
In future work, the effects of different CFRP materials and their interaction with the LLSS settings must be investigated. In particular, varying optical material characteristics can have a great influence on the performance of certain algorithms. Thus, these optical material parameters have to be examined in detail and linked to the current results. It is also worth noting, that the pre-processing can be improved for better detection performance, though this was not considered part of this work. In addition, it is conceivable to supplement the defect segmentation procedure presented in this paper with a machine learning approach. For instance, a filter-based technique could be retained for better comprehensibility of the algorithms behaviour, but the parameter setting might be carried out through a machine learning approach. Furthermore, the pre-processing parameters could be adjusted through machine learning techniques for optimised performance. Beyond that, a combined one stage segmentation and classification of the defects via a deep learning algorithm is also possible. However, for the traceability of such advance machine learning or even deep learning algorithms and its application in the aviation sector, additional methods for the explainability of the deep learning decision must be applied (Lee et al. 2021). Besides, the necessary amount and quality of training data need to be available, as explained earlier.
The key findings of this paper are briefly repeated in the subsequent section.

Conclusion
Below the main results of the investigations are summarised.
The Grid Based Adaptive Thresholding and the two Cell Wise Standard Deviation Thresholding algorithms achieve detection accuracies of > 97%. Furthermore, the defect position is mostly sufficiently estimated with > 75% position accuracy. This implies, a position deviation of less than 25% from the ground truth defect position. The actual segmentation accuracy is much worse. Therefore, the average overlap rate is less than 20%. Nevertheless, due to the sufficient detection accuracy this can be compensated with an extended defect region. Thus the findings meet very well the requirements described above for defect detection in the aerospace industry. For this reason, the research in this paper provides a sound fundament for the selection and parametrisation of suitable defect detection algorithms for LLSS based inspection systems for the composite production. Furthermore, the methodology from this paper can be applied from operators and developers of defect detection systems to configure their algorithms for a considered material. Furthermore, it should be noted, that the type and configuration of the pre-processing as well as the quality of the input data are strongly influencing the performance of the algorithms under consideration. Additionally, the laser scattering and beam propagation within the fibre material might lead to disturbing effects. Thus, these optical material parameters have to be examined in detail and linked to the current results. For this purpose, it might be helpful to investigate the laser line shape and disturbing influences from the recorded camera image. Thus, the development of an appropriate quality metric can be very beneficial for the efficient parametrisation of the image preprocessing.
Funding: Open Access funding enabled and organized by Projekt DEAL. The research was carried out within the framework of the German Aerospace Center's core funded research.

Availability of data and material:
The image data used will be provided on request.

Conflict of interest:
The authors declare that they have no conflict of interest.
Code availability: The corresponding Python code is made available via 4TU portal https://www.doi.org/10.4121/c.5180657 in case the paper is accepted.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.