Automatic reconstruction of irregular shape defects in pulsed thermography using deep learning neural network

Quantitative defect and damage reconstruction play a critical role in industrial quality management. Accurate defect characterisation in Infrared Thermography (IRT), as one of the widely used Non-Destructive Testing (NDT) techniques, always demands adequate pre-knowledge which poses a challenge to automatic decision-making in maintenance. This paper presents an automatic and accurate defect profile reconstruction method, taking advantage of deep learning Neural Networks (NN). Initially, a fast Finite Element Modelling (FEM) simulation of IRT is introduced for defective specimen simulation. Mask Region-based Convolution NN (Mask-RCNN) is proposed to detect and segment the defect using a single thermal frame. A dataset with a single-type-shape defect is tested to validate the feasibility. Then, a dataset with three mixed shapes of defect is inspected to evaluate the method’s capability on the defect profile reconstruction, where an accuracy over 90% on Intersection over Union (IoU) is achieved. The results are compared with several state-of-the-art of post-processing methods in IRT to demonstrate the superiority at detailed defect corners and edges. This research lays solid evidence that AI deep learning algorithms can be utilised to provide accurate defect profile reconstruction in thermography NDT, which will contribute to the research community in material degradation analysis and structural health monitoring.


Introduction
Quantitative damage and degradation evaluation for key components, structures and materials are widely recognised as an indispensable cornerstone in industrial modernisation, maintenance science and life-cycle engineering management. The requirements in terms of Structural Health Monitoring (SHM), manufacturing quality control, damage inducement, propagation analysis raise an increased demand for inspection accuracy, resolution, and decision reliability. The Non-Destructive Testing, Evaluation (NDT &E), as one of the main techniques to achieve this, takes the lead in the deployment of practical failure and damage evaluation in industrial and defence applications [1]. It utilises the basis of sensing discontinuity and anomaly in the physical field to identify defects in an object without causing any modification or destruction. The defects can be characterised by quantitatively analysing the type, distribution, feature, and pattern of the anomaly.
Among the diverse NDT, E techniques, Infrared Thermography (IRT) is a powerful technique that offers rapid, non-contact, robust non-invasive inspection, which has been widely applied for component integrity measurement and material characterisation [2]. As the most popular active IRT, pulsed thermography (PT), sketched in Fig. 1, is especially attractive for the subsurface defect detection because it employs a transient external source to heat the specimen, facilitating rapid, widely applicable, intuitive and relatively low-cost thermal imagery inspection. In principle, it uses a divergent thermal contrast on the object surface to detect the subsurface defect, supported by thermal pattern enhancement algorithms to estimate defect size and profile. However, the traditional PT, due to its limitation of the same surface heating and detecting, is always challenging to accurately reconstruct the profile from the blurred defect thermal pattern for deep and small defects.
Traditional PT post-processing methods aim to improve the defect contrast and Signal to Noise Ratio (SNR) for better measuring the location, size, and depth of defects. Typical methods including Thermographic Signal Reconstruction (TSR) [3], Principal Component Thermography (PCT) [4], Pulse Phase Thermography (PPT) [5], have been widely used. However, for the accurate reconstruction of the defect profile, it is always time-consuming and needs numerous calibrations and human pre-knowledge, and is sometimes even impossible for small and deep defects. Furtherly, the deviation in the heating, transfer, and capturing process always results in uncertainty and difficulties for decision making in periodic inspection. For material degradation analysis and structural health monitoring, the defect profile characterisation, especially on the sharp corner and edges of irregular defects, is essential as it can provide the anomaly cause for degradation generating and support defect propagation prediction. Therefore, an accurate defect profile reconstruction method with less inspection pre-knowledge is highly demanded and impacts inspection automation.
Different from the traditional thermography characterisation methods, we focused on the reconstruction of irregular defect spatial profiles by proposing an end-to-end deep learning automatic inspection method. By initiating a fast FEM simulation tool for PT, a multiple defect profiles database can be generated efficiently for deep learning network training. This provides a powerful tool for comprehensive database generation and augmentation, especially suitable for the data-driven-based machine learning technique. Then, a Mask-RCNN-based network is applied to investigate the defect profile reconstruction performance in a mixed-shape defect dataset. This paper is structured as follows: The related works are discussed in Sect. 2. Section 3 presents a fast FEM simulation tool for PT inspection dataset generation. Section 4 provides a Mask-RCNNbased network for defect characterisation. Section 5 reports the inspection results in comparison to traditional methods. Section 6 discusses the network performance, and the conclusions are given in Sect. 7.

Related works
Machine learning methods, as a data-driven approach, are famous for their comprehensive learning and remarkable automatic decision support. The state-of-the-art technique of Neural Networks (NNs) has been reported to be effective in defect identification using different patterns of thermographic signals such as pixel-wise temporal signal, or spatial correlation in one frame. In utilising thermal temporal information, Recurrent NNs (RNNs) are useful in defect feature learning and extraction. Fang and Maldague [6] focused on the thermal temporal information and proposed a defect depth quantification strategy by employing Gated Recurrent Unit (GRU) model to process the temporal features of a small region through the fully connected layer. Luo et al. [7] proposed the application of Long Short-Term Memory (LSTM) that uses the transient information on both sides of temperature peak combining spatial pixel transformation to acquire a reliable probability of detection. Another LSTM model was reported to classify four types of defects (water, oil, air, adhesive filled) by learning the thermal conductivity deviation of defects in temporal responses [8]. Similarly, Duan et al. [9] focused on the defect classification through training parametric standard NN with 2 hidden layers using the feature vectors extracted from TSR coefficients. The outputs demonstrate that the RNN-based method is remarkable in defect depth measurement, defect conductivity classification, but its characterisation of the defect's spatial profile is presented as mediocre.
Additionally, the imagery-based Convolutional Neural Networks (CNNs) also present effectiveness in IRT [10]. One of the common methods is to reduce the data dimension from 3D to 2D by rearranging thermal image pixels. Xie et al. [11] proposed an unsupervised learning Autoencoder model to improve the visibility of rear surface cracks in inductive thermography. Marani et al. [12] used CNN to detect defects with multiple sizes and depths. These two methods consider more temporal information than spatial information because each image is rearranged into a 1D vector that undermines the thermal spatial correspondence, which shows less capability of defect profile Fig. 1 The principle of pulsed thermography (PT) regression than identification. Another type of CNN-based method tends to explore a single thermal image, where VGG, CNN variations [13] and transfer learning models have been applied in cooling radiator [14], solar photovoltaic module [15], CFRP manufacturing [16,17], machine faults diagnosis [18] and crack detection [19,20]. In combination networks, transfer learning and deep learning are structured to improve defect detection in composites [21]. A U-net [22] architecture combined with the traditional PCT method was proposed to perform a defect profile segmentation by analysing the likelihood between defective areas and sound areas. These models focus on the exploration of spatial information rather than temporal information to characterise defects. Furtherly, a comparison work has been studied to investigate the multiple-size defects detectability between methods of YOLO [23], Fast-RCNN [24], and Mask-RCNN using a fusion dataset of synthetic and experimental data [25]. It reports a superior overall performance in defect detection and segmentation using the Mask-RCNN network. However, these researches mainly used the regular shape defect like circles or squares, and very limited studies focus on irregular shape defects. For real applications, a complex defect profile with sharp inflexion edges is common and difficult to be reconstructed using the existing methods. These types of defects would present mechanical stress concentration and always lead to defect propagation. But the surface thermal pattern of these inflexion points and sharp corners are always vulnerable to surrounding heat aliasing and sensitive to noise. Therefore, a study to automatically reconstruct irregular defect profiles is valuable and highly demanded.

Overall methodology
This overall methodology of the proposed method is shown in Fig. 2. This automatic inspection method starts with dataset preparation. To ensure the network can learn the thermal response of defects, a general dataset of PT inspection results including multiple defects with different irregular defect profiles is generated. A FEM simulation tool for PT inspection is applied, which can reduce the repetitive calculation burden from different defective models. Then by considering the realistic deviation in heating (pulse energy and duration noise) and capturing (thermal image noise), the general dataset is furtherly processed and augmented. The dataset is distributed with training, validation and testing sets for Mask-RCNN. Finally, the testing sets are employed to measure the model performance. To gradually validate the network, three subdatasets selected from the general dataset is implemented for different objectives. First, a single-shape defects subset is tested to validate the network feasibility. Then, a mixedshape defects subset is used to present the method's capability to reconstruct multiple irregular profiles. Thirdly, the variation of the frame selection is employed to test the network stability on different temporal thermal images.

A fast PT FEM simulation
To establish a general dataset, a large amount of accurate FEM modelling of known defects is indispensable, and the fast generation and memory cost-efficient are worthwhile. In this research, a model-based FEM forward modelling with a fast solver is adopted.
The governing equation of the PT heat transfer can be written as Eq. (1), and its FEM governing form is shown in Eq. (2).
where T and r f g denote the temperature and gradient vector and ÀQ C is the excitation heat flux applied on the surface region (C). The j, q and c denotes the heat conductivity, density, and specific heat of the material, respectively. By using Fourier transformation, the response signal can be calculated based on the frequency domain summation method. The transformed FEM governing equations are presented as Eqs. (3) and (4), where Q f g denotes the excitation heat source, [K] and [C] are the coefficient matrices of the FEM equations. To guarantee precision and efficiency, two strategies are applied. Firstly, the number of necessary harmonic frequency components is reduced by using an interpolation strategy [26]. Secondly, a fast forward scheme based on the databases strategy is applied to realize the rapid calculation of the temperature perturbation caused by a bottom-hole defect [27].
A numerical model of the thermography problem is shown in Fig. 3. The symbol X denotes the whole no-defect object; X c denotes the region of defect; and X 0 denotes a selected suspect region that contains the defect. To approximate defect, X 0 is chosen as a flat-bottom region with a thickness equal to the defect. The fast forward scheme based on the databases strategy can be described as follows using this numerical model.
Upon subtracting the single-frequency governing equation for the unflawed model from the single-frequency governing equation for the flawed model, one can obtain the governing equations about the field perturbation T f for single-frequency IRT problems. After separately denoting all the nodes as three parts, the FEM equations of the IRT problem become Eq.
where ½K is the FEM discrete coefficient matrix of the flawless model, ½K ij is the coefficient matrix corresponding to the defect model, fT 1 g is the temperature at the nodes of defect elements, fT 3 g is the inspection surface nodes and fT 2 g is the remaining unknowns. By denoting the inverse matrix of ½K as [H], the following Eq. (6), correlating the unflawed model to the defect perturbation field in X c , and the surface perturbation field response can be derived by Eq. (7).
Since the coefficient matrices ½H 11 , ½H 31 and temperature vector fT 0 1 g, fT 0 3 g, are independent of the defect geometry, once they have been calculated a priori and stored as databases, there is no need to recalculate the whole model for defects of different profiles. Thus, the fast scheme can significantly promote calculation efficiency by greatly reducing the dimension of governing equations [27].

Defect modelling
The thermal inspection dataset includes synthetic simulation data of a plate structure with multiple flat bottom hole defects. This work aims to investigate the inspection defect profile reconstruction capability. To minimise the mutual interference of multiple defects in one specimen, the defective specimen models are built individually in the same plate model, and the parameters of the exciting source are fixed. The model used is illustrated in Fig. 4, where a triangle-shaped flat bottom hole at 3.5mm depth is shown. The thermodynamic material parameters for the   Table 1.
For defect models, firstly, 9 square defects are designed with three sizes (121mm 2 , 49mm 2 , 1mm 2 ) in three depths (4.5mm, 3.5mm, 2.5mm to the surface). This single shape with multiple size design aims to study the feasibility of the proposed approach. Secondly, three types of flat bottom hole shapes are modelled using the above simulation. The planar rectangular and triangular shapes consist of 20 defects (10 for each shape) with the same depth (3.5 mm from the top surface). These models are designed for evaluating the network learning compatibility that should be able to distinguish target among various shapes, and then to mask the sharp corners and edge accurately. The specifications and shape profile of all models are listed in Fig. 5, where the defect boundary of flat-bottom hole defects in rectangular and triangular defects are shown using a binary colormap as they have the same (3.5 mm from the surface) depth. Differently, the images of square defects are shown in three grayscale levels because the square defects have different depths.

Dataset pre-processing
To increase the applicability, the PT simulation database is pre-processed by data augmentation via considering the systematic noise including camera capturing noise and heating variation. For the heating variation, five different heating energy magnitude variants to 1 Â 106 W/m2 (96%, 98%, 100%, 102%, 104%) are simulated for each model. Simultaneously, five different pulse duration variants to 0.2 s (0.18 s, 0.19 s, 0.2 s, 0.21 s, 0.22 s) are also simulated for each model. As a summary, for each defect, 25 heating variants are simulated. For simulating noise during the data capture, each thermal image is added with white noise with different signal-to-noise ratio (SNR) levels. More details about the added noise can be found in our previous work [28]. For Mask R-CNN training, for each defect, the profile image shown in Table 2 is annotated as ground truth into the format of JSON files for ResNet network code convenience. Based on this pre-processing for data augmentation, a general database including 3625 PT thermal inspection image sequences is established for further processing.

Network architecture
The Mask R-CNN is employed as a defect-recognition framework in this study. As a state-of-the-art object recognition algorithm, it has a strong capability of instance segmentation, which provides the shape of the detected object and can be used to investigate the defect profile. The proposed framework is a two-stage structure, as shown in Fig. 5. The first stage aims to propose the potential region of interest from the raw image. In this stage, the ResNet network is employed as the backbone network for feature extraction and followed by a Region Proposal Network which predicts the region proposal that includes the objects. Then the ROIAlign layer is used to preserve the spatial information, which is the key for the mask prediction. It solves the misalignment between the ROI and the extracted features by using the bilinear interpolation that could remove the quantisation of the pooling operation. At the second stage, there are 2 branches. In the classification branch, several fully connected layers are operated to achieve the classification and bounding box regression from the proposed ROIs. The mask branch achieves the pixel-wise segmentation mask prediction on each ROI through convolution and deconvolution operations.
In this proposed framework, as shown in Fig. 5, ResNet 101 [29] is employed to produce the feature map, which is followed by an RPN network to propose the ROI. Instead of using the ROIAlign layer to produce the alignment feature map, in this framework, the PointRend rending neural network is applied to enhance the detailed edge detection capability, undermining the over-smooth boundaries issue for sharp corners and irregular shapes. It refines the coarse mask by selecting a set of points that are predicted by the multi-layer perception and using the adaptive subdivision step which employs the bilinear interpolation to compute the edge at the high level. This framework is developed and implemented through the deep learning framework of TensorFlow 2.0 based on Python coding.
The loss function output of Mask RCNN is written as: where the L C , L B and L M denote the loss function of classification, bounding box and mask generation process, respectively. And the w C , w B and w M denote the weight coefficients of classification, bounding box and mask generation process, respectively.
For the ResNet101, the technique of transfer learning is used through COCO pre-trained weights that were trained using the COCO dataset to improve the training efficiency of the model. The algorithm of Adam was used as an optimiser. The weight decay was set as 0.0001 while the learning rate was 0.001 and learning momentum was set as 0.9. The number of training steps and validation per epoch  Table 2. This research aims to characterise the defect boundary profile of a single defect in one model separately. Hence, the performance of profile recall is the key focus instead of the classification of defect type. To evaluate the performance, the mean Intersection over Union (mIoU) is calculated to measure the overlap between the predicted profile and ground truth, written as Eq. (9).
where A \ , A [ denote the area of overlap and union between the predicted segmentation and the ground truth, respectively.

Datasets for defect reconstruction
Single defect shape dataset In the beginning, to validate the detection feasibility of the proposed method, the single defect shape detection using square defect models is conducted to test the architecture. This process consists of 9 sub-datasets with different sizes and depths for training, validation, and testing. It can not only demonstrate the detection feasibility but also investigate the detectability (detectable depth to size ratio) of this single shape learning process. For each dataset, a total of 225 inspected thermal images including 25 systematic deviations (5 heating energies x 5 pulse durations), 5 SNR levels of noise (42-58 dB) are prepared. Among these data, 180 images are randomly selected as the training dataset, 22 as the validation dataset and 23 as the test dataset (8:1:1).

Mixed defect shape datasets
In the second part, we mixed different shape defects in one dataset to test the method's capability on defect profile reconstruction. In this dataset, the selected thermal images of 10 triangle and 10 rectangle defects are mixed. Then by adding 20 SNR levels (from 41 dB to 60dB) of noise to each defect, 400 images (20 models x 20 noise deviation on each) are used in total. The ratio of the training dataset, validation dataset, and testing datasets is 8:1:1, which means that the training, validation, and testing datasets involve 320, 40, and 40 thermal images, respectively.

Model performance for different sizes and depths (square-shaped)
This section presented the results of the detection of square-shaped defects with different sizes and depths using each single model dataset. Figure 6(a) shows the 110th thermal frame after the pulse with 48dB random noise for all models. Figure 6(b) presents the fusion of the detection results of the proposed method and ground truth results. The green area and the green dotted boundary indicates the ground truth mask and ground truth bounding box, respectively. The brown area and the brown dotted boundary indicates the detected mask and detected bounding box, respectively. In this experiment, each model was trained for 50 epochs separately. Inspection of Fig. 7 left part shows that, for M01, as the biggest and shallowest defect (Width/Depth Ratio (WDR) at 4.4), the defect position is easy to observe and the thermal pattern can be recognised as a square shape. The smallest and deepest defect M09 (WDR at 0.22) is invisible. The detection results in Fig. 7 right part show the successful characterisation of the defect profile. All detected defects are correctly masked in the correct position and align to the ground truth boundary. The IoU between the detected defects and the ground truth is shown in Fig. 8. The big and middle-size defects can be detected with around 90% IoU for all three depths, which covers the defect WDR from 4.4 to 1.5. For the smallest defects (M03, M06, M09), the IoUs are around 70%. This result suggests that a big/shallow defect (M01) has better reconstruction accuracy than a small/deep defect (M09). This phenomenon fits the thermography defect inspection preknowledge. Square defect datasets test results demonstrated the feasibility of the proposed network and at least exhibited a stable detectability for the defects with a WDR of 1.5 (all three middle-size defects). It should be noted that the WDR of M03, M06, M09 are lower than 1 and are very difficult to be detected accurately. These 3 defects are invisible in the thermal images in Fig. 7 left part. To verify if the proposed method can achieve stable detection on these defects, the total loss and the mask loss of all 9 models are presented in Fig. 9. Both loss functions suggest that the big and middle-size defects with 3 different depths quickly reach a convergency status within the first 20 epochs. While for M03, M06, M09, the loss functions are not convergent before the first 50 epochs, which indicates that the network cannot overcome the noise and achieve effective learning of the defect thermal pattern. The detection performance would not be consistently correct for different end epochs, and no reasonable sign can be used to determine the optimal end epoch for consistent detection. It is because that defect size is too small to be effective learned through all the NN layers.  And the noise has a big influence on defects contrast. In multiple-layered deep learning networks, the initial input raw image has been convoluted multiple times, where the defect contrast has very limited contribution and cannot compete with noise, especially in the receptive field in the feature map of the last several layers. This is the reason why the network learning loss cannot converge and produce stable results. In most deep learning networks, small object detection is generally a challenging topic. Therefore, M03, M06, M09 should not be recognised as stable reconstruction. To improve the small/deep defect reconstruction, an improved network requires a further study.

Mixed defect shape detection
To further explore the network applicability, the mixedshape dataset is trained to investigate the performance of detecting different shape defects. Ten rectangular and triangular defects with the same depth are studied and the detection results are shown in Fig. 10. The thermal images in the first and fourth columns present the defect contrast but with rounded edges and corners. The proposed method produces highly accurate masks on the defect profile on both rectangle and triangle defects. Most defects can be detected with IoU over 90%. It demonstrates that the network can reconstruct the defect profile from the thermalprofile relationship (e.g. the thermal transfer response of defect structure). And it has a stable identification and segmentation performance for a multiple mixed profile dataset, especially for the straight edges and acute corners in defect profiles.
In a further analysis, among the rectangle defects, the network has a slight performance drop at the defects with a bigger aspect ratio (IoU of R01: 88%, R10: 90%). And the output mask cannot perfectly fit the corners of the rectangle. On the contrary, for the defects with a smaller aspect ratio (R05), the reconstructed mask shows an improved coverage on the corner area. A similar phenomenon can be observed on the triangle defects. The small defect (T06) with acute corners are not fully mask covered on corners, which results in an obvious IoU loss. Even in one triangle defect like T05, the obtuse angle corner is better reconstructed than the other two acute angle corners. The reason is that the big width/height ratio corner and acute triangle corner that has weaker thermal contrast in these areas is aliasing with surrounding (sound area) thermal pattern in the thermal transfer. These cramped areas can easily lose their shape information affected by noise. In addition, most edges in the reconstructed mask generally keep straight aligned with the ground truth boundary but miss several pixels. It demonstrates that after the RPN network finds the ROI, the mask rending network has learned the thermalprofile form (e.g. the thermal transfer response of defect structure) but is more or less affected by noise. On the other hand, it presents that the proposed method is principally different from the traditional edge detection image processing methods, which normally output distorted or curved boundaries.

Comparison with traditional thermography methods
To qualitatively compare the capability of the proposed network, a comparison between the proposed Mask-RCNN network and the traditional thermography image post-processing algorithms is presented in this section. Thermographic signal reconstruction (TSR) and Principal Component Thermography (PCT) are two well-known algorithms for defect contrast improvement. The TSR [24] method is remarkable for improving the temporal resolution and reducing time-domain noise of the thermogram sequence and consequently promoting the time-domain sensitive features. PCT is derived from the Empirical Orthogonal Functions (EOFs) in PCA to extract the spatial features and reduce undesirable noise by projecting original data onto an orthogonal components system. Figure 11 shows a comparison between the network inspected defect profile and TSR 1st derivative peak image, TSR 2nd derivative peak image and PCT 1st order image. The TSR and PCT images produced improved thermal contrast than the raw images, but it is also obvious that the shape distortion (rounded corner & edge loss) is inevitable. These methods are the types of image-based edge enhancement or kurtosis detection algorithms, where no ground truth is used for training to build the thermal-profile relationship learning (e.g. the rounded thermal transfer contour of true defect structure). Based on these enhanced images, it would be difficult to decide the accurate defect boundaries of the acute corners and straight edges using threshold type methods. On the contrary, the proposed network shows superior performance than the traditional algorithms by successfully reconstructing edges and corners.

Discussion
Collected data of active thermography contains a time series of images showing the defect contrast. For a specific defective structure and inspection condition, the defect would appear and disappear in a specific period of thermal sequence, which is one important pre-knowledge in thermography. Traditionally, the inspector has to manually select the optimal frame for defect characterization. The proposed network uses one of the thermal frames for network training. Even though, one of the key advantages of a deep learning network for active thermography NDT is that it can reduce the reliance on pre-knowledge, and improve its resistance to noise uncertainty. However, the training frame selection would still affect the network capability to some extent. Therefore, the thermal frame selection should be taken into consideration. Different thermal frames in the inspection sequence are compared to investigate their influence on network performance. The frames starting from 50th to 190th after the pulse are selected for training and corresponding inspection results and training loss are analysed. Figure 12a and b show the total loss and mask loss of different training frames, where 100 epochs are used. The losses using the frame after 110th decrease quickly at the beginning stage, and reach the convergence status after 60 epochs. However, the models using the frame at 50th and 70th cannot reduce the learning loss and have not reached convergence within 100 epochs. The model using the 90th frame is a mediocre option. This demonstrates the frame before the 90th frame provides the network with poor defect information, and the network and has a ''blind'' learning area in the initial state after the pulse.  Furtherly, the IoU inspection performance of these models after 50 epochs are compared in Figure 12(c). It shows that after convergence, the inspection IoU of networks after the 90th frame is around 90 %. On the contrary, the models using the 50th or 70th frame cannot characterise the defect profile even though the performance improves with more epochs. To intuitively demonstrate this, Figure 13 shows the inspected defect profile using different frames (50th, 70th, 110th, 190th). Two triangle and two rectangle defects are listed. In the 50th frame row, all four defects are poorly detected. The triangle shape is wrongly inspected as a rectangle shape. In the 70th frame row, better shape profiles are acquired but still cannot cover the corner. For the 110th and 190th frames, both networks can provide good and similar defect profiles. A very small IoU drop can be observed on the 190th frame compared to the 110th frame. It proves that the proposed method has a limited performance when the initial thermal frames are chosen for analysis. On the other hand, it also demonstrates that the proposed method has the resistance capability to the frame selection uncertainty in the proper frame period (110th-190th), which validates the compatibility advantages of the proposed automatic decision-making network over the manual inspection.
It should also be noted that the deep-learning-based technique can do fast, accurate characterisation only after a time-costing and heavy computation training process. The well-trained networks can also directly contribute to the on-site inspection with similar inspection scenarios. However, it should be noted that thermography is an inspection rather than a structural health monitoring technology, so real-time performance is not critical. The post-process of thermography is usually offline-based.

Conclusion
This paper proposed an automatic defect profile characterisation technique for pulsed thermography inspection using an end-to-end deep learning network. The proposed method shows its higher accurate and robust performance in sharp corners and edges of irregular defect profiles, which are commonly difficult for the traditional processing methods. The contributions of this study are the following: 1) A fast thermography FEM modelling technique was proposed and employed to efficiently generate the inspection thermal image database of multiple single defective specimens. It brings a powerful tool against the limitation that insufficient defect samples can be easily obtained in experiments.
2) The method feasibility is validated by training with single square-shaped defect datasets. The detectability of the proposed method is investigated, which the minimal detectable width/depth ratio for single square shape defect is 1.5. Then, the network demonstrates remarkable profile characterisation performance with multiple mixed-shape defects database with over 90% IoU accuracy to ground truths. 3) By comparing with traditional thermography postprocessing methods, the proposed method shows its superiority in detailed characterisation on sharp corners and edges with strong resistance to the shape distortion in thermography NDT. Finally, by analysing the network results using different transient thermal frames, this technique presents robust performance to frame selection variation, demonstrating its advantages of less pre-knowledge requirement and better resistance to the inspection uncertainties.
This study presents the superiority of the AI deep learning algorithm for high accurate defect profile characterisation in thermography NDT, instead of only detection. The technique will contribute to the research community for degradation and health assessment. It should be noted that the current technique fits the characterisation of flat-bottom hole defects. But its capability for other defect types like cracks or corrosion remains to be investigated. In addition, the study focusing on combining reconstruction for defect profile and depth is worth investigating and an improved network will be required to deal with defect depth variation in future. The efficiency of the defect characterisation is also an important topic, and then the optimisation works of training time and computational burden are to be developed for realistic deployment of the proposed technique.

Declarations
Conflict of interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.