Introduction

Manufacturers must deliver new products to the market across shorter time intervals due to several reasons including globalization, the rapid introduction of new technologies, and shorter product life cycles (Bellgran & Säfsten, 2009; Chryssolouris, 2013; Ishikura, 2001). The increasing demand for rapid, flexible, and cost-effective manufacturing results in a driving force towards novel rapid manufacturing technologies such as AM (Conner et al., 2014; Hendricks & Singhal, 2008). Academic and industrial investments in advancing the AM technology continue to rise with the aim of revolutionizing manufacturing.

There are currently over 20 recognized AM processes with different layer sintering methods (Norazman & Hopkinson, 2014). AM technology can generally be classified according to the raw material’s state used by the process; liquid-based processes (e.g., Stereolithography), solid-based processes (e.g., Fused Deposition Modeling, Robocasting), and powder-based processes (e.g., Laser Sintering, High Speed Sintering). Following the American Society for Testing and Materials (ASTM) classification, single-step AM technology for metallic materials include sheet lamination, laser powder bed fusion (L-PBF), and directed energy deposition (DED) (ASTM et al., 2015).

At present, AM technologies generally encounter the same challenges regardless of the specific process employed. The main challenges include relatively high unit costs due to expensive raw materials and machinery, part-part and machine-machine repeatability, and reject rates for finished parts (Conner et al., 2014; Tofail et al., 2017). Improving the cost-related AM limitations and ensuring repeatable sufficient AM part quality (mechanical properties and geometry), require improving both the underlying AM technology and the quality control framework of the process (Gao et al., 2015). In the literature, process modeling, monitoring, and control strategies have been deployed to overcome some of these drawbacks.

Literature review

In metal additive manufacturing, controlling the porosity formation and microstructure is essential for dictating the sintered part mechanical properties. Porosity is one of the most significant concerns for part durability, where high cycle and low cycle fatigue (LCF) properties tend to be substandard compared to conventional manufacturing processes (DebRoy et al., 2018). In-situ monitoring and controlling the microstructure and defect formations in AM is necessary as it dictates the mechanical properties of the sintered parts. The microstructure and defect formation in AM processes is very sensitive to slight deviations in process parameters (Clymer et al., 2017). Therefore, AM processes have tight operating windows for different materials. Sintering outside the experimentally-validated operating windows is either unexplored or associated with defects such as lack of fusion, porosity, and key-holing (Dass & Moridi, 2019).

AM technologies such as DED of metal incorporate many aspects including 3-D part design, material selection, manufacturing, and quality evaluation. Laser based AM technologies typically form \(\alpha '\) martensite microstructure as a result of fast cooling and therefore offer higher tensile strength and lower ductility. However, EBM manufactured parts tend to have lower tensile strength and higher ductility due to exhibiting \(\alpha \)\(\beta \) microstructure from the slower cooling in a vacuum atmosphere. Irrespective of the material, fine-grained microstructures (columnar, mix of columnar and equiaxed, and equiaxed) are usually observed in AM. Grains in mixed and equiaxed microstructures are typically on the length scale of a deposited layer. While columnar microstructures can epitaxially grow over several layers (in the build direction).

A single AM metal part can have several microstructural features and porosity types that dictate its mechanical properties. The microstructure and defects are formed in-situ and primarily depend on the process parameters and material used (Galarraga et al., 2016). For example, a higher laser scanning speed can lead to an increase of the HAZ, which can lead to lack of fusion porosities and the formation of larger grain sizes and more \(\alpha \) generations (Ge et al., 2019; Wang et al., 2021). It is possible to perform to use design of experiment (DOE) methods to estimate the process–property–performance (PPP) linkage and find the optimal parameter settings that minimize defects and achieve the desired microstructre and mechanical properties. However, AM is a multiple-input–multiple-output multi-physics process which makes it costly and time-consuming to rely on DOE alone. Additionally, the highly non-linear dynamics of AM results in a large uncertainty of the DOE surface response mapping in areas that are untested (i.e. extrapolation) (Childs & Washburn, 2019).

In Xia et al. (2016) the authors highlight that the hatch spacing results in different heat-transfer behaviors and a larger hatch pacing reduces the peak temperature and temperature gradient of the molten liquid. The laser power, scanning speed, and hatch spacing, have a significant effect on the temperature of the molten pool, the phase transitions, the thermal behavior, and defect formation (Pei et al., 2017; Song et al., 2012).

To overcome the challenges associated with DOE, numerical and data-driven models can be utilized to provide insights and allow real-time process monitoring and control of the process parameters. Real-time process monitoring and control are desirable to achieve the desired microstructure, higher manufacturing repeatability, reduce defects, and potentially achieve functionally-graded AM capabilities. In the context of AM monitoring and control, there are three primary data collection and modeling themes in AM:

  1. 1.

    Thermal models and thermal data collection

  2. 2.

    Melt-pool geometry models

  3. 3.

    Numerical and data-driven process-property-performance linkage models

Numerical simulations vary from the metal powder to the 3-dimensional part scales, and focus on one or two characteristics of the AM process due to a lack of in-depth understanding of the AM process. In many of the published work on laser-based AM process monitoring and control, assumptions such as constant CNC feed-rate and powder mass flow-rate are utilized to simplify the physics-based modeling problem and determine the effect of laser power on process outputs such as dilution (Boddu et al., 2001). However, in real-time application, the process variables are likely not constant. For example, the powder flow-rate will change depending on the volume of powder in the hopper. The CNC feed-rate will also vary and may lead to overfills and under-fills, due to acceleration and deceleration while cladding at the edges of the part. The clad profile and microstructure of AM parts change significantly with small deviations in the powder mass flow-rate. Limited to the current physics-driven numerical methods it is impractical to predict the whole AM process (process parameters to resultant microstructure) quickly and accurately (Qi et al., 2019). Therefore, data-driven models have been widely utilized in AM to avoid the need of knowing the entire underlying process physics and the computational expense of running numerical simulations in-situ.

Relying solely on numerical models to simulate in-situ the linkage between process parameters and microstructre formation is not practical for the following reasons:

  1. 1.

    Numerical models of the thermal dynamics in AM are either accurate but computationally-expensive or inaccurate and computationally-efficient and therefore can not be used in-situ to estimate and control the melt pool shape and temperature profile (Tan et al., 2019).

  2. 2.

    Thermal data acquired in-situ often suffers inaccuracies due to sensory calibration difficulties (Tapia & Elwany, 2014).

  3. 3.

    Micro-scale numerical models of the microstructure are computationally-expensive and therefore can not be used in-situ to estimate and control the microstructure (Tan et al., 2019).

  4. 4.

    Assumptions are often utilized to simplify the physics-based modeling problem, but lead to inaccuracies (Boddu et al., 2001).

Analytical process–property–performance models

In Ning et al. (2020) the authors develop a physics-based regression model to predict part porosity evolution. The authors define the porosity evolution as the change of volume fraction between the pre-processed powder bed void and post-processed part porosity. The molten pool dimensions were calculated by a closed-form temperature solution considering heat transfer. A regression model is employed finds the correlation between the molten pool dimensions and porosity evolution. The total porosity percentage results were validated against experimental porosity data of a 1.3 mm by 1.7 mm section of a 10 mm by 10 mm by 5 mm part.

Wang et al. (2022) developed a two-dimensional analytical model to calculate the keyhole porosity using the molten pool geometries, average pore size, melt flow velocity, and pore formation frequency as inputs. The 2D model is based on the 2D explanation of the bubble formation and trapping in the keyhole melting of LBPF. The keyhole pore average radius and porosity percentage under various process conditions were predicted and validated against the experimental literature measurements of Ti6Al4V in LPBF. The experimental average pore sizes under five different combinations of process parameters were obtained via cross-sectional optical microscopic pictures and image analysis. Across five combinations of process parameters (energy densities), the predicted pore average radius and porosity percentage were in acceptable agreement with the experimental results. Wang et al. also found a linear regression relationship between the average pore size and the energy density. Wang and Liang (2022) progress the two-dimensional analytical model to a three-dimensional analytical method and take in to account the process of bubble emission near the bottom of vapor depression and trapping by solidification front. From a sensitivity analysis, the authors found that the keyhole porosity percentage has a positive correlation with the laser power and a negative correlation with the scan speed.

Analytical solutions like the ones discussed do not utilize FEM or iteration-based simulations and thus are computationally efficient. However, analytical methods tend to make assumptions to simplify temperature-dependent material properties and to the authors’ knowledge has not been successfully developed to localize defects across varying process parameters and rigorous experimental validation.

Data-driven process–property–performance models

In the literature, data-driven models have been applied to the following parts of AM: design, in situ monitoring, and the process–property–performance linkage. Machine learning NN have been applied to estimate one of following properties at a time: dimensional accuracy, surface roughens, shrinkage, density, compressive strength, and porosity (Chen & Zhao, 2015; Garg et al., 2014; Shen et al., 2004; Sood et al., 2010).

AM does not have any large open-source experimental data sets since it is expensive to collect training data and data labeling requires expertise knowledge (Qi et al., 2019). Utilizing ML NNs on physics-derived synthetic data sets without using domain knowledge to guide the ML learning process leads to poor extrapolation capabilities (Popova et al., 2017). If for example the ML algorithm happens to inaccurately extrapolate the process parameters in-situ due to data noise or outliers then the manufacturing process is likely to fail. Similarly, it would not be possible to use traditional ML algorithms in an offline setting to explore new materials, process parameters, and microstructures.

In Dass and Moridi (2019) the authors demonstrate AM operating windows for different materials across a plot of power feed rate versus linear heat input. Sintering outside the experimentally-validated operating windows is either unexplored or associated with defects such as lack of fusion, porosity, and key-holing. In Clymer et al. (2017), the authors provide a theory-based operating window within the absorbed power versus beam scanning velocity space for 316 L stainless steel to achieve the yield strength, density, etc. Beyond AM, other fields also function within nominal operating windows. Friction stir welding (FSW) is also bounded by operating windows to avoid weld flaws (Kah et al., 2015; Sattari et al., 2012). For example, Fraser et al. (2018) provides a theory-based and experimentally-validated (AA6061-T6 butt joints) process window for FSW. However, extrapolating beyond the operating windows via physics-consistent and interpretable ML models has the following impacts:

  1. 1.

    Knowledge discovery: exploring new materials, AM process parameters, and microstructures through reliable extrapolation and model interpretability.

  2. 2.

    Physics-consistency: within the process operating windows (i.e. training data), theory-guided ML models would perform more accurately than physics-based models and data-driven models.

  3. 3.

    Additive manufacturing capabilities: through knowledge discovery and physics-consistency, theory-guided ML models can be used in real-time for process control to facilitate achieving the desired microstructre, higher manufacturing repeatability, reduce porosities, and potentially achieve functionally-graded AM capabilities. Functionally-graded materials have in-homogeneous mechanical properties achieved through the microstructure undergoing a gradual change along some direction.

Notable prior-guided and data-driven Ml hybrid approaches for microstructure predictions include Jha et al. (2018) and Popova et al. (2017). Jha et al. (2018) look at alloy design (rather than AM) and utilize machine learning (k-NN algorithm) to create meta-models from computationally-expensive CALPHAD solidification data (soft magnetic alloys). The combined ML-CALPHAD TGDM approach successfully captured relationships between the processing variables (composition, temperature, annealing time) and the alloy structure (mean radius and volume fraction).

Popova et al. (2017) employed a prior-guided data-driven approach and used data from the SPPARKS kinetic Monte Carlo (kMC) simulation suite to develop a multivariate regression polynomial model that can estimate microstructure features as function of the AM process parameters. They use the chord length distribution (CLD) algorithm along with principal component analysis to quantify features and perform feature selection on the kMC generated microstructure data. The authors note that the normalized MAE for some of the predicted CLDs in the testing data set were significant (\(>0.25\) compared to roughly \(<0.15\) for other test data points). They discover through a visual plot of the training/testing data that the test cases with high MAE were outside the range of the training data set (extrapolation). The work in Popova et al. (2017) was not validated against experimental data.

Fu et al. review the applications of ML to metal laser-based AM for defect detection (Fu et al., 2022). The authors point out the need for methods capable of dealing with varying sensor accuracy and spatio-temporal resolutions. They discuss that defect detection should focus on detecting and evaluating the defect impact rather than just detecting it, highlighting the importance of a complete process-property-performance modeling framework. They highlight the importance of feedback control and in-situ correction systems to mitigate the impact of the defects. In Lin et al. (2022) the authors further argue that in order to achieve closed-loop control, it is necessary to employ a hybrid physics-guided data-driven model and give the example of physical-guided loss functions.

Guo et al. (2022) discuss the challenges for physics-informed ML in metal AM. The challenges include limited data availability and a data imbalance where most data is considered compliant/defect-free. They bring attention to the challenges of data modality and the importance of developing data preprocessing methods that can be run in an in-situ manner. Regarding data integrity, they highlight that AM sensor monitoring data is imperfect and can be corrupted with noise, missing data, and incorrect measurements.

In another review, Meng et al. (2020) brings attention to the lack of research on uncertainty quantification in ML for AM. Uncertainty quantification and minimization for AM regression and classification process-property-performance linkage tasks are integral toward corrective feedback control and decision making.

In Herzog et al. (2023) the authors review the application of various ML methods towards in-situ defect detection in laser-based AM. The authors highlight that thermal and visible light images and convolutional neural networks are the most widely used approach for porosity classification. However, thermal and visible light process monitoring images create large data volumes leading to storage and transmission difficulties.

In Gaja and Liou (2018) the authors can detect and differentiate between porosities and cracks on a per-layer basis using feature extraction on acoustic emission (AE) data. The porosities and cracks were generated by mixing Ti-6Al-4V powder with H13 tool steel powder in the LMD printing process. The data was collected from 15 mm long and 5 mm long deposits. The AE signal was recorded during sintering, and the feature extraction was performed at the end of the deposited layer. The data labels were acquired via unsupervised learning in Gaja and Liou (2017). The experiment resulted in 5 crack defects and 32 porosities. Using the 37 samples, the authors train a logistic regression model and a neural network classifier to predict whether the AE signals correspond to porosity or a crack defect. The models were validated on a second AE signal acquired under the same experimental conditions.

Liu et al. propose using machine learning regression with physics-informed features to separately predict the mean pore diameter, median pore diameter, max pore diameter, and pore spacing (Liu et al., 2021). For each regression task, for example, predicting the mean pore diameter, they categorize the porosity data into pass, flag, and fail clusters via domain knowledge. Per porosity cluster, they train each model individually for improved performance. Training models individually per porosity cluster is a means of dealing with the data imbalance in the skewed porosity size distribution. Although the authors use k-fold cross-validation, we raise the potential for their proposed framework to suffer over-fitting. Over-fitting can occur due to neglecting the information available in porosity-free data, splitting the regression outputs into one per model, and splitting each regression output into three sub-categories. Omitting the porosity-free data and splitting the data into several sub-categories leads to a tailored model that is less likely to be generalizable on new data. The authors propose a novel physics-informed pore generation explanation method. They visualize the relationship between the physics effects and porosity via physics-porosity correlation maps such as power-intensity versus maximum-pore-diameter. The pore generation explanation method provides porosity suppressing regions and porosity encouraging regions.

In Gobert et al. (2018) the authors investigate the use of high resolution digital single-lens reflex (DSLR) cameras for defect detection in metal L-PBF. They collect multiple images where images one to three are captured following powder re-coating, and images four to eight are collected following the laser fusion step. Per three-dimensional neighborhood, they extract multi-dimensional features that span multiple layers via 3D convolution filters. They classify neighborhoods as anomalous or nominal via linear SVM classification and ensemble classification schemes. Via cross-validation experiments, the results show an in situ defect detection accuracy greater than 80% for flaws with a spherical equivalent diameter (SED) larger than 47 \(\upmu \)m. The authors also highlight the importance of assessing features through multiple layers to account for the melt pool remelting zone and accurately identify discontinuities.

In Jafari-Marandi et al. (2019) the authors propose a cost-driven decision-making framework which uses in-situ melt pool images and considers the cost of the spatial distribution of microstructural defects. The proposed framework recommends correction actions based on the cost of the spatial distribution of the defects.

Okaro et al. use a Gaussian mixture model and an expectation-maximization algorithm in a semi-supervised framework (Okaro et al., 2019). Photodiode data is acquired during the L-PBF of 49 tensile test bars, and ultimate tensile strength tests (UTS) were then used to categorize each bar as faulty or acceptable. The authors apply feature extraction to the photodiode data and validate the models via 2-fold cross-validation. The results show a 77% classification success rate on the ’acceptable’ parts. The semi-supervised approach utilizes the labeled UTS data but also leverages unlabeled data.

Tain et al. develop deep learning neural networks and a data fusion method to predict porosity in laser AM via melt pool thermal history data (Tian et al., 2021). The first model proposed is a convolutional NN (PyroNet) to correlate the in-situ pyrometry images with layer-wise porosity. A second long-term Recurrent convolutional Network (IRNet) is developed to correlate sequential thermal images with layer-wise porosity. The PyroNet and IRNet predictions are fused to improve the prediction accuracy of the layer-wise porosity. The proposed approach is validated via 6-fold cross-validation on Ti–6Al–4V thin-wall. The authors discuss and analyze the results per fold of the cross-validation. The author’s method’s are not tested on previously unseen test data set.

In Taherkhani et al. (2022) the authors take on an unsupervised self-organizing map (SOM) learning method to detect defects from co-axial data emitted from the melt pool during L-PBF. They test the proposed methods on samples with different process parameters, intentional micro-voids, and normally occurring porosities. The predicted defects’ position and size are compared to computed tomography scans via a volumetric segmentation method and a confusion matrix. The prediction algorithm achieved a 61% to 94% sensitivity rate and a 69% to 93% specificity rate. The authors note that the sensitivity and specificity depend highly on the process parameters. The SOM algorithm detected intentional defects ranging from 100 to 320 \(\upmu \)m in diameter.

In Jayasinghe et al. (2022) the authors apply unsupervised clustering methods on features extracted from photodiode data to classify the build density of L-PBF sintered cubes with an accuracy of 93.54%. They also use supervised regression to predict build density with a RMS error of 3.65%.

In Snow et al. (2022) the authors collect layer-wise imagery, multi-spectral emissions, and laser scan vector data of cylindrical L-PBF builds. X-ray computed tomography (XCT) and an automated defect recognition software were utilized to identify internal porosities. To create a balanced ML data set, a random nominal sample was selected for each flaw from the same part. Given a porosity/nominal coordinate, a local neighborhood of sensor data is extracted for each data sample. Each neighborhood corresponded to roughly \(940.00 \times 940.00 \times 660.00\) \(\upmu \)m (11 layers in the z coordinate). Convolutional neural networks were trained to discriminate porosities from nominal through a 70% training and 30% validation data split. The proposed approach is tested on a previously unseen data set collected from an independent build. The authors’ methods can detect porosities with a SED of 200.00 \(\upmu \)m and larger with a 93.90% classification accuracy.

In Mao et al. (2023), the authors propose a porosity detection-prediction framework based on CNNs and convolutional recurrent neural network (ConvLSTM) that predicts artificial porosity in the next layer based on the thermal signatures of the previous layers. The proposed methods are validated on Ti6Al4V parts with six artificially introduced porosities per region distributed in a hexagonal pattern and varying in diameter across regions. The porosity diameters examined range from 0.10 to 1.10 mm. The proposed approach successfully predicts the artificial porosity using computerized tomography scans with an F1-score of 0.75. One limitation of this work is that the artificial porosities re-occur at the exact locations within the part across the layers. Varying the porosity locations across layers would be a valuable test to examine the generalisability of the proposed methods.

In Oster et al. (2023), the authors propose a supervised CNN architecture for predicting porosity in local clusters via thermogram features. The proposed methods are validated on AISI 316 L stainless steel parts where the volumetric energy density was increased at three distinct build heights to force the formation of pores. The proposed approach is validated via computerized tomography scans and successfully predicts the porosity with an F1-score of 0.86 for the porosity above a threshold of 0.10% in clusters with the dimension of (\(700.00 \times 700.00 \times 50.00\)) \(\upmu m^3\). The authors report one drawback of the proposed approach is the limited real-time capability of the framework, which is dependent on the number of included subsequent layers.

We summarize the gaps in the literature of L-PBF data-driven in-situ porosity localization:

  1. 1.

    The limiting need for three-dimensional neighborhoods of sensory data from multiple sintered layers before making porosity predictions. Three-dimensional neighborhoods of data require collecting sensory data across several layers before making a porosity localization prediction which prohibits within-layer porosity detection and mitigation. Detecting and localizing porosity after multiple sintered layers have succeeded may hinder the possibility and quality of mitigating porosities.

  2. 2.

    Abundant information-rich porosity-free sensory data is often omitted to avoid an imbalanced data set.

  3. 3.

    Porosities are often simulated via mixing different powders and are larger than or equal to 100 \(\upmu \)m SED. However, the porosity formation heat dynamics that occur by mixing powders are not necessarily the same porosity formation heat dynamics that occur when using a single powder. Gas porosities in L-PBF are typically smaller than 100 \(\upmu \)m, and as small 10 \(\upmu \)m SED (Choo et al., 2019; King et al., 2014). Higher gas porosity counts result in crack initiation and propagation paths, resulting in a decrease of impact strength (Girelli et al., 2019; Kan et al., 2022). Research should address the challenge of localizing naturally occurring micro-porosities (\(<100\,\upmu \)m).

We address the gaps discussed in metal AM porosity localization via proposing a framework with the following contributions:

  1. 1.

    For the first time in literature, the capability to in-situ detect and localize porosities via a local neighborhood of within hatch stripe sensory data as opposed to a local 3-dimensional neighborhood of inter-layer sensory data.

  2. 2.

    A framework based on prior-guided neural networks to utilize the often abundant PBF nominal data to (1) improve the detection and localization of micro-porosities with a SED smaller than 50.00 \(\upmu \)m and (2) increases confidence in the model predictions via reducing the classification log loss error standard deviation. The often abundant PBF nominal data is utilized in the form of a prior training loss and a prior validation error.

  3. 3.

    To the author’s knowledge, this is the first work in the literature capable of the localization of naturally occurring porosities as small as 38.118 \(\upmu \)m SED (\(0.029\times 10^-3\) mm\(^3\)), smaller than 47 \(\upmu \)m SED in Gobert et al. (2018). To the author’s knowledge, the proposed methods localization of porosities as small as 38.118 \(\upmu \)m SED is also a more than a five-fold improvement on the smallest SED porosity localization via photodiode sensory data (Snow et al., 2022).

Our contributions are further supported via:

  1. 1.

    The proposed porosity detection and localization is done on the bulk core of the part and the border of the part which is often unused in the literature.

  2. 2.

    Decreasing the classification log loss error standard deviation via utilizing a portion of the excess porosity-free data in the form of prior loss function under the framework of prior-guided neural networks.

  3. 3.

    Thorough experimental validation and testing on single-material cubes sintered using various process parameters via a randomized Box Behnken design with 3 factors and 3 replicates.

  4. 4.

    Thorough model validation and testing via repeated k-fold cross-validation and multiple random model weight initialization.

The remainder of the paper is organized as follows. Section “Methodology” presents the problem formulation and methodology, including the data preprocessing, feature extraction, and machine learning. The Evaluation Section in “Evaluation” covers the machine learning model variants and the training, validation, and testing. In section “Results and analysis” we present the results and analysis. Finally, in the Conclusion section “Conclusion” we discuss the proposed methods’ findings, strengths, and limitations.

Methodology

Problem formulation and preliminaries

In this work we use the RenishawTM AM 500M PBF machine which uses a 500.00 W ytterbium fiber laser for sintering and the Renishaw InfiniAM spectral emissions systems for monitoring. The Renishaw InfiniAM spectral emissions systems includes three high precision co-axial single-channel photodiodes. The LaserVIEW system photodiode monitors the laser power. The MeltVIEW infrared light photodiode (1080.00 to 1700.00 nm) and the MeltVIEW visible light photodiode (700.00 to 1050.00 nm) monitor the melt-pool plume characteristics.

The part design is a cube with a 10.00 mm edge length and includes three smaller cubes at the top surface with a 2.00 edge length (Fig. 1). The three smaller cubes are geometric markers designed to facilitate alignment between the PBF and the XCT data.

A total of 45 cubes were sintered using high strength aluminum A205. The hatch spacing \({\mathcal {H}}\) and layer thickness \({\mathcal {L}}_{\mathrm{{t}}}\) are a constant 0.05 mm across the cubes. The border power, border speed, and hatch offset \({\mathcal {H}}_{\mathrm{{o}}}\) were varied following a randomized Box Behnken design with 3 factors and 3 replicates. The values utilized for the independent process parameter variables are:

  1. 1.

    Border power: 100.00, 300.00, and 500.00 W

  2. 2.

    Border hatch Offset (a): 0.00, 0.05, and 0.10 mm

  3. 3.

    Border speed: 50.00, 175.00, 300.00 mm/s

Fig. 1
figure 1

The part design and dimensions (mm unit) examined in this work

Fig. 2
figure 2

Porosity visualization of cube three: a The raw X-ray computed tomography data showing the porosity shape and volume and b the pre-processed dichotomous classification data \(\{0,1\}\) data after removing 3.00 mm from the bottom of the cube along the z-plane to omit the bandsaw porosities/saw marks

Fig. 3
figure 3

The XCT porosity spherical equivalent diameter distribution across the 44 cubes in a and b a zoom in on the distribution of spherical equivalent diameters smaller than 50.00 \(\upmu \)m

X-ray computed tomography data

In Fig. 2 we see a visualization of the XCT porosities in cube three. The raw XCT data for each cube includes x coordinate, y coordinate, z coordinate, and porosity volume. The coordinates are in inches, and the porosity volume is in mm\(^3\). We combine the text files into one matrix and label each sample with its respective cube index. The x, y, and z coordinates are converted from inches to mm. The porosity volume across the 44 cubes is right-skewed as seen in Fig. 3. Below is a list of the porosity volume statistics across the 44 cubes.

  1. 1.

    Number of porosities: 4733

  2. 2.

    Total porosity volume: 1.977 mm\(^3\)

  3. 3.

    Porosity volume mean: 4.177e\(-\)04 mm\(^3\) (layer thickness: 500.00e\(-\)04 mm)

  4. 4.

    Porosity volume standard deviation: 6.099e\(-\)04 mm\(^3\)

  5. 5.

    Porosity volume median: 2.02e\(-\)04 mm\(^3\)

  6. 6.

    Porosity volume IQR [0.25, 0.75]: [0.094e\(-\)03, 0.475e\(-\)03 ] mm\(^3\)

  7. 7.

    Minimum porosity volume (SED): 23.00e\(-\)04 mm\(^3\) (35.28 \(\upmu \)m)

  8. 8.

    Maximum porosity volume (SED): 83.00e\(-\)04 mm\(^3\) (251.00 \(\upmu \)m)

After sintering, the cubes were sawed off the base printing plate via an automatic bandsaw. The automatic bandsaw removed approximately 2 mm from each cube (saw kerf). After sawing, the bandsaw leaves a diagonal pattern of porosities/saw marks on the base of the cubes. We remove 1 mm of the XCT data from the bottom of the cube along the z-pane to omit the saw marks. A total of 3.00 mm from the bottom of each cube along the z-plane are omitted from the study. Seven large artifact porosity samples from the XCT scans were outside the cubes’ edges in the three-dimensional scan images (cubes 30, 35, 37, and 38). The seven artifact samples were omitted from the study.

The zero position of the XCT data is in the cube’s center with respect to the x, y, and z planes, excluding the three upper features. We shift the x, y, and z coordinates by adding [5, 5, 8] mm, such that x and y span [0, 10] and z spans [0, 9]. Due to the cube base unevenness and the non-smooth border surface, there are 1 or 2-dimensional geometric translational errors in the XCT centering of some cubes. Cubes with translational errors can be visually and computationally detected via searching for porosity locations outside the cube edge ranges of [0, 10] for the x and y plane and [0, 9] for the z plane. Note that the search for translational error in the XCT data must also separately consider the three smaller feature cubes. The cubes with translational error identified are cubes number 2, 3, 5, 10, 17, 18, 19, 20, 23, 30, 33, 37, and 40. Cubes with a translational error are shifted and centered around [5.0, 5.0, 4.5] along the x, y, and z planes, respectively. The number of porosities per cube is shown in Fig. 4.

Fig. 4
figure 4

The X-ray computed tomography porosity count distribution per cube

PBF sensor data

The raw data acquired from the RenishawTM AM 500M machine and the Renishaw InfiniAM spectral emissions systems are 241 files containing the following features in columns: Time stamp in us, duration in us (\({\underline{t}}\)), x and y coordinates of the laser in mm (\(\underline{\acute{x}}\) and \(\underline{\acute{y}}\)), laserVIEW normalized photodiode laser power (\({\underline{p}}\)), MeltVIEW photodiode intensities (\({\underline{q}}\) and \({\underline{r}}\)). The 241 files correspond to the 241 layers sintered. The raw files are compiled into one matrix, and the z coordinate data is added as a feature (\(\underline{\acute{z}}\)). Each sample (row-wise) is assigned its corresponding cube number for reference.

We remove the [0.00, 3.00] mm range of the machine/sensor data from the bottom of the cube along the z-pane to match the XCT. For ease of data readability, we shift each cube’s x, y, and z coordinates, such that x and y span the [0.00, 10.00] mm range and z coordinates span the [0.00, 9.00] mm range. Note that the x and y coordinate data for layers 101 and 102 have a geometry translation error and have been shifted to match the other layers. The geometry translation error is expected to be due to the acquisition system.

Due to the sensor acquisition rate, there are occasions where multiple consecutive samples are acquired at the same x and y coordinates. For example, if a given x and y coordinate is sintered for 40 us, two sensor readings spanning 20 us each could be acquired. Consecutive samples per cube per z-axis layer with the same x and y coordinates are combined into one sample. The duration feature is summed, and the three sensor data features are averaged. The cube index feature is also updated accordingly. After combining samples with repeated x and Y coordinates, the total number of samples is reduced by 1,758,037 samples from 267,867,956 to 266,109,919. The mean number of samples per cube is 5.953e+06 and the mean number of samples per 1.00 mm\(^3\) is 8.222e+03.

Feature addition

To improve the the models’ prediction accuracy we compute/create additional features including a border label, hatch offset, scan speed, and the energy density.

The core of the 45 cubes is sintered with the same parameters, but the border sintering parameters vary. It is necessary to have a border label for each sample to avoid a data imbalance when training the NN models. We clarify that border labels are only needed when training and would not be needed when running a trained model in-site. The border labeling procedure per PBF sample is discussed in Appendix A. In Fig. 5 we demonstrate the border labels for cube 5 across all the 180 layers.

The hatch spacing feature, \({\underline{w}}\), in meters is created via the cube number and border label. The hatch spacing for the cube cores is set as 0.050 mm. The hatch spacing for the cube border is set as the hatch offset value, a, depending on the cube number. The border samples of cubes with a null hatch offset, \(a=0\), are assigned a hatch spacing of the single track border spot size 0.080 mm.

The scan speed, \({\underline{v}}\), in meters per second is calculated via the duration and the x and y coordinates (Eq. 1).

$$\begin{aligned} \begin{aligned}{} & {} {}&{\underline{v}}_i = \frac{ \sqrt{{ \Delta {\underline{\acute{y}}}_i}^2 + { \Delta {\underline{\acute{x}}}_i}^2} }{{\underline{t}}},{} & {} {}&i \in \{1,\ldots ,I-1\}, \end{aligned} \end{aligned}$$
(1)

where \({\underline{v}}\) is the scan speed (m/s) and \({\underline{t}}\) is the scan duration (s).

Fig. 5
figure 5

A visualization of cube five showing the result of the border labeling procedure

Fig. 6
figure 6

The unitless energy density distribution across the core and border samples

A unitless energy density, e, is calculated via the scan speed, hatch spacing, LaserVIEW system photodiode feature, and the layer thickness (Eq. 2). The energy density distribution across the core and border samples is shown in Fig. 6 and clarifies why border labeling is important to avoid a data imbalance when training NN models. The assumptions made in the calculation of the energy density via Eq. 2 are covered in Prashanth et al. (2017). Although used widely in the literature, the energy density calculation in Eq. 2 may be miss-representative when used for PBF process parameter optimization. In this work, the assumptions made for the energy density calculation in Eq. 2 are less relevant in terms of porosity detection and localization. Albeit a more accurate energy density measure may improve porosity detection and localization.

$$\begin{aligned} {\underline{e}}_i = \left\{ \begin{array}{rcl} 0 &{} for &{} {\underline{v}}_i = 0\\ \frac{{\underline{p}}_i}{b \; {\underline{v}}_i \; {\underline{w}}_i } &{} for &{} {\underline{v}}_i \ne 0\\ \end{array}i \in \{1,\ldots ,I\} \right. \end{aligned}$$
(2)

where \({\underline{e}}\) is a unitless energy density.

Porosity-free coordinates

The PBF data is imbalanced with most of the samples being porosity-free. To create a balanced Bernoulli distributed classification data set, we need to identify 4733 porosity-free coordinates in the 45 cubes to match the \(C=\) 4733 XCT porosity samples. Additionally, we note the data imbalance in the porosity locations where most of the porosities are in the cube core relative to the cube borders. The porosity location imbalance is significant since the core samples have the same sintering settings while the border sintering settings vary across cubes. For a balanced data set, it is necessary to ensure an equal number of core and border samples between the porosity and porosity-free coordinates.

We partition the PBF data of each cube into 0.25 mm\(^3\) blocks. A block is identified and handled via its center coordinates. We use the XCT porosity locations \([\underline{\breve{x}}, \underline{\breve{y}}, \underline{\breve{z}}]\) to identify the porosity and porosity-free blocks. The border label \({\underline{u}}\) is used to identify the border blocks relative to core blocks. According to the 0.25 mm\(^3\) block boundaries, the XCT samples can be divided into 3304 core and 1429 border porosities. A block is considered a border block if it contains one or more border samples according to \({\underline{u}}\). Similarly, according to the 0.25 mm\(^3\) block boundaries, we identify 4050 and 4212 core and border block coordinates that are porosity-free across all 45 cubes. Therefore, there are 182,250 and 189,540 porosity-free core and border block coordinates and cube number combinations.

Fig. 7
figure 7

A visualization of the X-ray computed tomography porosities in cube three and the sensory data surrounding one porosity a the porosity in cube three, b the sensory samples surrounding one porosity, and c the within-hatch stripe sensory samples (five samples) utilized in the feature extraction

We randomly pick 3304 core and 1429 border porosity-free block coordinates and cube numbers to represent the porosity-free data. We define matrix \(\pmb {B}\) including the porosity-free block coordinates and cube numbers and the porosity XCT coordinates and cube numbers. The matrix \(\pmb {B}\), therefore, has 9466 rows and 4 columns containing the x, y, and z coordinates and cube numbers. We randomly pick an additional 3304 core and 1429 border porosity-free block coordinates and cube numbers (\(\pmb {C}\)) to represent a prior data-set. The prior data set use is discussed in section 4.7. When populating the prior data set \(\pmb {C}\), we ensure that non of the samples in \(\pmb {B}\) are repeated in \(\pmb {C}\) to avoid data leakage. For matrices \(\pmb {B}\) and \(\pmb {C}\) we define vectors \({\underline{y}} \in \{0,1\}\) and \({\underline{z}} \in \{0,1\}\) where porosities are assigned value 1 and porosity-free coordinates are assigned 0.

Ideally, a balanced data set would have an equal amount of porosity and porosity-free samples per combination of process parameters. As seen in Fig. 4 the porosity count distribution varies between the cubes, and some process parameter combinations result in more XCT porosity samples. Repeating the experiments several times to have an approximately equal number of porosity samples per combination of process parameters is time costly and requires expert knowledge. However, we note the choice taken here to have an approximately equal number of porosity-free samples across the cubes to improve the data imbalance.

Feature extraction

Feature extraction is a process of dimensionality reduction which reduces a set of raw or pre-processed data to a more manageable groups for processing and modeling tasks (Atwya & Panoutsos, 2019). The objective of this subsection is to determine which PBF samples should be used to represent the 9466 porosity and porosity-free coordinates and cube numbers (\(\pmb {B}\)). The number of PBF samples is \(I=\) 266,109,919. Given that the largest porosity volume is 83.00e\(-\)04 mm\(^3\) and the layer thickness and hatch spacing are 500.00e\(-\)04 mm, the largest porosity is smaller than the Renishaw PBF data resolution. Assuming that each sample in \(\pmb {B}\) corresponds to the nearest PBF sample in terms of the x, y, and z three-dimensional euclidean distance creates two challenges. The first challenge is correctly identifying the PBF sample closest to the XCT sample due to geometric and dimensional errors when mapping the XCT scans on the PBF data. The second challenge is that the sintered part properties and presence of porosity at a specific coordinate are affected by the surrounding hatch paths and the preceding and succeeding layers due to the laser spot overlapping and the melt pool remelting zone (section 2).

Instead of using the pre-processed PBF data set (\(I=\) 266,109,919 samples), we propose a framework where we use feature extraction measures on the five closest PBF samples to each XCT sample and porosity-free coordinate in terms of the three-dimensional euclidean distance. We note that using only the nearest five PBF samples results in in-situ porosity localization via exclusively one-dimensional hatch stripe and two-dimensional surrounding hatch stripes (laser spot overlap) data. The proposed framework does not utilize a three-dimensional neighborhood of data (i.e. does not make use of preceding/succeeding layers). Given a porosity or porosity-free coordinate, the five closest PBF samples are extracted from an equivalent \(0.04\pm 0.01\) hatch stripe length Fig. 7c.

In Eq. 3 we define the matrix \(\pmb {A}_{ic}\) the three-dimensional euclidean distance between each XCT sample (c) and every PBF sample (i). We find the five closest PBF samples to each XCT sample in terms of the euclidean distance \(\pmb {A}_{ic}\). Accordingly, for each sample in \(\pmb {B}\) we have five representative PBF samples. The same procedure is repeated for the prior matrix \(\pmb {C}\) porosity-free block coordinates and cube numbers. We note that utilizing the a pre-defined three-dimensional sized neighborhood of PBF samples is flawed in terms of a ML framework as it would lead to a different number of PBF samples per neighborhood due to the scan speed and hatch spacing/offset varying between cubes.

$$\begin{aligned} \begin{aligned}{} & {} {}&\pmb {A}_{ic} =\sqrt{ ( \underline{\acute{x}}_i - \pmb {B}_{c1})^2 + (\underline{\acute{y}}_i - \pmb {B}_{c2})^2 + ( \underline{\acute{z}}_i - \pmb {B}_{c3})^2 },\\{} & {} {}&i \in \{1,\ldots ,I\}, c \in \{1,\ldots ,2C\}, \end{aligned} \end{aligned}$$
(3)

where C is the number of XCT porosity samples and \(\pmb {A}_{ic}\) is the three-dimensional euclidean distance between each PBF and XCT sample x, y, and z coordinates.

We choose to set the three spectral emission sensors, scan speed, hatch spacing, and energy density as the machine learning input variables of interest. For each five representative PBF samples per coordinate in \(\pmb {B}\) and \(\pmb {C}\) we utilize the mean, variance and skewness as feature extraction methods on the spectral emission sensors, scan speed, hatch spacing, and energy density. The supervised machine learning input data set consists of 15 columns including (1) the mean, variance, and skewness of the three spectral emission sensors (\({\underline{p}}\), \({\underline{q}}\), and \({\underline{r}}\)) and (2) the mean and variance of the scan speed, hatch spacing, and energy density (\({\underline{v}}\), \({\underline{w}}\), and \({\underline{e}}\)).

Supervised machine learning

The input (extracted features) and target labeled data presents a Bernoulli distributed classification problem with a nominal dichotomous target [0, 1] where 0 indicates porosity-free and 1 indicates a porosity. To model the non-linear relationship between the input and target we use the universal approximator non-linear in the parameter two-layer feed-forward multi-layer perceptron (MLP) model within the framework of prior-guided neural networks. The MLP in this work consists of 15 input neurons, one hidden layer, and an output layer with one neuron. We use non-linear rectified linear unit activation function for the hidden layer and a logistic activation function for the output layer. The NN cost function in this work consists of an empirical loss function \({\mathcal {L}}_{\mathrm{{e}}}\) and a regularization loss function \({\mathcal {L}}_{\mathrm{{r}}}\) (Eq. 4).

To utilize the prior dataset, we use a prior-guided neural network as in Atwya and Panoutsos (2022), Liu and Wang (2019), Muralidhar et al. (2018), Jagtap et al. (2020). The class of prior-guided neural networks in Atwya and Panoutsos (2022), Liu and Wang (2019), Muralidhar et al. (2018), Jagtap et al. (2020) incorporate prior-based loss functions to the cost function of a vanilla NN. We add the prior knowledge-based loss function \({\mathcal {L}}_{\mathrm{{p}}}\) to the cost function in Eq. 5. Paper (Atwya & Panoutsos, 2022) covers the structure optimization of prior-knowledge guided neural networks and the impact of data complexity, hyperparameters, and the data set size.

The prior-guided framework utilized leads to three hyper-parameters: regularization loss weight (\(\rho _{\mathrm{{r}}}\ge 0\)), prior-based loss weight (\(\rho _{\mathrm{{p}}}\ge 0\)), and the hidden unit number (Eqs. 4 and 5). The work in Atwya and Panoutsos (2022) shows that structure-optimized prior-guided neural networks are not affected by non-adaptive regularization and prior-based loss weights. In this work, we utilize the structure-optimization methods in Atwya and Panoutsos (2022) and set the regularization and prior-based loss weights as constants. We discuss the benchmark and prior-guided model variants examined in section 5.1.

$$\begin{aligned} {\mathcal {L}}_{NN} = {\mathcal {L}}_{\mathrm{{e}}} + \rho _{\mathrm{{r}}} {\mathcal {L}}_{\mathrm{{r}}} \end{aligned}$$
(4)

where \({\mathcal {L}}_{\mathrm{{e}}}\) is the empirical loss, \({\mathcal {L}}_{\mathrm{{r}}}\) is the regularization loss, and \(\rho _{\mathrm{{r}}}\) is the regularization loss weight.

$$\begin{aligned} {\mathcal {L}}_{PGNN} = {\mathcal {L}}_{\mathrm{{e}}} + \rho _{\mathrm{{r}}} {\mathcal {L}}_{\mathrm{{r}}} + \rho _{\mathrm{{p}}} {\mathcal {L}}_{\mathrm{{p}}}, \end{aligned}$$
(5)

where \({\mathcal {L}}_{\mathrm{{p}}}\) is the prior loss and \(\rho _{\mathrm{{p}}}\) is the prior-based loss weight.

Evaluation

In this Section, we discuss the benchmark models, prior-guided model variants and the model training, validation, and testing framework.

Model variants

To benchmark the proposed prior-guided neural network approach, we test a support vector machine (SVM), a random forest (RF), and a vanilla multi-layer perceptron neural network (NN). Below is a list of the prior-guided and benchmark models:

  • PGNN: A prior-guided neural network with a prior loss and one hidden layer with 100 hidden units.

  • PGNN\(_1\): A prior-guided neural network with a prior loss and one hidden layer where the number of hidden units is optimized via a line search.

  • NN: A vanilla multi-layer perceptron NN with 100 hidden units (no prior usage).

  • SVM: A support vector machine model with a radial basis function (RBF) kernel (no prior usage).

  • RF: A random forest model with 500 trees and one minimum observation per leaf (no prior usage).

Model training

We split the data set (9466 samples) where half of the data set is used to train and validate the models and the second half is used to test the models. The second half of the data is referred to as the out-of-sample-data is reserved and only used to test the model performance after training and validation. Before splitting the data, the input and target row order is randomized. The minimum and maximum porosity volumes in the training/validation and testing data splits are provided below. Note that the minimum porosity volumes provided below are for reference and exclude the porosity-free samples.

  • Training/validation data minimum porosity volume (SED): 23.00e\(-\)04 mm\(^3\) (35.28 \(\upmu \)m)

  • Training/validation data maximum porosity volume (SED): 62.72e\(-\)04 mm\(^3\) (228.81 \(\upmu \)m)

  • Testing data minimum porosity volume (SED): 29.00e\(-\)04 mm\(^3\) (38.12 \(\upmu \)m)

  • Testing data maximum porosity volume (SED): 82.80e\(-\)04 mm\(^3\) (251.00 \(\upmu \)m)

The empirical \({\mathcal {L}}_{\mathrm{{e}}}\) and prior \({\mathcal {L}}_{\mathrm{{p}}}\) loss were computed via the cross-entropy cost function (Eqs. 6 and 7. The regularization loss \({\mathcal {L}}_{\mathrm{{r}}}\) was computed via L2 regularization (Eq. 8). The regularization weight was set as \(1e-3\) for the NN model. The regularization and prior weights were set as \(1e-3\) and \(1e-3\) for the PGNN models. Each input feature was transformed via z-score standardization (i.e. zero mean and unit standard deviation). The NN and PGNN model weights were initialized via the He initialization method (He et al., 2015). The loss functions in Eqs. 5 and 4 were minimized via the scaled conjugate gradient optimization algorithm (SCG) to find the optimal model weights. The SCG algorithm was allowed a maximum of 1e3 iterations to find the solution. Note that the mean and standard deviation of the input features in the training data set were used to normalize (z-score standardization) the input features in the prior data sets.

The SVM model was trained by minimizing the \(L_1\)-norm of the slack variables via the Sequential Minimal Optimization algorithm (Fan et al., 2005). The Sequential Minimal Optimization algorithm was allowed 1e8 iterations to find the solution.

In the RF ensemble model, each tree was created via bootstrap samples of the input data. Note that for each tree, data samples not included in the bootstrap sample are considered the out-of-bag samples. Breiman’s random forest algorithm (Breiman, 2001) is utilized to train the RF ensemble via the CART algorithm (Breiman, 2017) and Gini’s diversity index.

$$\begin{aligned} {\mathcal {L}}_{\mathrm{{e}}} = -\sum _{i=1}^{I} \bigg ( \underline{{\hat{y}}}_i \ln \left( {\underline{y}}_i \right) + \left( 1 -\underline{{\hat{y}}}_i \right) \ln \left( 1 - {\underline{y}}_i \right) \bigg ), \end{aligned}$$
(6)

where I is the number of data points and \({\hat{y}}_i\) and \(y_i\) are the prediction and target for the \(i^{th}\) input vector.

$$\begin{aligned} {\mathcal {L}}_{\mathrm{{p}}} = -\sum _{i=1}^{I} \bigg ( \underline{{\hat{z}}}_i \ln \left( {\underline{z}}_i \right) + \left( 1 -\underline{{\hat{z}}}_i \right) \ln \left( 1 - {\underline{z}}_i \right) \bigg ), \end{aligned}$$
(7)

where I is the number of data points and \({\hat{z}}_i\) and \(z_i\) are the prediction and target for the ith input vector of the prior data set.

$$\begin{aligned} {\mathcal {L}}_{\mathrm{{r}}} = \frac{1}{2} \underline{w}^T\underline{w}, \end{aligned}$$
(8)

where \(\underline{w}\) is a vector of the model weights.

Model validation

The number of hidden units optimization and use of the prior loss (PGNN and PGNN\(_1\)) is performed as in Atwya and Panoutsos (2022). The number of hidden unit coarse search was performed across a linearly spaced vector of 8 values within [30, 100]. The coarse search is followed by a fine search with a vector of 10 linearly spaced points in the range of \(\pm 10\) of the coarse best-estimated number of hidden units value. We use the same model training, validation, and testing framework as in Atwya and Panoutsos (2022).

To validate the NN and PGNN models, we use three-time-repeated 5-fold cross-validation (CV). The 5-fold CV is repeated three times for every randomly initialized set of weights using re-divided data subsets. The empirical and prior-based CV errors were measured via the cross-entropy function.

The RBF Kernel scale and the box constraint penalty hyperparameters of the SVM model are optimized via a grid-search. The hyperparameter search is performed over a log-scaled grid in the range \([1e-3,1e3]\). The SVM model was validated via 5-fold CV and the \(L_1\)-norm of the slack variables as empirical CV performance index.

The RF model was validated via the out-of-bag data. Roughly two thirds of the input data is selected for training for every tree and the remaining one third is used as out-of-bag validation data set.

Each NN/PGNN model was trained and validated 64 times using randomly initialized weights, and the mean/median results are provided in section 6. The SVM model was trained and validated 64 times using randomly re-partitioned data in the CV framework. The RF model was also trained and validated 64 times using randomly re-partitioned data in the out-of-bag framework.

Model testing

Following training and validation, the mean and standard deviation of the input features in the training data set were used to normalize (z-score standardization) the input features in the testing OOS data set. The classification performance of each model is tested on the OOS data set. Note that the OOS data set were not used in the training/validation steps.

Fig. 8
figure 8

The classification accuracy mean and \(95.00\%\) confidence interval across the 64 solutions of each model type. The results are color coded and labeled according to the porosity spherical equivalent diameter range (Color figure online)

Results and analysis

Due to the log-loss cross entropy activation function, the NN and PGNN variants predict the probability of a sample belonging to a class along the interval [0, 1]. Across the 64 randomized solutions of the NN and PGNN model variants we apply Youden’s Index cutoff threshold to the model predictions providing a trade-off between the hit and false-alarm rates. Youden’s Index cutoff threshold is a decision-making step where a decision is made on the porosity status of a sintered coordinate based on the classifier’s predicted probability. Utilizing the standard Youden’s Index cutoff threshold is also a decision that sensitivity and specificity are equally important when making classifications. In this work, the SVM and RF models output a binary value of either 0 for predicted nominal samples or 1 for predicted porosity samples.

Using raw model predictions \(\{0,1\}\), the NN, PGNN, and PGNN\(_1\) achieve an area under the curve of \(0.77\pm 0.01\), \(0.77\pm 0.01\), and \(0.79\pm 0.00\), respectively. The NN, PGNN, and PGNN\(_1\) Youden’s Index values are \(0.35\pm 0.01\), \(0.36\pm 0.02\), and \(0.37\pm 0.03\), respectively. Using Youden’s Index cutoff threshold results for the NN and PGNN variants, we assess the mean and standard deviation of the classification accuracy. We report the median and interquartile ranges for the true positive count, true negative count, false positive count, false negative count, precision, recall, and F-score. The structurally optimized prior guided neural network (PGNN\(_1\)) achieved the best cross entropy error at the median and interquartile hidden unit values of 20.00[20.00, 22.00].

Figure 8 shows the classification accuracy mean and standard deviation across the entire OOS data set (combined) and different porosity volume ranges within the OOS data. Over the entire OOS data (combined), the RF model achieves the highest classification accuracy (\(72.4\pm 0.04 \%\)) followed by PGNN\(_1\) (\(70.66\pm 0.19\)). The NN and PGNN variants outperform the SVM and RF models at detecting the porosity samples as opposed to the porosity-free samples. For porosities with a SED \(<50.00\upmu \)m, PGNN\(_1\) outperforms NN, RF, and SVM by \(2.68\%\), \(21.22\%\), and \(49.83\%\), relatively. The RF and SVM models outperform the PGNN\(_1\) model by \(19.97\%\) and \(16.49\%\). The classification results show that the NN and PGNN variants have a higher recall but lower precision than the SVM and RF models. As the porosity volume increases, the classification accuracy discrepancy between the NN/PGNN models and the SVM/RF models decreases. As expected, the classification accuracy saturation at larger porosity volumes shows that larger porosities are easier to detect. The precision and recall discrepancy between the SVM/RF and NN/PGNN models is likely due to (1) the large data set size being more suited to NN/PGNN relative to SVM and (2) The porosity input sample features or characteristics may be more difficult to detect or maybe less well-represented in the data. Generally, the RF model would seem to be the best option due to its classification accuracy. However, in additive manufacturing applications, detecting porosities is more important than detecting nominal samples, and PGNN\(_1\) is a more suitable choice for porosity localization.

Fig. 9
figure 9

The precision, recall, and F-score box plot across the 64 solutions of each model type. The box plot shows the median, interquartile range, minimum, and maximum values. Outliers are marked as black dot, and an outlier is a value that is more than 1.5 times the interquartile range away from the bottom or top of the box (Color figure online)

Fig. 10
figure 10

The true positive (TP), true negative (TN), false positive (FP), and false negative (FN) box plot across the 64 solutions of each model type.The box plot shows the median, interquartile range, minimum, and maximum values. Outliers are marked as black dot, and an outlier is a value that is more than 1.5 times the interquartile range away from the bottom or top of the box (Color figure online)

Fig. 11
figure 11

The classification accuracy mean and \(95.00\%\) confidence interval of the SVM, RF, and NN models across the 64 solutions of each model type; a porosities SED \(< 60.00\) \(\upmu \)m and b porosities SED \(\ge 60.00\) \(\upmu \)m

Fig. 12
figure 12

The classification accuracy mean and \(95.00\%\) confidence interval of the NN and PGNN models across the 64 solutions of each model type; a porosities SED \(< 60.00\) \(\upmu \)m and b \(\ge 60.00\) \(\upmu \)m

The precision and recall results in Fig. 9 confirm the classification accuracy results. the NN and PGNN variants have a higher recall but lower precision relative to the SVM and RF models. The F-score [0, 1] is defined as the harmonic mean of the model’s precision and recall. In this study, we use the standard F-score where the precision and recall are equally weighted. The F-score provides a more balanced view of the models’ performance relative to the classification accuracy which only considers the number of correct classifications. An F-score of 1 implies a perfect model. The SVM model has the lowest F-score median and interquartile ranges of 0.687 [0.687, 0.688]. The PGNN\(_1\) model has the largest F-score median and interquartile ranges of 0.735 [0.731, 0.737]. In AM porosity detection and localization, especially for safety critical applications such as aerospace, false negatives (miss-classifying porosities) are more costly than false positives and a higher recall is more important. Since PGNN\(_1\) has the highest recall and F-score performance, it is a more suitable choice for porosity localization. As our data set is balanced, the F-score results and findings are in agreement with the classification accuracy results. The true positive (TP), true negative (TN), false positive (FP), and false negative (FN) results in Fig. 10 further demonstrate that the NN and PGNN variants are less likely to miss-classify porosity samples.

Figure 11 demonstrates the SVM, RF, and NN classification accuracy and \(95.00\%\) confidence interval for different porosity volumes across the 64 solutions of each model type. The SVM model has no standard deviation in its predictions across the 64 solutions where for each solution the data was re-partitioned randomly. This is likely due to the SVM model failing to capture the variation in the data (under-fitting) and making the same predictions for the porosity class. The NN has a wider confidence interval relative to the RF model, but is still likely to outperform the RF model with a \(95.00\%\) level of confidence. In Fig. 11a and 11b we can also observe an increase in the classification accuracy as the porosities SED value increases, indicating as expected that larger porosities are easier to detect. Figure 12 demonstrates the NN, PGNN, and PGNN\(_1\) classification accuracy and \(95.00\%\) confidence interval for different porosity volumes across the 64 solutions of each model type. The NN, PGNN, and PGNN\(_1\) models have a similar confidence interval across the porosity volume range.

Table 1 The classification accuracy average percentage change of each model type relative to the neural network model
Table 2 The log loss error average percentage change of each model type relative to the neural network model

Tables 3 and 4 in Appendix B, show the classification accuracy and log loss error average and standard deviation on the testing OOS data across a finer porosity spherical equivalent diameter range. We utilize the two-sample Hotelling’s T\(^2\) for independent samples to determine if the classification accuracy percentage change and log loss error change is statically significant. To decide whether to perform a homoscedastic or a heteroscedastic T\(^2\) test, we use the multivariate statistical test Box’s M to check the equality of multiple covariance matrices. The Box’s M test assumes multivariate normality (Schumacker, 2015). The classification accuracy and log loss errors in Tables 3 and 4 mostly have a univariate and multivariate (relative to the NN model results) normal distribution according to quantile-quantile plots with an exception to the [200, 250] and 251.002 SED results (quantile-quantile plots omitted). The [200, 250] and 251.002 SED results are omitted from the T\(^2\) test analysis. The SVM model is omitted from T\(^2\) test analysis as it lacks variance in its predictions across the 64 solutions with randomized cross-validation re-partitioning. In this work, Box’s M takes on a Chi\(^2\) approximation which is accurate for group sample sizes larger than or equal to 20.

A non-significant Box’s M test result (P \(>0.05\)) indicates that the covariance matrices are equal. If the covariance matrices are not significantly different (homoscedastic) and the groups’ sample size is at least 50, Hotelling’s T\(^2\) test takes a Chi\(^2\) approximation, otherwise it takes an F approximation. If the covariance matrices are significantly different, Hotelling’s T\(^2\) test takes a Chi\(^2\) approximation. We perform the Box’s M and two-sample Hotelling’s T\(^2\) tests to assess if the prior-guided neural networks have a statically significant (P \(>0.05\)) effect on the classification accuracy and the log loss error relative to the NN model (Tables 6, 7). We also perform the Box’s M and two-sample Hotelling’s T\(^2\) tests to assess if the RF model has a statically significant (P \(>0.05\)) effect on the classification accuracy relative to the NN model (Table 5).

Table 1 shows the classification accuracy percentage change of the SVM, RF, PGNN, and PGNN\(_1\) models relative to the NN model. The cells highlighted indicate a statistically significant percentage change (P \(<0.05\)) based on Hotelling’s test. The RF model has a positive statistically significant effect on the porosity-free samples relative to the NN model. However, the RF model has a negative statistically significant impact on the porosity samples relative to the NN model. Overall, across the OOS data set, the PGNN model have no statistically significant change in the classification accuracy relative to the NN model. The RF model has a \(4.31\%\) classification accuracy percentage improvement relative to the NN model. However, as noted earlier, the NN outperforms the RF model in terms of the recall and F-score and is more suited to AM applications. The PGNN model has no statically significant performance change in the classification accuracy relative to the NN model. The PGNN\(_1\) model has an overall statistically significant classification accuracy percentage improvement of \(1.74\%\), relative to the NN model. Specifically, relative to the NN model, the PGNN\(_1\) model statistically significantly improves the classification accuracy of the porosity-free class, and the porosity SED ranges [44.00, 48.00], [60.00, 80.00], and [150.00, 200.00]. PGNN\(_1\) classification accuracy improvement and relatively balanced recall and precision, show the benefits of utilizing the nominal abundant data in the framework of structurally optimized prior-guided neural networks (Atwya & Panoutsos, 2022).

Table 2 shows the log loss error percentage change of the PGNN and PGNN\(_1\) models relative to the NN model. The cells highlighted indicate a statistically significant percentage change (P \(<0.05\)) based on Hotelling’s test. The PGNN model has no statistically significant impact on the log loss error relative to the NN model. The PGNN\(_1\) statistically significantly reduces the log loss error over the OOS data set by \(11.63\%\) relative to the NN model. Specifically, PGNN\(_1\) statistically significantly reduces the log loss error for the porosity-free class and for the porosity SED range \([40.00, 200.00] \upmu \)m. PGNN\(_1\) improvement of the log loss error increases the likelihood of the prediction being correct. An improved log loss error is significant for porosity mitigation control applications where the control action can be proportionate to the model’s prediction.

Conclusion

This work examines the use of photodiode sensory data in a machine learning framework for PBF in-situ porosity detection and localization. We propose for the first time in literature, a framework that can make in-situ porosity localization predictions via a neighborhood of within hatch stripe data. In-situ porosity localization via within hatch stripe sensory data achieves quicker predictions without the need for surrounding hatch stripes or preceding/succeeding layer data. The capability to make porosity localization predictions exclusively within stripe data potentially facilitates quicker within-layer defect mitigation via altering the sintering process settings of the surrounding hatch stripes. The capability of within-layer defect localization and mitigation can improve the quality of AM parts and improve AM compliance with the reliability and safety certification requirements of industries such as the aerospace industry (Blakey-Milner et al., 2021).

As proposed in this work, utilizing the abundant information-rich nominal PBF data in the form of a prior loss (PGNN\(_1\)) statistically significantly reduces the log loss score standard deviation and results in higher confidence in the model’s predictions. An improved log loss error is significant for porosity mitigation control applications where the control action can be proportionate to the model’s prediction.

In-situ porosity localization via within-hatch stripe sensory data also significantly reduces the amount of sensory data stored in-situ (towards making predictions) and offline. In our work, only five sensory samples from the current hatch-stripe are utilized in making a prediction rather than a three-dimensional neighborhood of data from preceding/succeeding layers, reducing data storage. In this work, the InfiniAM sensory data is 18.80 GB for a total sinter volume of 46.08 cm\(^3\). In an industrial reliability and safety certification setting where the data associated with the porosities and an equal amount of porosity-free data is stored for reference, our methods reduce the amount of data stored to 0.33 GB, reducing data storage costs.

To the author’s knowledge, this is the first work in the literature capable of the localization of micro-porosities with a SED \(\le 50.00\) \(\upmu \)m and as small as 38.118 \(\upmu \)m SED with an average classification accuracy and \(95.00\%\) confidence interval of 73.13±1.57. The proposed method’s localization of porosities as small as 38.118 \(\upmu \)m SED is also a more than a five-fold improvement on the smallest SED porosity localization via photodiode sensory data in the literature (Snow et al., 2022).

This work has demonstrated the potential of utilizing prior-guided neural networks and spectral emissions sensors towards in-situ porosity localization in LPBF. Further research may investigate applying the proposed methods in an in-situ manner to monitor the build’s quality and facilitate the early stopping/correction of builds.

A research question yet to be addressed in the literature is determining sensory samples’ informativeness within a three-dimensional porosity neighborhood. Typically, in the literature, the porosity status for a given coordinate within a layer is predicted via all the data from the layer in question and a given number of preceding and succeeding layers. In our work, we proposed utilizing the closest five sensory samples to a given coordinate in terms of the three-dimensional euclidean distance. However, future research can focus on quantifying the importance of a given sensory sample within a three-dimensional porosity neighborhood towards making a porosity prediction. A greater focus on data fidelity could significantly improve the empirical accuracy of porosity localization and shed light on the porosity formation mechanism. A better understanding of data fidelity could also enable the detection of porosities before they occur and contribute towards Right First Time manufacturing.

A general shortcoming of NNs and PGNNs is interpretability. With AM porosity detection and quantification as a motivator, further research might explore developing new prior incorporation frameworks to combine physics-based computational fluid dynamics models and PGNNs towards more interpretable modeling solutions. Further research might also explore whether ML frameworks could be standardized across materials, varying part geometry, and sintering machines for porosity detection and localization. The question is where to draw the line between ML frameworks for generalizable applications of low-quantity/high-variety products and tailored high-quantity/low-variety products.

An over-arching major limitation in additive manufacturing research and application is the variations in the sensory systems across different sintering machines and the limited open-source data available. An open-source multi-material platform with open-source data collation and an online repository is necessary and would significantly speed up progress in industrial applications, standardization, and research.