Introduction

Nuclear Forensics (NF) is a research discipline that focuses on the study of radioactive materials. For more than 50 years, international safeguards have been employed to confirm that nuclear materials disclosed by a state to the IAEA are used for peaceful purposes [2]. One of the safeguards' regime main strategies is based on environmental sampling and analysis to discover radioactive fingerprints of undeclared activity. Environmental sampling as a routine for safeguards inspections was approved in 1995 [3]. Environmental sample analysis provides information about nuclear activitiy in terms of form, size and surface structure of uranium microparticles of industrial dust in nuclear plants, often using scanning electron microscopy [4].

Particle isotopic analysis in NF has grown quickly in the past two decades due to advances in technologies for particle isotopic composition determination [5,6,7,8]. Forensic measurements in this field support action against illicit use of materials, guiding nations to improve their nuclear defense capabilities using scientific methods, evidence, and additional sources, including law enforcement and intelligence agencies [9].

NF uses analytical techniques for measuremts of isotopic, chemical, and physical properties of nuclear material, as well as traditional forensic evidence like DNA samples, hair, fingerprints, tool marks, and gunpowder residue, to identify fissile radioactive material and establish connections to individuals, locations, and incidents. Fission Track Analysis (FTA) detects fissile compounds. Radioactive samples from nuclear incident sites are collected for study. After collecting samples on dedicated sampling paper, they are sandwiched between two polycarbonate Lexan resin detector sheets and bombarded with thermal neutron flux to create fission tracks. Traces form star-like formations (fission track clusters) show nuclear fission products' trajectories. Those remnants can reveal interesting details about the measured substance's origin. Taken together with advanced statistical modelling methods to extract additional information from radiological NF investigations [4], FTA can advance the utility of NF measurements.

Machine Learning (ML) is a branch of AI that enables systems to learn from experience without explicit programming [10,11,12]. The software learn patterns in a sample set to decide how to handle new information. Recently, machine learning (ML) and its derivatives, particularly deep learning (DL), have evolved considerably due to the exponential development of information capture, storage, and processing capacities. [13]. This approach is compatible with computer vision (CV) application [14, 15], and integrates across various academic and engineering fields [16,17,18,19]. In recent years, FT cluster identification has moved from manual procedures on a database of thousands of images to automation, easing researchers' labor and providing more robust and accurate analysis. This study aims to improve field automation and propose an innovative DL-based solution for FTA using segmentation and classification of nuclear fission track shapes in microscopic images as part of our efforts to improve FTA as applied to NF.

Theory

This section provides an overview of the FTA process and presents a broad perspective on the selected technologies used for identification purposes in this study.

FTA

One advantage of the FTA method is its ability to identify fissile materials from background elements. One particle in 1,000,000 is fissile and significant. Sample collection is carried out using suitable safety and contamination control procedures. Potentially radioactive samples are collected on particle-free paper from items in the prescribed area. These particles are sealed in Lexan foil. To simplify collection, a mini-bulk model was built. This approach determines the average enrichment and concentration of the material, guiding the micro-bulk transfer quantity. The mean quantity of particles can indicate the existence of artificial chemicals, such as uranium isotopes, providing an initial forensic investigative trail. Mini-bulk sampling involves sampling a reference item and a suspected contaminated item. To dissolve the particles, the pieces are cut into 1 cm2 squares and soaked in Heptane liquid (C7H14) in a test tube. The sample preparation method follows these three steps: (1) completion of a mini-bulk preparation for ICP-MS testing, (2) preparation of a 300–500 picogram sample, and (3) attachment of the sample to a 31 × 30 mm Lexan surface to construct a Lexan detector. After sample preparation, a reactor irradiates the detector, which is a capture sandwich of several detectors, with thermal neutrons. This irradiation exposes the detector to 5 × 1013 n/cm2 s thermal neutrons for 20 min. The detector undergoes forced fissions due to irradiation. After irradiation, the detector sandwich cools for three days before being returned to the lab. After cooling, markers are securely mounted to the top and lower detectors at three locations. This creates a coordinate system for navigation between the detectors. Then, fission tracks from the material under test are developed and imaged in a lab. The particle's spatial placement determines its identification, and several chemical methods will test it. There are several star configurations that represent the orbits of nuclear fission products [20]. About 700 images are acquired using a transmitted light microscope equipped on a motorized XY scanning stage and automated Z-focus. There is a list of landmarks. The shapes are cut to verify the precision of preparation, casting, screening, and cutting. If a FT cluster is visible in the configuration, the procedure was correct and the substance is real. The FTA method can identify fissile radioactive material via spontaneous or induced fissions. Detector grooves show fission trail signatures and indicate the fission site. Since fission is isentropic, the fission product will be a star with a circular border. FTA offers exact location data on FT location, so that the sample associated with any FT cluster can be excised and sent for ICP-MS investigation. If uranium is present and the enrichment level is abnormal, forensic proof can be obtained. Generally accepted guidelines for enrichment levels are as follows: natural: 0.72%, commercial power reactors: 4–5%, research reactor: 20%, and nuclear weapons: 90% and higher. Figure 1 illustrates a schematic representation of the procedure.

Fig. 1
figure 1

Schematic flow of process of creating fission track cluster

Identification processes

Until 2017, FT cluster detection was done manually, like "finding a needle in a haystack". Recently, automated algorithms have been developed for FT cluster identification [21, 22]. Image processing software with a suitable user interface was used for identification. This technology overcomes biases associated with manual identification. For example, when people know the source of a forensic sample, they tend to work harder to find FT clusters. In a study [23], a 10% increase in cluster identification (FP) was observed when the examiner understood the relevance of the sample. Deviation between testers with similar training was found to be 15% [23]. AI-based systems may have advantages over previous technology. Autonomous systems' objectivity is a major asset. Two staff members and two days are needed for manual scanning and detection of FT clusters. Modern image processing methods can detect FT clusters from microns to roughly 300 µm in size. Depending on the importance of the sample, repeated testing by the same examiner yielded 90% accuracy. Using different testers reduced accuracy 85%. The false positive rate rose by 5% [23]. Figure 2 shows an example artifact and a variety of stars observed in our lab. The tested material is natural sand containing uranium [24]. The stars are located and the quantity of tracks characterizes the material as fissile. It is important to note that a large natural uranium particle can form a star like a small highly enriched particle. The shape of a star only shows the amount of fissile material present, not its enrichment.

Fig. 2
figure 2

Types of stars: A black center with a long halo B huge without a black center C small black center D rich E poor F artifact

This study used a U-Net neural network architecture, based on a Fully Convolutional Network (FCN) [25], to segment and classify fission track clusters in microscope images. Neural networks, also known as artificial neural networks, are computational mathematical models inspired by biological neural networks used in machine learning. The wide range of shapes and sizes they can take in an image in addition to the possibility of contaminants being mistaken for FT clusters make textural star shapes difficult to recognize. Noise and blur are common problems with microscope images. Typical image processing models are excessively sensitive to these problems. Clusters have very varied shapes, and occasionally they overlap each other, causing standard segmentation and classification procedures to lose accuracy. Neural network-based semantic segmentation algorithms for image processing have advanced recently, so that these tools are now available to address the shortcomings of classical image processing workflows. Semantic segmentation labels each image pixel (e.g. person, road, sky, ocean, car).

Semantic segmentation is utilized in autonomous cars, industrial inspections, satellite image-based terrain classifications, and medical imaging analysis [26,27,28,29]. Currently, FCN algorithms are considered the 'State of The Art' in practical neural network performance [25]. These algorithms have yielded positive results in many segmentation challenges [30, 31]. FCN architecture has only locally connected layers such as convolution, pooling, and un-sampling. Avoiding dense layers reduces parameters, making network training faster. U-Net networks [32] were developed by adapting the FCN architecture for biomedical image segmentation. This adaptation allows training with fewer images and improves segmentation accuracy. For segmentation and classification purposes, an algorithm based on U-Net network could be a viable option, since U-Net network captures the data's different shapes and textures, enabling accurate predictions and labeling of the stars. U-Net topology has a skip link between the Encoder and Decoder, two layers with ReLU activation functions and normalization, and a Maxpooling layer. Convolution creates image feature maps. Kernels, or quadratic two-dimensional filters, are applied to input data at fixed spatial locations during convolution. Convolution between the Kernel and the input produces the output.

The selection of the U-Net architecture was motivated by its known efficacy and resilience in handling diverse spatial variations in object size, texture, and shape [33]. A model can be assessed using accuracy (precision) and recall (sensitivity). Accuracy measures the proportion of relevant examples among the recovered instances, while recall measures the proportion of relevant instances that were successfully retrieved, where:

$$Recall=\frac{TP}{TP+FN} ;\,\, Precision=\frac{TP}{TP+FP}$$
(1)

Contemporary cluster detection methods have 80% precision. However, manual scanning causes inconsistency. Thus, professional quality control is required, and this is time-consuming. Cluster detection and classification is difficult due to their varied forms and sizes. This makes it difficult to identify stars from other non-star phenomena like scratches, dirt, shadows, etc., resulting in many false positives. The study seeks a solution to these issues.

Our current focus in forensic segmentation and identification is demonstrating FTA segmentation capacity. An image processing workflow created using the Fiji distribution of ImageJ [35] was developed as part of an ongoing research collaboration between the International Atomic Energy Agency and the Israel Atomic Energy Commission. This algorithm finds three types of prominent FT clusters [22, 36]. The primary challenge lies in identifying sparse FT clusters. We will use DL methods to improve the existing approach by reclassifying sparse clusters as a discrete category rather than noise. In the beginning of our inquiry, we found dense clusters utilizing only 30 observations. With sparse clusters, adding data improved performance. The ultimate goal is to recognize roughly 7 classes of clusters, including floating, asymmetric, and complex clusters. The idea is to automate FT cluster identification for about ten categories. No methodology currently supports this.

The U-Net technique was chosen for its robust performance on image segmentation problems, with relatively small datasets. In this study, the dataset had over 100 labeled observations per category.

Experimental

The dataset acquisition process had two stages: Fission stars were first collected using a nuclear fission detector, Lexan. Second, an optical microscope generated 700 images with a 10% overlap. Each image was then divided into 250 × 250 µm pieces for further processing.

Methodology

Intensity thresholding together with morphological criteria were used to divide an image into foreground and background components to retrieve stars. High-intensity areas in the binary image were classified as potential FT clusters. Measurements were based on star detection thresholds. Detection thresholds were set considering characteristics such as cluster size, criterion to distinguish clusteres from scratches or dirt, ratio of pixels in the dark area to the bounding box (fill ratio), Circularity and Eccentricity.

Data preparation

Cluster categories were defined to standardize the nomenclature for description of objects of interest. The categorization parameters were set by these criteria, in order to create a complete cluster catalog. The study had four classification criteria: size (large/small), richness (poor/rich), center (black center/no black center), and halo length. Asymmetric, complex, and hovering stars were also identified. The star categorization symbol is w, x, y, and z. The letters denote binary values of 0 or 1 for each condition. The magnitude, richness, core black region, and halo extent are represented by w, x, y, and z. Tables 5, 6, 7 present the binary notation for cluster classification. Typical images and color setting for cluster classes and types are shown. All three tables appear in Appendix.

Machine learning

Machine learning algorithms use mathematical models to learn and predict from input data. The model input data is split into different datasets. Model development often uses three datasets: training, validation, and test. The model is trained using the training set to optimize parameters using known cases. For each input vector in the training set, the model generates an output and compares it to the ground truth. Model parameters are modified based on comparison results and learning technique. A validation set dataset uses the fitted model to estimate performane on data not in the training set. The validation set impartially evaluates a model's performance during training, and is used to detect overfitting.. A naive test set is then used to test the model.

Cross-validation was used for multiple model testing and training iterations. This method optimizes data use across arrays. In this study fivefold Cross Validation technique was employed. This method partitioned the data into five subsets, with 60% for training, 20% for validation, and 20% for testing. A full data replacement process was designed to assure the independence of the validation and test sets. This was repeated five times, changing the training data composition.

Data must be carefully labeled at the pixel level to provide the ground truth for DL training. The raw image has a light background and dark objects. The fission track clusters need to be labeled. The images include artifacts such as shadows, stones, dirt, scrapes, and other factors. To train, each pixel must be classified correctly. The FT clusters were labled with white pixels (255), and all background and artifacts were black (zero). Figure 3 shows a source image on the left and its corresponding ground truth. All dataset images undergo this treatment. Source image resolution was reduced from 20 to 4 MB during training data preparation, due to information storage and training run session limits.

Fig. 3
figure 3

A Original image containing a poor rose with dirty background. B The image after manual editing in preparation for training used as the tagged image

Architecture

The tasks of data preparation, running training runs, completing segmentation, and other operations were executed using MATLAB software. As previously stated, this study's model uses U-Net neural network architecture. The name "U-Net" describes the network's architecture. The network has a connected path and a broad path, forming a U shape. Convolutional networks use iterative convolutions, ReLU activations, and pooling operations in the connection pathway.

Contraction reduces spatial information and increases feature information. A series of convolutions with high-resolution features from the shrinking path merges feature and spatial information in the expansive path. This study's U-Net architecture is shown in Fig. 4. Table 1 details our model's network layers after this depiction. The model has three jump points and uses convolution.

Fig. 4
figure 4

Schematic diagram of the network layers in a U structure

Table 1 Network layers arranged by groups

One of the limitations associated with the utilization of deep learning techniques is the requirement for a substantial volume of data. For deep learning networks to achieve optimal performance, a substantial volume of data is required, typically in the order of millions of records. Furthermore, it is observed that the quality of training outcomes improves as the quantity of available information increases. As previously said, the U-Net model has an inherent advantage in its ability to effectively utilize small amounts of data. However, it is important to note that a substantial amount of data is still required. The process of generating data, particularly when it involves tagging information, presents a formidable undertaking that necessitates significant allocation of resources. The generation of artificial information, often known as Augmentation, has the potential to substantially enlarge the available data.

Image labeling

As part of this work, we created a new classification system. A custom MATLAB application was used for image labeling. Sequential steps were taken: We start with source images for analysis. We then created labeling categories. Within these categories, we established a reference label. This label can be used on clusters of various categories and non-cluster objects like dirt or scratches. The label highlights star features. Finally, we compiled and exported all of the data to a file for analysis.Footnote 1

Experimental—segmentation processes

Once the data are ready, training starts. This study's system trained for single- and multi-class classification. Effective training produces a model that can segment images and detect stars. The semantic segmentation procedure and evaluation are covered in this section.

Semantic segmentation

Semantic segmentation labels each visual pixel as animal, human, road, sky, car, etc. The aim is to recognize certain features in an image, not categorize it. This approach distinguishes visual entities. In our work, segmentation requires labeling each pixel for different types of clusters, non-cluster objects, and the background. A MATLAB function named semanticseg was used. For segmentation, the function takes the trained neural network and an image. It then semantically segments the input image and assigns classification scores to each classified label. An array of scores is extracted and grouped by pixel or voxel in the supplied image. After this phase, the layers are blended using Label Overlay to create a composite image with several labels. The Image Scaled Color technique generates a categorization results map. Figure 5 shows how semantic segmentation creates a multi-layer tagged image from a source image. This generation is done through training with an integrated mask image (Ground Truth). Each mask in the Ground Truth image represents a cluster type.

Fig. 5
figure 5

An example of creating a tagged image using semantic segmentation

Segmentation evaluation

We created a model that maximizes ROC curve area. We will explain the ROC curve, its ideas, and how to generate it in this section. We will also examine how to test our model against this curve.

The curve structure: the ROC curve shows binary classifier performance at different decision levels. A curve is created by plotting the true positive rate (TPR) and false positive rate (FPR) across acceptance levels. The sensitivity, or true positive rate, is a machine learning coverage measure. The false positive rate is 1 minus the specificity, or leakage or significance threshold. The receiver operating characteristic (ROC) curve shows how significance level affects sensitivity. Binary classification divides the results into positive and negative groups. True positive (TP), false positive (FP), true negative (TN), and false negative (FN) are the categorization outcomes. Table 2 uses the Confusion Matrix to display these values. Our matrix is formed after training using segmentation (during the activation of the trained model on the test set) and can be used to calculate ROC values.

Table 2 Confusion matrix (in Parenthesis: our case)

Visually, as depicted in Fig. 6, the diagonal line serves as a demarcation between positive things on the left and negative items on the right. The classifier is represented by an egg-shaped boundary, encompassing positive objects inside its confines and excluding negative items outside of it.

Fig. 6
figure 6

True Positive (TP), False Positive (FP), True Negative (TN), False Negative (FN)

When considering errors in fission track analysis, it is important not to miss actual FT clusters (which would be a false negative), but marking non-stars as clusters (false positives) is a less serious error. This is due the fact that the meaning of FN is a radioactive material that has been missed, as opposed to FP which means a non-radioactive item who may be marked falsely as radioactive.

Automation

The above steps for labeling training, and application of a DL model were developed into a set of computer codes that can be used to automate the processing of large numbers of images and FT clusters.Footnote 2

Results and discussion

This section provides an overview of the training and segmentation workflows for both single class and multi class classification.

Single class training—poor & rich stars

Figure 7 shows two training outcomes for a low-quality star detection model. The primary training consisted of either 20 epochs (94 min), 50 epochs (135 min), 100 epochs (193 min), or 150 epochs (492 min). Accuracy increases with repetitions and run duration. The figure shows two runs: 20 epochs and 1580 iterations (top image) and 150 epochs and 11,850 iterations (bottom image). Both sets use 120 iterations for validation. The accuracy chart at the top of each image shows the Mini-Batch Accuracy graph as a blue line and the Validation Accuracy as black dots. The Mini-Batch Loss (brown) and Validation Loss (black dots) are shown in the lower part of the figure.

Fig. 7
figure 7

Training results of single poor star model in 20 (A) and 150 (B) epochs

Table 3 displays the outcomes of the four training sessions.

Table 3 Training results repetitions: 20, 50, 100, 150 epochs

In comparison to prior investigations [23], it has been observed that by carrying out training for a duration of 20 epochs, it is feasible to get similar outcomes of 84.55% accuracy (in comparison to about 85%) as reported by several human evaluators. The application of 50 epochs of training enhances the accuracy level, enabling comparable results to a single tester (90%). Subsequently, employing 150 epochs of training further enhances the accuracy level, resulting in an accuracy rate of 92.04%. This achievement surpasses the highest level attained by other methods thus far by approximately 2% (90%). Figure 8 displays a collection of segmented FT clusters.

Fig. 8
figure 8

Results of stars segmentation after training with 20, 50, 100 and 150 epochs

Some examples of artifact objects are shown in Fig. 9. The blue parts on the turquoise background symbolize segmentation with different epochs (20, 150). Five images were segmented after two classifiers were trained. The first three images show stones and dirt, the fourth an unclassified star object, and the fifth a scratch. The objective is for the blue marking (FP) to be minimal in all of these objects. As the number of epochs increase, blue marking intensity decreases, meaning that FP pixels are reduced.

Fig. 9
figure 9

Non-star objects segmentation results after training with 20 and 150 epochs

Table 4 presents a collection of focused segmentation images, illustrating a comparative analysis of outcomes achieved for objects classified as FT clusters and those classified as artifacts. For FT clusters, a higher degree of blue marking indicates a. Conversely, for other objects, the objective is to minimize the blue marking and achieve a greater resemblance to the background hue.

Table 4 Comparison of segmentation between stars and non-star objects

Figure 10 shows an image that has undergone segmentation, with the resulting segmentation image displayed on the left side, and a corresponding classification map on the right side. The visual attention is directed towards a star, which is distinctly highlighted in green, as well as an item that is not classified as a star, which is noted in red. Table 4 presents a shared view of these two objects.

Fig. 10
figure 10

Comparison between star and non-star object—a segmentation image and a classification map

This work produced two ROC curves using segmented images from a single-classifier model of a black-centered rich cluster, as shown in Fig. 11. These curves also have AUC values calculated. The ROC curve shows how a binary classifier performs based on its decision threshold. At the middle, the graph shows the TPR-FPR relationship across acceptance criteria.

Fig. 11
figure 11

ROC curves produced after completing model training for a single classifier

The AUC calculation shows the graph's area when a normal result is 0.5 to 1. AUC = 0.5 implies a random prediction. Thus, the model does not improve star categorization. A model with a higher AUC value identifies and classifies better. Any outcome above 0.5 is better than random selection for maximizing value. The plots show a reasonable AUC = 0.68 and a superior one at AUC = 0.84 in our model.

Table 8 in the appendix summarizes one example of results of 4 images and shows pixel intensity, TP, FN, FP, recall and precision of each image. The model has a fair TP and FP and is biased toward zero FN in the specified images as these images went through an adaptive threshold algorithm which was developed for background cleaning.Footnote 3

Multi class training—poor & rich FT clusters

After training and segmenting rich and poor single cluster images, we trained a multi-class model for both, so that the model identified poor (type 0001) and rich (type 1111) clusters. The training had 20 epochs, 4880 iterations, and 250 validation iterations. Training results: Mini-Batch Accuracy: 96.43%, Validation Accuracy: 86.30%, Losses: 0.1461, 1.4974. Training outcomes are good for Validation Accuracy. Loss performance is good until epoch = 12. However, the results oscillate after this. Several factors may explain this. One possibility is that pre-processing techniques, such as zero-meaning and normalization, were applied to the training set but not to the validation set, or vice versa. Another explanation could be that the observed discrepancy is due to overfitting, where the model has learned to perform well on the training data but fails to generalize effectively to the validation data. Due to the unchanged code execution mechanism, the first choice is less relevant. The second possibility is more likely, so models with more data were created. The dataset was enhanced via augmentation, and nodes were randomly excluded during training using dropout. In addition, noise reduction and Cross Validation were used to improve data quality. This algorithm improved the accuracy rate to 86.30%, compared to two persons' 85%. Compared to a manual tester's 90% recognition rate, the model's accuracy is still lower.

Figure 12 shows segmentation with classifiers: ST_0001, ST_1111, and background. The left half of the illustration shows images of ST_0001 stars with blue markings on yellow backgrounds. The right side shows ST_1111 type stars with green markings on a yellow background. The file names are centrally placed, indicating that the left and right images came from a single file.

Fig. 12
figure 12

Results of segmentation and identification of two types of stars after training

Two ROC charts show the performance of several classifiers, including AUC values, in Fig. 13. Each chart shows multiple classifier graphs. Source and labeled images were used to generate segmentation images and charts. The chart below shows the results of two star categories in a trained model image. These categories include improved results with AUC values of 0.99 and 0.96. An alternate model is used to explore type 0001 and type 1111 stars. Type 1111 stars have an AUC value of 0.66, as indicated in the chart. It shows less satisfactory results for type 0001 stars, with an AUC of 0.24. The value of 1111 is appropriate given its attributes. Its magnitude is less than 0.7. In contrast, it exceeds 0.5, the chance of a random guess. Trial 0001's results are below 0.5, indicating a technical problem.

Fig. 13
figure 13

ROC charts obtained after training different models with two classifiers

We found that the threshold vector from the MATLAB perfcurv function is cut off at 0.33 instead of decreasing to 0, which appears to be a software fault. It is necessary to carry out new training with segmentation and ROC calculations after training and study the results to observe the root cause of these false results.

Synthesized FT clusters

After performing multi class training and segmenting, we trained a model with FT clusters that were generated using a specialized software tool designed for the purpose of creating synthetic FT clusters [34]. The process created 50 clusters with 50 leaves, 50 clusters with 100 leaves, and 15 clusters with 300 leaves per cluster. The results of synthesized cluster training model utilizing a retrain procedure consists of 40 epochs in the first stage, 10 in the second, and 40 in the third: Mini-Batch Accuracy of 74.04%, Validation Accuracy of 73.65%, Mini-Batch Loss of 0.1893, and Validation Loss of 1.9106.

Figure 14 shows ROC charts and AUC computations of segmentation images from multiple models. The top-charting model was trained on synthesized StarType_50, StarType_100, and StarType_200 clusters. StarType_50 and StarType_200 performance is shown in the chart. AUC scores were 0.99 for the former and 0.94 for the latter. Another illustration of the same model with all three classes is on the left. A model trained with A50, A100, A300 stars is shown in the bottom plots. The right graph provides good AUC scores for the 3 classifiers: 0.94, 0.91, and 0.89. The left graphic provides AUC for the three classifications: 0.95, 0.96, and 0.61. All results are good, except for the A300 classification, which is only a little better than random guessing (better than 0.5).

Fig. 14
figure 14

ROC charts obtained for segmentation images from three-classifier models

Conclusions

This paper has introduced a novel methodology for detecting FT clusters in microscope images using advanced deep learning segmentation and classification. A U-Net FCN model was developed to accurately segment star-like patterns in single-class and multi-class scenarios. The methodologies described in this study have been effectively employed as a novel analytical technique for the identification of fission track clusters.

A model was created for clusters with a diameter of less than 60 microns (200 pixels), less than 10 leaves, and no black center (type 0001). The model had a 0.84 ROC area. A computational model was also created to evaluate type 1111 stars of higher magnitude and brightness. This model had an outstanding ROC curve area of 0.90. Early models were also designed to detect cluster shape and size variations simultaneously. It was found that as epochs increase, model accuracy increases. After 20 epochs, the ML model attained ~ 85% accuracy, comparable to human manual identification. After 50 epochs, the model attained 90% accuracy, surpassing manual identification. After 150 epochs, it reached 92%.

The research included creating a new FT cluster database, classifying cluster types, characterizing the model, designing an architecture, and optimizing. The research also established segmentation measures, optimized the number of epochs and validation frequency, and investigated background noise filtering thresholds. In addition, detection capabilities and adaptive threshold definitions were improved and an Image Labeler tool was created to generate labeling information semi-automatically, followed by Auto Labeling automation. However, the labeling tools are not covered in this paper. A software tool to create synthetic FT clusters was created in the project's latter phases, and used to train a DL model. The initial outcome of the Retrain approach in this collaboration yielded a classification accuracy of 73.65%. Three different types of clusters were included in the images which were used to train a DL model.

We are currently working to extend this work. First improvement in classical methods for processing the ground truth images to imrpove their SNR will assist in creating larger numbers of labeled images for training, validation and testing. Similarly, imrprovement in the FT cluster simulator will allow the creation of large simulated datasets for training. Synthetic data usage for AI projects is largely investigated in the recent years.Footnote 4

Other DL architectures can be tested, with different optimizers, etc., in order to optimze the resulting DL network that is to be deployed.

It is also feasible to expand from a segmentation model to a detection model by adding bounding boxes to the ground truth. It is possible to extend the investigation to identify the precise coordinates of the detected clusters, as well as investigate cluster class frequencies and cluster densities per unit area. A correlation test can be used to compare laboratory data with MS-ICP particle observations of particle size, isotopic composition, and cluster shape. Another option is to apply this method to a parallel study of Single Track detection in a high-flux gamma detector based on uranium foil that converts the gamma flux into fissions (fγ).