1 Introduction

Detecting anomalous radioactive sources in urban areas is a critical function in national security due to the negative impact of such sources that could endanger citizens. One way for radiation detection is through portal monitoring (Chambers et al. 1974), where a single sensor is deployed at each choke point along the road. Detection using single sensor is simple and cost efficient. However, it is impractical in large areas (Brennan et al. 2004). Detection in large areas is typically performed using sensor networks (Nemzek et al. 2004; Stephens and Peurrung 2004; Chandy et al. 2008; Liu et al. 2011; Kumar et al. 2015; Liu and Abbaszadeh 2019; Zhao et al. 2019), where the sensors can be static or mobile. The use of distributed sensor network for detection and localization requires fusion of the measured data. In other words, the data collected from multiple sensors is further processed for localization of radiation sources (Sen et al. 2016; Wu et al. 2019). Generally, the detection and localization process is performed using the data collected from any sensor whose reading exceeds a preset threshold (Bai 2015).

In this work, radiation detection is considered through a distributed sensor network that consists of sensor nodes with the capability of transmitting processed data. Sensors in the network are equipped with Global Positioning System (GPS) receivers to provide position information (Nemzek et al. 2004). Hence, the data collected is represented by the sensors’ reading and position. Many techniques have been developed for radiation source localization. In the considered problem, source localization corresponds to the estimation of the source’s location along with the source’s intensity (strength) (Bai 2015; Iyengar and Brooks 2016; Gunatilaka et al. 2007; Chin et al. 2008; Morelande and Ristic 2009). In what follows, some of the popular localization methods in the literature are briefly introduced.

Maximum likelihood estimation (MLE) algorithm is considered to be the most popular approach for estimating radioactive source parameters (location and intensity) (Bai 2015; Gunatilaka et al. 2007; Vilim and Klann 2009; Deb 2013; Cordone 2019). MLE is a statistical estimation technique that provides a solution through maximization of a likelihood function (Kay 1993). The maximization is carried out numerically because the likelihood function does not have a closed-form solution for radioactive source parameter estimation problems. Consequently, MLE presents significant computational challenges. Many methods have been proposed to address the complex computational problem through determining initial values for the numerical solution of MLE (i.e., MLE grid search).

In (Deb 2013), a special case of Newton’s iterative method was proposed to find the MLE solution. The method provides an approximate distribution for the source intensity using the Expectation Maximization (EM) approach. It uses the peaks of the distribution as initial estimates to bootstrap the iterative MLE process. Solving the MLE problem using Newton’s method can speed up the maximization process. However, local maxima can be selected instead of the required global maxima. The work in Bai (2015) presented an algorithm through which an initial estimate is generated using an averaging process that requires large number of measurements. It was shown that the method can asymptotically converge to the conventional maximum likelihood source estimate.

Multi-resolution MLE is another method for performing the maximization process using less iterations than that of the standard MLE (Hesterman et al. 2010). It is based on a series of higher resolution MLE grid search, where the initial estimates of each search are set using the final estimates of the previous one. An improved multi-resolution MLE algorithm is presented in Cordone (2019), where a modification is provided to avoid the problem of capturing a local maxima. The modification is applied through expanding the grid search by a small factor after each layer of the multi-resolution algorithm.

Another popular method for estimating radioactive source parameters is the Bayesian approach (Liu et al. 2011; Jarman et al. 2011; Hite et al. 2016; Tandon et al. 2016; Bukartas et al. 2019). It requires prior distributions for the unknown parameters including the intensity of the background radiation (Liu et al. 2011). In (Hite et al. 2016), the Markov Chain Monte Carlo (MCMC) is utilized to generate a full posterior probability density for the estimation process using the Bayesian algorithm. In (Tandon et al. 2016), a Bayesian Aggregation (BA) technique was proposed. The method is used to learn the expected Signal-to-Noise Ratio (SNR) as a function of source strength using nonparametric Bayesian model. Attenuation and scattering factors are typically neglected in the estimation problem. However, the work in Jarman et al. (2011) considers these factors within the Bayesian estimation procedure and is based on an approximate distribution.

Lately, spatial statistics algorithms have been utilized for the detection of radiation sources (Zhao et al. 2019; Reinhart 2013; Sullivan 2016). Among these algorithms, the Kriging approach is considered to be the most commonly applied technique for radioactive source localization. It is a geo-statistical interpolation method, where the measurements at given positions are used to estimate unknown measurements at other positions (Stein 2012). Employing the Kriging model for estimation is challenging due to the requirement of the mean and variance for the distribution describing the unknown measurements. In (Sullivan 2016), Kalman filters are used as an attempt to solve the problem of the unknown mean parameter for the universal Kriging algorithm. On the other hand, in Zhao et al. (2019), the Poisson Kriging approach is employed through setting constant value for both the mean and the variance.

The existing localization methods are based on statistical approaches that most of which require a prior knowledge of unknown parameters. Hence, the estimation accuracy is mostly affected by the technique used for evaluating the statistical parameters. To the best of our knowledge, the machine learning has not been exploited for solving the localization problem of radiation sources through a network of detectors. In this paper, we utilize supervised machine learning for the estimation process. A regression algorithm is employed to predict the source’s intensity and location using only the sensors’ reading and position. Thus, the estimation procedure is performed without the need for any unknown parameters. However, a training phase is required for the learning process. In this work, a feature extraction method is proposed to represent the sensors’ data effectively. The extracted features are then used as independent variables in the decision tree regression algorithm for radioactive source parameter estimation.

The rest of the paper is organized as follows. Section 2 presents the preliminaries of the work, which briefly introduces the radiation measurement fundamentals and the decision tree algorithm. In Sect. 3, the proposed work is explained. The experimental results are demonstrated in Sect. 4. Finally, the work is concluded in Sect. 5.

2 Preliminary

2.1 Radiation measurements

Any radioactive element emits ionizing radiation as a form of energy that is released due to atomic or nuclear processes. The radiation behaves as a stream of particles, as well as a wave, that can propagate through space or other mediums (Gunatilaka et al. 2007). It is basically categorized into: alpha particles, beta particles and gamma ray (Kumar et al. 2015). In this work, we consider only the detection of gamma ray due to its ability to travel longer distances compared to alpha and beta particles. Radiation sources are either natural or man-made. The natural sources are known as NORM (Naturally Occurring Radioactive Materials), which can be found in cosmic rays, soil, or buildings. On the other hand, man-made sources are synthetic radioactive isotopes like Cesium-137 and Cobalt-60 that are used for many purposes such as medical, industrial, or research (Gary et al. 2005).

Many types of detectors can be used for radiation measurements (Kraner 1981). Usually, the detector measures the dose rate that can be represented by a count rate (counts per seconds or counts per minutes) corresponding to the source’s intensity. The detector’s reading value, denoted as R, can be modeled as a random variable that follows a Poisson distribution (Liu and Abbaszadeh 2019; Gunatilaka et al. 2007; Kraner 1981; Morelande et al. 2007; Hellfeld et al. 2019) as shown in (1).

$$ P\left( {R = c} \right) = \frac{{e^{ - \lambda T} \cdot \lambda T^{c} }}{c!} $$
(1)

where P(R = c) is the probability that the reading R is equal to c counts per second within the measuring time T. The value λ is the average count rate, which depends on the following (Kraner 1981): (a) the type of the radioisotope source; (b) the activity of the source; (c) the distance between the source and the detector; (d) the background radiation released from the naturally occurring radioactive materials (NORM) in the environment surrounding the detector; and (e) the detector’s parameters. Accordingly, the average count rate can be modeled as follows:

$$ \lambda = \zeta \cdot \frac{{\Gamma \cdot {\text{SI}}}}{{d^{2} }} \cdot e^{ - \rho d} + B $$
(2)

where ζ and Γ are constants values that depend on the detector and the type of the radiation source, respectively. SI and ρ are the source’s intensity and the attenuation coefficient, respectively, while d is the distance between the source and the detector. The value B represents the detected background radiation. Generally, for the radioactive source detection and localization problem, the following assumptions are made (Cordone 2019):

  • All detectors are identical.

  • The attenuation coefficient is close to zero.

  • The detection is established for a certain type of radioisotope.

  • The background radiation is known.

  • The only variables are the source’s intensity and location.

Hence, the calculation of the average count rate λ can be simplified as follows:

$$ \lambda \approx \delta \cdot \frac{SI}{{d^{2} }} + B $$
(3)

where the term δ is a constant factor that can be determined through calibration (Bukartas et al. 2019).

2.2 Decision tree

Decision tree is a popular machine learning algorithm. It is a tree-like model that can be used for classification or regression, where a target value is predicted based on a set of binary rules (Han et al. 2011). Decision tree is widely applied in many applications, such as medical diagnoses, financial analysis, manufacturing, movie preferences, spam filters, etc. It has many advantages over other machine learning techniques such as:

  • Easier to read and interpret without requiring statistical knowledge due to its flow chart-like structure.

  • Requires less effort for preprocessing the data, where no data normalization or scaling is required.

  • Outliers and missing values, in the training data, have lower impact on the classification accuracy.

  • Requires fewer computations for predicting the target value.

  • Can be used for predicting categorical or continuous variables.

In a decision tree, hierarchal (tree) structure is used for classification (or regression), such that each internal node (non-leaf node) presents a test on an attribute (feature), while each branch indicates the outcome of the test (Han et al. 2011). The leaf (terminal) node represents a class label (in classification) or a target value (in regression). The topmost node is referred to as the root node. One of the significant features of a tree is the maximum depth, which represents the longest path from the root to a leaf. Figure 1 illustrates an example for a decision tree, with maximum depth of 4, where one of five target values (y1, y2, y3, y4, and y5) is predicted according to the values of four attributes (A1, A2, A3, and A4). For example, in the decision tree presented by Fig. 1, the value of the ith attribute Ai is examined through a comparison with the corresponding ith threshold value vi. Practically, decision tree models are generated using training data provided to the model in a training phase.

Fig. 1
figure 1

Example for a decision tree applied upon four attributes

The decision tree is modeled according to a process referred to as induction (Han et al. 2011). More than one technique can be used to model a decision tree, such as ID3 (Iterative Dicotomiser), C4.5 (a successor of ID3), and CART (Classification and Regression Trees) (Han et al. 2011). These techniques are usually based on a greedy approach, where the tree is constructed in a top-down manner using an attribute selection method. A metric, like information gain or Gini index, is calculated to choose the best attribute at each node for splitting the dataset into two subsets according to a threshold value. More information about the decision tree modeling is presented in Han et al. (2011). In the proposed work, the decision tree is implemented using the Scikit-learn module in Python (Pedregosa et al. 2011), which use the CART method for induction.

3 The proposed work

3.1 Problem definition

As mentioned in Sect. 2.1, the Poisson distribution model can be used to calculate the detector’s reading for a given intensity and location of a radioactive source. However, the model cannot be used to evaluate the intensity and the location of the source from the detector’s reading. In this work, the considered problem is to estimate the intensity (SI) and location (Sloc) of the radiation source detected by a group of sensors. Figure 2 shows the given inputs and the required outputs for the estimation process, where Ri and Loci are the reading and location of the ith sensor. The number of sensors used is denoted by M. The estimated values, for the source’s intensity and location, are referred to as SIest and Slocest, respectively. Note that the location is represented by the two dimensional Cartesian coordinates (x, y).

Fig. 2
figure 2

Block diagram showing the inputs and outputs of the estimation process

The proposed method employs supervised machine learning to build a regression model that can be used in the estimation process. However, the model requires a training phase that needs a dataset of values representing the source’s parameters (intensity and location) and the corresponding feature vectors, which depend on the sensors’ data (readings and positions). In the next sub-section, the method used for data generation is described.

3.2 Network setup and data generation

In this sub-section, we explain the network setup and the generation of datasets that are used for training and testing the regression model. In the proposed work, the localization process is performed within an area of size Dx × Dy. The given area is divided into non-overlapping regions each is of area Drg × Drg as shown from Fig. 3. Let Rgi be the ith region and Nrg be the total number of regions that can be calculated as follows:

$$ N_{rg} \approx \frac{{D_{x} }}{{D_{rg} }} \cdot \frac{{D_{y} }}{{D_{rg} }} $$
(4)
Fig. 3
figure 3

A Dx × Dy area, starts at location (0, 0), divided into Nrg regions

Sensor nodes are deployed in each region to monitor the environment. In particular, we assume that each region is covered with a cluster of M sensors connected through a star topology to a cluster head node as shown from Fig. 4. We assume that the head node is located at the center of the corresponding region, while the rest of the cluster members are randomly deployed within a region. Moreover, the communication between neighboring clusters is carried out through the corresponding head nodes. To maintain energy-efficient communication between the cluster members and the head node, the region should not span large geographical area. Note that the number of regions could be tuned to enable energy-efficient operation.

Fig. 4
figure 4

Illustration of a star topology connection between the M nodes of the cluster, where M = 8

For data generation, we consider a hypothetical radiation source is randomly located inside the given area, where the source’s intensity can be expected in the interval [SImin, SImax]. It is assumed that all values within the expected range have the same probability of being detected. Hence, the source’s intensity (SI) is uniformly distributed in the interval [SImin, SImax], and can be expressed as follows:

$$ {\text{SI}} = {\text{SI}}_{\min } + {\text{rand}}\;\left( {0,1} \right) \cdot \left( {{\text{SI}}_{\max } - {\text{SI}}_{\min } } \right) $$
(5)

where SImin and SImax are the considered minimum and maximum value for the source’s intensity, respectively. The function rand(0,1) generates a random value between 0 and 1 following a uniform distribution. Similarly, the location Sloc of the radiation source is expected to be inside the considered region area without any bias to certain locations. Consequently, the source location Sloc is generated according to a uniform distribution as follows:

$$ \left[ {\begin{array}{*{20}c} {x_{s} } \\ {y_{s} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {x_{\min } } \\ {y_{\min } } \\ \end{array} } \right] + {\text{ rand}}\left( {0,1} \right) \cdot \left[ {\begin{array}{*{20}c} {x_{\max } - x_{\min } } \\ {y_{\max } - y_{\min } } \\ \end{array} } \right] $$
(6)

where (xs, ys) are the Cartesian coordinates of the source location Sloc. The boundaries of the area, where the source is located, are represented by the coordinates (xmin, ymin) and (xmax, ymax) such that:

$$ \left[ {\begin{array}{*{20}c} {x_{\min } } \\ {y_{\min } } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 0 \\ 0 \\ \end{array} } \right], \left[ {\begin{array}{*{20}c} {x_{\max } } \\ {y_{\max } } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {D_{x} } \\ {D_{y} } \\ \end{array} } \right] $$
(7)

On the other hand, the random position (xi, yi) of the ith sensor in a given cluster follows a uniform distribution and can be expressed as follows:

$$ \left[ {\begin{array}{*{20}c} {x_{i} } \\ {y_{i} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {x_{c} } \\ {y_{c} } \\ \end{array} } \right] + I\left( {i \ne 0} \right) \cdot \frac{{D_{rg} }}{2} \cdot \left[ {\begin{array}{*{20}c} {{\text{rand}}_{x} \left( { - 1,1} \right)} \\ {{\text{rand}}_{y} \left( { - 1,1} \right)} \\ \end{array} } \right], 0 \le i \le M - 1 $$
(8)

where (xc, yc) is the center position of the corresponding region. In this work, the head node is represented by the ith sensor, where i = 0. The indicator function I(.) is used to consider the location of the head node, which returns 1 or 0 if its input condition is true or false, respectively. The functions randx(− 1,1) and randy(− 1,1) generate a random value between -1 and 1 following a uniform distribution for the x and y coordinates, respectively.

A constraint is applied to the randomly generated location, of any node (sensor or source), such that the distance between two adjacent nodes is larger than a user defined value denoted by Dismin. The dataset, upon which the training and testing phases are carried out, is generated through calculating Ndata values for SI and Sloc using (5) and (6). The source intensity (SI) and the source location (Sloc) dataset are presented in (9) and (10), respectively.

$$ {\text{SI}}\;{\text{dataset}} = \left[ {\begin{array}{*{20}c} {{\text{SI}}_{1} } \\ {{\text{SI}}_{2} } \\ . \\ . \\ . \\ {{\text{SI}}_{{N_{{{\text{data}}}} }} } \\ \end{array} } \right] $$
(9)
$$ {\text{Sloc}} \;{\text{dataset}} = \left[ {\begin{array}{*{20}c} {{\text{Sloc}}_{1} } \\ {{\text{Sloc}}_{2} } \\ . \\ . \\ . \\ {{\text{Sloc}}_{{N_{{{\text{data}}}} }} } \\ \end{array} } \right] $$
(10)

In a given area, the region closest to the radioactive source location is affected by the radiation more than other farther regions, and can hence provide estimated values for the source parameters more accurately. Accordingly, in our approach, the M sensors in the cluster, whose head node records the highest reading (hence is closest to the source), are selected for the estimation process. Note that the estimation procedure is carried out at the head node. For each sensor, the reading is evaluated according to the Poisson model described in Sect. 2. First, the average count rate λ is calculated. Then, the count rate value is generated randomly using the Poisson distribution.

3.3 Feature extraction

In this sub-section, we illustrate the process for evaluating the feature vector using the data collected from the M sensors that detected the radioactive source. As shown from Fig. 5, the required information for the estimation method is the reading and the location of each sensor in the cluster. The main idea in the proposed feature extraction is to represent both the location and reading of each sensor using one value. In other words, we fuse both location and reading of a sensor into a single value. The feature vector FV holds M feature values, each of which corresponds to the data of a single sensor as shown in (11). Hence, we have

$$ FV = \left[ {\begin{array}{*{20}c} {f_{0} } \\ {f_{1} } \\ . \\ . \\ {f_{M - 1} } \\ \end{array} } \right] $$
(11)

where fi is the feature value that represents the data collected from the ith sensor. Note that i = 0 represents the head node in the region. As mentioned earlier, the cluster head sensor node is always located at the center of the region. Hence, only the reading of the head node (R0) is sufficient to represent the corresponding feature value f0 that is calculated as follows:

$$ f_{0} = R_{0} $$
(12)

On the other hand, for the randomly allocated sensors, the location information must be considered for calculating their feature values. In other words, one value must indicate both the reading and the location of a sensor as shown from Fig. 5. In the following, the evaluation of the ith feature value (fi) is explained, where 1 ≤ i ≤ M-1. In order to incorporate the location information as input feature to the machine learning model, we propose to utilize zoning operation.

Fig. 5
figure 5

Block diagram showing the inputs and output of the feature value evaluation process

Zoning operation is applied to convert loci from the Cartesian coordinates (xi, yi) to a zone number zi. Figure 6 shows a Drg × Drg region, whose center located at (xc, yc), divided into zones each of size Δx × Δy. In particular, zones are obtained by dividing the x and y coordinates into equally spaced intervals of length Δx and Δy, respectively. Then, we define zonex and zoney, which correspond to the zone interval indices corresponding to xi and yi, respectively, as shown in Fig. 6. Note that the length of interval zonex and zoney is Δx and Δy, respectively. Then the total number of intervals, denoted by Nzx and Nzy, for zonex and zoney can be calculated as follows:

$$ Nz_{x} = \left\lfloor \frac{{D_{rg} }}{{\Delta_{x} }}\right\rfloor ,\quad Nz_{y} = \left\lfloor \frac{{D_{rg} }}{{\Delta_{y} }} \right\rfloor $$
(13)

where \(\left\lfloor . \right\rfloor\) is the floor operator. Accordingly, the zone number zi is between [0, Nzx Nzy -1] as illustrated in the figure. Algorithm 1 presents the pseudo code for the evaluation of the zone number from Cartesian coordinates. Note that (xorigin, yorigin) is the origin coordinates located at the top left corner of the region of interest.

Fig. 6
figure 6

A region, of size Drg × Drg, divided into zones of size Δx × Δy

figure a

After zoning, the second step is to evaluate the feature value fi from the reading value Ri and the zone number zi. In the proposed approach, a quantization process is employed to quantize the value Ri. The reading Ri depends on the unknown source intensity SI that can vary from small to very large values. Accordingly, a large quantization error might result from setting constant values for the quantization parameters (quantization step and number of quantization levels). To solve this problem and maintain low quantization error, the quantization is applied to the difference Diffi between the reading Ri of the ith sensor and the reading R0 of the head sensor. Note that the distance between any cluster member and the cluster head sensor is less than \({D}_{rg}/\sqrt{2}\) with the assumed setting where the head node is in the middle of the region. In general, even if the head node is not exactly at the middle, since sensors are within the same cluster, it is expected that the values sensed will not vary drastically. Hence, a limited value for Diffi is expected. Figure 7 shows the quantization levels used for quantizing the magnitude value of Diffi, where the quantization step is denoted by ΔQ. The number of quantization levels Nlevels is calculated as follows:

$$ N_{{{\text{levels}}}} = \left\lfloor \frac{{{\text{Diff}}_{\max } - {\text{Diff}}_{\min } }}{{\Delta_{Q} }}\right\rfloor $$
(14)

where Diffmax is a user-defined value that indicates the expected maximum value for |Diffi|. The minimum value is denoted by Diffmin, which is equal to zero. The value of each level is equal to mΔQ, where m is an integer value in the range [0, Nlevels-1]. Accordingly, the quantized value for |Diffi|, denoted by QDiffi, can be represented by Qleveli ΔQ such that:

$$ Q{\text{level}}_{i} = \left\lfloor \frac{{\left| {{\text{Diff}}_{i} } \right|}}{{\Delta_{Q} }} \right\rfloor $$
(15)

where Qleveli is the number of quantization levels corresponding to the value of |Diffi|. After evaluating the zone number zi and the quantization level Qleveli, the feature value fi is calculated. A matrix is constructed, where its rows and columns represent the possible zone numbers and quantization levels, respectively. Hence, the size of the matrix is Nz × Nlevels, where Nz is the total number of zones. The value of each matrix element is its corresponding location number, where numbering is performed from left to right and top to bottom. Note that the numbering starts from zero as shown from Fig. 8. The number of the ith element located at the rith row and cith column can be calculated as follows:

$$ i^{th} \;{\text{element's}} \;{\text{number}} = r_{i} \cdot N_{{{\text{levels}}}} + c_{i} $$
(16)

where Nlevels is the total number of columns. The magnitude of the feature value fi is selected from the constructed matrix according to the zone number zi and the quantization level Qleveli. According to (16), |fi| can be evaluated using the following relation:

$$ \left| {f_{i} } \right| = z_{i} \cdot N_{{{\text{levels}}}} + Q{\text{level}}_{i} $$
(17)
Fig. 7
figure 7

Quantization levels for evaluating the quantized value for |Diffi|, which is referred to as QDiffi

Fig. 8
figure 8

The matrix from which the magnitude of the feature value fi is selected according to the zone number (zi) and the quantization level (Qleveli)

The Pseudo code for the feature value evaluation is presented in Algorithm 2. As noted, the quantization is applied to the absolute value of Diffi to reduce the number of quantization levels. To consider the sign of the difference Diffi, the magnitude of the feature value |fi| is multiplied by the factor Diffi /|Diffi|. Eventually, the feature vector carries information about sensor location (zone number) and sensor reading.

figure b

3.4 Estimation using regression

In this work, the estimation process of the radioactive source parameters is performed through a regression model that is designed in a training phase and evaluated using a testing phase. We generate data utilizing models explained in previous sections. Recall that we use the Poisson distribution to generate the reading values (count rate) and locations follow uniform random distributions. The generated data (Ndata) are divided into training and testing data. According to the work in Müller and Guido (2016), it is typical to split the data such that more data is used for training the machine learning algorithm such that:

$$ N_{{{\text{tr}}}} \approx \frac{3}{4} N_{{{\text{Data}}}} , \quad N_{{{\text{ts}}}} \approx \frac{1}{4} N_{{{\text{Data}}}} $$
(18)

where Ntr and Nts are the number of values for the training and testing datasets, respectively. The training data, provided to the model, is comprised of the generated SI and Sloc datasets along with their corresponding feature vector FV dataset. While in the testing phase, only the features are provided, and regression models estimate SI and Sloc.

Note that the majority of the traditional regression models provide single output value for a given feature vector input. Accordingly, two regression models are used to estimate SI and Sloc separately. Moreover, the Sloc coordinates (xs, ys) are represented by a zone number Sz using Algorithm 1. Figure 9 shows the procedure for evaluating the FV dataset and the source zone Szone dataset from the SI and Sloc datasets. First, the count rate reading of each sensor is evaluated using the Poisson model that is based on the source’s location and intensity. Then, the feature vector is calculated according to the feature extraction method. The source’s zone number Sz is computed from the source’s location and the cluster head location (at the center of the region).

Fig. 9
figure 9

Block diagram for the procedure used for evaluating the feature vector and source zone datasets

In Fig. 10, a block diagram is presented to illustrate the training phase of the two regression models that are implemented using the decision tree algorithm. The trained models are referred to as SI_DTmodel and Sz_DTmodel, which are used to estimate the SI and Sz value, respectively. Finally, the trained decision tree models can be employed to estimate the source’s intensity SIest and location Slocest as shown from Fig. 11. Note that the Szest value is converted to the Slocest coordinates using the conversion method presented in Algorithm 3, where the source’s location (xs, ys) is estimated to be at the center of the corresponding zone. The testing phase is carried out to examine the accuracy of the estimated values using the testing datasets. Next section presents the performance evaluation results.

Fig. 10
figure 10

Training of two decision trees for estimating SI and Sz

Fig. 11
figure 11

The proposed procedure for estimating the source intensity SI and location Sloc from the sensors’ readings and locations using the SI_DTmodel and Sz_DTmodel, respectively

figure c

4 Experimental results

In this section, the performance of the proposed approach is evaluated and compared to that of other existing methods. The simulation experiments were performed using Python 3.5 on a PC with 4 GB RAM and Processor of 2.1 GHz. Table 1 shows the values of the parameters used for the data generation that corresponds to the radioactive source and the detector. Note that, in a sensor network, the sensor density is defined by the number of sensors divided by the region size. The work in Cooper et al. (2012) provided a sensor density of 0.009 sensors/m2, which can determine with high confidence the location of radiological material in a given area. Hence, for a region of size 30 × 30 m2, eight sensors are sufficient for effective detection. The constant value δ, in (3), is set through calibration using the simulated source intensities presented in Zhao et al. (2019). Table 2 shows the simulated intensities, for the radioisotope 137Cs (Cesium-137), measured in cps using the D3S (Discreet Dual Detector) detector located at 1 m away from the source (Zhao et al. 2019). The simulation was performed through GADRAS (Gamma Detector Response and Analysis Software) without considering the background radiation. However, in the proposed work, the count rate representing the background radiation is set to a constant value as shown in Table 1. The range of intensity values [SImin, SImax] is set according to the values that are commonly used for evaluating radiation source localization methods (Zhao et al. 2019; Bukartas et al. 2019; Zhao and Sullivan 2019). For the feature extraction process, the expected maximum difference (Diffmax) is set to 1000 cps, while the quantization step (ΔQ) is set to 10 cps. The maximum depth of the decision tree is set to 30.

Table 1 The values of the parameters used for the data generation
Table 2 Some of the simulated source intensities measured in counts per second, at 1 m away from the source, presented in Zhao et al. (2019)

The performance of the proposed work is compared with other existing approaches in terms of the estimation accuracy and the execution time using the testing data. Table 3 describes four methods used for the performance comparison. The methods are implemented according to their description in Zhao et al. (2019); Bai 2015; Cordone 2019; Zhao and Sullivan 2019). Note that the three methods presented in Bai (2015); Cordone 2019; Zhao and Sullivan 2019) are based on the MLE algorithm. However, the likelihood optimization is performed differently. The method in Bai (2015) applies a grid search to estimate the source’s parameters, while the methods in Zhao and Sullivan (2019) is based on Newton–Raphson optimization. On the other hand, the estimation method in Cordone (2019) is performed using a multi-resolution MLE. In this work, the Kriging method’s parameters in Zhao et al. (2019) are adjusted experimentally according to the testing data.

Table 3 Description of the methods used for the performance comparison

For the source intensity estimation, the accuracy is measured using the NRMSE (Normalized Root-Mean-Square Error) that is calculated as follows:

$$ {\text{NRMSE}} = \frac{1}{{{\text{SI}}_{\max } - {\text{SI}}_{\min } }} \cdot \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{{N_{{{\text{ts}}}} }} \left( {{\text{SI}}_{i} - {\text{SI}}_{{{\text{est}}_{i} }} } \right)^{2} }}{{N_{{{\text{ts}}}} }}} $$
(19)

where \({\mathrm{SI}}_{i}\) and \({\mathrm{SI}}_{{\mathrm{est}}_{i}}\) are the ith actual and estimated value for the source’s intensity, respectively. Figure 12 shows the comparison between the proposed approach and the methods in Zhao et al. (2019); Bai 2015; Cordone 2019; Zhao and Sullivan 2019) in terms of the intensity estimation error measured using NRMSE. As shown from Fig. 12, the proposed method provides a superior performance in terms of the intensity estimation accuracy. Results show that the method in Bai (2015) provides better estimation accuracy than that of the methods in Cordone (2019); Zhao and Sullivan 2019) because the corresponding likelihood optimization is performed using grid search.

Fig. 12
figure 12

Comparison between the proposed approach and the four methods, described in Table 3, in terms of the SI (source intensity) estimation error measured using NRMSE

On the other hand, the accuracy of the estimated source location is represented by location error that is evaluated using the Euclidean distance as follows:

$$ {\text{location}}\;{\text{error}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} {{\text{dis}}\left( {{\text{Sloc}}, {\text{Sloc}}_{{{\text{est}}}} } \right)_{1} } \\ {{\text{dis}}\left( {{\text{Sloc}}, {\text{Sloc}}_{{{\text{est}}}} } \right)_{2} } \\ \end{array} } \\ \cdot \\ \cdot \\ \cdot \\ {{\text{dis}}\left( {{\text{Sloc}}, {\text{Sloc}}_{{{\text{est}}}} } \right)_{{N_{ts} }} } \\ \end{array} } \right] $$
(20)

where dis (Sloc, Slocest)i is the Euclidean distance between the ith value of the actual and estimated source location, respectively. Figure 13 shows a box plot representation for the location error achieved using the proposed method and the four methods described in Table 3. The results, shown from Fig. 13, are tabulated in Table 4 that illustrates some of the descriptive statistics for the location error. As seen from Fig. 13 and Table 4, the proposed method is outperformed by method 1 in terms of the location estimation accuracy due to the exhaustive search employed (as will be seen next, method 1 performance comes at expense of much higher execution time/latency). The proposed approach provides better accuracy compared to that of the other methods.

Fig. 13
figure 13

Box plot representation for the location error, measured in meter (m), achieved by the proposed approach and the four methods described in Table 3

Table 4 Comparison between the proposed method and the four methods, described in Table 3, according to some of the descriptive statistics for the location error measured in meter (m)

Finally, the performance is examined according to the execution time of the estimation process. Table 5 presents a comparison between the proposed algorithm and the methods, described in Table 3, in terms of the execution time measured in seconds (sec). Note that the time is measured for estimating both the source’s intensity and location. As shown from the results, the proposed method is less time-consuming compared to that of the other methods. From Tables 3 and 5, as expected, the method in Bai (2015) is the most time-consuming algorithm due to the exhaustive search of the MLE method.

Table 5 Comparison between the proposed approach and the four methods, described in Table 3, in terms of the execution time measured in seconds (s)

It should be noted the performance of the proposed work was measured using generated data rather than real data. However, the accuracy of real decisions depends on the degree of similarity between the assumed parameters used for generating the synthetic data and the real parameters corresponding to the real measurements and environment. If the measurement tools and environment parameters are closer to the assumed parameters, then the achieved accuracy will be closer to that provided using synthetic data.

5 Conclusion

In this paper, we presented a method for radioactive source localization using machine learning to estimate the source’s location and intensity. A distributed sensor network is used for detecting the radiation source in a given area through a group of clusters, each of which includes a fixed number of detectors. The estimation process is carried out within the cluster, where the radiation source is detected. It is performed via a regression algorithm using the reading and position of each sensor. A simple and effective feature extraction method was proposed through which a feature vector is generated using synthetic data created using Poisson model. We proposed a zoning operation to incorporate location information as a feature to the machine learning model. On the other hand, the sensor’s reading is approximated using a quantization process. Then, each feature value is evaluated using the zone number and the quantized value of the reading. In the proposed method, any traditional regression algorithm can be utilized for the estimation procedure using the calculated feature vector. The decision tree regression model is examined. Two separate decision trees were employed, where each one was used to estimate one of the radiation source’s parameters (location and intensity). A performance comparison was carried out between the proposed work and other recently published localization methods. The experimental results showed that the proposed approach provided superior performance compared to most approaches according to the estimation accuracy and execution time. However, exhaustive search based on MLE provided more accurate estimation accuracy at the expense of much higher latency.