1 Introduction

The space debris number increases greatly in the last decades due to the intense outer space exploration, making a deteriorating earth orbit. The space debris detecting, dodging and removing become a remarkable international issue. In this case, the surveillance of the space debris becomes a hot topic in the space exploration. Meanwhile, the improvement of the space debris surveillance and early warning system, especially the detection of dim small targets, plays as a key technology.

Due to the complexity of the imaging environment in space, the detection of dim small targets in star images is influenced by low noise-signal ratio and rare unstable features and the similarity between stars and targets. As the imaging distance is too far, and the size of the space debris itself is also small, the spot of the target in the star image target has only 3–100 pixels without obvious texture features. Many commonly used feature extraction operators, such as Canny, LBP, SIFT and etc., cannot effectively extract the features in the star image. The target is approximately Gaussian distribution on the image with the brightest pixel as the center, and the surroundings are scattered into a circular or elliptical spot. The gray value of the surrounding pixels is lower than the center pixel. In the image sequence, since the movement of the target and background noise, the gray level, area, SNR and other intuitive features of the target are constantly changing.

Due to the characteristic of the star image, most of the existing detectors handle the problem from two aspects. The first one is to choose better threshold for segmentation. Consider of the uneven illumination, adaptive local threshold segmentation method [1,2,3] was proposed. The segmentation threshold is calculated in each divided sub-image instead of setting a global threshold [4,5,6]. However, the relationship between the adjacent target pixels and the distribution of the gray level are not taken into account. The other make uses of other features of the image, such as the target geometry the detection method based on mathematical morphology [7], the target detection method based on the statistical model [8], the target detection based on wavelet analysis method [9], and the genetic algorithm [10] which is robust to complex background. Pixel based methods usually use multi-frame image to detect target. For example, the method based on inter-frame difference [11] and the self-adaptive optical flow method [12] need at least 2 images to determine the motion of the targets. Although these algorithms have improved accuracy, there are still limitation on detecting the low SNR target.

In this paper, we consider the problems mentioned above and propose dim small target detecting method with feature learning. The main contributions of our work can be summarized as follows:

  • We design a filter bank based on the imaging characteristics of the small debris and noises, to makes full use of the correlation between the adjacent pixels of the target. The experimental results demonstrated that the features extracted by the designed filter banks achieve better than the traditional features such as gray level, etc.

  • We take the detection problem as a classification problem. A SVM classifier trained by labeled star image is used to detect the stars and potential targets. The training process of the SVM classifier is simple but the effective.

2 The Spatial-Frequency Domain Features Based Dim Small Target Detection

2.1 Dim Small Target Imaging Model

The target imaging model in a single frame is shown in Eq. (1):

$$ T^{t} + B^{t} = \alpha_{{(x_{i} ,y_{j} )}}^{t} \delta^{t} (x_{i} ,y_{j} ) \otimes h_{o} \otimes h_{T}^{t} + n^{t} (x,y)\quad x,y \in N_{T} $$
(1)

where \( T^{t} \) is the ideal imaging model for the target at the moment, \( B^{t} \) is the background in the moment of the imaging model. The real target imaging model is the result of the superposition of the target itself and the noises. \( \alpha_{{(x_{i} ,y_{j} )}}^{t} \) is the brightness of target, \( \delta_{{(x_{i} ,y_{j} )}}^{t} \) is the impact function, \( h_{o} \) for the optical system blur kernel, which is generated by the design and manufacture of camera optical system. \( h_{T}^{t} \) is the target motion blur kernel, which is generated by the relative motion between the camera and targets, \( n_{(x,y)}^{t} \) is the noise gray level at pixel on location of \( (x,y) \).

The only difference between stars and targets is the motion blur kernel. In this paper, we only focus on detection in single frame, so we treat star and debris target as the same class. The stars and targets will be distinguished by their motion features with multi-frame in subsequent steps.

2.2 Dim Small Target Detection

In this paper, we use a filter bank consisting of 29 filters to extract the features of each pixel in the star images. The features extracted from the labeled real star image are then used to train the SVM classifier. After obtaining the optimal parameters, target in new star image can be detected by the trained SVM classifier (Fig. 1).

Fig. 1.
figure 1

The workflow of the proposed method

Feature Extraction with Filter Bank.

Consider of the distribution of the gray level and the motion blur, we design a filter bank consisting of 29 filters to extract the features of the stars and potential targets in different scales (Fig. 2).

Fig. 2.
figure 2

The filter bank has a mix of edge, bar and spot filters at multiple scales and orientations. It has a total of 29 filters - 2 Gaussian derivative filters at 6 orientations, 8 Laplacian of Gaussian filters, 4 Gaussian filters and 5 S filters.

The labeled images are convolved with the 29 filters respectively. The feature vector of every pixel in the image is 29-dimentional.

$$ Feature_{i} = I_{train} \otimes filter_{i} \quad i = 1, \ldots ,29 $$
(2)

The Gaussian filters.

The Gaussian derivative filters consist of 4 rotationally invariant filters of the form.

$$ {\text{G(}}x ,y ) { = }\frac{1}{{2\pi \sigma^{2} }}e^{{ - \frac{{x^{2} + y^{2} }}{{2\sigma^{2} }}}} $$
(3)

where x, y are the two-dimensional coordinates, \( \sigma \) is \( \sqrt 2 ,2,2\sqrt {2,4} \).

The Gaussian derivative filters.

The Gaussian derivative filters consist of filters at 6 orientations.

$$ \left[ {\begin{array}{*{20}c} {x^{{\prime }} } \\ {y^{{\prime }} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\cos \,\theta } & { - \sin \,\theta } \\ {\sin \,\theta } & {\cos \,\theta } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} x \\ y \\ \end{array} } \right] $$
(4)

where x, y are the two-dimensional coordinates, \( x^{{\prime }} ,y^{{\prime }} \) are the coordinates after rotation, \( \theta \) is the direction, respectively, taking 0°, 30°, 60°, 90°, 120° and 150°.

$$ \frac{\partial G}{{\partial x^{{\prime }} }} = \left( { - \frac{1}{{2\pi \sigma_{x}^{4} }}} \right)xe^{{ - \frac{{x^{{{\prime }2}} + y^{{{\prime }2}} }}{{2\sigma_{x}^{2} }}}} ,\frac{\partial G}{{\partial y^{{\prime }} }} = \left( { - \frac{1}{{2\pi \sigma_{y}^{4} }}} \right)ye^{{ - \frac{{x^{{{\prime }2}} + y^{{{\prime }2}} }}{{2\sigma_{y}^{2} }}}} $$
(5)
$$ \frac{{\partial^{2} G}}{{\partial x^{{{\prime 2}}} }} = \left( { - \frac{1}{{2\pi \sigma_{x}^{4} }}} \right)\left( {1 - \frac{{x^{{{\prime 2}}} }}{{\sigma_{x}^{2} }}} \right)e^{{ - \frac{{x^{{{\prime }2}} + y^{{{\prime }2}} }}{{2\sigma_{x}^{2} }}}} ,\frac{{\partial^{2} G}}{{\partial y^{{{\prime 2}}} }} = \left( { - \frac{1}{{2\pi \sigma_{y}^{4} }}} \right)\left( {1 - \frac{{y^{{{\prime 2}}} }}{{\sigma_{y}^{2} }}} \right)e^{{ - \frac{{x^{{{\prime }2}} + y^{{{\prime }2}} }}{{2\sigma_{y}^{2} }}}} $$
(6)

where \( \sigma_{x} \) is \( 6\sqrt 2 \), \( \sigma_{y} \) is \( 2\sqrt 2 \).

The Laplacian of Gaussian filters.

The LOG filters consist of 8 rotationally invariant filters.

$$ \text{L}oG(x,y) = - \frac{1}{{\pi\upsigma^{4} }}\left[ {1 - \frac{{x^{2} + y^{2} }}{{2\upsigma^{2} }}} \right]e^{{ - \frac{{x^{2} + y^{2} }}{{2\upsigma^{2} }}}} $$
(7)

where \( \sigma \) takes \( \sqrt 2 ,2,2\sqrt 2 ,4,3\sqrt 2 ,6,6\sqrt 2 ,12, \) respectively.

The Schmid (S) filters.

The S filters [13] consist of 5 rotationally invariant filters.

$$ S(\text{r,}\sigma \text{,}\tau ) = \cos \left( {\frac{\pi \tau r}{\sigma }} \right)\exp \left( { - \frac{{r^{2} }}{{2\sigma^{2} }}} \right) $$
(8)

where F0(σ, τ) is added to obtain a zero DC component with the (σ, τ) pair taking values (2, 1), (4, 1), (6, 1), (8, 1), and (10, 1). The filters have rotational symmetry.

In this paper, we use the Eq. (9) to normalize the data range to [−1,1], to get rid of influence of scale difference.

$$ x^{*} = x_{\hbox{min} } + 2 \times \frac{{x - x_{\hbox{min} } }}{{x_{\hbox{max} } - x_{\hbox{min} } }},x \in D,x^{*} \in D^{*} $$
(9)

where D is the dataset need to be normalized, \( D^{*} \) is the normalized dataset, x and \( x^{*} \) are the element of D and \( D^{*} \), respectively. \( x_{\hbox{max} } \) and \( x_{\hbox{min} } \) are the largest and smallest elements in D.

Balance of the sample.

In the star image, the number of background pixels far exceed the number of target pixels, which more than one order of magnitude. When the number of positive and negative samples are very disparity, the classifier only classifying the major class correctly can achieve high precision. Therefore, in order to ensure the balance amount of training samples, we select a part of the background pixels as the negative samples for training from the feature space corresponding to the star image.

Since the influence of the inconsistent background illumination may influence the performance, we divide the image into several regions, and the background pixels are taken randomly from every region of the image to ensure the pixel samples are distributed on the whole image. In this paper, every training image is divided into 16 regions, and the same number of background points are randomly sampled in each area as the training sample input.

SVM Classifier Training.

In the feature space obtained by the above processing, the training data set T can be expressed as:

$$ T = \left\{ {\left( {{\mathbf{x}}_{1} ,{\mathbf{y}}_{1} } \right),\left( {{\mathbf{x}}_{2} ,{\mathbf{y}}_{2} } \right), \ldots \left( {{\mathbf{x}}_{\text{N}} ,{\mathbf{y}}_{\text{N}} } \right)} \right\},{\mathbf{x}}_{i} \in \chi = R^{n} ,1 \le i \le N $$
(10)

where \( {\mathbf{x}}_{i} \) is the n dimensional feature vector of the ith training sample, \( {\mathbf{y}}_{i} \) is the class label of the ith training sample, N is the number of samples in the training data set.

Each pixel has an n-dimensional feature vector. In this n-dimensional feature space, the features are linearly inseparable, so it is necessary to map the space to high-dimensional space and then classify it.

In the original linear space \( \chi \subset {\mathbf{R}}^{n} ,{\mathbf{x = }}\left( {x^{(1)} ,x^{(2)} , \cdots ,x^{(n)} } \right)^{T} \in \chi \), where \( \chi \) is the input of low-dimensional space. The mapping high-dimensional space is \( \text{Z} \subset {\mathbf{R}}^{m} ,{\mathbf{z = }}\left( {z^{(1)} ,z^{(2)} , \cdots ,z^{(m)} } \right)^{T} \in Z,(m > n) \). Here, the radial basis function (RBF) is used to map the nonlinear samples to the high-dimensional space.

$$ K({\mathbf{x,z}}) = \exp \left( { - \gamma \left\| {{\mathbf{x - x}}_{c} } \right\|^{2} } \right)\quad \gamma > 0 $$
(11)

where x the original feature vector. z is the corresponding feature vector in the high dimensional space. \( {\mathbf{x}}_{c} \) is the center of the kernel function.

And then use the idea of maximizing the interval to find a class of hyperplanes as shown in Eq. (12) in the training data set.

$$ w^{\text{T}} {\mathbf{z + }}b = 0 $$
(12)

where w and b are the normal vector and the intercept of the classification hyperplane. The corresponding decision function is formula (13).

$$ f(x) = sign\left( {w^{T} {\mathbf{z + }}b} \right) = \left\{ {\begin{array}{*{20}l} 1 \hfill & {w^{T} {\mathbf{z + }}b > 0} \hfill \\ 0 \hfill & {w^{T} {\mathbf{z + }}b < 0} \hfill \\ \end{array} } \right. $$
(13)

The training of the SVM classifier minimizes the objective function in Eq. (13). The aim is to maximize the support vector, at the same time, minimize the number of misclassified samples.

$$ \begin{array}{*{20}c} {\mathop {\hbox{min} }\limits_{\omega ,b,\xi } \frac{1}{2}\left\| w \right\|^{2} + C\sum\limits_{i = 1}^{N} {\xi_{i} } } \\ {s.t\,\,\,y_{i} \left( {w^{T} z_{i} + b} \right) \ge 1 - \xi_{i} ,\xi_{i} \ge 0,i = 1,2, \ldots ,N} \\ \end{array} $$
(14)

where \( \xi_{i} \) is the relaxation variable, and \( \xi_{i} \ge 0 \), C is the penalty parameter, \( C > 0 \).

The optimal classification parameters C and \( \gamma \) corresponding to the SVM classifier are determined through twice grid search. The range of parameters are divided into different grids according to a certain separation method. Then the algorithm iterates through the points in each grid to determine the relative optimal parameters for the classifier.

The bigger grids are applied in the first search. After finding out the best interval, the smaller grids are creating within the interval for the second search. The specific search method is as follows (Table 1):

Table 1. The process of grid search

Here we use 5-fold cross validation to determine the classification performance of the classifier in a single training process. The original data is divided into five parts, four of which as a data training set, the rest one as the validation set. During the training process, the training set is used to train the classifier, and then the verification set is used to verify the classification performance of the classifier. The average of the five results is the classification performance of the classifier.

As shown in Fig. 3, each blue circle represents a grid parameter optimization process. When the penalty function parameter C is 3, and the parameter \( \gamma \) is −2, the classifier classification performance is optimal, classification accuracy rate of 99.626%.

Fig. 3.
figure 3

The results of grid search (Color figure online)

3 Experiments

In this section, more than 3000 labeled star images with hot-pixel, cosmic rays and other noises are experimented with the proposed method. The environment is Windows 7, Core i3 processor, memory 6 G, the program is coded in MATLAB R2013a.

3.1 Detection Rate

In this paper, the detection is on pixel level, so the detection rate is defined as the proportion of correctly classified target pixels in the ground truth target pixels (Fig. 4).

Fig. 4.
figure 4

The detection rate of target with different SNR

Fig. 5.
figure 5

The ROC line graph of different methods

The Filters-SVM method tends to have a better detection rate when processing the same sequence of images. The detection rate of the Filters-SVM method is higher than 95% for different SNR targets. The average detection rate of adaptive local threshold segmentation is only 92.36% when the target SNR is lower.

The ROC curve of our method higher than the other methods, and the AUC value of Filters-SVM is 0.97, which is greater than the AUC value of adaptive local threshold segmentation (0.96) and Gray-SVM method (0.89).

3.2 Shape Retention Ability

Haralick [14] established four criteria to evaluate the performance of segmentation. (a) For some features, such as gray level and texture, the same region should be coincident and uniform. (b) The area should be simple and without a lot of holes. (c) There are significant differences between the adjacent regions. (d) The boundary of each region should be simple but not rough, and the space position is accurate.

The target shape is a key factor in the process of target centroid localization. The more complete the target shape, the more accurate the centroid positioning results. The retention of the target shape in the three methods is analyzed. The results are shown in Fig. 6.

Fig. 6.
figure 6

The results of detection with different method

The proposed method (Filters-SVM) is compared with adaptive local threshold segmentation (TS), and gray based SVM (Gray-SVM), which use only gray level as features to train the SVM classifier. The result demonstrates that the Filters-SVM method (in Fig. 5) has least incorrectly classification on the background noise. The edge of the target is smooth.

The centroid position of target in the image will be later used to determine the orbit. A very small error can lead to several kilometers of track position error. In this paper, we evaluate the performance of detection method also by the centroid positioning error. Here we use the energy accumulation centroid positioning method proposed in [15] (Table 2).

Table 2. The error of centroid positioning

3.3 Speed

Due to the real-time requirement of dim small target detection in star image, the processing time of single-frame image is also an important criterion of small target detection method. In this experiment, the single frame image processing time is analyzed.

Table 3 shows the single frame processing time for three different target detection methods. The result of adaptive local threshold segmentation consist of many wrongly classified pixels, and generally cooperated with the post-process to deal with the wrong pixels. Therefore, this experiment also considered its impact on single-frame image processing time. The Filters-SVM method meets the requirements of real-time processing.

Table 3. The average processing time in single frame

4 Conclusion

This paper proposes a low-signal-to-noise star point detection method for star images based on the high-dimensional features. In this paper, the target detection is solved as an image classification problem. Firstly, the designed filter banks is used to extract the high-dimensional features, and then the classifier is trained according to the characteristics. Finally, the detection of potential targets and stars is realized by the trained classifier. The experimental results show that our method can detect the dim and small targets and stars in the star image, and can effectively prevent the interference of the noise. The weak targets with small spots, low brightness and low signal-to-noise ratio are well extracted.