Keywords

1 Introduction

LAMOST, the abbreviation of Large Area Multi-Object Fiber Spectroscopic Telescope, has the ability to obtain 4000 targets data simultaneously in one observation. More about LAMOST has been described in references [1, 2]. The energy of the celestial targets transmitted through two different cameras [3] and then imaging onto the slit of each spectrograph, finally we obtain the spectra through shooting by CCD cameras.

There are a lot of processes while dealing with two-dimensional spectral images [4]. Flux extraction is one of the most essential procedures that highly affect the subsequent processing steps. The energy of each spectrum diffused not only in spatial direction but also diffused a little bit in wavelength one. Flux extraction is mainly to extract flux of different spectra at each pixel along these two directions.

Methods we now utilized in flux extraction are mainly divided into two kinds. They are flux extraction in one-dimensional (1D) and in two-dimensional (2D). Methods based on 2D [5, 6], such as profile fitting methods, that is to choose an appropriate function to simulate the profile of each spectrum in both spatial and wavelength direction. However, due to the changes of spectrum energy in the wavelength direction reflects the spectra information of the target objects. Therefore, flux extraction in the wavelength direction may break the structure and data integrity of spectrum information. Therefore, methods we applied in LAMOST now are still based on 1D data.

The original method based on 1D is the aperture extraction method [7] that is to choose an appropriate aperture around each fiber center and then counts up all flux within the aperture along spatial direction. The method is simple and cost less time. However, the disadvantage is that it is strongly depends on the aperture size and it cannot solve cross-talk [8] with adjacent fibers. An optimal aperture method [9] then appeared to give different pixel a different weight. The method can improve the SNR (signal noise ration). However, it is also highly depending on the aperture parameter setting.

Another method is the profile fitting method [10], which is to choose an appropriate function, usually we choose Gaussian function to approximate the profile in spatial direction and then obtain each pixel corresponding flux through the function. The method can overcome the cross-talk problem. However, it also has a disadvantage. There are 250 spectra data in each spectra image, as for such massive spectra data, each fiber has a different profile function in different wavelength. Therefore, profile fitting method is time-consuming and it is not appropriate for LAMOST spectra data.

An improved approach is using RBF neural network [11] which is to choose Gaussian function as radial basis function to fit profiles. The method works well, however it cannot possess both a higher SNR and lower time consuming simultaneously.

In this paper, an improved method is proposed to extract spectra flux through GRNN (General Regression Neural Network), a frequent method to fit various nonlinear functions. GRNN neural network is an improved RBF network with four layers. We compare our method with RBF one with same radial basis function and same parameters. The results show that the method we proposed in this paper not only possesses higher precision with higher signal-to-noise ratio but also cost less time.

The rest of the paper is organized as follows. The principle and architecture of GRNN is detailed in Sect. 2. The flux extraction algorithm is presented in Sect. 3. Experimental results and comparison are revealed in Sect. 4 and finally following a conclusion in Sect. 5.

2 The Principle and Architecture of GRNN

2.1 The Principle of GRNN

General Regression Neural Network is a commonly used method to do nonlinear fitting in recent years which is based on non-linear regression analysis [12]. GRNN has a strongly nonlinear mapping ability and fast learning speed. It does not need a weight training process, so it is faster than RBFNN which has a layer decision and a time-consuming training phase. In addition, GRNN has the ability to deal with unstable data.

We assume there are two randomly variables \( {\text{x}} \) and \( {\text{y}} \), \( {\text{f}}\left( {{\text{x, y}}} \right) \) represents the known joint continuous probability density function. The observed value of \( {\text{x}} \) is \( {\text{x}}_{0} \). The probability density of \( {\text{y}} \) is given by the following:

$$ {\text{E}}\left( {{\text{y|x}}_{0} } \right) = \frac{{\mathop \smallint \nolimits_{ - \infty }^{ + \infty } {\text{yf}}\left( {{\text{x}}_{0} , {\text{y}}} \right){\text{dy}}}}{{\mathop \smallint \nolimits_{ - \infty }^{ + \infty } {\text{f}}\left( {{\text{x}}_{0} , {\text{y}}} \right){\text{dy}}}} $$
(1)

\( {\text{y}}\left( {{\text{x}}_{0} } \right) \) is the predicted output, however, the density function \( {\text{f}}\left( {{\text{x}}_{0} ,{\text{y}}} \right) \) usually is uncertain. According to non-parametric estimation, we estimate \( {\text{f}}\left( {{\text{x}}_{0} , {\text{y}}} \right) \) with the sample sets \( \left\{ {x_{i} ,y_{i} } \right\}_{i = 1}^{n} \) through the following:

$$ {\text{f}}\left( {{\text{x}}_{0} , {\text{y}}} \right) = \frac{1}{{{\text{n}}\left( {2\uppi } \right)^{{\frac{{{\text{p}} + 1}}{2}}} \upsigma^{{{\text{p}} + 1}} }}\mathop \sum \limits_{{{\text{i}} = 1}}^{{\rm n}} {\text{e}}^{{ - {{\rm d}}\left( {{{\rm x}}_{{0,{{\rm x}}_{\text{i}} }} } \right)}} {\text{e}}^{{ - d\left( {y,y_{i} } \right)}} $$
(2)
$$ {\text{d}}\left( {{\text{x}}_{{0,{\text{x}}_{{\rm i}} }} } \right) = \mathop \sum \limits_{{{\text{j}} = 1}}^{\text{p}} \left[ {\left( {{\text{x}}_{{0{\text{j}}}} - {\text{x}}_{\text{ij}} } \right)/\upsigma } \right]^{2} ,{\text{d}}\left( {{\text{y}},{\text{y}}_{\text{i}} } \right) = \left[ {{\text{y}} - {\text{y}}_{\text{i}} } \right]^{2} $$
(3)

Where \( p \) is the dimension of the variable \( x \) and \( n \) is the number of the observation samples, the final output is able to be calculated by:

$$ {\text{y}}\left( {{\text{x}}_{0} } \right) = \frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{{\rm n}} ({\text{e}}^{{ - {{\rm d}}\left( {{\text{x}}_{{0,{{\rm x}}_{\text{i}} }} } \right)}} \mathop \smallint \nolimits_{ - \infty }^{ + \infty } y_{i} {\text{e}}^{{ - d\left( {y,y_{i} } \right)}} {\text{dy}})}}{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{{\rm n}} ({\text{e}}^{{ - {{\rm d}}\left( {{\text{x}}_{{0,{{\rm x}}_{\text{i}} }} } \right)}} \mathop \smallint \nolimits_{ - \infty }^{ + \infty } {\text{e}}^{{ - d\left( {y,y_{i} } \right)}} {\text{dy}})}} $$
(4)
$$ {\text{y}}\left( {{\text{x}}_{0} } \right) = \frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} y_{i} {\text{e}}^{{ - d\left( {y,y_{i} } \right)}} }}{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}}\,{\text{e}}^{{ - d\left( {y,y_{i} } \right)}} }} $$
(5)

2.2 The Architecture of GRNN

GRNN is a transformation of RBFNN [13] with four layers. The first one is an input layer, which represent the input spectra signal in this paper. The second one is the hidden layer, we also call radial basic layer, which is to choose an appropriate transfer function. There are lots of transfer functions and the commonly used one is Gaussian basis function. The third one is a summation layer and finally an output layer. The architecture of the network is shown in Fig. 1.

Fig. 1.
figure 1

The network architecture of GRNN.

3 Flux Extraction Algorithm

3.1 Preprocessing the Spectra

The spectral images we processed in this paper are flat field spectra images, as shown in Fig. 2. The vertical and the horizontal direction are the wavelength and spatial direction respectively. The spatial one is also the flux extraction direction. The flat images are preprocessed including bias subtraction and tracing fiber center [14].

Fig. 2.
figure 2

Part of an integrity flat image.

3.2 The Network Structure and Training Algorithm

As introduced above, there are 250 fiber spectra on each CCD image, so the 250 spectra data are the input data. The second step is to choose an appropriate radial basic function. In this paper, we use double Gaussian function [15] to extract flux and verify the feasibility of the algorithm. The double Gaussian function (DGF) is as follows:

$$ {\text{G}}\left( {\text{x}} \right) = \frac{\text{A}}{{\sqrt {2\uppi } \upsigma }}\exp \left\{ {\frac{{ - \left( {{\text{x}} - {\text{x}}_{\text{c}} -\Delta {\text{x}}} \right)^{2} }}{{2\upsigma^{2} }}} \right\} + \frac{\text{A}}{{\sqrt {2\uppi } \upsigma }}\exp \left\{ {\frac{{ - \left( {{\text{x}} - {\text{x}}_{\text{c}} +\Delta {\text{x}}} \right)^{2} }}{{2\upsigma^{2} }}} \right\} $$
(6)

where

$$ {\text{A}} = {\text{F}}_{\text{peak}} \cdot \frac{{\sqrt {2\uppi } \upsigma }}{2} \cdot { \exp }\left\{ {\frac{{ - \left( {\Delta {\text{x}}} \right)^{2} }}{{2\upsigma^{2} }}} \right\}^{ - 1} $$
(7)

The two single Gaussian functions have the same amplitude A, \( {\text{x}}_{\text{c}} \) is the center of each fiber, \( \Delta {\text{x}} \) is the distance to the center \( {\text{x}}_{\text{c}} \), \( \upsigma \) is the variance of each fiber. \( {\text{F}}_{\text{peak}} \) represents the peak value of each spectra profile.

There are three parameters we need to determine which are \( {\text{x}}_{\text{c}} \), \( \Delta {\text{x}} \) and \( \upsigma \). As mentioned above, \( {\text{x}}_{\text{c}} \) is a constant value for each profile. Therefore, the crucial step is to determine \( \Delta {\text{x}} \) and \( \upsigma \). Since in the single Gaussian function, the initial \( \upsigma \) is 3.5 (depends on the FWHM). In this paper, we choose the initial value of \( \Delta {\text{x}} \) and \( \upsigma \) both as 1.7. We use the following formula to adjust \( \Delta {\text{x}} \) and \( \upsigma \):

$$ {\text{error}}_{\text{cen}} = \frac{{{\text{F}}_{\text{cen}} - {\text{G}}_{\text{cen}} }}{{{\text{F}}_{\text{cen}} }}{\text{error}}_{\text{wing}} = \frac{{{\text{F}}_{\text{wing}} - {\text{G}}_{\text{wing}} }}{{{\text{F}}_{\text{wing}} }} $$
(8)
$$ \Delta {\text{x}} =\Delta {\text{x}} \cdot \left( {1 + {\text{error}}_{\text{cen}} } \right)\quad \upsigma = \upsigma \cdot \left( {1 + {\text{error}}_{\text{wing}} } \right) $$
(9)

where \( {\text{F}}_{\text{cen}} \) and \( {\text{G}}_{\text{cen}} \) are the actual flux around the center and the approximation, respectively. The two errors are used to adjust \( \Delta x \) and \( \upsigma \).

4 The Experimental Results and Analysis

In this section, we compare our method with aperture and RBFNN method, some of the flux extraction results are as shown in Figs. 3 and 4.

Fig. 3.
figure 3

The aperture, RBF and GRNN methods of results.

Fig. 4.
figure 4

The aperture, RBF and GRNN methods of results.

In order to verify the accuracy of our proposed method, we evaluate it through two different aspects: the signal noise ratio (SNR) and the time consumption. We first use the same method as [15] to calculate SNR:

$$ {\text{SNR}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \frac{{F_{M}^{i} }}{{\left| {F_{i} - F_{M}^{i} } \right|}} $$

where \( N \) is the selected pixels, \( F_{M}^{i} \) is the corresponding flux of the continuum which is calculated by median filter with the width of 11 pixels. \( F_{i} \) is the flux of \( i{\text{th}} \) pixel we extract. The experimental results are shown in Fig. 5. We calculate the average SNR of the three methods in different pixel area, as shown in Table 1, the SNR of our proposed method is higher than RBF and aperture methods.

Fig. 5.
figure 5

The SNR of three methods for some of the spectra.

Table 1. The average SNR of aperture, RBF and GRNN methods

Moreover, we also compare the time consumption of the three methods, as shown in Table 2. Although the aperture method possesses lower time consumption, it cannot improve the SNR of flux extraction. GRNN method cost less time than RBF method which also has a higher SNR.

Table 2. The time consumption of three methods

5 Conclusion

In this paper, we proposed a novel method which utilizing GRNN to extract flux of the LAMOST spectra data. We use double Gaussian basis function to approximate the profile of the spectra in spatial orientation. Some contrast experiments are implemented on the same spectra data. The experimental results show that the GRNN method we presented in this paper possess both a higher SNR and lower time consuming than aperture and RBFNN methods. This is more suitable for LAMOST which has such massive data.