Introduction

Feature extraction from a remotely sensed dataset is an important step for different types of remote sensing applications such as band-to-band registration and geometric correction, image normalization. Thus, it has received considerable attention [1]. Over the past decades, a considerable number of studies have proposed several local feature extraction techniques, such as the SIFT algorithm [1, 2]. These methods involve a Speeded Up Robust Features (SURF) by Bay et al. [2] and Gradient Location Orientation Histogram referenced by Mikolajczyk and Schmid [3].

Image matching

The image matching is an aspect of several problems in different kinds of remote sensing and computer vision, including object recognition, bands registration, motion-tracking, and image geometric correction [1]. Lowe [4] described imagery features that have properties make them satisfactory for images matching of an object. The features and/or objects are invariant to imagery rotation, scaling and it seems partially invariant to any changes in viewing-point and illumination. They are incredibly localized in frequency and spatial domains and that will reduce the disruption probability by clutter, occlusion, or noise. A large numbers of objects can extract from different satellite image through using an efficient approaches.

Ground control points

The ground control points (GCPs) are groundtruth data collected from the study area from fieldwork using different instrument such as GPS. However, control points (CPs) are extraction control points are selected using from remote sensing imagery or using Google Earth [2, 3]. GCPs are very important stage for performing different kinds of remote sensing applications. Therefore, this step has received considerable attention [1, 2]. Therefore, a robust and flexible technique is always necessary to extract the GCPs automatically. Improving the extracted GCPs automatically will again improve the application accuracy [1]. The refined GCPs should then be used in determining transformation coefficients in different types of remote sensing applications. There are several algorithms have been used to perform feature extraction such as algorithms reported by [1,2,3,4].

The SIFT algorithm

SIFT algorithm is used for this study as it is one of the most effective algorithms that has been used to extract the features from images automatically [4,5,6,7,8,9]. However, applying SIFT for remote sensing imagery such as multi-sensor and/or near-equatorial orbit satellite (NEqO) images performs poorly or fails completely and will produce false CPs, which leads to an error in CP matching for sequence images that have the same features. The SIFT CPs are the SIFT extracted control points, and these CPs consider as a reference points can be used for performing images geometric and/or registration correction. Images captured at different times have a wide range of frequencies (intensity value of image pixel), and images captured at different frequencies have a different kind of response [10,11,12,13]. Therefore, matched CP pair correctness is important [3, 4, 14]. It is very common to get a location error for SIFT CPs [4, 9, 15, 16]. Finding an accurate method of refining CP quality is a difficult task [16,17,18]. Captured remotely sensed imagery, especially from near-equatorial satellites and multiple sensors, contains nonlinear geometric distortions [9, 16, 17, 19, 20].

Modified SIFT approaches

These types of errors are non-systematic and, in order to overcome such errors, it is impossible to collect CPs by conventional techniques for NEqO images because of the differences in altitude (sensor and topographic terrain), attitude (pitch, roll, and yaw), capturing time, illumination, viewing points, sun zenith and azimuth, and sensor zenith and azimuth during image capture [16, 21, 22]. Modified SIFT approaches widely adopted in with different types of data such as the synthetic aperture radar imagery [3]. The extraction of CPs for remote sensing images by employing the SIFT algorithm has also been improved. Shragai et al. [23] applied the SIFT approach for extraction the CPs from aerial imagery; it provided good results. Wessel et al. [24] modified a technique to extract GCPs for near-real-time SAR images and integrated it with the digital elevation model (DEM). Liu et al. [25] modified SIFT and called it bilateral filter SIFT (BFSIFT), which is used in the pyramid construct instead of the Gaussian filter. Liu and Yu [26] used the SIFT algorithm to match the sensed and reference images after performing edge extraction on SAR images. Chureesampant and Susaki [27] compared the SIFT-dealing performance of SAR images in different polarizations. Form these works above, knowledge with all challenges of SIFT approaches with remote sensing images is obtained.

Objective of this research

The NEqO satellite system is a very new generation of optical satellite, and unfortunately, no available online publications was found that used the RazakSAT satellite image, and the satellite after received some images lost in space. Therefore, a robust and flexible technique is necessary to automatically extract CPs. The objective of this study is to propose a new methodology that can use the extracted CPs to improve the feature extraction and that will lead to obtain an accurate result during performing different remote sensing applications such as pattern recognition, band-to-band registration, and geometric correction on images capture by different satellite systems.

Materials and methods

A new methodology to refine and improve the generated SIFT features automatically is presented in this research. This methodology is described in Fig. 1. The proposed approach starts by selecting the reference and slave images, and then image compression is performed on both images. Next, we apply a sharpening filter. Then, the SIFT algorithm is applied to generate feature extraction and/or CPs. After that, the generated CPs are refined by employing the sum of absolute difference (SAD) algorithm, which measures and compares the correlation similarity in brightness values (intensities) between the CPs in the reference and slave images to avoid obtaining bad CPs and errors in image matching. Finally, evaluations of the Refined SIFT CPs are performed by comparing the result of the Refined SIFT CPs with that of the original SIFT.

Fig. 1
figure 1

Flowchart of the methodology of improving feature extraction

Study area description and dataset

The study area for this research is located in Penang state in northwest Malaysia, and it lies between 100°09′08′′–100°21′29′′E and 05°13′ 04′′–05°30′ 00′′N, as indicated in Fig. 2. Penang has an area of about 1048 km2, and the population reached 1.767 million in 2018. The Malaysian RazakSAT sensor imagery was adopted in this article. The RazakSAT sensor is a new-generation optical NEqO satellite system at an altitude of about 685 km in space [28]. The RazakSAT imagery coverage area is approximately (20 × 100) km, and its satellite imagery has four multispectral channels (red, green, blue, and near-infrared) and one panchromatic channel [18]. The spatial resolutions of the spectral and panchromatic bands are 5 and 2.5 m, respectively [29]. The images were captured over Malaysia and covered a part of Penang Island. The image was captured on August 1, 2009. Table 1 shows the wavelength of RazakSAT image bands. The Malaysian RazakSAT near the equatorial satellite traverses the Earth in an equatorial journey at ± 10° south and north of the equator line [30]. Figure 3 shows the NEqO satellite image bands of the RazakSAT satellite image that are obtained from the study area. The processed RazakSAT image has only 10% of cloud cover.

Fig. 2
figure 2

Study area in Penang Island

Table 1 RazakSAT satellite image bandwidths specifications
Fig. 3
figure 3

NEqO satellite image bands of the RazakSAT satellite

Using remotely sensed data from the NEqO satellite system has many advantages especially in equatorial countries, all the equatorial countries located in tropical area and these areas are cloudy area all the day, and it has high humidity lead to perform different kind of risks. Therefore, it is difficult task to monitor the environment and risks such as floods, landslides, earthquakes along these areas. It is necessary to use a new generation of satellite can monitor these countries during the day and support these areas with around 14 imagery each day [3, 28]. The NEqO satellite images are not available, because this kind of satellite is very new in space and till now its images used in private sector.

Dataset processing and analyzing

Master and sensed image selection

The use NEqO satellite system needs to be corrected geometrically. However, it is difficult task to do because of the nonlinear distortion. The first step of the proposed technique is the selection of master and sensed images. One of the difficulties encountered in this study is selecting the reference image because only one satellite image is available. Fortunately, the bands of this image have high distortion (nonlinear), which is perfect in this study. Each band of the image is considered as an individual image, and these four bands are involved in implementing the refine-SIFT method. The image used in this study is related to the near-equatorial satellite. All the bands in the image have a similar amount of noise, skewness, stretching, and rotation. Slight differences were observed in the study area [31, 32]. However, the green channel indicates fewer defects in the reference (master) image, and the remaining (R), (NIR), and (B) bands consider as slave imagery. In addition, green channel has nose (missing area) less than the others as indicates in Fig. 4 the RazakSAT image bands noise by putting circles around the noise of each band to show the noise of other bands regarding green band image.

Fig. 4
figure 4

RazakSAT image bands noise

Image compression

The second stage of this research is compression of image bands. Image compression is the reduction of the data amount that is required in order to represent imagery; it is also the reduction of data that encode the original image to fewer bits [32]. This is an important stage of this study, simply because the RazakSAT image had a large storage size of more than 3.5 gigabytes. Hence, MATLAB software was used for this stage. This large size of data is difficult to process and requires huge drive storage and these conditions increase the processing time and usage of storage of the computer. We attempted to reduce the storage amount of the imagery and the processing time. Therefore, both the reference image (band) and sensed image (band) of the RazakSAT satellite were converted to grayscale level in order to perform a normalization of the values to the interval (0–255). Afterward, image compression was performed on the images using MATLAB environment software. All bands were converted into JPEG form through image compression to minimize their size [26]. The processing time before performing image compression using MATLAB software was 2 days. However, after performing band compression, the processing time reduced dramatically to become only two minutes.

This procedure made the processing of the images more flexible and reliable. The specifications of the laptop computer that we used are as follows:

  1. 1.

    The RAM was 8 gigabytes.

  2. 2.

    The memory card was 8 gigabytes.

  3. 3.

    The CPU was Core i7.

  4. 4.

    The storage size was 1 terabyte.

Image sharpening

For research such as the current study, filtering the satellite images by using a sharpening filter to extract the CPs is easier and better [2, 32, 33]. Hence, a high-pass filter (HPF) was selected to filter the RazakSAT image’ bands. HPF is one of the basis filter uses for sharpening approaches. Imagery sharpening occurs when imagery contrast enhances between the adjoining area with a little variation of darkness or brightness. HPF works by retaining high-frequency information in imagery while reducing in low-frequency information [32, 33]. The HPF filter helps to remove the imagery’s low-frequency components, while the high-frequency ones remain. ENVI 5.0 Classic’s default HPF uses a kernel size of (3 × 3) by using (8) as the value of the center pixel and (− 1) for all exterior pixels. Figure 5 illustrates the image bands before and after applied the HPF filter.

Fig. 5
figure 5

Filtered RazakSAT image bands, a the image bands before applied HPF, b the image bands after applied HPF

Automatic feature extraction

For this study, we employed a SIFT approach. This transforms imagery to relative coordinates that are scale-invariant, as reported by Lowe [4, 5]. An important aspect of this procedure is that it creates large feature numbers that cover the imagery over all locations and scales [3, 4, 34]. The feature quantity is important for object recognition. However, the ability to detect small features in cluttered backgrounds requires at least three features for every object [5]. CPs are robust to changes in image scaling, skewing, illumination, and rotation with changes in viewpoints [4, 5]. SIFT has been applied to different fields, including computer vision, remote sensing, object recognition, medicine, and robotics [21, 27, 34]. The CPs were automatically extracted from the reference and sensed bands by using a feature-based approach employing the SIFT algorithm to extract control points automatically in three steps. In the first step, CPs were extracted from the reference image and stored in a database. In the second step, SIFT features were extracted from the sensed image.

The major stages of computation used to generate a set of image features through the use of the SIFT algorithm are presented below [4, 5]:

  1. 1.

    Conduct scale-space extrema detection of CPs by using a cascade filtering approach that employs efficient algorithms to identify candidate CP locations by identifying locations and scales that can be repeatedly assigned under differing views of the same object. Therefore, image scale space is known as a function L (X, Y, σ) obtained from a convolution of a Gaussian variable in the form G (X, Y, σ) for input imagery I (X, Y), where the L (x, y, σ) will produce from the convolution of the G (x, y, σ):

$$L \left( {X,Y,\delta } \right) = G\left( {X, Y, \sigma } \right)* I \left( {X, Y} \right)$$
(1)

where * denotes the convolution operation in X and Y, and

$$G\left( {X, Y, \sigma } \right) = \frac{{1*e^{{\left( {x^{2} + y^{2} } \right)/2\sigma^{2} }} }}{{2\pi \sigma^{2} }}$$
(2)

A method to efficiently detect stable CP locations in scale space was proposed by Lowe [5] through the adoption of scale-space extrema in the difference of Gaussian functions convolved with imagery, D (X, Y, σ), which can be calculated by computing the difference between the two separated scales using a constant (multiplicative factor k):

$$D\left( {X, Y, \sigma } \right) = \left( {G\left( {X, Y,k\sigma } \right) - G\left( {X, Y, \sigma } \right)} \right)*I\left( {X,Y} \right)$$
(3)
$$D\left( {X, Y, \sigma } \right) = L\left( {X, Y,k\sigma } \right) - L\left( {X, Y, \sigma } \right)$$
(4)

There are several reasons to select this function. First is the efficiency of the function’s calculation, simply because of the smoothing of imagery. The (L) needs to be calculated for the description of the scale-space feature. The (D) can be calculated by simple imagery subtraction. Secondly, the Gaussian function difference shows a close scale-normalized Laplacian of Gaussian approximation (σ22G) [35]. In each scale-space octave (the octaves numbers and the scale depending on the original image size, while programming of SIFT, it will have to decide for anyone how many scales and octaves it wants. Low [4], who created the SIFT approach suggested it need to four octaves and five blur levels are very ideal for the SIFT method), the imagery will convolve with Gaussians to generate the set scale space imagery. Then the adjacent Gaussian imagery can be subtracted to generate the Gaussian image difference. The Gaussian imagery will be down-sampled by 2; this process will be repeated. We define (σ2) as the required factor for true scale invariance, and (σ22G) is the CP maxima and minima [36]. The relationship between (D) and (σ22G) can be understood by the heat diffusion equation [parameterized in terms of (σ) rather than the more usual (t = σ2)]:

$$\frac{\partial G}{{ \partial \delta }} = \sigma \nabla^{2} G$$
(5)

2G can then be computed through finite difference approximation to ∂G/∂σ by using the difference of nearby scales at () and (σ):

$$\sigma \nabla^{2} G \, \approx \frac{\partial G}{{\partial \delta }} \approx \, \left( {\left( {G\left( {x,y,k\sigma } \right) - G\left( {x,y,\sigma } \right)} \right)/\left( {k\sigma - \sigma } \right)} \right)$$
(6)

and, therefore,

$$G\left( {x, \, y, \, k\sigma } \right) - G\left( {x,y,\sigma } \right) \approx \left( {k - 1} \right) \, \sigma^{2} \nabla^{2} G$$
(7)

where (k − 1) is a constant in scales. So it does not affect extrema location, and k = 2½ [(generally s = 2, k = √2)]. It is divide each octave by s + 3, each octave image = image/2

  1. 2.

    By equations below, the object localization can be conducted and filtering can be undertaken [5]:

    $$D\left( x \right) = \frac{{\partial G^{T} }}{\partial x}x + X^{T} \cdot \frac{{\partial^{2} D_{x} }}{{\partial x^{2} }}$$
    (8)

    where (D) and derivatives will evaluate at the sample point and [x = (x, y, σ)T]. (x) it is the offset to this point. The (\(\hat{X}\)) is known as the extremum location and it can be determined by the function derivative with respect to (x) and setting it to (zero), giving:

    $$\hat{X} = - \frac{{\partial^{2} D^{ - 1} }}{{\partial x^{2} }} \cdot \frac{\partial D}{{\partial x}}$$
    (9)

The Hessian and derivative of (D) is approximated by using the differences of neighboring sample points. The (3 × 3) outcome is a linear system that could be solved with minimal cost, and the \(\left[ {D\left( {\hat{X}} \right)} \right]\) is known as a function value at the extremum, used to reject unstable extrema has a low contrast Brown and Lowe [5]:

$$D\left( {\hat{X}} \right) = D + \frac{1}{2}\frac{{\partial D^{T} }}{\partial x}$$
(10)
  1. 3.

    Eliminating edge responses: The principal curvatures are computed from a 2 × 2 Hessian matrix [36]. H is computed at the location, and the scale of the keypoint is as follows:

$$H = \left| {\begin{array}{*{20}c} {Dxx} & {D_{xy} } \\ {D_{xy} } & {D_{yy} } \\ \end{array} } \right|$$
(11)

where H = Hessian matrix. The derivatives are estimated by taking the differences of neighboring sample points. The eigenvalues of H are proportional to the principal curvatures of D. Borrowing from the approach employed by Harris and Stephens [35], to avoid explicitly computing the eigenvalues, the focus is on their ratio. Let α be the eigenvalue with the largest magnitude and β be a smaller one. The sum of the eigenvalues can then be computed from the trace of H, and their product can be computed from the determinant as follows:

$$Tr\left( H \right) \, = \, Dxx \, + \, Dyy \, = \, \alpha \, + \, \beta$$
(12)
$${\text{Det}}\left( H \right) \, = \, Dxx \, Dyy \, - \, \left( {Dxy} \right)^{2} = \alpha \beta$$
(13)

where α and β are eigenvalues. When the determinant is a negative and because the curvature has different signs, the point is not considered as extremum. Let us assume that (r) is the ratio between the largest and smallest magnitude eigenvalues, so [α = ];

$$\frac{{Tr(H)^{2} }}{{{\text{Det}}\left( H \right)}} = \frac{{\left( { \propto + \beta } \right)^{2} }}{\alpha \beta } = \frac{{\left( {r\beta + \beta } \right)^{2} }}{{r\beta^{2} }} = \frac{{\left( {r + 1} \right)^{2} }}{r}$$
(14)

The [(r + 1)2/r] is the quantity at the minimum when the two eigenvalues are equal; they increase with the value of (r):

$$\frac{{Tr(H)^{2} }}{{{\text{Det}}\left( H \right)}} < \frac{{\left( {r + 1} \right)^{2} }}{r}$$
(15)

Fewer than 20 operations of the floating point are required in order to test each single CP. For this research, the experiments used a value of r equal to 10 to eliminate each CP that has a ratio of principal curvature equal to or greater than 10.

  1. 4.

    Performing Orientation assignment: From assigning a consistent orientation to each CP regarding the local imagery properties, the descriptor of the CP will represent a relative relationship to this orientation and thus achieves invariance to image rotation. This approach is in contrast with the orientation invariant descriptors of Schmid and Mohr [36]. The CP scale is adopted to select the Gaussian smoothing imagery, L, with the closest scale, so the computations are conducted in a scale-invariant manner. For each imagery sample, L (x, y), the gradient magnitude, represented by m (x, y), and orientation, represented by θ (x, y), is computed by pixel differences:

$$m\left( {x,y} \right) = \surd \left( {\left( {L\left( {x + 1,y} \right) - L\left( {x - 1,y} \right)} \right)^{2} + \left( {L\left( {x,y + 1} \right) - L\left( {x,y - 1} \right)} \right)^{2} } \right)$$
(16)
$$\theta \left( {x,y} \right) = \tan^{ - 1} \left[ {\left( {L\left( {x,y + 1} \right) - L\left( {x,y - 1} \right)} \right)/\left( {L\left( {x + 1,y} \right) - L\left( {x - 1,y} \right)} \right)} \right]$$
(17)

The third step is matching between the SIFT features of the reference and sensed images. This was evaluated by individually comparing each feature from the sensed image to its previous counterpart in the database and identifying candidate matching features based on the Euclidean distance of their feature vectors through the use of the mathematical expression below [5]:

$${\text{d}}\left( {p,q} \right) = \sqrt {\mathop \sum \limits_{i = 1}^{n} \left( {qi - pi} \right)^{2} }$$
(18)

where p = (p1, p2,..., pn) and q = (q1, q2,..., qn) are two points in Euclidean n-space. The keypoint descriptors are highly distinctive; hence, the correct match of a single feature can result in a good probability using a large features database. However, in cluttered imagery, several objects do not have corrected matching [4, 5].

Improving the extracted features

Using extracted SIFT CPs results in numerous false and incorrect matches [2, 3, 7, 9, 15]. This phenomenon increases the image matching errors and negatively affects the application of the images in remote sensing applications [13, 37]. In this study, the new technique was employed to refine and improve the features extracted by the SIFT approach by removing the false extracted SIFT CPs that lead to incorrect matches by employing the SAD algorithm. The SAD approach works by measuring the similarity between blocks imagery. It is performed by computing the absolute difference between both of each pixel in (original block) reference image and a corresponding pixel in the slave image (block being used for comparison). All differences then will sum to establish a simple (metric of block similarity) [38, 39]. Several studies have employed the SAD algorithm in their applications, such as object recognition, motion estimation, and video compression [38,39,40]. Other researchers have performed different optimizations on SAD to make it faster, reduce the computation time, and obtain better matching between CPs [32, 41]. The SAD approach can be expressed [39] as follows:

$${\text{SAD}} = \sum\limits_{x = 0}^{N - 1} {\sum\limits_{y = 0}^{N - 1} {\left| {A\left( {x,y} \right) - B\left( {x,y} \right)} \right|} }$$
(19)

where A and B are blocks, and x and y are the pixel indices of matrices A and B, respectively. N is the row and column number in the images. The location coordinates of the SIFT CPs were used and inputted in the refining processing together with the SAD algorithm for automatic extraction of the most accurate SIFT CPs [39, 40].

Results and discussion

The SAD algorithm for processing and analysis in this research was employed in a different manner than in all previous studies [39, 40]. All the previous researchers applied the SAD algorithm in processing the entire reference and sensed images to identify the similarity matrix based on area correlation [31, 38, 42,43,44,45,46]. However, in this study, the SAD algorithm was used to refine the extracted SIFT CPs. In some respects this is different from previous studies because the weakness of both SIFT and SAD in this kind of image is overcome by removing the incorrectly matched CPs when the SIFT algorithm produces CPs. The SAD functions measure the intensity similarity between the reference and sensed images by calculating the absolute differences in each SIFT CP in the image window and in the corresponding CP in the search window based on Eq. (19). Thereafter, all these differences should be integrated to ensure that the similarity between the two images is identified. Using an empirical threshold and suitable kernel size helps to determine and remove false SIFT CP pairs that are matched errors.

The employed steps are as follows: (1) The CPs should be automatically extracted by SIFT. (2) The reference and sensed images and the image coordinates of the extracted SIFT CPs of both images should be entered into the SAD algorithm. (3) The SAD algorithm should be run to measure the similarity in intensity between the areas around the CPs of the sensed image and the corresponding CPs in the reference image. Regarding to Eq. (18),the image window represent Block A located in reference image that has the generated SIFT CP and the search window represents the corresponding Block B in the sensed image that have the corresponding CPs to those in image windows as indicate in Eq. (18). The false CP matches can be removed by using the empirical threshold. So, we input the green band image as the referenced image and the red band image and blue band image as the sensed images into the SIFT algorithm to extract the SIFT features (CPs), and then we applied the SAD algorithm as explained above with different thresholds and the same frame size. Figures 6a, b and 7a, b and Tables 2, 3, and 4 show the results of this processing and analysis.

Fig. 6
figure 6

Comparison of a applying SIFT between the green and red bands image and b applying SIFT and SAD between the green and red bands for the same area

Fig. 7
figure 7

Comparison of a applying SIFT between the green and blue bands and b applying SIFT and SAD between the green and blue bands

Table 2 Applying SIFT and SAD on green–red bands image
Table 3 Applying SIFT and SAD on green–blue bands image
Table 4 Applying SIFT and SAD on G–NIR bands image

From Table 2, it can be seen that the extracted SIFT CPs before applying the proposed methodology of reference green band image and sensed red band image were 1500 and 2000, respectively for each single experiment. However, after applied the new technique with SAD algorithm lead to remove the false extracted SIFT CPs from the range of (700–1358). The remaining corrected extracted SIFT CPs were (750, 552, 920, respectively, for three experiment. From Figs. 6a, b and 7a, b, the result was obtained by empirically selecting the threshold. A value in between that was not extremely high or extremely small was selected to obtain a good number of CPs [47, 48]. The perfect frame size and threshold value for this study were 3 × 3 and 250, respectively, based on experimental results to remove the false CPs given by SIFT. Tables 2, 3, and 4 show the results of applying the SIFT and SAD algorithms by using different threshold values and frame sizes for all images. The processing was performed in the MATLAB environment.

Figure 6 illustrates the SIFT algorithm applied between the green image and the red band image. The extracted CPs were then refined by using the proposed method. Figures 6 and 7 shows the matching processing between the generated CPs from applied SIFT in Figs. 6a and 7a by the colored lines (yellow color) between the features in image (a) and those corresponding CPs in image (b) of Figs. 6b and 7b both of these feature are matched by these line. However, there are some matched lines are incorrected matching related to use SIFT algorithm and these error in matching put in circle to make it easy to recognize. The number of CPs decreases after the false SIFT CPs are removed. The first experimental threshold used was 200, as indicated in Table 2.

From Table 3, it can be seen that the extracted SIFT CPs before applying the proposed methodology of reference green band image and sensed red band image were 1500 and 2000, respectively for each single experiment. However, after applied the new technique with SAD algorithm lead to remove the false extracted SIFT CPs from the range of (228–383). The remaining corrected extracted SIFT CPs were (73–228), respectively, for three experiment.

Table 2 shows the operation of the SAD algorithm with different threshold values and frame sizes by performing three experiments on the reference image (green band image) and slave images (blue band). First, the SAD algorithm was used with a threshold value equal to 200 and frame size of 3 × 3. The numbers of extracted SIFT CPs were 1500 and 2000 key points in the reference and slave images, respectively, and the number of matched CPs was 1450. However, after the SAD algorithm was applied to the SIFT CPs, the matched CP number decreased to 750. The falsely matched CPs could not be removed by using this frame size and threshold. Thus, they still showed matching errors. In the second experiment, the threshold value was changed to 250 with the same frame size (3 × 3). The false CPs were removed, and the most accurately matched CPs were obtained by using both the SIFT and SAD algorithms. These key points represented the most accurate SIFT CPs. In the third experiment, the frame size was changed to 5 × 5, and a similar threshold value was used. Ninety-two matched CPs were obtained. Based on the experimental results indicated that using the threshold value of 250 and frame size of 3 × 3 provides a maximum number of SAD CPs (corrected CPs) and obtains the most precisely matched CPs, and the bad CPs are removed from the slave and reference image. Regarding Tables 2, 3 and 4 the false SIFT CPs numbered are 898, 228 and 117 between the reference and slave images. On the other side, applying SIFT and SAD between G–NIR bands give results more accurate regarding to applied SIFT and SAD with other bands.

Figure 7a, b shows the application of SIFT and then SAD for the green and blue bands, and Fig. 6a shows the result of applying the SIFT algorithm; the false CPs showed obvious matching errors. Figure 6b presents the result of applying the SAD algorithm to the extracted SIFT CPs before the biased CPs was removed. Clearly, using the SAD algorithm based on the extracted features showed better performance than using it based on area correlation. In the same way, Tables 3 and 4 show that the results of the proposed refined SIFT method compared with those of the original SIFT. The refined SIFT selects the true and accurate matched CPs based on the experimental results. Moreover, manually collecting the CPs is difficult and time consuming, particularly when the amount of data is large [23]. Thus, refined SIFT is more reliable, flexible, and accurate in extracting and improving CPs [3, 28]. Figures 8, 9, and 10 indicate the comparison between false and true extracted CPs of applying SIFT and SAD for the green, red, blue, and NIR bands.

Fig. 8
figure 8

Comparison between the three experiments of applying SIFTS and SAD on green–red bands image

Fig. 9
figure 9

Three experiments of applying SIFT and SAD on green–blue bands image

Fig. 10
figure 10

Three experiments of applying SIFT and SAD on G–NIR bands image

Figure 8 illustrates the results of the three experiments of the matched, true and false extracted CPs before and after applying SAD algorithm. It is clear that when applied only SIFT method the number of matched CPs 1450 in all the three experiments. However, after applied SAD algorithm on the extracted SIFT CPs lead to remove the false and incorrect matched CPs 700, and became the true corrected CPs of SAD method only 750 instead of 1450 for the first experiment. In addition for the second and third experiment the false SIFT CPs (898, 1358) will remove with applied SAD method to make the corrected CPs will be (552, 92).

Applying SAD approach on NEqO satellite images will remove the extracted SIFT CPs and enhance the extracted CPs about 50% out of all extracted CPS. Figure 9 illustrates the results of the three experiments of the matched, true and false extracted CPs before and after applying SAD algorithm. It is clear that when applied only SIFT method the number of matched CPs 456 in all the three experiments. However, after applied SAD algorithm on the extracted SIFT, CPs lead to remove the false and incorrect matched CPs 263 and became the true corrected CPs of SAD method only 193 instead of 456 for the first experiment. In addition for the second and third experiment the false SIFT CPs (228, 383) will remove with applied SAD method to make the corrected CPs will be (228, 73).

Figure 10 illustrates the results of the three experiments of the matched, true and false extracted CPs before and after applying SAD algorithm. It is clear that when applied only SIFT method the number of matched CPs 850 in all the three experiments. However, after applied SAD algorithm on the extracted SIFT CPs lead to remove the false and incorrect matched CPs 650, and became the true corrected CPs of SAD method only 656 instead of 850 for the first experiment. In addition for the second and third experiment, the false SIFT CPs (117, 294) will remove with applied SAD method to make the corrected CPs will be (733, 556). These SAD CPs will be the reference points will use in case to perform bands image registration, normalization of radiometric correction, image geometric correction and other applications. The results prove that applying SAD algorithm will remove the SIFT false extracted CPs and produce the true CPs.

Conclusions

Features extraction methods have very affective role when performing image registration and geometric corrections for data collected from different sensor systems image, that have linear image distortion. Different kinds of algorithms adopted. However, for this research, SIFTS method was applied. The automatic extracting CPs by SIFT is not adequate for the new optical satellite generation of NEqO and multi-sensor images captured from different viewpoints and at different times and illumination. This paper presents a technical workflow that can be used for large-scale mapping based on automatically refining the extracted SIFT CPs using the SAD algorithm; it does so in a manner that is different to the conventional approach, involving the extracted CP coordinates of the reference and sensed images in the SAD algorithm and using an empirical threshold. The data adopted for implementation this study was obtained from the new generation of near-equatorial orbital satellite system called RazakSAT satellite image bands have been adopted for examine the proposed methodology, it covered the Penang Island, Malaysia. The application of refined SIFT was improved to remove false CPs that had matching errors. Finally, we evaluated the refined CP scenario by comparing the result of the original extracted SIFT CPs with that of the proposed method. The experimental results show that applying SAD approach on NEqO satellite images will remove the extracted SIFT CPs and enhance the extracted CPs about 50% out of all extracted CPs over all processed and analyzed bands image, for all the three experiments with using frame size of (3 × 3) for the first and second experiments and (5 × 5) for the third experiment. The number of extracted control points (CPs) to be reduced from 2000 to 750, 552, and 92 for the green and red bands image, from 678 extracted CPs to be 193, 228, and 73 between the green and blue bands, and from 1995 extracted CPs to be 656, 733, and 556 between the green and near-infrared bands, respectively. Results also indicate the reliability, effectiveness, and robustness of the proposed method, as well as its high precision that meets the requirements of different remote sensing applications, such as band-to-band registration, geometric, normalization of radiometric correction and change detection processing of near-equatorial satellite images. This result encourages further research to improve feature extraction approaches.