1 Introduction

The interpolation process is one of the most common processes in remote sensing image and video analysis. Some applications of interpolation in order to estimate unknown pixels are image compression, high-rate video transmission, image and video watermarking, image reconstruction, restoration, and magnification. For instance in [1], a modified scheme was proposed for converting standard-definition television (SDTV) frames to high-definition television (HDTV) standard [2] to be used in video transmission technologies such as DVB-T. Researches on interpolation algorithms include a wide range of research on which some details of them are reviewed as follows. Two most famous interpolators are bi-cubic convolution (mainly abbreviated as BC) and bi-linear (BL) [3]. Today, BC and BL are classified into non-adaptive techniques in terms of local edge computation and indeed provide two linear reconstruction filters [4]. Another main point about them is to use both methods in image processing software tools for remote sensing such as ENVI and ERDAS. However, we wish to focus on newer and efficient types of interpolators entitled edge-guided interpolation algorithms. Edge-guided methods are often applicable in image and video reconstruction problems [5]. In [6], a technique has been represented which estimates anything based on an assumption that every image can be modeled as a locally stationary Gaussian process. Based on this assumption, the local covariance coefficients in low-resolution (LR) images are estimated, and then, the interpolation process is performed based on geometric duality between the LR and the high-resolution (HR) covariance. A key issue of this method that makes it unsuitable for ViSAR frames is to consider some statistical assumptions which do not exist in practice. In [5], a new scheme was proposed which uses tensor tool for interpolation in order to realize the edge-guided interpolation. This method could outperform some existing methods.

In this research, we want to propose a new edge-guided interpolator based on statistical estimation. Our purposed method uses an adaptive weighting mechanism which makes it edge-guided, non-linear, and fully greedy. Our scheme is an extension for the method discussed in [7,8,9] for remote sensing applications. In [7], a basic edge-guided interpolation based on linear minimum mean square error estimation (LMMSE) was introduced for benchmark images such that some evaluations about it have been done in [10]. LMMSE includes two phases of directional filtering using a pre-interpolator and data fusion of two orthogonal directions. LMMSE scheme for remote sensing images has been discussed in [9]. This interpolator is a relatively adaptive scheme needing a pre-interpolator based on linear filtering, e.g., linear or cubic interpolation, for directional filtering. In the current work, we are going to propose a full-adaptive version of LMMSE for remote sensing data of ViSAR whereas our proposed technique does not need any pre-interpolator. In fact, if LMMSE can outperform some linear methods like BL or BC, it is completely natural because they have been used as pre-interpolator in LMMSE structure (although as two one-dimensional components), but our proposed method which is named adaptive LMMSE (ALMMSE) can fuse directional observation without need to any linear pre-interpolator and however outperforms the linear interpolators. Our experiments show that it is winner against five conventional techniques among the most popular non-adaptive/linear reconstruction filters.

We can also use the proposed approach for magnifying some multispectral images such as IKONOS and QuickBird images or images related to high-resolution optical remote sensing sensors [11,12,13]. In addition, there are many other applications for interpolation algorithms, e.g., data hiding [14,15,16,17,18], interpolation-based image denoising and demosaicking [19,20,21], SDTV to HDTV conversion (SD2HD) [2] in video processing, color processing [22], information fusion [8, 9], and shadow detection [23] which can be assisted by ALMMSE algorithm. As we mentioned, the main focus of this research is towards interpolation-based image/video compression [10, 24]. For compression, we firstly down-sample video frames to reduce the information size at the sender side and then reconstruct them using an interpolator at the receiver side. Consequently, ALMMSE can be used in different processes of remote sensing images.

The rest of this paper is organized as follows: in Section 2, we review LMMSE details and some of its applications in digital image processing; then in Section 3, we present our proposed scheme (ALMMSE); and finally, we evaluate it in Section 4. Evaluations show that the use of a locally adaptive estimation in ViSAR frames creates better quality compared to many conventional techniques. The last section is a dedicated conclusion on the work.

2 Related work

LMMSE is a quartered interpolator for creating a four-time larger interpolated image and is widely used in different applications such as enlargement (zooming) [7], noise removal (denoising) [19], color demosaicking [20], and image compression [10]. In this technique, each of non-existing pixels will be computed based on four nearest neighboring pixels which are previously known. Generally, a schematic according to Fig. 1 is used for representing mechanism of LMMSE in two scenarios with orthogonal directions.

Fig. 1
figure 1

General positions for underestimate pixels. There are two general positions for underestimate pixels (square symbol) which should be computed by an LMMSE-based interpolator [19]: the left part shows a position with two orthogonal directions of 0 and 90 degrees, and the right shows another position with orthogonal directions of 45 and 135 degrees [7]

The main shortcoming in the design of LMMSE interpolator is to select equal weights for two corresponding pixels which are in the same direction. According to logic of greedy algorithms, LMMSE is not classified into full-adaptive algorithms because it considers a general assumption about generality of images in its computations. However, we can consider it as a partially-adaptive interpolator compared to linear interpolators such as BL and BC. In our study, the aim is to create a new LMMSE-based interpolator for reconstruction of a kind of compressed remote sensing data without need to any pre-interpolation step.

In [21], the authors have proposed an LMMSE-based interpolator for color demosaicking. Demosaicking is a certain type of interpolation which is commonly applicable in some color images, for example Kodak dataset [20]. The demosaicking algorithm in [20] is based on LMMSE and strongly outperforms BL interpolator [3]. Another application of LMMSE is noise reduction. A denoising algorithm is practically a low pass filter which filters high frequency variations of images, or in the other words, it reduces/removes the noises.

Most of the interpolators have mechanisms based on averaging process which is equal to low pass filtering. Therefore, LMMSE-based interpolators can be used in the noise reduction problems. For example in [19], an LMMSE-based denoising algorithm has been proposed for a wide range of digital images. Quality of the scheme is observable. Interpolation in spatial domain is also a way for image compression, for more details about LMMSE-based image compression refer to [10]. Thus, in addition to direct applications of interpolators such as magnification, interpolation is widely used for image restoration and reconstruction. In other researches, interpolation-based data hiding [8, 14, 15, 18, 25] and multispectral image fusion (pan-sharpening) [9] are carried out using it. For example in [9], LMMSE has been applied as a magnifier for achieving better quality in pan-sharpening process of Landsat-8 images compared to a linear interpolator. Another application for LMMSE is to do denoising for improving classification accuracy in digital images, because noise reduces accuracy of classifiers (a pre-process based on noise reduction algorithms is normally essential before classification).

3 Proposed method

In the proposed scheme, an interpolation without any assumption regarding estimation weights is applied to reconstruct compressed frames [4]. In fact, there are no default weights, and all of them are computed adaptively. Our proposed scheme adaptively estimates non-existing pixels to keep edge information in the best way. In this section, we discuss our ALMMSE interpolation method which is an edge-guided scheme and uses four nearest neighbors from two orthogonal directions to estimate targeted pixels; thus, it has suitability for Markov random field (MRF)-based neighborhood systems with order of 1 or 2 such as many remote sensing images. An important point about ALMMSE is to be a full-adaptive non-linear approach that does not need any pre-interpolation compared to linear schemes using polynomials (non-adaptive methods) and traditional LMMSE (with a pre-interpolator).

In order to compress ViSAR frames using the proposed method, we should make down-sampled versions from HR frames (to create LR frames) with lower resolution and then reconstruct the LR frames using our interpolator. To do this, for example, we estimate 75% of the removed pixels through compression (with a down-sampling algorithm like Algorithm 1) with using only 25% of the remaining cases (sample pixels). In such experiments, we can reduce the video size to be one fourth of the original version. Therefore, we use ALMMSE as a regular interpolation for a quartered template.

Here, we discuss details of ALMMSE interpolation using a template according to Fig. 2a, as seen in continuation of this section. As per Fig. 2 b, five sample pixels are shown, and for simplicity, we utilize simple notations as Eq. (1):

$$ {\displaystyle \begin{array}{l}{x}_l\left(i,j\right)={x}_1\\ {}{x}_l\left(i,j+1\right)={x}_2\\ {}{x}_l\left(i+1,j\right)={x}_3\\ {}{x}_l\left(i+1,j+1\right)={x}_4\\ {}{x}_h\left(2i,2j\right)={\hat{x}}_h\end{array}} $$
(1)
Fig. 2
figure 2

The steps in estimating non-existing pixels [18]. a Original and non-existing pixels. b, c Computing of estimated values of 75% non-existent pixels in different directions. After doing these three steps, all the non-existing pixels will be reconstructed

To calculate the non-existing pixels in LMMSE-based interpolated frames, i.e., \( {\hat{x}}_h \)(xh is an unknown ideal value and \( {\hat{x}}_h \) is an estimate for xh) and all the same positions, we use a linear combination of original pixels of LR frame. These original pixels are nearest neighbors of the targeted pixel according to Eq. (2) to generate an estimated value (in some scenarios, two of four nearest neighbors are also estimated pixels of a prior step). Although we are using a linear combination, but since all the interpolation weights are specified adaptively and are not fixed, thus, the final reconstruction filter based on ALMMSE will be non-linear [22]. In [22], the traditional LMMSE has been discussed extensively in order to keep the adaptivity for gray levels in every edge area. As can be followed in [22], a general form of LMMSE (for first/second order MRF system and using the simplest pre-interpolator based on two 1D linear estimation (Eq. (3) shows two directional estimates through a kind of bi-linear)) can be written as per Eq. (4). The weights in Eq. (4) are according to Eq. (5) and computable through Eqs. (6–8). Therefore, the ALMMSE closed form is similar as per Eq. (9):

$$ {\displaystyle \begin{array}{l}{\hat{x}}_h={w}_a\;{\hat{x}}_a+{w}_b\;{\hat{x}}_b\kern0.6em \\ {}\kern0.72em {w}_a+{w}_b=1\end{array}} $$
(2)
$$ {\displaystyle \begin{array}{l}{\hat{x}}_a=\frac{x_1+{x}_4}{2}\;\\ {}{\hat{x}}_b=\frac{x_2+{x}_3}{2}\;\end{array}} $$
(3)
$$ {\hat{x}}_h={w}_a\frac{x_1+{x}_4}{2}+{w}_b\frac{x_2+{x}_3}{2}=\frac{w_a}{2}{x}_1+\frac{w_b}{2}{x}_2+\frac{w_b}{2}{x}_3+\frac{w_a}{2}{x}_4 $$
(4)
$$ \left\{{w}_a,{w}_b\right\}=\underset{w_a+{w}_b=1}{\mathrm{argmin}}\kern0.36em E\;\left\{{\left({\hat{x}}_h-{x}_h\right)}^2\right\} $$
(5)
$$ {w}_a=\frac{{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({\hat{x}}_b-\overline{x}\right)}^2}{{\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2+{\left({\hat{x}}_a-\overline{x}\right)}^2+{\left({\hat{x}}_b-\overline{x}\right)}^2} $$
(6)
$$ {\displaystyle \begin{array}{l}{w}_b=\frac{{\left({x}_1-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2+{\left({\hat{x}}_a-\overline{x}\right)}^2}{{\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2+{\left({\hat{x}}_a-\overline{x}\right)}^2+{\left({\hat{x}}_b-\overline{x}\right)}^2}\\ {}\kern0.84em =1-{w}_a\end{array}} $$
(7)
$$ \overline{x}=\frac{1}{4}\;\sum \limits_{i=1}^4{x}_i=\frac{x_a+{x}_b}{2} $$
(8)
$$ {\displaystyle \begin{array}{l}{\hat{x}}_h={w}_1\;{x}_1+{w}_2\;{x}_2+{w}_3\;{x}_3+{w}_4\;{x}_4=\sum \limits_{i=1}^4{w}_i{x}_i\kern0.48em \\ {}\kern4.199998em \sum \limits_{i=1}^4{w}_i=1\end{array}} $$
(9)

Now in Eq. (9), we should compute four weights of w1, w2, w3, and w4. To do this, there are many ways, but we represent a full-adaptive solution inspired by the traditional LMMSE. Our proposed ALMMSE is indeed a heuristic idea towards extending LMMSE to be full-adaptive (not considering a similar weight for collinear pixels) and with no need to a linear pre-interpolator which makes more computational complexity. For eliminating the pre-interpolation step, we use an approximate as Eq. (10) to make Eqs. (6–7) more simple and then generate a heuristic expansion on LMMSE weights to achieve ALMMSE weights (Eq. (11)). As shown in Eqs. (12–16), efficient weights of ALMMSE are very similar to LMMSE weights. In fact, we do not consider any equal weights for these four nearest pixels, and therefore, the approach is fully adaptive whereas LMMSE always selects the same weights for the collinear pixels. In addition, LMMSE has to compute some values as directional estimates of each set of collinear pixels. We could consider an adaptive structure which assumes each pixel as separate sample, regardless of collinearity; therefore, the final approach does not need any pre-estimation for directional estimates that are no longer definable.

$$ {\displaystyle \begin{array}{c}{\left({\hat{x}}_a-\overline{x}\right)}^2\approx 0\\ {}{\left({\hat{x}}_b-\overline{x}\right)}^2\approx 0\end{array}} $$
(10)
$$ \left\{{w}_1,{w}_2,{w}_3,{w}_4\right\}=\underset{\sum \limits_{i=1}^4{w}_i=1}{\mathrm{argmin}}\kern0.36em E\;\left\{{\left({\hat{x}}_h-{x}_h\right)}^2\right\} $$
(11)

The estimation of non-existing pixels will be repeated to achieve all values of 75% of underestimate pixels, as illustrated in Fig. 2c.

$$ {w}_1=\frac{{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2}{3\left({\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2\right)} $$
(12)
$$ {w}_2=\frac{{\left({x}_1-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2}{3\left({\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2\right)} $$
(13)
$$ {w}_3=\frac{{\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2}{3\left({\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2\right)} $$
(14)
$$ {\displaystyle \begin{array}{l}{w}_4=\frac{{\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2}{3\left({\left({x}_1-\overline{x}\right)}^2+{\left({x}_2-\overline{x}\right)}^2+{\left({x}_3-\overline{x}\right)}^2+{\left({x}_4-\overline{x}\right)}^2\right)}\\ {}\kern0.84em =1-\sum \limits_{i=1}^3{w}_i\end{array}} $$
(15)
$$ \overline{x}=\frac{1}{4}\;\sum \limits_{i=1}^4{x}_i $$
(16)

Note that computing of the targeted pixels is firstly based on four nearest neighbors, but in some positions due to the existence of two estimated pixel among four nearest neighbors, practically, the estimation procedure has been performed by six neighbors of which four of these six pixels are not within MRF neighborhood. In the next section, the proposed scheme is evaluated. We will see that the proposed scheme is effective on ViSAR dataset.

Moreover, evaluation is performed based on objective and subjective quality assessment metrics. In addition to the proposed method, a pre-processing step before doing the re-sampling process exists which contains two blocks of down-sampling and up-sampling. Suppose that an input image is a typical M × N matrix (for simplicity, M and N are even). Algorithm 1 and Algorithm 2 describe these two blocks.

For more details about impacts of different models of down-sampling and up-sampling in quartered interpolators, see detailed discussions in [10]. Table 1 provides more qualitative details of LMMSE, ALMMSE, BL, and BC.

Table 1 Qualitative descriptions of the interpolation methods

4 Results

For evaluation, some ViSAR frames are used which are observable in Fig. 3. PSNR and SSIM as main metrics are used in all evaluations. PSNR and SSIM definitions are seen in Eq. (17) and Eq. (18), respectively, for two entire 8-bit images x and y with the same size N1 × N2. In Eq. (18), ux and uy denote mean of images, \( {\sigma}_x^2 \) and \( {\sigma}_y^2 \) show variance of them, and σxy describes the covariance between them. Re-sampling is done with the proposed scheme (ALMMSE) and some conventional methods including BL, BC, Lanczos (with parameter of 2 and 3 as an approximation for sinc function), and box kernel [3, 26]. All methods are similar in terms of not having a pre-interpolation step, and this makes the evaluations fair. In addition towards fairness, the down-sampling processes in all methods are the same, the up-sampling in our method is according to Algorithm 2, and the linear methods are according to the MathWork definition.

Fig. 3
figure 3

ViSAR dataset includes three different videos; one frame from each video is seen

All simulations have been implemented using MATLAB and show that our scheme is strongly winner against non-adaptive/linear methods. Outputs of PSNR and SSIM with complete details are listed in Tables 2, 3, and 4, and Fig. 4 describes an average for all videos. We observe that the proposed scheme based on a full-adaptive approach causes a suitable impact in real ViSAR dataset and outperforms the other methods.

$$ PSNR=20\kern0.36em \log \kern0.24em \frac{2^8-1}{\sqrt{\frac{1}{N_1\times {N}_2}\;{\sum}_{i=1}^{N_1}{\sum}_{j=1}^{N_2}{\left({x}_{ij}-{y}_{ij}\right)}^2}} $$
(17)
$$ SSIM=\frac{2{u}_x{u}_y}{u_x^2+{u}_y^2}\times \frac{2{\sigma}_x{\sigma}_y}{\sigma_x^2+{\sigma}_y^2}\times \frac{\sigma_{xy}}{\sigma_x{\sigma}_y} $$
(18)
Table 2 Results for video 1
Table 3 Results for video 2
Table 4 Results for video 3
Fig. 4
figure 4

Average results of PSNR (dB) and SSIM

5 Conclusions

In the recent years, data processing for IoT became an interesting topic of research [27,28,29,30]. In our study, we proposed ALMMSE interpolation algorithm for the remote sensing ViSAR frames captured by imaging radars in an IoT-enabled radar networks of drones and airplanes [31]. This scheme is a new edge-guided interpolator based on non-linear statistical estimation which has no assumption on local weights and also does not need any pre-interpolator. The main feature of the proposed method is to use the most adaptation in comparison to another edge-guided interpolator and conventional interpolation techniques. We compared it with several linear interpolators which do not need any pre-interpolator too. All experiments illustrate a clear consequence about superiority of the proposed method. As a future work, we can go ahead to propose a more accurate version of ALMMSE with lower computational complexity. Evaluation of this proposed method for other remote sensing devices may determine some future directions.