An Image Super-Resolution Reconstruction Method with Single Frame Character Based on Wavelet Neural Network in Internet of Things

Guo, Ling-li; Woźniak, Marcin

doi:10.1007/s11036-020-01681-6

An Image Super-Resolution Reconstruction Method with Single Frame Character Based on Wavelet Neural Network in Internet of Things

Open access
Published: 14 November 2020

Volume 26, pages 390–403, (2021)
Cite this article

Download PDF

You have full access to this open access article

Mobile Networks and Applications Aims and scope Submit manuscript

An Image Super-Resolution Reconstruction Method with Single Frame Character Based on Wavelet Neural Network in Internet of Things

Download PDF

2838 Accesses
28 Citations
Explore all metrics

Abstract

The application of the traditional single frame character image super-resolution reconstruction method has some problems, such as noise can not be removed completely and anti-interference performance is poor. A new method for the super-resolution reconstruction of single frame character image based on wavelet neural network is proposed. The structure and interface of image acquisition unit of solid state image sensor are designed. Combined with pinhole imaging model and camera self-calibration, image acquisition of Internet of Things is completed. An image degradation model was established to simulate the degradation process of ideal high-resolution image to low-resolution image. Wavelet threshold denoising method is used to remove the noise in a single frame character image and improve the anti-interference performance of the method. The wavelet neural network reflection model is used to reconstruct the single frame feature image and improve the resolution of the image. The experimental results show that the blur degree of the reconstructed image is always less than 5%. In the whole experiment, the accuracy of this method can be maintained at 80% ~ 90%. The image detail retention rate of the research method is relatively stable. With the increase of the number of experimental images, the retention rate of image details remains between 80% and 95%, indicating that the method is effective in practical application.

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

MFFN: image super-resolution via multi-level features fusion network

Article 15 February 2023

Learning a Deep Convolutional Network for Image Super-Resolution

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Super-resolution reconstruction of Internet of Things image refers to a digital image processing technique that reconstructs a high-resolution image from one or more low-resolution images. According to different kinds of images, the image super-resolution reconstruction mainly consists of super-resolution reconstruction of color images and super-resolution reconstruction of depth images [1]. In the field of image processing, super resolution reconstruction techniques are generally used to increase resolution of images for the acquired low-resolution images. The clarity of a single frame image of the Internet of Things is the prerequisite for the subsequent image processing. The high blur degree of the image will aggravate the image problems of the Internet of Things, leading to the unsatisfactory application effect of the Internet of Things [2, 3].

Internet of Things image super-resolution reconstruction has very important application value in many fields. Therefore, the super-resolution reconstruction method for single-frame character images needs to be analyzed and studied. Liu et al. proposed a super-resolution reconstruction method for single-frame image based on multi-level convolution neural network learning. This method constructs a PMJ model applied to super-resolution reconstruction, and performs preliminary feature extraction on the image during the sensing phase. This method cannot effectively remove the noise in the single-frame character image, and the anti-interference performance is poor [4]. Peng et al. reported a single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization, which can be well maintained; during the reconstruction phase, non-negative neighborhood embedding is used to select the number of neighbors; in the end, the non-local similarity of the image is used to construct non-local regular terms to modify the reconstruction result. The process of this method is complex, and errors are easy to occur, resulting in lower resolution of the image [5]. And this method ignores the errors caused by the acquisition of super-resolution images by cameras under visible light communication [6]. Yuan et al. came up with a single frame image super-resolution reconstruction method based on support vector regression (SVR). This method uses raster scanning to scan high and low resolution training images and extracts input vectors and label pixels from the blocks respectively. SVR tools are utilized to regress the corresponding label pixels belonging to the super-resolution image block to complete the super-resolution reconstruction of the image. This method cannot remove the Gaussian noise in the single-frame character image, and the anti-interference performance is poor [7].

In summary, a super-resolution reconstruction method for single-frame character Internet of Things image based on wavelet neural network is proposed. Combined with pinhole imaging model and camera self-calibration, image acquisition of Internet of Things is completed. Based on the data acquisition method, an image degradation model is established to obtain the degradation process of high-resolution image to low-resolution image. Wavelet threshold denoising method is used to remove the noise in a single frame character image. The wavelet neural network reflection model is used to reconstruct the single frame feature image and improve the resolution of the image. The experimental results show that compared with the traditional method, the proposed method has better denoising effect, lower image blur degree after reconstruction, and better practical application effect.

2 Internet of things image acquisition based on solid state image sensor

The main function of the Internet of Things image acquisition unit is to select the image sensor model and configure the working mode of the sensor, which can normally collect effective images. The structure of the image acquisition unit is shown in Fig. 1.

As shown in Fig. 1, first select the image sensor model. The image sensor converts the light image on the photosensitive surface to the corresponding electrical signal by using the photoelectric conversion function of the photoelectric device. Nowadays, widely used image sensors are mainly divided into two types, namely, solid state image sensors and photoconductive photographic tubes. The comparison of the two is shown in Table 1.

Table 1 Comparison between solid-state image sensor and photoconductive tube

Full size table

The results show that the solid-state image sensor has advantages of light weight, high integration, small size, long life and low power consumption. For this purpose, the system USES solid state image sensor to collect images.

Secondly, the image sensor operation mode is configured, mainly to design the image sensor interface, as shown in Fig. 2.

Using the above design of image sensor interface, the working mode of the sensor is determined. Based on this, the non calibration camera model is constructed to complete the image acquisition of the Internet of things [8].

The camera uses pinhole imaging model, and the imaging expression of any point X_i in a certain space is as follows:

$$ {\lambda}_i\left[\begin{array}{c}{u}_i\\ {}{v}_i\\ {}1\end{array}\right]={P}_i{X}_i $$

(1)

In formula (1):

$$ {\displaystyle \begin{array}{c}\kern1.5em {P}_i={K}_i\left({R}_i\left|{T}_i\right.\right)\\ {}{X}_i={\left[{X}_{wi}\kern0.30em {Y}_{wi}\kern0.50em {Z}_{wi}\kern0.50em 1\right]}^T\end{array}} $$

(2)

In formula (1) and formula (2), λ_i represents the projection depth, $ {K}_i=\left[\begin{array}{c}{f}_u\kern0.9000001em s\kern1em {u}_0\\ {}0\kern1.3em {f}_v\kern0.7em {v}_0\\ {}0\kern1.3em 0\kern1.4em 1\end{array}\right] $ represents the camera internal parameter matrix, and f_u and f_v represent the effective focal length of the camera. s belongs to the perpendicularity factor of the camera array, (u₀, v₀) is the coordinate of the camera main point, R_i and T_i represent the rotation matrix and translation matrix formed by the i-th camera coordinate system and the world coordinate system respectively, and X_i is the homogeneous coordinate of the object point in the world coordinate system.

If the camera and the target are in the same world coordinate system {O, E_x, E_y, E_z} and {o, e_x, e_y, e_z}, then the camera coordinate system origin is the camera focal center C − (0_e, 0_e, 0_e). The xy plane is parallel to the image plane of the camera, and the image plane is described as z_e = f [9].

In the camera coordinate system, any point X_e = (X_e, Y_e, Z_e)^T in space is mapped to a point x_e = (x_e, y_e, z_e)^T in the image plane. The transformation process is described as follows:

$$ \left\{\begin{array}{c}{x}_e={X}_e\frac{f}{Z_e}\\ {}{y}_e={Y}_e\frac{f}{Z_e}\end{array}\right. $$

(3)

The formula (3) is transformed into homogeneous coordinate form

$$ \lambda \left[\begin{array}{c}{x}_e\\ {}{y}_e\\ {}1\end{array}\right]=\left[\begin{array}{c}f\kern1.6em 0\kern1.6em 0\kern1.6em 0\\ {}0\kern1.6em f\kern1.6em 0\kern1.6em 0\\ {}0\kern1.7em 0\kern1.8em 1\kern1.7em 0\end{array}\right]\left[\begin{array}{c}{X}_e\\ {}{Y}_e\\ {}{Z}_e\\ {}1\end{array}\right] $$

(4)

In formula (3), λ = Z_e.

After transformation to homogeneous coordinates, the initial mapping relationship can be expressed in a linear way, and the matrix combining 3D and 2D points is called projection matrix.

The world coordinate system {O, E_x, E_y, E_z} and camera coordinate system {o, e_x, e_y, e_z} are not unified. Generally, the corresponding position relationship between the 3D object and the camera is unknown, so it is necessary to distinguish the world coordinate system from the camera coordinate system. There is X_e = R(X_E − t) European transformation A from the coordinate system of the 3D target to the camera coordinate system, where R represents the rotation matrix and t represents the translation vector. At this point, formula (4) is converted into the following form:

$$ \lambda \left[\begin{array}{c}{x}_e\\ {}{y}_e\\ {}1\end{array}\right]={K}^{\ast}\left[R- Rt\right]\left[\begin{array}{c}{X}_E\\ {}\kern0.2em {Y}_E\\ {}\kern0.1em {Z}_E\\ {}\kern0.4em 1\end{array}\right] $$

(5)

Projective geometry knowledge shows that: a group of lines L₁, L₂, L₃…L_n intersect at an infinite point, assuming that the point imaging is described as vp_x, if these lines are parallel to the plane, then vp_x is imaging at infinite distance; if these lines are not parallel to the plane, vp_x is imaging at a finite distance, and these points are called vanishing points. The line passing through vanishing point vp_x in the image must have translation relationship with other lines [10].

Suppose that after cube perspective imaging, lines a, d, h intersect at common vanishing point vp_x, and lines c, i, f intersect at a common vanishing point vp_y.

Straight line b, e, g intersect at vp_z. The vanishing point is different from the other points in the imaging plane. It describes the straight line direction data, and the 3D reconstruction structure data can be obtained by comprehensive analysis of the vanishing points.

Under the description of homogeneous coordinates, M = [X; Y; Z; l] point in space is projected on a point m = [u, v; l] on the image; a line in space is projected as l = m₁ × m₂ in the plane, and m₁ and m₂ are the endpoints of both sides of the line l.

If l₁ and l₂ represent the projection of any pair of parallel lines, the vanishing point is vp = l₁ × l₂.

For a group of parallel line projections l_i, i = 1…n in space, the vanishing point can be calculated by the following least squares solution:

$$ J(vp)=\underset{vp}{\min}\sum \limits_{i=1}^n{\left({l}_i^T vp\right)}^2 $$

(6)

If the vanishing points in three directions perpendicular to each other in space are known to be vp_x, vp_y,vp_z, and, they all meet the requirements of formula (7). The formula is transformed into a linear equation system related to ω element, and then the ω value is obtained by solving the equation, and then the camera internal parameter matrix can be established to complete the camera self calibration.

$$ v{p}_x^T\omega v{p}_y^T=v{p}_x^T\omega v{p}_z^T=v{p}_y^T\omega v{p}_z^T $$

(7)

When calculating the 3D coordinates of matching points, the world coordinate system is placed on the position of the first camera coordinate system, and the camera projection matrix is as follows [11]:

$$ \left\{\begin{array}{c}{P}_1={M}_1\left[I\left|0\right.\right]\\ {}{P}_2={M}_1\left[R\left|t\right.\right]\end{array}\right. $$

(8)

In the formula, M₁ is the camera internal parameter matrix, R is the rotation matrix. For projection matrix P_I, assume that P_I1, P_I2 and P_I3 are P_I row vectors, (u_i, v_i, 1)^T is the homogeneous coordinate of the i-th matching point in P_I, and $ {\overset{\sim }{X}}_i $ is the homogeneous coordinate relative to the space point

$$ {Z}_c\left[\begin{array}{c}u\\ {}v\\ {}1\end{array}\right]=\left[\begin{array}{c}{\alpha}_u\kern1em 0\kern1.3em {u}_0\kern1.3em 0\\ {}\kern0.1em 0\kern1.3em {\alpha}_r\kern1em {v}_0\kern1.4em 0\\ {}0\kern1.5em 0\kern1.6em 1\kern1.8em 0\end{array}\right]\left[\begin{array}{c}R\kern1em t\\ {}{0}^T\kern0.6em 1\end{array}\right]\left[\begin{array}{c}{X}_w\\ {}{Y}_W\\ {}{Z}_w\\ {}\kern0.5em 1\end{array}\right]={M}_1{M}_2{X}_w=M{X}_w $$

(9)

Where α_u = f/dx = f/dy belongs to the proportional coefficient on the axis x and y, and Z_c can be taken as a constant factor. If formula (16) and (17) are combined, we can obtain:

$$ \left[\begin{array}{c}{P}_{13}{u}_i-{P}_{11}\\ {}{P}_{13}{v}_i-{P}_{12}\\ {}{P}_{23}{u}_i-{P}_{21}\\ {}{P}_{23}{v}_i-{P}_{22}\end{array}\right]{\overset{\sim }{X}}_i=M{\overset{\sim }{X}}_i=0 $$

(10)

From formula (18), we can get the value of $ {\overset{\sim }{X}}_i $ by calculating three unknowns with four equations. The vector corresponding to the minimum eigenvalue of A^TA is the solution of $ {\overset{\sim }{X}}_i $.

Based on the Internet of Things image collected by solid state image sensor, the mathematical model of Internet of Things image observation will be constructed below.

3 Mathematical model for image observation in the internet of things

In order to better solve the problem of super-resolution reconstruction, the process of low-resolution image acquisition needs to be modeled [12]. The degradation of an ideal high-resolution image to a low-resolution image is simulated, and Fig. 3 shows the process of image degradation.

The specific causes of motion deformation, blurring, downsampling, and additive noises in Fig. 1 are as follows:

(1)
Motion deformation: there are global motion and local motion. The global motion is generated according to the motion of the camera. After the global motion, the image is deformed. After the deformation, each object in the image has the same motion characteristics and parameters, and the motion can be compensated by estimating the parameters of the two-dimensional or multi-dimensional model. The local motion is generated by the motion of each object in the scene. After local motion, the image is deformed. After the deformation, each object in the image has its own motion characteristics and parameters, which are relatively complicated to handle [13].
(2)
Blurring: there are mainly three types: the motion blur caused by the relative motion, the optical blur caused by the defocus of the optical imaging system and the diffraction limit and other factors, and the blurring of the low-resolution sensor. In the application of single-image super-resolution reconstruction, these blurs are usually characterized by point spread functions [14].
(3)
Sampling: the light emitted by the object is converted into an electric signal on the sensor. In order to display and store the electric signal, it needs to be sampled. The sampling process may cause signal distortion, resulting in a decrease in the sharpness of the image and output of a low-resolution image [15].
(4)
Additive noises: there are noises formed by mutual interference of various originals in the system, and noises in the sampling process.

Let X be an ideal high-resolution image. It can be seen from Fig. 1 that after X undergoes motion deformation, blurring, sampling, and noises during imaging, the quality of the image obtained actually decreases, resulting in a low-resolution image. The low-resolution image may be the result of one imaging or multiple imaging.

The k-frame low-resolution image obtained by imaging k times in the same scene is denoted as Y_k. The mathematical description of this process is:

$$ {Y}_k=D{B}_k{M}_kX+{n}_k $$

(11)

Where k = 1, 2, ⋯, K is the number of frames in the image sequence, K is the total frame number of the image sequence in the same scene; Y_k is the low-resolution image of the k-th frame; X is the ideal high-resolution image; D is the downsampling matrix; B_k is the fuzzy matrix of Y_k; M_k is the deformation matrix of Y_k; N_k is the additive noises matrix.

After the image is imaged once, a low-resolution image is obtained, which is denoted as Y. This process is not impacted by a geometric transformation matrix. The mathematical description of the process is:

$$ Y= DBX+n $$

(12)

For ease of description, the noise variable is removed to get a simplified expression:

$$ Y= DBMX\ \mathrm{or}\ Y= DBX $$

(13)

4 A super-resolution reconstruction method for a single frame character image of internet of things

4.1 Wavelet threshold denoising

Wavelet threshold denoising can be divided into the following steps [16]:

(1)
Wavelet decomposition: appropriate wavelet basis and decomposition level are selected to decompose the image.
(2)
Threshold selection and quantification: the corresponding threshold is chosen for decomposed high-frequency coefficients of each layer, and the threshold is quantified.
(3)
Wavelet reconstruction: according to low frequency coefficients of the N-th layer the wavelet decomposition and the high frequency coefficients of each layer after threshold selection, the image is processed with wavelet reconstruction. In wavelet threshold denoising, the selection and quantification of thresholds are very important and directly related to the quality of denoising.

Hard threshold and soft threshold are two common methods of wavelet threshold. The hard threshold algorithm uses zero substitution when the wavelet coefficient W_{i, j} is less than the threshold λ, ie:

$$ {W}_{i,j}=\left\{\begin{array}{cc}0& \left|{W}_{i,j}\right|\le \lambda \\ {}{W}_{i,j}& {W}_{i,j}>\lambda \end{array}\right. $$

(14)

The soft threshold algorithm replace the wavelet coefficient W_{i, j} with zero if the value is smaller than the threshold λ, and the other is modified by subtracting the threshold λ from the wavelet coefficient, ie:

$$ {W}_{i,j}=\left\{\begin{array}{cc}0& \left|{W}_{i,j}\right|\le \lambda \\ {}\operatorname{sgn}{W}_{i,j}\left|{W}_{i,j}-\lambda \right|& {W}_{i,j}>\lambda \end{array}\right. $$

(15)

A local adaptive threshold selection method based on wavelet decomposition layer number, local contrast and statistical characteristics of high frequency coefficients was proposed. In this method, the high-frequency coefficient matrices in each horizontal, vertical, and diagonal directions of the lifting wavelet decomposition are processed in blocks to obtain a plurality of sub-coefficient matrices. Each sub-coefficient matrix corresponds to a local information of the image [17]. According to the number of layers, contrast, and absolute median, a threshold selection model is proposed, namely:

$$ {T}_{ijk}=\frac{\lambda_{ijk}{N}_{ijk}}{2^i-1} $$

(16)

In the formula, k is 1, 2, and 3, respectively, representing the horizontal, vertical and diagonal directions; i represents the number of decomposition layers; y represents the local image in the i-th layer, the k-th direction, and the j-th sub-matrix. Contrast; n represents the absolute median of the high-frequency coefficients corresponding to the j-th sub-matrix in the layer direction. As the number of decomposition layers i, local contrast y, and the absolute median of high frequency coefficients in each sub-matrix are different, a more adaptive threshold is selected.

Daubechies (9/7) wavelet transform belongs to biorthogonal wavelet transform and has a linear phase. Biorthogonal wavelet transform has been widely used in image processing. The biorthogonal wavelet decomposition formula for discrete signals is [18]:

$$ {y}_i=\sum \limits_k{h}_k{x}_{2i-k} $$

(17)

$$ {z}_i=\sum \limits_k{g}_k{x}_{2i-k} $$

(18)

In the formula, h and g represent image signals. If x is the original signal and y and z are low-frequency signals and high-frequency signals obtained after decomposition, the reconstruction formula is:

$$ {X}_i=\sum \limits_k\left({\overline{h}}_k{x}_{2i-k}{y}_k+{\overline{g}}_{i-2k}{z}_k\right) $$

(19)

4.2 Feature point extraction and matching

Feature points represent the points with obvious features in the image, such as gray maximum point and corner vertex. The obtained feature points are called interest operator or favorable operator. Generally speaking, interest points have a typical local property, which can be located by some local detection operator.In this paper, Moravec feature point extraction operator method is proposed, which follows the following principles: firstly, the target image is grayed, and the minimum gray variance in the important direction represents the change of pixel gray value, and then the highest interest point is selected as the feature point in the local area of the image. The detailed algorithm steps are as follows [19]:

Step 1: If the gray value of pixel (u, v) is marked as g_{u, v}, in order to obtain the interest value of pixel (u, v), it is necessary to obtain the sum of squares of gray level differences of adjacent pixels in four directions in 5 × 5 window with pixel (u, v) as the center. If k = int(n/2) = 2, then the formula for calculating the sum of squares is as follows:

$$ {V}_1=\sum \limits_{i=-k}^{k-1}{\left({g}_{u+i,v}-{g}_{u+i+1,v}\right)}^2 $$

(20)

$$ {V}_2=\sum \limits_{i=-k}^{k-1}{\left({r}_{u,v+i}-{g}_{u,v+i+1}\right)}^2 $$

(21)

$$ {V}_3=\sum \limits_{i=-k}^{k-1}{\left({g}_{u+i,v+i}-{g}_{u+i+1,v+i+1}\right)}^2 $$

(22)

$$ {V}_4=\sum \limits_{i=-k}^{k-1}{\left({g}_{u+i,v-i}-{g}_{u+i+1,v-i+1}\right)}^2 $$

(23)

The smallest one is regarded as the interest value of pixel (u, v):

$$ I{V}_{u,v}=\min \left\{{V}_1,{V}_2,{V}_3,{V}_4\right\} $$

(24)

Step 2: Combined with the known threshold, the points whose interest value is higher than this threshold are regarded as feature candidate points. In order to determine the threshold value, it is necessary to ensure that the candidate points contain the required feature points and do not contain too many useless points.

Step 3: Select the maximum value as the required feature point in the candidate points, remove all non interest points in the fixed size window, and only retain the maximum interest value as the unique feature point of the pixel.

The traditional method of feature point matching based on image gray level is to detect the gray features of the region related to the standard image and the target to be matched. In this paper, the image feature points are determined by the maximum gray level covariance of the region to be detected, and the least square image matching method is used in the matching process [20].

Suppose there are two digital images A and B to be matched. If GA_ij and GB_ij are used to represent the gray values of column j in row i of an N × N pixel array in image A and B, the gray mean and variance are expressed by the following formula:

$$ \left\{\begin{array}{c}\overline{GA}=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^NG{A}_{ij}\\ {}\overline{GB}=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^NG{B}_{ij}\end{array}\right. $$

(25)

$$ \left\{\begin{array}{c}{\sigma}_A=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^N{\left(G{A}_{ij}-\overline{GA}\right)}^2\\ {}{\sigma}_B=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^N{\left(G{B}_{ij}-\overline{GB}\right)}^2\end{array}\right. $$

(26)

The formula of covariance between pixel A and pixel B is as follows:

$$ {C}_{AB}=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^N\left(G{A}_{ij}-\overline{GA}\right)\left(G{B}_{ij}-\overline{GB}\right)=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^NG{A}_{ij}G{B}_{ij}-\overline{GA GB} $$

(27)

When the covariance C_AB is the maximum, the N × N pixel array in A and B belongs to the matched image array, and the matching is completed according to the median point.

4.3 Super-resolution image reconstruction based on wavelet neural network

The conversion from a low-resolution image to a super-resolution image is defined as F(⋅), the input layer vector of the network is a_ij = (P_ij, q_ij, 1)ⁱ, σ(⋅) is the growth limit, and Ω_p − q is the coordinate (p, q) of the point on the 2D image plane [21].

The 3D image constructed by the wavelet neural network reflection model can be defined as:

$$ {R}_{i,j}\left(p,q\right)=F\left(w,v,{a}_{i,j}\right)=\sigma \left[\sum \limits_{h=1}^N{V}_k\sigma \left({W}_k{a}_{i,j}\right)+\theta \right] $$

(28)

The height of the image surface is calculated [22]:

$$ {z}_{i,j}\left(n+1\right)={z}_{i,j}(n)-\frac{\varepsilon }{4}\left({f}_{i,j}+{g}_{i,j}\right) $$

(29)

In the equation, z_{i, j} represents the height of the image surface, and f_{i, j} and g_{i, j} represent the width and length of the image surface. The expression of the reflection model based on the wavelet neural network output layer is [23]:

$$ V=\sum \limits_{i,j}\left({\phi}_{i,j},{\phi}_{i,j}^T\right) $$

(30)

Where ϕ_{i, j} is the vector of image. The expression of the hidden layer reflection model based on wavelet neural network is:

$$ wk\left(n+1\right)= wk(n)+\eta w\left[-\frac{E_T}{w_k}+E(wk)\right] $$

(31)

Where, w represents the space formed by the detailed information described by the wavelet function, E_T represents the optimal nonlinearity parameter, E(wk) represents the least optimized objective function, and single-frame character image super-resolution reconstruction is evaluated by the energy function E(θ) of network. The higher E(θ) value is, the higher the resolution of a reconstructed single-frame character image. The calculation equation of E(θ) is [24, 25]:

$$ E\left(\theta \right)=F{\left({W}_k,v,a\right)}^2+\lambda $$

(32)

Wavelet threshold denoising, feature point extraction and matching, and super-resolution image reconstruction are used to complete image super-resolution reconstruction of single frame feature Internet of Things based on wavelet neural network. The application performance of this method will be verified in the following paragraphs.

5 Experimental results and analysis

To verify the overall effectiveness of the proposed method, the single-frame character image super-resolution reconstruction method based on wavelet neural network needs to be tested. The test platform for this experiment is Matlab. The system is Windows 7.0. In the experiment, multiple experimental indexes were used to evaluate the effectiveness of the application method, including anti-interference, signal-to-noise ratio, reconstructed image resolution, reconstructed local and global image effect, reconstructed image clarity, extracted image feature precision, and retained image detail. Through the above indicators, the proposed method, literature [5] method and literature [7] method are compared. The specific experimental results are as follows:

5.1 Anti-jamming performance test of different methods

Using Peng Yang equality based on nonnegative neighborhood given by the embedding and nonlocal regularization of the single frame image super-resolution reconstruction method and Yuan Qi equality are proposed based on support vector regression (support vector regression, the SVR) of the single frame image super-resolution reconstruction method as the experiment contrast method, compared with the proposed method, test the anti-jamming performance of three different methods.The test results are shown in Fig. 4.

Figure 4(a) shows test results of the single frame character image super resolution reconstruction method based on wavelet neural network. It can be seen from Fig. 4(a) that when a single frame character image super resolution reconstruction method based on wavelet neural network is used for testing, the frequency fluctuation of the signal in the image after adding the interference is small by comparing the signal frequency after adding interference and before adding the interference. The verification shows the single-frame character image super-resolution reconstruction method based on wavelet neural network has stronger anti-interference performance. Figure 4(b) shows the test results of the single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization. Figure 4(c) are the results of the single-frame image super-resolution reconstruction method based on SVR. Figure 4(b) and (c) show that when the single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization and the single-frame image super-resolution reconstruction based on support vector regression method are used, the frequency fluctuation of the signal in the image after adding the interference is large by comparing the signal frequency after adding interference and before adding the interference. It shows the single-frame character image super-resolution reconstruction method based on wavelet neural network has weaker anti-interference performance.

5.2 SNR comparison of different methods

In order to further verify the anti-interference ability of the single-frame character image super-resolution reconstruction method based on wavelet neural network, the single-frame character image super-resolution reconstruction method based on wavelet neural network, non-negative neighborhood embedding and non-local regularization (method 1), SVR (method 2) are compared, and the SNR of 512 × 512 × 8 after reconstruction are compared. Equation (15) is used to calculate SNR, test results are shown in Table 2.

$$ {R}_{PSNR}=10{\log}_2\left({255}^2D\right) $$

(33)

Table 2 SNR of three methods.

Full size table

Analysis of the data in Table 2 shows that for testing seven different images, the SNR in the image does not decrease with the increasing input noise intensity when the single-frame character image super-resolution reconstruction method based on wavelet neural network is used, which shows that the single-frame character image super-resolution reconstruction method based on wavelet neural network can effectively remove the noise in the image. When the non-local regularized single-frame image super-resolution reconstruction method and the SVR method are used, the SNR in the image decreases with the increase of the input noise intensity, which proves the non-local regularized single frame image super-resolution reconstruction method and the SVR method. Single-frame image super-resolution reconstruction method cannot remove the noise effectively when the noise intensity is high. The test results show that the SNR of single-frame character image super-resolution reconstruction method based on wavelet neural network is higher than the method based on non-negative neighbor embedding and non-local regularization, and the method under different noise intensities, which verifies that the single-frame character image super-resolution reconstruction method based on wavelet neural network are more effective.

5.3 Image resolution comparison under different methods

Single-frame character image super-resolution reconstruction method based on wavelet neural network, single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization, single frame image super-resolution based on SVR are tested separately, and the resolution of the images reconstructed with three methods is compared. The test results are shown in Fig. 5.

Figure 5(a) is the image resolution obtained by reconstructing a single-frame character image using the single-frame character image super-resolution reconstruction method based on wavelet neural network. Figure 5(b) shows the image resolution obtained by reconstructing a single-frame character image using the single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization. Figure 5(c) is the image resolution obtained by reconstructing a single-frame character image using the single-frame image super-resolution reconstruction method based on SVR. Comparing Fig. 5(a), (b) and (c), it can be seen that the image resolution obtained by the single-frame character image super-resolution reconstruction method based on wavelet neural network is higher than that based on non-negative neighborhood embedding and non-local regularization and that based on SVR. Because single-frame image super-resolution reconstruction based on non-negative neighborhood embedding and non-local regularization and the method based on the SVR have poor anti-interference. In the reconstruction process, the image contains too much noise, resulting in a lower resolution of the image obtained by the reconstruction.

5.4 Contrast of local reconstruction effects of different methods

Based on the above experimental results, compared with the literature [5], the equality based on nonnegative neighborhood given by the embedding and nonlocal regularization of the single frame image super-resolution reconstruction method and reference [7], the equality of proposed based on support vector regression (support vector regression, the SVR) of the single frame image super-resolution reconstruction method for partial local reconstruction effect of sample image. Figure 6 is the sample image of this experiment, Fig. 7(a) is the image after local reconstruction using the method of reference [5], Fig. 8(b) is the image after local reconstruction by using the method of reference [7], and Fig. 8(c) is the image after local reconstruction by using the proposed method.

As shown in Fig. 7, there are obvious occlusion parts of animal body. From the above images, it can be clearly seen that the image reconstruction effect of literature [5] method and reference [7] method is poor. There are obvious noise points in the local image reconstruction results of the method in reference [5], which leads to the loss of small features of animal fur. Although the noise problem of the method in reference [7] has been significantly improved, the local reconstruction structure is not complete. In contrast, the proposed method can not only blur the image, but also preserve the details and texture of the image and reconstruct the target occlusion with high definition.

5.5 Comparison of global image reconstruction effects with different methods

On the basis of verifying the effect of local image reconstruction, in order to further verify the image reconstruction effect of the proposed method, two experimental samples were used to compare the global image reconstruction effect of different methods. Figure 8 shows the two sample images of this experiment. The literature [5] Peng Yang equality based on nonnegative neighborhood given by the embedding and nonlocal regularization of the single frame image super-resolution reconstruction method and reference [7] Yuan Qi equality of proposed based on support vector regression (support vector regression, the SVR) of the single frame image super-resolution reconstruction method for partial local reconstruction effect of sample image were used as experimental control, and the experimental results were compared with the proposed method. The specific comparison results were shown in Fig. 9 and Fig. 10.

It can be seen from the experimental results in Fig. 9 and Fig. 10 that the global reconstruction effects of different methods are quite different for experimental sample image 1 and experimental sample image 2. The results of global image reconstruction based on reference [5] and reference [7] have problems of blurred image details and ghosting, which results in a large amount of information loss and incomplete image features after reconstruction. In contrast, the global reconstruction effect of the proposed method is more ideal, which ensures the image clarity, and the key features of the image are more complete, and the detail retention rate is better.

5.6 Contrast of image sharpness after reconstruction under different methods

In order to further verify the clarity of the image processed by the research method, the fuzziness of the information detail representation after image reconstruction of the three methods was compared.The lower the ambiguity, the better the effect of image reconstruction. The experimental results are shown in Fig. 11:

According to the analysis and comparison of Fig. 11, after image reconstruction by traditional methods, the overall ambiguity value of the image is high, and the maximum ambiguity is as high as 45%. In addition, the ambiguity changes greatly during the experiment, and the numerical level is not stable. In the whole experiment process, the ambiguity of the proposed method is kept at a low level, which is basically not affected by the number of experiments. The experimental results show that the image reconstructed by the single frame character Internet of things can maintain a high definition.

5.7 Comparison of image feature extraction accuracy of different methods

The accuracy of image feature extraction is a key index that affects the effect of image reconstruction. In this section, the accuracy of image feature extraction is taken as the evaluation index to compare the accuracy of different methods. High image feature extraction precision is an important prerequisite for image reconstruction. The specific experimental results are as follows:

It can be seen from the experimental results in Fig. 12 that there is no significant difference in the accuracy of image feature extraction between the literature in [5] method and the literature in [6] method. In the 700 times of experimental iteration, the accuracy level has always been below 50%, and at the 100th time of the experiment, the image feature extraction accuracy is the lowest, about 10%. The above data show that the traditional method does not meet the requirements of image processing in this field, and can not complete the high standard image reconstruction. In the whole experimental process, the accuracy of the proposed method can be maintained at 80% ~ 90%. This is because the Moravec feature point extraction operator method is introduced in this paper. The highest interest point is selected as the feature point in the local area of the image, which greatly improves the accuracy of image processing.

5.8 Comparison of image detail retention rates by different methods

In order to verify the effect of the research method on image detail information retention, the following experiments are carried out.The sample images in the experiment are from Imagenet database.(http://www.image-net.org/).From this database, 100 images collected by Internet of things are selected as the research object of this experiment.The 100 images were reconstructed by using the method sample of reference [5], the method of reference [7] and the research method, and the detail retention rate of the processed images was compared.The experimental results are shown in Fig. 13.

According to the experimental data in Fig. 13, after image reconstruction, the image detail retention rate in reference [5] is about 40% ~ 60%. In short, the traditional method loses about half of the detail information during image reconstruction. The data show that this traditional method can not meet the requirements of the current image processing field.The method in reference [7] has the same defects as the method in reference [5]. The retention rate of image information varies from 40% to 70%. Although the image reconstruction effect will be slightly better, it does not meet the requirements of image processing.In contrast, the image detail retention rate of the research method is more stable. In the whole experimental process, with the increase of the number of test images, the image detail retention rate is maintained at 80% ~ 95%.This group of data shows that the practical application effect of the research method is better.

6 Conclusion

The image super-resolution reconstruction technology means that a low-resolution image acquired by exisThis paper proposes a single-frame character image super-resolution reconstruction method based on wavelet neural network. The super-resolution reconstruction of the single-frame character image is completed, and the noise in the image can be effectively removed, and a single-frame character image with higher resolution can be obtained. The experimental results show that the fuzziness of the reconstructed image is less than 5%. In the whole experiment, the accuracy of the method can be maintained at 80% ~ 90%, and the image detail retention rate is maintained at 80% ~ 95%. The above data show that compared with the traditional method, the proposed method has the most ideal image reconstruction results.

Super-resolution image reconstruction is one of the hot topics in recent years. Finding a highly efficient and simple reconstruction algorithm is of great significance. There are still many problems to be solved in the study of super-resolution image reconstruction.

(1)
Improvement of the super-resolution reconstruction model: The reconstruction model of the Internet of Things image in different research areas is different, especially in the research of super-resolution reconstruction of single-frame character images, it is necessary to combine the formation process of single-frame character images to establish a reconstruction model more suitable for single-frame character images.
(2)
Evaluation of reconstructed Internet of Things image quality: In practical applications, other evaluation indicators must be considered to achieve an accurate evaluation of the reconstructed image quality.

References

Wu QJ, Sun YF, Zhao L (2017) Depth image super-resolution reconstruction of the sparse representation and simulation. Computer Simulation 34:234–237
Google Scholar
Shuai L, Shuai W, Xinyu L et al (2020) Fuzzy detection aided real-time and robust visual tracking under complex environments. IEEE Trans Fuzzy Syst:1. https://doi.org/10.1109/TFUZZ.2020.3006520
Li Y, Wang Y, Li Y, Jiao L, Zhang X, Stolkin R (2016) Single image super-resolution reconstruction based on genetic algorithm and regularization prior model. Inf Sci 372:196–207
Article Google Scholar
Liu N, Li CH (2015) Single image super-resolution reconstruction via deep convolutional neural network. China Sciencepaper 10:201–206
Google Scholar
Peng YP, Ning BJ, Gao XB (2015) Single-frame image super-resolution reconstruction algorithm based on nonnegative neighbor embedding and non-local means regularization. Computer Science 42:104–107
Google Scholar
Chow CW, Shiu RJ, Liu YC et al (2018) Non-flickering 100 m RGB visible light communication transmission based on a CMOS image sensor. Opt Express 26:70–79
Google Scholar
Yuan QP, Lin HJ, Chen ZH et al (2016) Single image super-resolution reconstruction using support vector regression. Opt Precis Eng 24:2302–2309
Article Google Scholar
Wei Z, Zou W, Zhang G, Zhao K (2019) Extrinsic parameters calibration of multi-camera with non-overlapping fields of view using laser scanning. Opt Express 27:16719
Article Google Scholar
Lv YW, Liu W, Xu XP (2018) Methods based on 1D homography for camera calibration with 1D objects. Appl Opt 57:2155–2164
Article Google Scholar
Sylvain G (2018) Folding photography in the time domain. Nat Photonics 12:502–503
Article Google Scholar
Stark A, Wong E, Weigel D et al (2018) Repeatable speckle projector for single-camera three-dimensional measurement. Opt Eng 57:1205–1216
Google Scholar
Suseela G, Phamila YAV (2018) Low-complexity low-memory energy-efficient image coding for wireless image sensor networks. The imaging science journal 66:125–132
Article Google Scholar
Chengfei G, Jietao L, Tengfei W et al (2018) Tracking moving targets behind a scattering medium via speckle correlation. Appl Opt 57:905
Article Google Scholar
Trull AK, Jelle VDH, Van VLJ et al (2018) Comparison of image reconstruction techniques for optical projection tomography. Appl Opt 57:1874–1882
Article Google Scholar
Zhang J, Blum RS, Poor HV (2018) Approaches to secure inference in the internet of things: performance bounds, algorithms, and effective attacks on IoT sensor networks. IEEE Signal Process Mag 35:50–63
Article Google Scholar
Shuai L, Xinyu L, Shuai W et al (2020) Fuzzy-aided solution for out-of-view challenge in visual tracking under IoT assisted complex environment. Neural Comput Applic. https://doi.org/10.1007/s00521-020-05021-3
Liu S, Pan Z, Cheng X (2017) A novel fast fractal image compression method based on distance clustering in high dimensional sphere surface. Fractals 25(4):1740004
Article Google Scholar
Piyush J, Surya P (2018) Continuous wavelet transform based no-reference image quality assessment for blur and noise distortions. IEEE Access 6:33871–33882
Article Google Scholar
Wei PC, Zou Y (2020) Image feature extraction and object recognition based on vision neural mechanism. Int J Pattern Recognit Artif Intell 34:3340–3342
Article Google Scholar
Liu S, Bai W, Liu G, Li W, Srivastava HM (2018) Parallel fractal compression method for big video data. Complexity 2018:2016976–2016916. https://doi.org/10.1155/2018/2016976
Article MATH Google Scholar
Kanatsoulis CI, Fu X, Sidiropoulos ND, Ma WK (2018) Hyperspectral super-resolution: a coupled tensor factorization approach. IEEE Trans Signal Process 66:6503–6517
Article MathSciNet Google Scholar
Chee AKW (2019) Principles of high-resolution dopant profiling in the scanning helium ion microscope, image widths, and surface band bending. IEEE Transactions on Electron Devices 66:4883–4887
Article Google Scholar
Jin, X, Liu, SY, & Dai,H (2019) a distributed semi-supervised learning algorithm based on manifold regularization using wavelet neural network. Neural Netw 118:300–309
Hurtik P, Molek V, Hula J (2020) Data preprocessing technique for neural networks based on image represented by a fuzzy function. IEEE Trans Fuzzy Syst 28:1195–1204
Article Google Scholar
Liu S, Guo C, Fadi A et al (2020) Reliability of response region: a novel mechanism in visual tracking by edge computing for IIoT environments. Mech Syst Signal Process 138:106537
Article Google Scholar

Download references

Author information

Authors and Affiliations

Changzhi University, Changzhi, 046011, China
Ling-li Guo
Faculty of Applied Mathematics, Silesian University of Technology, Gliwice, Poland
Marcin Woźniak

Authors

Ling-li Guo
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Woźniak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcin Woźniak.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Guo, Ll., Woźniak, M. An Image Super-Resolution Reconstruction Method with Single Frame Character Based on Wavelet Neural Network in Internet of Things. Mobile Netw Appl 26, 390–403 (2021). https://doi.org/10.1007/s11036-020-01681-6

Download citation

Accepted: 25 October 2020
Published: 14 November 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11036-020-01681-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An Image Super-Resolution Reconstruction Method with Single Frame Character Based on Wavelet Neural Network in Internet of Things

Abstract

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

MFFN: image super-resolution via multi-level features fusion network

Learning a Deep Convolutional Network for Image Super-Resolution

1 Introduction

2 Internet of things image acquisition based on solid state image sensor

3 Mathematical model for image observation in the internet of things