1 Introduction

Super-resolution reconstruction of Internet of Things image refers to a digital image processing technique that reconstructs a high-resolution image from one or more low-resolution images. According to different kinds of images, the image super-resolution reconstruction mainly consists of super-resolution reconstruction of color images and super-resolution reconstruction of depth images [1]. In the field of image processing, super resolution reconstruction techniques are generally used to increase resolution of images for the acquired low-resolution images. The clarity of a single frame image of the Internet of Things is the prerequisite for the subsequent image processing. The high blur degree of the image will aggravate the image problems of the Internet of Things, leading to the unsatisfactory application effect of the Internet of Things [2, 3].

Internet of Things image super-resolution reconstruction has very important application value in many fields. Therefore, the super-resolution reconstruction method for single-frame character images needs to be analyzed and studied. Liu et al. proposed a super-resolution reconstruction method for single-frame image based on multi-level convolution neural network learning. This method constructs a PMJ model applied to super-resolution reconstruction, and performs preliminary feature extraction on the image during the sensing phase. This method cannot effectively remove the noise in the single-frame character image, and the anti-interference performance is poor [4]. Peng et al. reported a single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization, which can be well maintained; during the reconstruction phase, non-negative neighborhood embedding is used to select the number of neighbors; in the end, the non-local similarity of the image is used to construct non-local regular terms to modify the reconstruction result. The process of this method is complex, and errors are easy to occur, resulting in lower resolution of the image [5]. And this method ignores the errors caused by the acquisition of super-resolution images by cameras under visible light communication [6]. Yuan et al. came up with a single frame image super-resolution reconstruction method based on support vector regression (SVR). This method uses raster scanning to scan high and low resolution training images and extracts input vectors and label pixels from the blocks respectively. SVR tools are utilized to regress the corresponding label pixels belonging to the super-resolution image block to complete the super-resolution reconstruction of the image. This method cannot remove the Gaussian noise in the single-frame character image, and the anti-interference performance is poor [7].

In summary, a super-resolution reconstruction method for single-frame character Internet of Things image based on wavelet neural network is proposed. Combined with pinhole imaging model and camera self-calibration, image acquisition of Internet of Things is completed. Based on the data acquisition method, an image degradation model is established to obtain the degradation process of high-resolution image to low-resolution image. Wavelet threshold denoising method is used to remove the noise in a single frame character image. The wavelet neural network reflection model is used to reconstruct the single frame feature image and improve the resolution of the image. The experimental results show that compared with the traditional method, the proposed method has better denoising effect, lower image blur degree after reconstruction, and better practical application effect.

2 Internet of things image acquisition based on solid state image sensor

The main function of the Internet of Things image acquisition unit is to select the image sensor model and configure the working mode of the sensor, which can normally collect effective images. The structure of the image acquisition unit is shown in Fig. 1.

Fig. 1
figure 1

Structure of image acquisition unit

As shown in Fig. 1, first select the image sensor model. The image sensor converts the light image on the photosensitive surface to the corresponding electrical signal by using the photoelectric conversion function of the photoelectric device. Nowadays, widely used image sensors are mainly divided into two types, namely, solid state image sensors and photoconductive photographic tubes. The comparison of the two is shown in Table 1.

Table 1 Comparison between solid-state image sensor and photoconductive tube

The results show that the solid-state image sensor has advantages of light weight, high integration, small size, long life and low power consumption. For this purpose, the system USES solid state image sensor to collect images.

Secondly, the image sensor operation mode is configured, mainly to design the image sensor interface, as shown in Fig. 2.

Fig. 2
figure 2

Interface design of image sensor

Using the above design of image sensor interface, the working mode of the sensor is determined. Based on this, the non calibration camera model is constructed to complete the image acquisition of the Internet of things [8].

The camera uses pinhole imaging model, and the imaging expression of any point Xi in a certain space is as follows:

$$ {\lambda}_i\left[\begin{array}{c}{u}_i\\ {}{v}_i\\ {}1\end{array}\right]={P}_i{X}_i $$
(1)

In formula (1):

$$ {\displaystyle \begin{array}{c}\kern1.5em {P}_i={K}_i\left({R}_i\left|{T}_i\right.\right)\\ {}{X}_i={\left[{X}_{wi}\kern0.30em {Y}_{wi}\kern0.50em {Z}_{wi}\kern0.50em 1\right]}^T\end{array}} $$
(2)

In formula (1) and formula (2), λi represents the projection depth, \( {K}_i=\left[\begin{array}{c}{f}_u\kern0.9000001em s\kern1em {u}_0\\ {}0\kern1.3em {f}_v\kern0.7em {v}_0\\ {}0\kern1.3em 0\kern1.4em 1\end{array}\right] \) represents the camera internal parameter matrix, and fu and fv represent the effective focal length of the camera. s belongs to the perpendicularity factor of the camera array, (u0, v0) is the coordinate of the camera main point, Ri and Ti represent the rotation matrix and translation matrix formed by the i-th camera coordinate system and the world coordinate system respectively, and Xi is the homogeneous coordinate of the object point in the world coordinate system.

If the camera and the target are in the same world coordinate system {O, Ex, Ey, Ez} and {o, ex, ey, ez}, then the camera coordinate system origin is the camera focal center C − (0e, 0e, 0e). The xy plane is parallel to the image plane of the camera, and the image plane is described as ze = f [9].

In the camera coordinate system, any point Xe = (Xe, Ye, Ze)T in space is mapped to a point xe = (xe, ye, ze)T in the image plane. The transformation process is described as follows:

$$ \left\{\begin{array}{c}{x}_e={X}_e\frac{f}{Z_e}\\ {}{y}_e={Y}_e\frac{f}{Z_e}\end{array}\right. $$
(3)

The formula (3) is transformed into homogeneous coordinate form

$$ \lambda \left[\begin{array}{c}{x}_e\\ {}{y}_e\\ {}1\end{array}\right]=\left[\begin{array}{c}f\kern1.6em 0\kern1.6em 0\kern1.6em 0\\ {}0\kern1.6em f\kern1.6em 0\kern1.6em 0\\ {}0\kern1.7em 0\kern1.8em 1\kern1.7em 0\end{array}\right]\left[\begin{array}{c}{X}_e\\ {}{Y}_e\\ {}{Z}_e\\ {}1\end{array}\right] $$
(4)

In formula (3), λ = Ze.

After transformation to homogeneous coordinates, the initial mapping relationship can be expressed in a linear way, and the matrix combining 3D and 2D points is called projection matrix.

The world coordinate system {O, Ex, Ey, Ez} and camera coordinate system {o, ex, ey, ez} are not unified. Generally, the corresponding position relationship between the 3D object and the camera is unknown, so it is necessary to distinguish the world coordinate system from the camera coordinate system. There is Xe = R(XE − t) European transformation A from the coordinate system of the 3D target to the camera coordinate system, where R represents the rotation matrix and t represents the translation vector. At this point, formula (4) is converted into the following form:

$$ \lambda \left[\begin{array}{c}{x}_e\\ {}{y}_e\\ {}1\end{array}\right]={K}^{\ast}\left[R- Rt\right]\left[\begin{array}{c}{X}_E\\ {}\kern0.2em {Y}_E\\ {}\kern0.1em {Z}_E\\ {}\kern0.4em 1\end{array}\right] $$
(5)

Projective geometry knowledge shows that: a group of lines L1, L2, L3Ln intersect at an infinite point, assuming that the point imaging is described as vpx, if these lines are parallel to the plane, then vpx is imaging at infinite distance; if these lines are not parallel to the plane, vpx is imaging at a finite distance, and these points are called vanishing points. The line passing through vanishing point vpx in the image must have translation relationship with other lines [10].

Suppose that after cube perspective imaging, lines a, d, h intersect at common vanishing point vpx, and lines c, i, f intersect at a common vanishing point vpy.

Straight line b, e, g intersect at vpz. The vanishing point is different from the other points in the imaging plane. It describes the straight line direction data, and the 3D reconstruction structure data can be obtained by comprehensive analysis of the vanishing points.

Under the description of homogeneous coordinates, M = [X; Y; Z; l] point in space is projected on a point m = [u, v; l] on the image; a line in space is projected as l = m1 × m2 in the plane, and m1 and m2 are the endpoints of both sides of the line l.

If l1 and l2 represent the projection of any pair of parallel lines, the vanishing point is vp = l1 × l2.

For a group of parallel line projections li, i = 1…n in space, the vanishing point can be calculated by the following least squares solution:

$$ J(vp)=\underset{vp}{\min}\sum \limits_{i=1}^n{\left({l}_i^T vp\right)}^2 $$
(6)

If the vanishing points in three directions perpendicular to each other in space are known to be vpx, vpy,vpz, and, they all meet the requirements of formula (7). The formula is transformed into a linear equation system related to ω element, and then the ω value is obtained by solving the equation, and then the camera internal parameter matrix can be established to complete the camera self calibration.

$$ v{p}_x^T\omega v{p}_y^T=v{p}_x^T\omega v{p}_z^T=v{p}_y^T\omega v{p}_z^T $$
(7)

When calculating the 3D coordinates of matching points, the world coordinate system is placed on the position of the first camera coordinate system, and the camera projection matrix is as follows [11]:

$$ \left\{\begin{array}{c}{P}_1={M}_1\left[I\left|0\right.\right]\\ {}{P}_2={M}_1\left[R\left|t\right.\right]\end{array}\right. $$
(8)

In the formula, M1 is the camera internal parameter matrix, R is the rotation matrix. For projection matrix PI, assume that PI1, PI2 and PI3 are PI row vectors, (ui, vi, 1)T is the homogeneous coordinate of the i-th matching point in PI, and \( {\overset{\sim }{X}}_i \) is the homogeneous coordinate relative to the space point

$$ {Z}_c\left[\begin{array}{c}u\\ {}v\\ {}1\end{array}\right]=\left[\begin{array}{c}{\alpha}_u\kern1em 0\kern1.3em {u}_0\kern1.3em 0\\ {}\kern0.1em 0\kern1.3em {\alpha}_r\kern1em {v}_0\kern1.4em 0\\ {}0\kern1.5em 0\kern1.6em 1\kern1.8em 0\end{array}\right]\left[\begin{array}{c}R\kern1em t\\ {}{0}^T\kern0.6em 1\end{array}\right]\left[\begin{array}{c}{X}_w\\ {}{Y}_W\\ {}{Z}_w\\ {}\kern0.5em 1\end{array}\right]={M}_1{M}_2{X}_w=M{X}_w $$
(9)

Where αu = f/dx = f/dy belongs to the proportional coefficient on the axis x and y, and Zc can be taken as a constant factor. If formula (16) and (17) are combined, we can obtain:

$$ \left[\begin{array}{c}{P}_{13}{u}_i-{P}_{11}\\ {}{P}_{13}{v}_i-{P}_{12}\\ {}{P}_{23}{u}_i-{P}_{21}\\ {}{P}_{23}{v}_i-{P}_{22}\end{array}\right]{\overset{\sim }{X}}_i=M{\overset{\sim }{X}}_i=0 $$
(10)

From formula (18), we can get the value of \( {\overset{\sim }{X}}_i \) by calculating three unknowns with four equations. The vector corresponding to the minimum eigenvalue of ATA is the solution of \( {\overset{\sim }{X}}_i \).

Based on the Internet of Things image collected by solid state image sensor, the mathematical model of Internet of Things image observation will be constructed below.

3 Mathematical model for image observation in the internet of things

In order to better solve the problem of super-resolution reconstruction, the process of low-resolution image acquisition needs to be modeled [12]. The degradation of an ideal high-resolution image to a low-resolution image is simulated, and Fig. 3 shows the process of image degradation.

Fig. 3
figure 3

Internet of things image degradation process.

The specific causes of motion deformation, blurring, downsampling, and additive noises in Fig. 1 are as follows:

  1. (1)

    Motion deformation: there are global motion and local motion. The global motion is generated according to the motion of the camera. After the global motion, the image is deformed. After the deformation, each object in the image has the same motion characteristics and parameters, and the motion can be compensated by estimating the parameters of the two-dimensional or multi-dimensional model. The local motion is generated by the motion of each object in the scene. After local motion, the image is deformed. After the deformation, each object in the image has its own motion characteristics and parameters, which are relatively complicated to handle [13].

  2. (2)

    Blurring: there are mainly three types: the motion blur caused by the relative motion, the optical blur caused by the defocus of the optical imaging system and the diffraction limit and other factors, and the blurring of the low-resolution sensor. In the application of single-image super-resolution reconstruction, these blurs are usually characterized by point spread functions [14].

  3. (3)

    Sampling: the light emitted by the object is converted into an electric signal on the sensor. In order to display and store the electric signal, it needs to be sampled. The sampling process may cause signal distortion, resulting in a decrease in the sharpness of the image and output of a low-resolution image [15].

  4. (4)

    Additive noises: there are noises formed by mutual interference of various originals in the system, and noises in the sampling process.

Let X be an ideal high-resolution image. It can be seen from Fig. 1 that after X undergoes motion deformation, blurring, sampling, and noises during imaging, the quality of the image obtained actually decreases, resulting in a low-resolution image. The low-resolution image may be the result of one imaging or multiple imaging.

The k-frame low-resolution image obtained by imaging k times in the same scene is denoted as Yk. The mathematical description of this process is:

$$ {Y}_k=D{B}_k{M}_kX+{n}_k $$
(11)

Where k = 1, 2, ⋯, K is the number of frames in the image sequence, K is the total frame number of the image sequence in the same scene; Yk is the low-resolution image of the k-th frame; X is the ideal high-resolution image; D is the downsampling matrix; Bk is the fuzzy matrix of Yk; Mk is the deformation matrix of Yk; Nk is the additive noises matrix.

After the image is imaged once, a low-resolution image is obtained, which is denoted as Y. This process is not impacted by a geometric transformation matrix. The mathematical description of the process is:

$$ Y= DBX+n $$
(12)

For ease of description, the noise variable is removed to get a simplified expression:

$$ Y= DBMX\ \mathrm{or}\ Y= DBX $$
(13)

4 A super-resolution reconstruction method for a single frame character image of internet of things

4.1 Wavelet threshold denoising

Wavelet threshold denoising can be divided into the following steps [16]:

  1. (1)

    Wavelet decomposition: appropriate wavelet basis and decomposition level are selected to decompose the image.

  2. (2)

    Threshold selection and quantification: the corresponding threshold is chosen for decomposed high-frequency coefficients of each layer, and the threshold is quantified.

  3. (3)

    Wavelet reconstruction: according to low frequency coefficients of the N-th layer the wavelet decomposition and the high frequency coefficients of each layer after threshold selection, the image is processed with wavelet reconstruction. In wavelet threshold denoising, the selection and quantification of thresholds are very important and directly related to the quality of denoising.

Hard threshold and soft threshold are two common methods of wavelet threshold. The hard threshold algorithm uses zero substitution when the wavelet coefficient Wi, j is less than the threshold λ, ie:

$$ {W}_{i,j}=\left\{\begin{array}{cc}0& \left|{W}_{i,j}\right|\le \lambda \\ {}{W}_{i,j}& {W}_{i,j}>\lambda \end{array}\right. $$
(14)

The soft threshold algorithm replace the wavelet coefficient Wi, j with zero if the value is smaller than the threshold λ, and the other is modified by subtracting the threshold λ from the wavelet coefficient, ie:

$$ {W}_{i,j}=\left\{\begin{array}{cc}0& \left|{W}_{i,j}\right|\le \lambda \\ {}\operatorname{sgn}{W}_{i,j}\left|{W}_{i,j}-\lambda \right|& {W}_{i,j}>\lambda \end{array}\right. $$
(15)

A local adaptive threshold selection method based on wavelet decomposition layer number, local contrast and statistical characteristics of high frequency coefficients was proposed. In this method, the high-frequency coefficient matrices in each horizontal, vertical, and diagonal directions of the lifting wavelet decomposition are processed in blocks to obtain a plurality of sub-coefficient matrices. Each sub-coefficient matrix corresponds to a local information of the image [17]. According to the number of layers, contrast, and absolute median, a threshold selection model is proposed, namely:

$$ {T}_{ijk}=\frac{\lambda_{ijk}{N}_{ijk}}{2^i-1} $$
(16)

In the formula, k is 1, 2, and 3, respectively, representing the horizontal, vertical and diagonal directions; i represents the number of decomposition layers; y represents the local image in the i-th layer, the k-th direction, and the j-th sub-matrix. Contrast; n represents the absolute median of the high-frequency coefficients corresponding to the j-th sub-matrix in the layer direction. As the number of decomposition layers i, local contrast y, and the absolute median of high frequency coefficients in each sub-matrix are different, a more adaptive threshold is selected.

Daubechies (9/7) wavelet transform belongs to biorthogonal wavelet transform and has a linear phase. Biorthogonal wavelet transform has been widely used in image processing. The biorthogonal wavelet decomposition formula for discrete signals is [18]:

$$ {y}_i=\sum \limits_k{h}_k{x}_{2i-k} $$
(17)
$$ {z}_i=\sum \limits_k{g}_k{x}_{2i-k} $$
(18)

In the formula, h and g represent image signals. If x is the original signal and y and z are low-frequency signals and high-frequency signals obtained after decomposition, the reconstruction formula is:

$$ {X}_i=\sum \limits_k\left({\overline{h}}_k{x}_{2i-k}{y}_k+{\overline{g}}_{i-2k}{z}_k\right) $$
(19)

4.2 Feature point extraction and matching

Feature points represent the points with obvious features in the image, such as gray maximum point and corner vertex. The obtained feature points are called interest operator or favorable operator. Generally speaking, interest points have a typical local property, which can be located by some local detection operator.In this paper, Moravec feature point extraction operator method is proposed, which follows the following principles: firstly, the target image is grayed, and the minimum gray variance in the important direction represents the change of pixel gray value, and then the highest interest point is selected as the feature point in the local area of the image. The detailed algorithm steps are as follows [19]:

Step 1: If the gray value of pixel (u, v) is marked as gu, v, in order to obtain the interest value of pixel (u, v), it is necessary to obtain the sum of squares of gray level differences of adjacent pixels in four directions in 5 × 5 window with pixel (u, v) as the center. If k = int(n/2) = 2, then the formula for calculating the sum of squares is as follows:

$$ {V}_1=\sum \limits_{i=-k}^{k-1}{\left({g}_{u+i,v}-{g}_{u+i+1,v}\right)}^2 $$
(20)
$$ {V}_2=\sum \limits_{i=-k}^{k-1}{\left({r}_{u,v+i}-{g}_{u,v+i+1}\right)}^2 $$
(21)
$$ {V}_3=\sum \limits_{i=-k}^{k-1}{\left({g}_{u+i,v+i}-{g}_{u+i+1,v+i+1}\right)}^2 $$
(22)
$$ {V}_4=\sum \limits_{i=-k}^{k-1}{\left({g}_{u+i,v-i}-{g}_{u+i+1,v-i+1}\right)}^2 $$
(23)

The smallest one is regarded as the interest value of pixel (u, v):

$$ I{V}_{u,v}=\min \left\{{V}_1,{V}_2,{V}_3,{V}_4\right\} $$
(24)

Step 2: Combined with the known threshold, the points whose interest value is higher than this threshold are regarded as feature candidate points. In order to determine the threshold value, it is necessary to ensure that the candidate points contain the required feature points and do not contain too many useless points.

Step 3: Select the maximum value as the required feature point in the candidate points, remove all non interest points in the fixed size window, and only retain the maximum interest value as the unique feature point of the pixel.

The traditional method of feature point matching based on image gray level is to detect the gray features of the region related to the standard image and the target to be matched. In this paper, the image feature points are determined by the maximum gray level covariance of the region to be detected, and the least square image matching method is used in the matching process [20].

Suppose there are two digital images A and B to be matched. If GAij and GBij are used to represent the gray values of column j in row i of an N × N pixel array in image A and B, the gray mean and variance are expressed by the following formula:

$$ \left\{\begin{array}{c}\overline{GA}=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^NG{A}_{ij}\\ {}\overline{GB}=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^NG{B}_{ij}\end{array}\right. $$
(25)
$$ \left\{\begin{array}{c}{\sigma}_A=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^N{\left(G{A}_{ij}-\overline{GA}\right)}^2\\ {}{\sigma}_B=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^N{\left(G{B}_{ij}-\overline{GB}\right)}^2\end{array}\right. $$
(26)

The formula of covariance between pixel A and pixel B is as follows:

$$ {C}_{AB}=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^N\left(G{A}_{ij}-\overline{GA}\right)\left(G{B}_{ij}-\overline{GB}\right)=\frac{1}{N^2}\sum \limits_{i=1}^N\sum \limits_{j=1}^NG{A}_{ij}G{B}_{ij}-\overline{GA GB} $$
(27)

When the covariance CAB is the maximum, the N × N pixel array in A and B belongs to the matched image array, and the matching is completed according to the median point.

4.3 Super-resolution image reconstruction based on wavelet neural network

The conversion from a low-resolution image to a super-resolution image is defined as F(⋅), the input layer vector of the network is aij = (Pij, qij, 1)i, σ(⋅) is the growth limit, and Ωp − q is the coordinate (p, q) of the point on the 2D image plane [21].

The 3D image constructed by the wavelet neural network reflection model can be defined as:

$$ {R}_{i,j}\left(p,q\right)=F\left(w,v,{a}_{i,j}\right)=\sigma \left[\sum \limits_{h=1}^N{V}_k\sigma \left({W}_k{a}_{i,j}\right)+\theta \right] $$
(28)

The height of the image surface is calculated [22]:

$$ {z}_{i,j}\left(n+1\right)={z}_{i,j}(n)-\frac{\varepsilon }{4}\left({f}_{i,j}+{g}_{i,j}\right) $$
(29)

In the equation, zi, j represents the height of the image surface, and fi, j and gi, j represent the width and length of the image surface. The expression of the reflection model based on the wavelet neural network output layer is [23]:

$$ V=\sum \limits_{i,j}\left({\phi}_{i,j},{\phi}_{i,j}^T\right) $$
(30)

Where ϕi, j is the vector of image. The expression of the hidden layer reflection model based on wavelet neural network is:

$$ wk\left(n+1\right)= wk(n)+\eta w\left[-\frac{E_T}{w_k}+E(wk)\right] $$
(31)

Where, w represents the space formed by the detailed information described by the wavelet function, ET represents the optimal nonlinearity parameter, E(wk) represents the least optimized objective function, and single-frame character image super-resolution reconstruction is evaluated by the energy function E(θ) of network. The higher E(θ) value is, the higher the resolution of a reconstructed single-frame character image. The calculation equation of E(θ) is [24, 25]:

$$ E\left(\theta \right)=F{\left({W}_k,v,a\right)}^2+\lambda $$
(32)

Wavelet threshold denoising, feature point extraction and matching, and super-resolution image reconstruction are used to complete image super-resolution reconstruction of single frame feature Internet of Things based on wavelet neural network. The application performance of this method will be verified in the following paragraphs.

5 Experimental results and analysis

To verify the overall effectiveness of the proposed method, the single-frame character image super-resolution reconstruction method based on wavelet neural network needs to be tested. The test platform for this experiment is Matlab. The system is Windows 7.0. In the experiment, multiple experimental indexes were used to evaluate the effectiveness of the application method, including anti-interference, signal-to-noise ratio, reconstructed image resolution, reconstructed local and global image effect, reconstructed image clarity, extracted image feature precision, and retained image detail. Through the above indicators, the proposed method, literature [5] method and literature [7] method are compared. The specific experimental results are as follows:

5.1 Anti-jamming performance test of different methods

Using Peng Yang equality based on nonnegative neighborhood given by the embedding and nonlocal regularization of the single frame image super-resolution reconstruction method and Yuan Qi equality are proposed based on support vector regression (support vector regression, the SVR) of the single frame image super-resolution reconstruction method as the experiment contrast method, compared with the proposed method, test the anti-jamming performance of three different methods.The test results are shown in Fig. 4.

Fig. 4
figure 4

Test results of three different methods. (a) Results of the proposed method, (b) Results of the method based on non-negative neighborhood embedding and non-local regularization, (c) Results of the method based on SVR.

Figure 4(a) shows test results of the single frame character image super resolution reconstruction method based on wavelet neural network. It can be seen from Fig. 4(a) that when a single frame character image super resolution reconstruction method based on wavelet neural network is used for testing, the frequency fluctuation of the signal in the image after adding the interference is small by comparing the signal frequency after adding interference and before adding the interference. The verification shows the single-frame character image super-resolution reconstruction method based on wavelet neural network has stronger anti-interference performance. Figure 4(b) shows the test results of the single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization. Figure 4(c) are the results of the single-frame image super-resolution reconstruction method based on SVR. Figure 4(b) and (c) show that when the single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization and the single-frame image super-resolution reconstruction based on support vector regression method are used, the frequency fluctuation of the signal in the image after adding the interference is large by comparing the signal frequency after adding interference and before adding the interference. It shows the single-frame character image super-resolution reconstruction method based on wavelet neural network has weaker anti-interference performance.

5.2 SNR comparison of different methods

In order to further verify the anti-interference ability of the single-frame character image super-resolution reconstruction method based on wavelet neural network, the single-frame character image super-resolution reconstruction method based on wavelet neural network, non-negative neighborhood embedding and non-local regularization (method 1), SVR (method 2) are compared, and the SNR of 512 × 512 × 8 after reconstruction are compared. Equation (15) is used to calculate SNR, test results are shown in Table 2.

$$ {R}_{PSNR}=10{\log}_2\left({255}^2D\right) $$
(33)
Table 2 SNR of three methods.

Analysis of the data in Table 2 shows that for testing seven different images, the SNR in the image does not decrease with the increasing input noise intensity when the single-frame character image super-resolution reconstruction method based on wavelet neural network is used, which shows that the single-frame character image super-resolution reconstruction method based on wavelet neural network can effectively remove the noise in the image. When the non-local regularized single-frame image super-resolution reconstruction method and the SVR method are used, the SNR in the image decreases with the increase of the input noise intensity, which proves the non-local regularized single frame image super-resolution reconstruction method and the SVR method. Single-frame image super-resolution reconstruction method cannot remove the noise effectively when the noise intensity is high. The test results show that the SNR of single-frame character image super-resolution reconstruction method based on wavelet neural network is higher than the method based on non-negative neighbor embedding and non-local regularization, and the method under different noise intensities, which verifies that the single-frame character image super-resolution reconstruction method based on wavelet neural network are more effective.

5.3 Image resolution comparison under different methods

Single-frame character image super-resolution reconstruction method based on wavelet neural network, single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization, single frame image super-resolution based on SVR are tested separately, and the resolution of the images reconstructed with three methods is compared. The test results are shown in Fig. 5.

Fig. 5
figure 5

Image resolution obtained with three different methods. (a) Image resolution obtained by the proposed method, (b) Results of the method based on non-negative neighborhood embedding and non-local regularization, (c) Results of the method based on SVR.

Figure 5(a) is the image resolution obtained by reconstructing a single-frame character image using the single-frame character image super-resolution reconstruction method based on wavelet neural network. Figure 5(b) shows the image resolution obtained by reconstructing a single-frame character image using the single-frame image super-resolution reconstruction method based on non-negative neighborhood embedding and non-local regularization. Figure 5(c) is the image resolution obtained by reconstructing a single-frame character image using the single-frame image super-resolution reconstruction method based on SVR. Comparing Fig. 5(a), (b) and (c), it can be seen that the image resolution obtained by the single-frame character image super-resolution reconstruction method based on wavelet neural network is higher than that based on non-negative neighborhood embedding and non-local regularization and that based on SVR. Because single-frame image super-resolution reconstruction based on non-negative neighborhood embedding and non-local regularization and the method based on the SVR have poor anti-interference. In the reconstruction process, the image contains too much noise, resulting in a lower resolution of the image obtained by the reconstruction.

5.4 Contrast of local reconstruction effects of different methods

Based on the above experimental results, compared with the literature [5], the equality based on nonnegative neighborhood given by the embedding and nonlocal regularization of the single frame image super-resolution reconstruction method and reference [7], the equality of proposed based on support vector regression (support vector regression, the SVR) of the single frame image super-resolution reconstruction method for partial local reconstruction effect of sample image. Figure 6 is the sample image of this experiment, Fig. 7(a) is the image after local reconstruction using the method of reference [5], Fig. 8(b) is the image after local reconstruction by using the method of reference [7], and Fig. 8(c) is the image after local reconstruction by using the proposed method.

Fig. 6
figure 6

Experimental sample image

Fig. 7
figure 7

Comparison of image local reconstruction effects of different methods. (a) Method in literature [5], (b) Method in literature [7], (c)The proposed method

Fig. 8
figure 8

Image global reconstruction test sample. (a) Experimental sample image 1, (b) Experimental sample image 2

As shown in Fig. 7, there are obvious occlusion parts of animal body. From the above images, it can be clearly seen that the image reconstruction effect of literature [5] method and reference [7] method is poor. There are obvious noise points in the local image reconstruction results of the method in reference [5], which leads to the loss of small features of animal fur. Although the noise problem of the method in reference [7] has been significantly improved, the local reconstruction structure is not complete. In contrast, the proposed method can not only blur the image, but also preserve the details and texture of the image and reconstruct the target occlusion with high definition.

5.5 Comparison of global image reconstruction effects with different methods

On the basis of verifying the effect of local image reconstruction, in order to further verify the image reconstruction effect of the proposed method, two experimental samples were used to compare the global image reconstruction effect of different methods. Figure 8 shows the two sample images of this experiment. The literature [5] Peng Yang equality based on nonnegative neighborhood given by the embedding and nonlocal regularization of the single frame image super-resolution reconstruction method and reference [7] Yuan Qi equality of proposed based on support vector regression (support vector regression, the SVR) of the single frame image super-resolution reconstruction method for partial local reconstruction effect of sample image were used as experimental control, and the experimental results were compared with the proposed method. The specific comparison results were shown in Fig. 9 and Fig. 10.

Fig. 9
figure 9

Sample 1 Global image reconstruction effect. (a) Method in literature [5], (b) Method in literature [7], (c)The proposed method

Fig. 10
figure 10

Sample 2 Global image reconstruction effect. (a) Method in literature [5], (b) Method in literature [7], (c)The proposed method

It can be seen from the experimental results in Fig. 9 and Fig. 10 that the global reconstruction effects of different methods are quite different for experimental sample image 1 and experimental sample image 2. The results of global image reconstruction based on reference [5] and reference [7] have problems of blurred image details and ghosting, which results in a large amount of information loss and incomplete image features after reconstruction. In contrast, the global reconstruction effect of the proposed method is more ideal, which ensures the image clarity, and the key features of the image are more complete, and the detail retention rate is better.

5.6 Contrast of image sharpness after reconstruction under different methods

In order to further verify the clarity of the image processed by the research method, the fuzziness of the information detail representation after image reconstruction of the three methods was compared.The lower the ambiguity, the better the effect of image reconstruction. The experimental results are shown in Fig. 11:

Fig. 11
figure 11

Experimental comparison results

According to the analysis and comparison of Fig. 11, after image reconstruction by traditional methods, the overall ambiguity value of the image is high, and the maximum ambiguity is as high as 45%. In addition, the ambiguity changes greatly during the experiment, and the numerical level is not stable. In the whole experiment process, the ambiguity of the proposed method is kept at a low level, which is basically not affected by the number of experiments. The experimental results show that the image reconstructed by the single frame character Internet of things can maintain a high definition.

5.7 Comparison of image feature extraction accuracy of different methods

The accuracy of image feature extraction is a key index that affects the effect of image reconstruction. In this section, the accuracy of image feature extraction is taken as the evaluation index to compare the accuracy of different methods. High image feature extraction precision is an important prerequisite for image reconstruction. The specific experimental results are as follows:

It can be seen from the experimental results in Fig. 12 that there is no significant difference in the accuracy of image feature extraction between the literature in [5] method and the literature in [6] method. In the 700 times of experimental iteration, the accuracy level has always been below 50%, and at the 100th time of the experiment, the image feature extraction accuracy is the lowest, about 10%. The above data show that the traditional method does not meet the requirements of image processing in this field, and can not complete the high standard image reconstruction. In the whole experimental process, the accuracy of the proposed method can be maintained at 80% ~ 90%. This is because the Moravec feature point extraction operator method is introduced in this paper. The highest interest point is selected as the feature point in the local area of the image, which greatly improves the accuracy of image processing.

Fig. 12
figure 12

Experimental comparison results

5.8 Comparison of image detail retention rates by different methods

In order to verify the effect of the research method on image detail information retention, the following experiments are carried out.The sample images in the experiment are from Imagenet database.(http://www.image-net.org/).From this database, 100 images collected by Internet of things are selected as the research object of this experiment.The 100 images were reconstructed by using the method sample of reference [5], the method of reference [7] and the research method, and the detail retention rate of the processed images was compared.The experimental results are shown in Fig. 13.

Fig. 13
figure 13

Comparison of detail information retention rate of different methods

According to the experimental data in Fig. 13, after image reconstruction, the image detail retention rate in reference [5] is about 40% ~ 60%. In short, the traditional method loses about half of the detail information during image reconstruction. The data show that this traditional method can not meet the requirements of the current image processing field.The method in reference [7] has the same defects as the method in reference [5]. The retention rate of image information varies from 40% to 70%. Although the image reconstruction effect will be slightly better, it does not meet the requirements of image processing.In contrast, the image detail retention rate of the research method is more stable. In the whole experimental process, with the increase of the number of test images, the image detail retention rate is maintained at 80% ~ 95%.This group of data shows that the practical application effect of the research method is better.

6 Conclusion

The image super-resolution reconstruction technology means that a low-resolution image acquired by exisThis paper proposes a single-frame character image super-resolution reconstruction method based on wavelet neural network. The super-resolution reconstruction of the single-frame character image is completed, and the noise in the image can be effectively removed, and a single-frame character image with higher resolution can be obtained. The experimental results show that the fuzziness of the reconstructed image is less than 5%. In the whole experiment, the accuracy of the method can be maintained at 80% ~ 90%, and the image detail retention rate is maintained at 80% ~ 95%. The above data show that compared with the traditional method, the proposed method has the most ideal image reconstruction results.

Super-resolution image reconstruction is one of the hot topics in recent years. Finding a highly efficient and simple reconstruction algorithm is of great significance. There are still many problems to be solved in the study of super-resolution image reconstruction.

  1. (1)

    Improvement of the super-resolution reconstruction model: The reconstruction model of the Internet of Things image in different research areas is different, especially in the research of super-resolution reconstruction of single-frame character images, it is necessary to combine the formation process of single-frame character images to establish a reconstruction model more suitable for single-frame character images.

  2. (2)

    Evaluation of reconstructed Internet of Things image quality: In practical applications, other evaluation indicators must be considered to achieve an accurate evaluation of the reconstructed image quality.