Fuzzy‑based video compression using bilinear fuzzy relation equations

We present a novel color video compression method using the greatest solution of a system of bilinear fuzzy relation equations to assess the similarity between frames. The frames in each band are treated separately and each frame is classified as an Intra frame or a Predictive frame. A frame is labelled as Predictive frame, and compressed more than an Intra-frame, if the similarity value with the previous Intra frame is higher than a selected threshold; A pre-processing activity is performed to select the optimal threshold value of the similarity between frames. The proposed method allows to supply a high quality of the reconstructed frames and has the advantage of not requiring high CPU time and memory storage for its execution; it was tested on color videos of the Fast-Moving Objects dataset; the results show that it produces better performances than the Lukasiewicz similarity-based video compression method and comparable with those achieved by MPEG-4 and the deep learning video compression method DVC_pro. The results show that the quality of the reconstructed frames obtained with BFRE is comparable with that of DVC Pro, but has a lower computational complexity, providing better performances in terms of video encoding speed.


Introduction
The exponential evolution of digital technologies and the need to use and archive video streams in numerous applications has led to the development of standard video compression methods and approaches used today in devices and video communication applications (Abomhara et al. 2010;Beach and Owen 2018).
Recently some authors present new video compression algorithms aimed to balance information loss with computational speed.In particular, some deep neural network-based autoencoders for video compression were proposed.These methods minimize the rate distortion of frames, setting or maximizing the number of bits (or bit rate) for the compression, in order to obtain a higher quality of the reconstructed frame.Their main benefits are their adaptiveness to the input and their non-linearity which makes them flexible to detect spatial variations in frames.However, even if deep learning video compression algorithms improve the rate-distortion by providing a high quality of the reconstructed frames, they have the disadvantage of being slower than traditional video compression algorithms and require a high memory capacity in the learning phase.
For this purpose, in this research we propose a new video compression algorithm based on bilinear fuzzy relation equations (for shorts, BFRE) which improves the quality of the reconstructed frames compared to traditional algorithms and requires lower CPU time and stored capacity than those required for deep learning video compression algorithms.
In this paper we propose a new video compression method based on BFRE, used to measure the similarity between frames.
The main goal of this research is to build an efficient and computationally fast video compression method by measuring the similarity between the frame and previous frames applying solutions of BFRE systems.We consider the decomposition of the video frames into two types of frames, called Intra-frames (for short, I-frames) and Predictive Frames (for short, P-frames) as proposed in (Nobuhara 2006).A P-frame is a frame whose similarity with the previous I-frame is higher than a prefixed threshold.Therefore, it can be compressed at a higher compression rate than that used to compress an I-frame.
We apply a similarity measure based on the bilinear fuzzy relation equations described in (Di Martino andSessa 2017, 2018).The authors prove that the BFRE similarity index is more robust to noise with respect to similarity measures based on the Lukasiewicz t-norm.
The iterative frame classification process proposed in (Nobuhara 2006;Di Martino et al. 2010) is used to classify frames; we treat separately the three bands Red (R), Green (G) and Blue (B), then, for each band we apply the BFRE similarity measures to compare a frame with the previous I-frame, where each frame is normalized in [0,1] to represent a fuzzy relation.Instead of setting a-priori the similarity threshold, we execute a pre-processing phase in which the first k frames are analyzed; the first frame is marked as an Intra frame (I-frame) and the following frames as Predictive frames (P-frames).Each of the (k -1) P-frames is subsequently reconstructed and the trend of the PSNR is analyzed as the BFRE similarity index changes, choosing as the similarity threshold that value of the similarity index below which the PSNR trend decreases rapidly.
In Fig. 1 is schematized the pre-processing phase.
The pre-processing phase ends after selecting the similarity threshold value; then, is executed the BFRE compression process, in which each frame is labelled and compressed as I-frame or P-frame depending on whether the BFRE similarity with the previous I-frame is less than the selected threshold or not.The BFRE decompression process reconstructs the original frame; finally, three reconstructed frames in the three bands are merged to obtain the frame of the reconstructed color video.
The main performance benefits that characterize the proposed color video compression method are the following: -it uses an algorithm based on the BFRE similarity between images to label frames; it allows to reconstruct color videos with less information loss than that obtained in (Di Martino et al. 2010), in which a similarity measure based on Lukasiewicz's triangular norm is used.Indeed, in (Di Martino & Sessa 2017, 2018;) it was demonstrated that the BFRE similarity is more robust to noise with respect Lukasiewicz t-norm similarity; -the adopted pre-processing phase allows to determine for each band an optimal value of the similarity threshold, guaranteeing a lower loss of information in the reconstructed frames compared to that obtained by selecting other threshold values; -the proposed method is independent of the image compression algorithm used to compress the frames; -unlike video-compression deep learning algorithms, it does not require high CPU time and memory storage.

Related work
In recent years the need to improve the quality of the reconstructed frames has prompted many researchers to propose video compression methods that minimize the RDO of the reconstructed frames.
In (Jridi and Meher 2017) a video compression technique based on Discrete Cosine Transform (for short DCT) is applied; it has better performances than other DCT-based video compression techniques, but it has high processing times.In (Wang et al. 2017) a video-compression method based on a quadtree plus binary tree block partition structure is proposed; this approach represents a tradeoff between encoding complexity and quality performance, but it engenders a not negligible decrease in the quality of the reconstructed videos.
In (Haitham 2018) a new video-compression technique based on a Contourlet Transform used with a motion compensation technique (MC) is proposed.
Recently, deep neural network-based autoencoders for video compression were proposed.These methods minimize the rate distortion (RDO), setting or maximizing the number of bits (or bit rate) for the compression, in order to obtain a higher quality of the reconstructed frame.Their main benefits are their adaptiveness to the input and their non-linearity which makes them flexible to detect spatial variations in frames.
A neural video compression algorithm is proposed in (Chen et al. 2020) to predict spatiotemporal relationship between blocks in video frames, minimizing the residual block error.In (Honggui and Trocan 2018) a deep neural framework is proposed in which multiple deep neural networks are trained to predict pixels of intra-frames based on pixels of previous frames.
An end-to-end network neural scheme is proposed in (Lu et al. 2018) to replace the standard codec structure.In (Lu et al. 2021) a hybrid variation of this end-to-end neural video compression framework is proposed to improve the compression efficiency and optimize the rate distortion.The results show that this hybrid video compression method provides better quality reconstructed frames than those obtained using traditional algorithms, but at the expense of high computational complexity.
A Deep Convolutional Neural Network with three hidden layers is used in (Putra et al. 2022) for intra-frame-based video compression; test results performed on color videos show that the quality of the reconstructed frames is better than that obtained using discrete cosine transform and fractal video-compression algorithms.
Surveys of deep learning video compression algorithms are provided in (Zhang et al. 2020;Bidwe et al. 2022).These studies show that deep learning video compression approaches improve the rate-distortion; however, they are slower than traditional video compression algorithms and require higher memory consumption.
In order to reach a trade-off between quality of the reconstructed frames and CPU times, we propose a new color videos compression method based on BFRE where the solutions of bilinear fuzzy relation equations are applied to measure the similarity e between frames.(Loia and Sessa 2005) applied fuzzy relation equations in an iterative video compression method in which the Lukasiewicz fuzzy triangular norm measures the similarity between frames.The first frame is labelled as a I-frame, then, a t-norm based similarity measure is applied to compare the next frame to the previous I-frame; if this similarity value is higher than the selected threshold, then the current frame is labelled as a P-frame, otherwise it is labelled as a I-frame and becomes the I-frame to be compared with the next frame.The frames are compressed by applying the fuzzy relation equations image compression method (Di Martino et al. 2003a, b;Nobuhara 2006).In (Di Martino et al. 2010) the video compression method proposed in (Loia and Sessa 2005) is applied using the Fuzzy transform image compression algorithm (Perfilieva 2006;Di Martino and Sessa 2007;Di Martino et al. 2008) (for short F-transform) to compress each frame.The authors compare their method with the video compression method in (Loia and Sessa 2005) showing that the better results in terms of quality of the reconstructed video are obtained by using the F-transform algorithm.
(Di Martino and Sessa 2013) proposed a new video compression method in which the frames are classified in I-frames, P-frames, and Bi-directional frames (for short, B-frames), following the MPEG-4 algorithm classification (Pereira and Ebrahimi 2002;Richardson 2010).A B-frame is reconstructed by using either the previous or successive I-frame.The authors compare their method with the MPEG algorithm and the video compression method (Loia and Sessa 2005) using the Peak Signal to Noise Ratio (for short PSNR) index to measure the quality of the reconstructed frames.They show that the performances of the three algorithms are similar.
One of the main critical issues of the fuzzy-based videos compression algorithms is the selection of the similarity threshold between frames.In (Nobuhara et al. 2000;Loia & Sessa 2005) the Lukasiewicz's t-norm operator is used to calculate the similarity between two frames considered as fuzzy relations; the similarity threshold is experimentally set at 0.95 as the quality of the reconstructed frames decreases exponentially when the similarity measure is below this value; however, this threshold value could vary according to the characteristics and type of frames.In the proposed method a preprocessing phase is adopted to optimize the choice of the similarity threshold in each band.
In Sect. 2 we introduce the BFRE image similarity measure, synthetizing the method to measure the similarity measure between images based on the BFRE greatest solution.In Sect. 3 the proposed BFRE color video compression method is presented.The results of experimental tests applied on color videos are shown and discussed in Sect. 4. Final considerations are discussed in Sect. 5.
Let F and G be two M × N grey images.We denote with F p,q and G p,q the values of the pixels of F and G at the p th row and at the q th column with p = 1,…,M and q = 1,…,N.Let L the number of grey levels of the two images.
We construct a 3 × 3 window around the pixel F pq with components: and a correspondent 3 × 3 window around the pixel G p,q with components: , where i = 1,…,3 and j = 1,…,3, be two 3 × 3 fuzzy relations; we construct the 3 × 3 BFRE system: where x = (x 1 ,x 2 ,x 3 ) is a vector of three fuzzy variables and the union and intersection operators are given, respectively, by the max s-norm and by the min t-norm.
In (Li 1992) is proposed an algorithm, called the BFRE algorithm, to find the greatest solution of a BFRE system in the form (3).
The BFRE algorithm is applied in (Di Martino and Sessa 2018) to find the greatest solution of the (3), given by ̂ = (x If = (0,0,0) T is the smallest solution of (3), in (Di Martino and Sessa 2018) is defined the following measure of similarity between the two fuzzy relations A and B: (1) The index Sim pq BFRE varies between 0 and √ 3 3 .Its value is 0 if a ij = 0 (1) and b ij = 1 (0) for each i = 1,2,3 and j = 1,2,3 (maximum difference between the two windows) and is √ 3 3 ≈ 0.577 if the greatest solution of ( 9) is ̂ = (1, 1, 1) T (the two 3 × 3 windows are identical).
In (Di Martino and Sessa 2018) is defined the following similarity index between the images F and G as where M and N are the number of rows and columns of the two images.
If F and G are color images in (Di Martino and Sessa 2018) the similarity between the two images is given by the average of the similarities in the three bands Red (R), Green (G) and Blue (B): This similarity measure ( 6) is compared in (Di Martino and Sessa 2018) with other image similarity measures based on fuzzy relations.The authors measure the similarity between images distorted by Gaussian noise, proving the similarity measure ( 6) is more robust to noise with respect to other image similarity indices.
Below is schematized in pseudocode the algorithm used to calculate the similarity between two grey levels images in which the formula (5) is used.

Next i
12.

Input: MxN gray images F and G with L grey levels
Output: Index of similarity between F and G SIM BFRE (F,G) 1.
For i = 1 to M 3.
For j = 1 to N 4.
Construct the 3x3 fuzzy relation A by (1)

The proposed video compression method
We propose a method based to the BFRE similarity measure (6) to compress color videos.We use the BFRE similarity measure to classify individual frames as I-frames or P-frames, treating videos in the three color bands R, G and B separately.
The first frame is always labeled as I-frame; then, the similarity (6) between the I-frame and the next frame is measured.If this similarity value is greater than a fixed threshold the next frame is labeled as P-frame, otherwise it is labeled as I-frame and is considered in the comparisons with the following frames.This process is iterated until the last frame of the video has been analyzed.
Following (Di Martino et al. 2010), we apply the Fuzzy Transform image compression method to compress the frames.The I-frames are compressed with a compression rate with a weak com-pression rate ρ I and the P-frames are compressed by applying a strong compression rate ρ P .
In order to compress the P-frames with a strong compression rate, in (Loia and Sessa 2005;Di Martino et al. 2010) an approach was used in which the P-frame is transformed in a difference frame D, called Δ-frame, where: As the Δ-frame D has a low quantity of information it can be compressed with a strong compression rate.If F is the decoded I-frame and D the decoded Δ-frame, the P-frame G can be reconstructed by the following formula: Below is schematized the proposed video compression algorithm.( 7) if Sim < Sim th then 7.
Label the current frame as an I-frame 8.
Compress the current frame with the compression rate ρ I 10. else 11.
Label the current frame as a P-frame 12.
Compress the Δ-frame D with the compression rate ρ P 14.
Until all frames are labeled 16.
Store the compressed frames The following algorithm schematize the decompression process in which a P-frame is reconstructed by (8).

Algorithm 3: BFRE video decompression
Input: Compressed single-band video level frames compressed Output: Decompressed video 1.
F:= the first compressed frame 2.
Decompress F using the compression rate ρ I obtaining F 3. Repeat 4.
Decompress D using the compression rate ρ P obtaining D 6.
Until all frames are reconstructed 8.
Store the decompressed video To fix the similarity threshold Sim th is executed a preprocessing phase in which the first k frames of the single band video is analyzed; in this phase the first frame is marked as an I-frame and the following k-1 frames are considered P-frames.Each compressed P-frame is subsequently reconstructed and is calculated the PSNR measure, adopted to evaluate the quality of the reconstructed frame Ĝ compared to the correspondent original frame G.Then, the trend of the PSNR index with respect to the BFRE similarity between the frame and the first frame is constructed.The PSNR index between the original frame G and the reconstructed frame Ĝ is given by where: The threshold Sim th is selected as the value of the similarity index below which the PSNR trend decreases rapidly.
In Algorithm 4 is schematized in pseudocode the preprocessing method applied to select the similarity threshold.Label the first frame as an I-frame 2.
F:= the first frame 3.
For i = 2 to k 4.
G:= the i th frame 5. Sim Compress G with the compression rate ρ P 7.
Decompress G with the compression rate ρ P 8. PSNR Next i 10.
Fix Sim th as the similarity value below which the PSNR trend decreases rapidly 11.
Return Sim th We tested our method by applying it to color videos with different sizes.To measure the performance of the BFRE video compression algorithm we compared the results with those obtained using the MPEG-4 video compression method, the video com-pression method based on fuzzy transform (Di Martino and Sessa 2007) and the end-to-end Deep Video Compression DVC_Pro in (Lu et al 2021).The results are shown and discussed in the next section.

Results
To test the BFRE video compression algorithm, we have executed it on a set of 16 color videos extracted from the Fast-Moving Object (FMO) web page http:// cmp.felk.cvut.cz/ fmo.The video frames have sizes 1920 × 1080, 1280 × 820 and 960 × 540 pixels.An Intel(R) Core(TM) i7 processor with a CPU clock speed of 2.90 GHz was used for all tests.
For the sake of brevity, the results obtained on the color video darts_window1, made up of 179 frames, and ping_ pong_side, made up of 445 frames, are shown in detail.In Fig. 2a and b are shown the first frames of the two-color videos.
All the frames were decomposed in the three bands Red, Green and Blue and the frames in each band have been treated separately.In these tests the values ρ I = 0.25 and ρ P = 0.0625 were selected respectively for the compression rate of I-frames and P-frames.Now are shown the results obtained for the color video darts_window1.The first 40 frames are used in the preprocessing phase to fix the similarity threshold.The first frame is considered an I-frame and the other frames are all considered P-frames.In Fig. 3 is show the trend of the PSNR index with respect to the BFRE similarity in the three bands.
All the three trends in Fig. 4 show a rapid decrease below the similarity value 0.535, for which the selected similarity threshold is Sim th = 0.535 in any band.After selecting the similarity threshold were executed the BFRE video compression and BFRE video decompression algorithms to obtain the reconstructed video.The reconstructed first two frames are shown in Fig. 4a-b.
The plot in Fig. 5 shows the trend of the PSNR index with respect to the BFRE simi-larity measure obtained by executing the BFRE video compression and decompression algorithms.The PSNR value computed for each frame is given by the average of the PSNR measured in the three bands.The PSNR index rises as the similarity increases, oscillating between the values 29.6 and 30.9.
To analyze the performance of the BFRE video compression method we compared the results obtained with those obtained executing the fuzzy transform-based video compression algorithm in (Di Martino et al. 2010) (FTR), the ISO/IEC standard MPEG-4 (Pereira and Ebrahimi 2002) (MPEG) and the end-to-end Deep Video Compression DVC_Pro in (Lu et al 2021).In the comparison, the ρ I and ρ P compression rates selected for I-Frame and P-frame are used by performing FTR.A bit rate of 10 mbps has been selected in the MPEG-4 compression.The learning-based optical flow model Spynet (Ranjan and Black 2017) is applied in DVC_Pro to assess motion information.In Table 1 are shown the mean, standard deviation, min and max values of the PSNR index measured for all the reconstructed frames by executing BFRE, FTR MPEG and DVC_Pro.
These results show that BFRE provides better results with respect to FTR and PSNR values similar to those obtained by running MPEG and DVC_Pro.Moreover, BFRE shows better stability than FTR, MPEG and DVC_Pro, also providing the best measure of the standard deviation of the PSNR value for all frames.Now are shown the results obtained for the color video ping_pong_side.In Fig. 6a-c are shown the trend of the PSNR index with respect to the BFRE similarity measure in the three bands.
The trend of PSNR in the bands G and B (Fig. 6b and c) show a rapid decrease below the similarity value 0.500, for which we select a similarity threshold Sim th = 0.500 in these two bands.The trend in the R band show a rapid decrease below a BFRE similarity 0.540; this value is selected as similarity threshold in the R band.
The reconstructed first two frames are shown, respectively, in Fig. 7a and b.
In Table 2 are shown the mean, standard deviation, min and max values of the PSNR index measured for all the reconstructed frames by executing BFRE, FTR and MPEG.
The results in Table 2 confirm that the BFRE algorithm provide better performance than FTR in terms of mean PSNR.Moreover, the mean PSNR is comparable with the one measured executing MPEG-4 and DVC_Pro.In addition, BFRE shows better stability than FTR, MPEG and DVC_Pro, showing the lowest value of the standard deviation measure.

Fig. 1
Fig. 1 Architectural scheme of the BFRE video compression pre-processing phase 2: BFRE video compressionInput: A single-band MxN video with L grey levels Output: Compressed video 1. Label the first frame as an I

Algorithm 4 :
BFRE similarity threshold selectionInput: The first k frames of a MxN single band video with L grey levels Output: The similarity threshold Sim th 1.

Fig. 3 a
Fig. 3 a PSNR trend-R Band.b PSNR trend -G Band.c PSNR trend-B Band

Fig. 6 a
Fig. 6 a PSNR trend-R Band.b PSNR trend -G Band.c PSNR trend-B Band

Table 3
Mean PSNR and standard deviation obtained for the color videos in the FMO dataset The highest values of mean PSNR and the lowest values of PSNR standard deviation are shown in bold