Keywords

1 Introduction

In HEVC standard, the performance of motion estimation (ME) highly depended on the selection of advanced motion vector prediction (AMVP) technology [1]. The motion vector prediction (MVP) is selected from a motion vector candidate list which consists of one motion vector from neighboring units on the left of the current coding unit, one motion vector from above neighboring units, and the motion vector of the spatially the same position in the previous encoded frame. And the motion vector in the list with minimum cost is selected as the final MVP. AMVP is significantly simplified to provide a good trade-off between coding efficiency and an implementation cost. However, the fixed pattern of the MVP decision process without consideration of the reliability of the surrounding motion vectors makes it has lower estimation accuracy.

Some previous works are proposed to improving the performance of MV coding. The main idea based on the spatial and temporal MVP candidate schemes for video coding have one assumption in common. The motion of neighboring blocks has to be similar [2,3,4,5,6,7,8]. Lin et al. present a new location of the temporal motion vector predictor, a priority-based derivation algorithm of spatial and temporal MVPs [1, 2]. Jung et al. propose the motion vector competition (MVC) scheme to select one motion vector predictor among the given motion vector candidates [3]. These methods can increase the coding efficiency of motion vector coding in HEVC. Yang et al. define a predictor candidate set, and the motion vector predictor can be generated by the minimum motion vector difference criterion [4]. Chien et al. design an enhanced AMVP mechanism to get an accurate motion vector predictor [5]. However, the spatial and temporal MVP candidates lack precision, and these approaches improve the performance of motion vector coding limitedly.

Motion estimation is a core part of video coding, and it can improve the coding efficiency significantly. Meanwhile, the coding complexity has significantly increased. Some adaptive search range (SR) methods have been presented to reduce the encoding complexity. Determining a suitable SR, they can be classified two categories: MV-based method and SAD-based method.

The MV-based methods [10,11,12,13,14,15] persist in that, when the distribution of the MVD is concentrated in zero with a small variance, the SR can be adjusted to the small. Lou et al. present an adaptive motion search range method to reduce the memory access bandwidth [10,11,12]. In this method, an applicable SR is chosen to contain the optimal MV by using a probability model. In Dai’s work [13], an adaptive SR method is proposed by using a Cauchy distribution. These methods can reduce the encoding complexity significantly.

The SAD-based methods [16, 17] set a threshold on SAD value to decide whether the video content is motion or not. However, the SAD-based methods are unreliable.

In summary, a universal motion vector prediction framework is proposed to improve the coding efficiency in this work. For the fixed pattern of MV coding in HEVC, the main disadvantages of the previous work are that the precision is not sufficient and the robustness is not high. Firstly, a novel motion vector prediction method is used to generate the optimal MV. For the fixed SR adopting in motion search processing, there are a large number of redundant computation. Thus, an adaptive search range selection method is developed to reduce the encoding complexity of ME owing to the benefit of the accurate motion vector prediction. Different from the state-of-the-art, my approach treats the MVP selection and the SR selection jointly.

2 Proposed Method

2.1 Novel Motion Vector Prediction (NMVP)

Considering the video content with strong spatial correlations, motion vector predictor of the current CU can be generated from the adjacent CUs. The novel motion vector prediction is based on the previous work [9]. Different from the fixed pattern AMVP technology, the spatial neighborhood cluster G is composed of all spatial neighbor CUs. The cluster G is shown as in Fig. 1. Where \(CU_L\), \(CU_{TR}\) and \(CU_{TL}\) denote the left, top right and top left CU of the current CU, separately. The cluster G is defined as

$$\begin{aligned} G = \{CU_L,\ldots CU_{TL},\ldots CU_{TR}\} \end{aligned}$$
(1)
Fig. 1.
figure 1

Spatial correlation neighborhood cluster.

The MVs and depths information of G can be used to predict the MVP of the current CU. However, the computation complexity is high by the checking all of the information. Thus, the relatively reliable sub-cluster should be developed for the MVP. Therefore, in order to utilize the spatial correlation, the sub-cluster M is defined as

$$\begin{aligned} M = \{CU_L, CU_{TL}, CU_{TR}\} \end{aligned}$$
(2)

The sub-cluster M is contained in the cluster G (\(M\subset G\)). \(MV_{L}\), \(MV_{TR}\) and \(MV_{TL}\) indicate the MV candidates in the left, top right and top left of the current CU.

The basic idea of the proposed MVP method is to prejudge the MVP of the current CU according to the MVs of the spatial adjacent CUs. When the sub-cluster M is available, the information of M is used to predict the MVP of the current CU. In contrast, when the sub-cluster M is unavailable, the information of G is used to predict the MVP of the current CU.

When the MVs of sub-cluster M are available, a simple MV can be selected as the optimized MV for the current CU. On the contrary, when the MVs of sub-cluster M are not available, the reliability of the candidate MVs is the lowest and it is hard to get the accurate MVP using the fixed AMVP mechanism. In this case, the MVP position may tend to be near to the left of CU, and it is possible to tend to be near to the top of CU. Thus, all available MVs of the spatial neighborhood cluster G need to be checked. In order to get the accurate MVP, all surrounding MVs of G can be added to the candidates MVs, and the cost of these MVs are checked to get the optimized MVP.

2.2 Adaptive Search Range Selection (ASRS)

An accurate MVP is critical to SR reduction. In this subsection, the importance of the MVP to the SR reduction is studied. As to the SR, the optimized MV is close to the MVP so that a smaller SR can be used in the ME. In the contrast, a larger SR should be used in the ME if the optimized MV is far away the MVP. The difference between the optimized MV and the MVP is named the MV prediction difference (MVPD). Thus, when the distribution of the MVPD is concentrated near the center with a small variance, a smaller SR can be adopted.

Previous works on adaptive search range selection (ASRS) methods are reported in [10, 12]. The MV variance (\(\sigma ^2\)) of the spatial and temporal neighboring CUs is used to adjust the search range in ME. In these methods, the Cauchy distribution is proposed to model the distribution of the MVPD. The probability density function of MVPD is used to calculate the probability of the optimized MV within the SR in [10], the probability density function of zero mean Cauchy distribution is can be defined as

$$\begin{aligned} f_{ME}(x,y) = f_{ME,X}(x)\cdot f_{ME,Y}(y) \end{aligned}$$
(3)
$$\begin{aligned} f_{ME,X}(x) =\frac{C_x}{|\frac{x}{\zeta _x}|^\frac{5}{3} +1} \end{aligned}$$
(4)
$$\begin{aligned} f_{ME,Y}(y) =\frac{C_y}{|\frac{y}{\zeta _y}|^\frac{5}{3} +1} \end{aligned}$$
(5)

where \(C_x\), \(C_y\) are normalization constants, and \(\zeta _x\) and \(\zeta _y\) are parameters of the modified zero-mean Cauchy function, which can be computed by the sample variances of MVPDs (\(\sigma ^2\)). Thus, the probability of the optimized MV can be defined as

$$\begin{aligned} F_{ME}(SR_x, SR_y) = F_{ME,X}(SR_x)\cdot F_{ME,Y}(SR_y) \end{aligned}$$
(6)
$$\begin{aligned} F_{ME,X}(SR_x) =\int _{-SR_x-0.5}^{SR_x+0.5} f_{ME,X}(x)dx \end{aligned}$$
(7)
$$\begin{aligned} F_{ME,Y}(SR_y) =\int _{-SR_y-0.5}^{SR_y+0.5} f_{ME,Y}(y)dy \end{aligned}$$
(8)

where \(SR_x\) and \(SR_y\) are horizontal and vertical component of search range, separately. For the given probability \(C_{prob}\) that denotes the search range contains the optimized MV, the dynamic search range can be determined by the closest SR with a probability no less than \(C_{prob}\), and it satisfies

$$\begin{aligned} F_{ME,X}(SR_x) = F_{ME,Y}(SR_y)=\sqrt{C_{prob}} \end{aligned}$$
(9)

where the computational dynamic search range can be represented by \(SRx\_dyn\) and \(SRy\_dyn\), respectively. In this work, the default SR is represented by \(SR\_def\). When the MVP is accurate enough, the minimal search range (represented by \(SRx\_min\) and \(SRy\_min\)) is set to 1/8 of the default SR.

Fig. 2.
figure 2

Search range refinement

Recall that the spatial correlation based MVP decision algorithm is studied in above section, and the reliability of the neighboring candidate MVs is used to get the optimized MVP. To treat the MVP selection and the SR reduction jointly, when a high precision MVP is selected to be close to the optimized MV, the SR will be small. An example of the search range refinement method is shown in Fig. 2. In this case, the SR of modified search range is set to \(SR_x\) and \(SR_y\), while the default SR (\(SR\_def\)) is set to \(64\times 64\) in the ME of HEVC reference software.

2.3 Universal Motion Vector Prediction Framework

Jointing the NMVP and ASRS methods together, the universal motion vector prediction algorithm is shown as algorithm 1. Firstly, the optimized MVP is decided by the MVs of the spatial neighbor CUs. Secondly, the search range is adjusted by the probability of the optimized MV within the SR and the reliability of the MV candidates. It is noted that, when the reliability of the neighboring candidate MVs is the highest, the SR can be reduced significantly and it is set to the (\(SRx\_min\), \(SRy\_min\)). Moreover, when the reliability of the neighboring candidate MVs is the lowest, the SR is set to the {\(\min \)(\(SR\_def\), \(SRx\_min+SRx\_dyn\)), \(\min \)(\(SR\_def\), \(SRy\_min+SRy\_dyn\))} respectively.

figure a

3 Experiment Results

The proposed algorithm is implemented and verified based on HEVC test model HM16.12. The test conditions are set to evaluate the performance of the proposed algorithm at different profiles (RA and LD) [18]. The quantization parameters (\(QP_i\)) are set to 22, 27, 32 and 37, respectively. In this work, the search strategy is TZsearch, and the SR is adjusted adaptively.

The performance of the proposed algorithm is evaluated Bjontegarrd Delta bitrate (BDBR) [19], and the average time increasing (TI) is defined as

$$\begin{aligned} TI(\%)=\frac{1}{4}\sum _{i=1}^{i=4}\frac{T_{Pro}(QP_i)-T_{HM}(QP_i)}{T_{HM}(QP_i)} \times 100\% \end{aligned}$$
(10)

where \(T_{HM}(QP_i)\) and \(T_{pro}(QP_i)\) are the encoding time by using the HEVC reference software and the proposed method with different \(QP_i\).

Table 1. Performance of universal motion vector prediction.
Fig. 3.
figure 3

R-D curve of BasketballDrive (RA).

Table 1 shows the performance of the universal motion vector prediction algorithm. In RA case, the bitrate can be reduced by 5.49% on average, while the encoding time increasing is 51.13%. In LD case, the bitrate can be reduced by 5.34% on average, while the encoding time increasing is 44.96%. The proposed algorithm can improve the coding efficiency significantly.

It is noted that, for the motion severe sequence, the proposed algorithm can improve the performance, which is the greatest contribution of this paper. The bitrate can be reduced by 7.85% for BasketballDrive sequence in RA case, and the R-D curve is shown as Fig. 3.

It would be specially mentioned that this proposed method causes the encoding complexity increasing with the encoding efficiency raising. However, for the application that does not care about the real-time encoding, and care more about the coding efficiency, and it is an efficient approach for coding efficiency improvement in HEVC.

4 Conclusion

In this work, a universal motion vector prediction framework is presented to improve the performance of HEVC. The simulation results demonstrate that the proposed overall algorithm can improve the encoding efficiency by 5.34–5.49% on average, which the encoding time increasing is about 45–51%.