A Novel Kite Cross Hexagonal Search Algorithm for Fast Block Motion Estimation

The performance quality and searching speed of Block Matching (BM) algorithm are affected by shapes and sizes of the search patterns used in the algorithm. In this paper, Kite Cross Hexagonal Search (KCHS) is proposed. This algorithm uses different search patterns (kite, cross, and hexagonal) to search for the best Motion Vector (MV). In first step, KCHS uses cross search pattern. In second step, it uses one of kite search patterns (up, down, left, or right depending on the first step). In subsequent steps, it uses large/small Hexagonal Search (HS) patterns. This new algorithm is compared with several known fast block matching algorithms. Comparisons are based on search points and Peak Signal to Noise Ratio (PSNR). According to results obtained in this paper, KCHS needs less search time than others algorithms and gives very acceptable performance quality.


Introduction
Efficient coding of video sequences is an important process and becomes essential in many multimedia and communication applications. These applications require a very high compression ratio because of limited channel bandwidth of real video playback [1].
Block Matching (BM) algorithms have been widely used in various video coding standards such as MPEG series and H.26x due to their efficiency and implementation simplicity in both software and hardware. In block matching, the current frame is divided into Macro Blocks (MBs) of equal size. Each of these MBs is compared with the block at the same position and its adjacent blocks in the previous frame (reference frame). The motion vector represents the movement of a macro block from one location to another in the reference frame. The MVs of all blocks in the frame are considered as the estimated motion of that frame. The search area for a good macro block match is constrained up to p pixels on all four sides of the corresponding macro block in the previous frame. This 'p' is called as the search parameter. Larger search parameter is required for larger motions which lead to more computationally expensive motion estimation process. The output of a cost function is used to match one macro block with another. The cost function is based on a Block Distortion Measure (BDM). The block with minimum cost is considered as the closest to current block. Many cost functions are used. Mean Absolute Difference (MAD) is the most popular and less computationally expensive. MAD is given by equation (1). Mean Squared Error (MSE) is another cost function and is given by equation (2).
(1) ∑ ∑ (2) ∑ ∑ steps to find a small region where the best motion vector is expected to locate. Finally, SDSP is used to find the final MV [6]. The Cross Hexagonal Search (CHS) algorithm was proposed by S. Zhu et al. in 2009. In this algorithm, two SCSP are applied in the first two steps. Each step has a halfway stop when the minimum BDM point is at the center. A LCSP is applied in the third step and this step also has halfway stop. Then, a LHSP is applied repeatedly until the minimum BDM point is at the center. The final step applies SHSP to find the final motion vector [7].
A new algorithm was proposed by R. Bali and V.K. Govindan in 2014. The proposed algorithm is obtained by combining Three Step Search (TSS) and Simple and Efficient Search (SES). The search window in the reference frame is divided into four quadrants. The SES is used to find the best one of these quadrants, while TSS is used to find the final MV with in the selected quadrant. The experiments show that this algorithm gives better results than both TSS and SES [8].
S.KU. Chhotary et al. proposed a new hybrid algorithm which is a combination of TSS and DS in 2016. The first step of TSS is applied in the initial step. If the minimum BDM point is a) the center, then step 3 of TSS is applied and the search stops. b) One of the corners, then continue performing the same steps of TSS. c) One of the axis, then LDSP is applied followed by applying the last step of TSS [9].
A new algorithm called Three Step Cross Search (TSCS) was proposed by R. Bhandari and A. Vyas in 2016. This algorithm consists of three steps. In the first two steps, a cross shape pattern forming (+) is applied with step size (S) equal to 4 and 2 in the first and second steps respectively. In the last step, 8 points surrounding the center point found in the previous step are evaluated to get the final motion vector [10].
S.K. Sahu and D. Shukla proposed a new algorithm called Star Diamond-Diamond Search (SDDS) in 2018. In the first step, a star diamond search pattern is applied. This pattern consists of 9 points. In the following steps, a SDSP is applied repeatedly until the minimum BDM point is at the center. Star pattern is applied only once to detect the direction of the motion vector, if it is stationary, vertical, diagonal, or horizontal [11].

Block matching algorithms
A number of most popular and used block matching algorithms are described in this section. Brief explanations of these algorithms are given in subsections below.

Full Search (FS)
Full search (also called Exhaustive Search (ES)) calculates the cost function at each possible location in the search window. FS will check (2w + 1)2 search points to find the solution when the search window range in the reference frame is ±w in both directions. It gives the optimal solution and also gives the highest PSNR. But it is the most computationally expensive algorithm. Its drawback is that more numbers of computations are required when larger search window is used. So it is not good for coding real-time videos [12]. Therefore, many block matching algorithms with different search strategies and patters were proposed to reduce computations and give good results.

Three Step Search (TSS)
It is one of the oldest fast block matching algorithms. Koga et al. proposed this algorithm in 1981 [13]. It consists of three steps. In each step, it searches at eight points +/-S around the center point in addition to the center point only in the first step, where S is the step size. The point with the least cost is set as the center point for the next step.
Step size is first set to four (i.e. S = 4 if search parameter is 8) and is divided by two after each step. After completing the three steps, the MV is set to the position of the minimum BDM point (i.e. the best match) [9].
TSS is based on the assumption of unimodal error surface. This means that when the search goes far away from the global minimum, the error increases monotonically. In fact, this assumption is not always true due to reasons such as the inconsistent block segmentation of moving object and background and the aperture problem, etc. [14] TSS procedure is shown in figure 2. The advantages of TSS are 1) it is simple. 2) it reduces computations significantly. 3) it has good performance. So, it was used frequently in most applications. The disadvantages of 3SS are 1) it is not suitable for small motions because of using uniform search pattern. 2) it uses large square search pattern in the first step (9x9), so there is high probability that it gets trapped to local minimum.

Cross Search (CS)
This algorithm was proposed by Ghanbari in 1990 [15]. It starts by using 5x5 search pattern forming a (X) shape and step size = 4 when the maximum motion displacement is 8. In each step, four locations are checked at the end of (X) pattern. The minimum BDM point is set as the center point of the next step and the step size is divided by 2. This procedure is repeated until the step size becomes equal to 1. At this step size, if the minimum BDM point of the previous step is at the center, upper left, or lower right; CS uses a (+) search pattern is used; otherwise, CS uses a (X) search pattern. The BDM point in this step will give the final motion vector [16]. CS procedure is shown in figure 3.

Four Step Search (FSS)
This algorithm was proposed by L.M. Po and W.C. Ma in 1996 [17]. FSS uses square search pattern of size 5 x 5 which consists of nine search points. FSS consists of four steps. The step size is not changed during the first three steps and is set to 2 (i.e. S = 2). In the fourth step the window size is reduced to 3x3, it means that, S = 1. The value of search parameter p is not important. In first step, FSS checks 8 points at distance +/-S around the center point in addition to the center point (i.e. 9 points). In second and third steps, three or five points are checked according to position of least weight point in the previous step (i.e. there is overlapping). In the fourth step, 8 points at distance +/-S around the center point are checked. In steps one and two, the algorithm skips third step and immediately executes fourth step, if the center point is the point with least weight and the final MV is found [12]. FSS procedure is shown in figure 4.

Diamond Search (DS)
S. Zhu and K.K. Ma introduced this algorithm in 2000 [18]. DS is similar to FSS, but it uses diamond search pattern instead of square search pattern and the number of steps in the algorithm is not restricted. DS uses two different search patterns, LDSP consisting of 9 checking points as shown in figure 5a and SDSP consisting of 5 checking points as shown in figure 5b. Search Pattern (SDSP). In first step, LDSP is applied. In second step, SDSP is applied if the center point is the minimum BDM point search stops; otherwise LDSP is applied recursively until the center point is the minimum BDM point. After that, SDSP is applied and the final MV is found. When applying LDSP repeatedly, there will be overlapped checking points (only 3or 5 new points are checked) [19]. DS procedure is shown in figure 6.  (+7,-2). The advantages of DS are 1) it uses diamond search pattern, which is better than rectangular search pattern. The search pattern has moderate size. So, global minimum can be found accurately [12]. 2) the probability of being trapped in local minima is minimized because DS uses infinite number of steps until finding final solution.  Hexagonal Search Pattern (SHSP). In first step, LHSP is applied. In the second step, SHSP is applied if the center point is the minimum BDM point and search stops; otherwise LHSP is applied recursively until the center point is the minimum BDM point. After that, SHSP is applied and the final MV is found. When applying LHSP repeatedly, there will be overlapped checking points (only 3 new points are checked) [19]. HS procedure is shown in figure 8. because the other points are overlapped with the previous step (faster than DS which checks 3 or 5 points in these steps). The disadvantage of HS is that using search pattern with few points and large size will make this algorithm sometimes trapped into local minimum point.

Flat Hexagonal Search (FHS)
This algorithm was introduced by C.H. Chen and Y.F. Li in 2004 [20]. The FHS uses Flatted Hexagonal Search Pattern (FHSP) as shown in figure 9. In first step, FHSP is applied. In second step, SHSP is applied if the center point is the minimum BDM point and the search stops; otherwise, FHSP is applied recursively until the center point is the minimum BDM point. Finally, SHSP is applied and the final MV is found. When applying FHSP repeatedly, there will be overlapped checking points [21]. FHS procedure is shown in figure 10.

Input
Reference frame, Current frame.

Output
Motion vector of current frame.

Begin
Step 1 (SCSP): SCSP is applied at the center of the search window (5 points). If the minimum BDM point is the center point, then the search stops; otherwise, go to Step 2.
Step 2 (KSP): One of the four types of KSP is applied depending on the position of the minimum BDM point found in Step 1 which will be considered as the new center point for KSP. If the minimum BDM point is the center point, the search stops; otherwise go to Step 3.
Step 3 (Hexagonal Searching): A new LHSP is applied considering the minimum BDM of Step 2 as the center point. If the new minimum BDM point is the center point, then go to Step4; otherwise repeat this step.
Step 4 (Ending -Converging step): SHSP is applied and the minimum BDM point is found. This point is considered as the motion estimation solution.
End KCHS requires 5 search points for stationary block and 9 search points for quasi-stationary block, whereas the DS requires 13 search points, and HS requires 11 search points for both stationary and quasi-stationary blocks. KCHS procedure is shown in figure 12.

Simulations and results
In this paper, experiments are executed using MATLAB 2018a program. The computer used here has the following specifications: Intel® Core™ i3 CPU 3217U @ 1.8G processor, 3 MB Smart Cache, 1 GB video card memory, and 64bit Windows®7 ultimate.
Parameters used in this simulation are given in table 1. The first 30 frames of each video sequence are used in simulation. The comparison between BM algorithms is done based on the value of search points required to find the motion vector for each frame, and the value of PSNR between original frames and reconstructed frames.

Conclusions
In this paper, eight BM algorithms are executed and compared based on two performance measuring parameters: average search points per frame, and PSNR per frame.
This paper shows that the kite Cross Hexagonal Search algorithm (KCHS), which combines kite, cross, and hexagonal patterns is faster than all other BM algorithms used in this paper (requires less search points) regardless the type of video. At the same, it gives results with accuracy similar to the accuracy of results given by other algorithms. It is especially used for video conferencing.