Video scene change detection based on histogram analysis for hiding message

The rapid growth of internet technology has greatly increased the opportunity for secrecy communication. Most of communications implemented compression method in digital applications due to fast transferring data in the limited bandwidth. In addition, the information can be changed by a third party in communication. This paper proposed a new hiding technique by modifying DCT coefficients in the video frames. The scene changes of the video data are identified based on histogram analysis for hiding locations. Scene changes among video frames are detected by measuring the significant difference of histogram analysis. The proposed hiding technique in video is also evaluated against MPEG-4 compression. The experimental results show that the proposed scheme achieved high imperceptibility with minimum visual distortion on the video quality. The proposed hiding scheme also can recover the concealed data from the compressed video. The results show that the extracted hidden message is able to resistant against MPEG compression.


Introduction
Internet has always been the most popular used medium to transfer valuable information between users. The valuable information can be changed or lost by a third party in communication. The hiding data should provide a highly secured way without leaking to any authorized users or hackers, Video steganography is one of concealing techniques in communication by hiding valuable information in the video data [1]. Data hiding techniques can help users to embed secret data inside a cover object without the existence of secret data being realized. Video steganography techniques are developed to be used for transmission of data without being perceptible to others and withstand against attacks from any unauthorized parties [2]. Secret data such as text, audio, images or videos can be embedded for communication. Videos are one of the popular cover media that has been widely used in communication. The video is the combination of audio and a collection of sequence images; it has large capacity for hiding data. Video provides a large redundant area that can be used for hiding data. Therefore, video steganography becomes a popular technique for concealing messages.
Most video steganography techniques pay attention to produce high imperceptibility results. In video steganography, the message must be embedded in the specific frame without affecting the quality of the video [3]. Video steganography allows a large amount of messages to be embedded in the video, and the hidden data can't be noticed by human eyes [4][5][6]. Steganography can be performed in frequency and spatial domains. In spatial domain, the message was encoded by altering the pixel value directly. The hiding data based on spatial domain will produce high imperceptibility. In this technique, the hidden data in the stego object can be affected by adding the noise. Hiding data in the spatial domain will be destroyed due to compression method. In the frequency domain, the data was not embedded directly into the image pixels. This technique is less vulnerable to steganalysis attacks because it has high 2 security due to its complexity of hiding the messages. Besides that, hiding data in the frequency domain has possibility to produce robust message, it allows the stego object will be compressed before transmitting the data. The most widely used of transform domain in the steganography is Discrete Cosine Transformation (DCT).
In addition, video transmission always been compressed before transferring the data. Most of current video steganography does not satisfy to achieve robustness under compression method [7]. Researchers investigate the scene change in the sequence of video frames for hiding secret-data [8]. The hidden data into scene change has potential to maintain the video quality. The video scene change has not been fully investigated that the hidden data can be secured and robust against compression method. This paper presents a new hiding technique in scene change of video frames based on the histogram analysis. The difference histogram analysis between current and next video frame is calculated to identify the scene change. The frames that have significant difference value are selected for concealing message. The proposed hiding scheme modifies the selected DCT coefficients for hiding data. The scene change of video frame has a potential for concealing message and it provides high imperceptibility.

DCT
The Discrete Cosine Transformation (DCT) transformed video frame for non-overlapping blocks. The DCT coefficients consist of high-frequency, mid frequency and low-frequency sub-bands of the DCT coefficients. The DCT is defined by: The inverse of DCT is given by:

Entropy
Entropy is related to intensity variation and has great value for homogeneous regions and small values for selected regions [9] [10]. The entropy can be used to estimate the required bits of pixel to present it. The lowest entropy value has less image information to the human visual systems (HVS) [11] [12]. The entropy has been widely implemented to choose suitable location for hiding data, so it can achieve high imperceptibility [13] - [14]. The entropy represents the most significant texture of the image pixels. The entropy is defined by: where i= 1,2, …, n. E represents the entropy value, ri represent the accurrence probability of an event i

Proposed Scheme
The experiments use five videos [15] with avi format to test the proposed scheme. The block diagram of the proposed hiding scheme is visualized in Figure 1. The scene change of video frame was detected by estimating each video frame's different histogram analysis as shown in Algorithm 1.

Figure 1. Block diagram of the proposed hiding scheme
A video is divided into a sequence of video frames. Then, the histogram analysis is performed to find scene change of video frames. The significant difference between histogram of the original video frame and the histogram of the next video frame. The video frames that have highest number of scene change on the sub-blocks are selected for hiding data. The selected video frames are computed by entropy to determine the hiding data location. The coordinate locations of the sub-blocks with lowest entropy are saved into a database. The sub-block of the video frame that have lowest entropy are computed by DCT to obtain DCT coefficients. The proposed hiding data is performed by following Algorithm 3. Thus, perform inverse DCT for the selected sub-block and merge the sub-block into video frames. Then, merge all video frames to obtain stego-video.

Algorithm 1: Scene change detection
Step 1: Read video file.
Step 2: Divide the video into a sequence of video frames, each component of video frame consists of red, green and blue channels.
Step 3: Find histogram difference between the current video frame and the next video frame.
where Ri denotes as the current video frame and Ri+1 represents the next video frame Step 4: Compute the overall histogram difference from i =1 to i =T -1, where T is the last video frame.
Step 5: Compute the mean and standard deviation of Dhist.  Step 6: Compare threshold with Dhist.

Mean of Frame Difference
if Dhist > Threshold Ri+1 is marked as scene change frame End Step 7: Sort the video frame that have scene change frame, highest histogram difference and save it into a database.

Hiding Data
The hiding data into the video is discussed in the following algorithm.
Step 1: Read video file.
Step 2: Select the video frame that have scene change frame, highest histogram difference from database Step 3: Convert the selected frame into YCbCr, then select Y channel Step 4: Divide the selected video frame into 2×2 pixels, then it is computed by entropy method Step 5: Sort sub-block image according to the lowest entropy value, then save the coordinates of the selected sub-block into a database Step 6: Perform DCT on each selected sub-block, embed the message bits into the selected sub-blocks using Algorithm 2.

Algorithm 2: Hiding data Input: A= G(1,2), B= G(2,1), a Output: Embedded sub-block If message(i) ==1 If abs(A) < abs (B) C = B B = A A = C + a else A = A + a B = B end else if message(i) == 0 If abs(A) < abs(B) A = A B = B + a end else C = A A = B B = C + a end
where A represents DCT coefficients of G (1,2), B denotes by DCT coefficients of G(2,1) and a is a scaling factor obtained from Algorithm 3.
Step 7: Perform inverse DCT on the selected sub-block Step 8: Convert embedded video frame into RGB channel.
Step 9: Merge all video frames to obtain stego-video.

Algorithm 3: Threshold
Input: A= G(1,2), B= G(2,1) where T is a user-defined for scaling threshold, and the amount of T is investigated to achieve high video quality and prevent the embedded information in the DCT block.

Extracting of Hidden Data
The extracting of hidden message is discussed in Algorithm 4 as follows: Algorithm 4: Extracting data Input: Stego-video, coordinates of the selected frames and sub-block Output: message recovery Step 1: Divide a video into a sequence of video frames Step 2: Select video frames according by referring to the database Step 3: Select sub-block images based on the coordinates of selected subblocks in the database Step 4: The selected sub-block is transformed by two-dimensional DCT Step 5: Compare the DCT coefficients of G(1,2) and G(2,1) as follows: If abs (G(1,2)) < abs (G(2,1)) If the DCT coefficients of G (1,2) is lower than G(2,1), the message bit is 0, else the message bit is 1.

Evaluation of stego-video
The stego-video is evaluated by Structural SIMilarity index (SSIM) and True Acceptance Rate (TAR) to measure the imperceptibility and message recovery under compression method. SSIM index is defined by: where S represents the sequence number of selected video frames. μx is the mean over a window in image x, μy is the mean over a window in image y and c1, c2 are constants. TAR is used to estimate the correctness of the message recover compared to the original message. TAR is utilized to quantify the correctness of the characters that have been hidden into the video under compression method. TAR can be defined by:

Experimental Results
The experiments were conducted to conceal the message with different amounts of message size. The proposed scheme was tested with the amount of message which are 500 characters, 1000 characters and According to Table 1, the proposed scheme achieved slightly lower of SSIM value than other schemes. The proposed video steganography is designed to robust under video compression. MPEG-4 compression is selected to test the proposed hiding message. MPEG-4 has been widely implemented in multimedia applications. The experiments investigate the embedding strength for hiding message in order to achieve an optimal balancing between video quality and resistance against compression method. The experiments increment the embedding strength T one at a time under compression method. The optimal embedding strength for hiding message is shown in Figure 2.  Table 2. Referring to Table 2, it shows a significant difference between the proposed scheme and other scheme especially LSB methods. The hiding data using LSB method does not resistant against compression. The hidden message has been destroyed when the stego-video was compressed by MPEG-4. The  proposed scheme demonstrates that it can achieve good imperceptibility with SSIM value of about 0.992 for hiding message 2000 character and resistance against compression with TAR value of about 1.

Conclusion
This paper proposed a new hiding message into the video scene change based on histogram analysis. The message has been concealed by examining selected DCT coefficients. The video frames that have scene change were selected based on the different histogram analysis. The proposed scheme has been tested under MPEG-4 compression. The scheme was evaluated by hiding message with different amount of hiding data. The experimental results prove that the proposed concealing message achieve high robustness under MPEG compression. The extracted message under MPEG-4 compression achieved high acceptance rate of recovered message. The proposed scheme is able to maintain the video quality with SSIM value of about 0.992.