A Cooperative Detection of DDoS attacks based on CNN-BiLSTM in SDN

In response to the challenges posed by the high overhead and low detection efficiency of traditional SDN, a novel approach has been proposed to detect DDoS attacks. This cooperative method leverages information entropy and deep learning techniques to divide the detection task between the data plane and control plane. An advanced CNN-BiLSTM model with batch normalization and attention mechanism is utilized to identify DDoS attack traffic. The results of experiments demonstrate that this method offers superior accuracy, detection rate, and false alarm rate compared to prior approaches. Moreover, the switch-controller collaborative detection method proposed in this research reduces the occupancy rate of CPU, in contrast to the conventional single point detection method.


Introduction
Currently, SDN has two main methods for detecting DDoS attack traffic: entropy-based and machine learning-based.Entropy-based detection is fast and lightweight, but has a high false alarm rate.While machine learning-based detection methods offer improved accuracy, they rely on extracted traffic characteristics, which can result in slower detection speeds compared to entropy-based detection methods.As a result, there is a current research focus on designing a DDoS attack detection method that can achieve both high accuracy and fast detection speeds.
Mousavi et al. proposed using destination IP address information entropy to detect DDoS attacks, but this method has a high risk of false positives [1].JunJ et al. improved the entropy calculation method by adding "packet speed" but the threshold cannot adjust dynamically, leading to false positives [2].Entropy calculation is quick but may lead to many false alarms.It is also not very scalable as the threshold needs to be adjusted when the network size changes.
Nisha et al. propose using SAE-MLP to detect DDoS attacks, considering detection accuracy but not detection time [3].Jin et al. suggest using 6-tuple features and SVM to detect traffic in SDN, while accuracy depends on feature selection [4].
Despite offering higher detection accuracy than entropy-based detection methods, the machine learning approach still faces certain challenges.In SDN networks, where traffic information collection, statistics, and detection are performed on the controller, network expansion leads to significant overhead, causing delays in attack detection [5]- [7].
The paper suggests a collaborative detection approach that combines information entropy and deep learning to overcome challenges with existing research methods.Distributing the detection task to the switch enables the method to identify DDoS attack traffic across different network layers.This approach effectively reduces the workload on the controller, resulting in improved detection efficiency.The control plane utilizes an enhanced CNN-BiLSTM model with batch normalization and attention mechanism.This model enables a more comprehensive understanding of the spatial-temporal characteristics of traffic data, leading to higher accuracy and lower false positive rates compared to conventional machine learning techniques.Additionally, a simulation platform is constructed to better showcase the efficacy of the proposed collaborative detection approach instead of relying solely on dataset testing.

CNN Model
In SDN, network flow data is represented by a k-dimensional vector for each stream (denoted by i x ).
The stream data can be input as 1: . Conv1D applies a filter, represented by a convolution core hk R   , to h , which represents a set of stream characteristics.This method differs significantly from traditional approaches.
After convolving, pooling is performed to enhance feature robustness and reduce model overfitting.The maximum neurons in the specific region is then obtained: The key features are chosen through the aforementioned operations.

BiLSTM Model
tanh( [ , ] ) tanh( ) (8) BiLSTM is an advanced type of LSTM that combines forward and backward LSTM to provide a more accurate analysis of traffic data.This allows for more precise calculations through a two-way information analysis process.Here is how the calculation process works:

Method Description
Fig. 1 The DDoS attack detection program The switch's initial detection module relies on information entropy-based detection to identify any anomalies in real-time traffic, promptly alerting the controller.Under normal network conditions, the controller forwards traffic using pre-set policies.Once abnormal traffic is detected, the deep module is activated to identify any DDoS attack traffic.If detected, the controller will alert the administrator and instruct the switch to discard the data.

Initial Detection Module Based
Assuming a set is the probability of occurrence for the ith destination.The Shannon entropy formula can be expressed as: Formula (12) shows that increased sample concentration results in decreased entropy, whereas decreased sample concentration results in increased entropy.Formula (13) demonstrates the need for normalization of the entropy value for comparison purposes after calculation:

Batch Normalization
By using Batch Normalization, the neural network can maintain consistency in input data distribution, ensuring its trainability and reducing node distribution transformation.This results in faster convergence and preserved representation ability of the network.

CNN-BiLSTM Module
Fig. 2 Training process of CNN-BiLSTM model The feature vector is inputted into the CNN layer to extract spatial hierarchical features, enhancing model fitting.The model is downsized using MaxPooling to boost computing speed and increase the robustness of the extracted features.Batch Normalization is the subsequent layer, normalizing input batches to avoid gradient vanishing during training.The attention mechanism is introduced to calculate the importance of each attribute feature, focusing more on crucial features to achieve improved intrusion detection.Lastly, to determine the correlation between previously extracted features and the output classification result, a Dense layer is used for non-linear transformation.

Experimental Environment
The experiment used the following hardware: AMD Ryzen 8 5800H@4.20 GHz 16 core CPU, and Ubuntu 16.04LTS as the operating system.Create a detection module using information entropy in P4 language, developed a CNN-BiLSTM model-based deep detection module utilizing the Keras deep learning framework, and simulated a real network environment using mininet as depicted in figure 3. TFN tool is used to generate DDoS attack traffic, which includes mixed SYN flood and ICMP flood.h1 is attacked by h7 in this simulation.
Fig. 3 The SDN environment simulations in the mininet

Analysis of the Initial Detection Module
Figure 4 demonstrates the effect of the presence or absence of an initial detection module on the quantity of Packet-in packets the controller receives.The curve's peak value and rate are lower when the initial detection module is in place compared to when it's not.Furthermore, the controller gets the standard number of Packet-in packets roughly 2 seconds sooner than anticipated.To categorize traffic as DDoS attack traffic and facilitate early mitigation.However, the CNN-BiLSTM model's detection process is time-consuming, causing a delay in emergency response.Therefore, having an initial detection module improves emergency response efficiency.
Analyzing the comparison in Figure 5, it can be observed that the controller's overall CPU occupancy rate.

Fig. 4 Packet-In packets changes
Fig. 5 The CPU occupancy change of the controller The deep detection module is activated by an abnormal report message from the switch, resulting in low CPU usage.Without an initial detection module on the switch, real-time network traffic detection relies on the controller using the CNN-BiLSTM model.This approach incurs high detection costs and leads to increased CPU utilization.
During a DDoS attack (9s-20s), both methods require time to respond due to the high attack traffic rate, reaching a peak of 100%.As a result, fewer Packet-In messages are sent to the controller, enabling faster restoration of the CPU utilization to normal levels.
In contrast, if the switch lacks an initial detection module, it continuously sends Packet-In messages to the controller until the CNN-BiLSTM model provides a detection result.This consistent handling of Packet-In messages by the controller causes a delay in the CPU utilization rate returning to normal.
In conclusion, the information entropy-based initial detection dodule deployed on the switch can preliminarily identify DDoS attack traffic in the network and achieve the effect of early mitigation, as well as reduce the overhead of the controller.

Dataset
To replicate real-world conditions, the "sdn_dataset" [10]  To normalize the features, scaling is applied to handle values that are too large or small, ensuring they fall within the range of (0,1).These features are then transformed into 65-dimensional vectors and utilized as input for training and testing the CNN-BiLSTM model.

Parameter Setting
After conducting a grid search and optimizing hyperparameters, the optimal configuration was determined to be a batch size of 128, a learning rate of 0.01, and 50 training cycles.To avoid overfitting and improve the model's generalization, dropout is implemented before the Dense layer.

Result Analysis
( Table 4 illustrates that the CNN-BiLSTM model introduced in this study surpasses the SVC-RF method from the top-performing literature [12].The CNN-BiLSTM model's accuracy, detection rate, and false alarm rate have improved by 0.82%, 1.49%, and -1.46% respectively, compared to the SVC-RF method.When compared to the BiLSTM model, the improvements are 0.87%, 0.83%, and -1.08% respectively.Moreover, against the RF method from literature [13], the improvements are 2.22%, 3.76%, and -3.97% respectively.

T P T P T N D R A C C T P F N T P T N T P F N F P
BiLSTM outperforms LSTM in accuracy as it incorporates both forward and backward LSTMs to analyze bidirectional traffic data.LSTM and BiLSTM can only capture temporal patterns in network data traffic, but network data traffic exhibits both temporal and high-dimensional spatial characteristics.Hence, the CNN-BiLSTM technique enables a comprehensive extraction of network data traffic characteristics.Additionally, the CNN-BiLSTM method can further capture potential network traffic characteristics and enhance representation ability through batch normalization,as well as incorporate the attention mechanism, the significance of each attribute feature can be quantified, and greater emphasis is placed on the more critical attributes.This strategy results in a more effective intrusion detection approach.Consequently, the CNN-BiLSTM method outperforms the SVC-RF method in terms of performance.
In conclusion, the CNN-BiLSTM approach outperforms all the aforementioned methods.

Conclusion
This research presents a novel approach to identify collaborative DDoS attacks in SDN settings.It utilizes the concept of information entropy and employs the CNN-BiLSTM model for detection.In the initial phase, the switch employs information entropy calculations to identify DDoS attack traffic.The deep detection module, implemented on the controller, utilizes the CNN-BiLSTM model to detect malicious traffic.This technique effectively minimizes the detection burden on the controller and significantly reduces the overall detection time.
received by the switch with different destination IP addresses.

Table 1 .
During a DDoS attack, the entropy threshold for the destination IP address is set to ( ) of the destination IP address.Table 1 displays the corresponding decision table.Criteria for Judgment Feature Extraction Flow characteristics selection can greatly impact the detection model, leading to increased model complexity, higher overhead, and reduced detection efficiency[9].The controller obtains flow table details via the OpenFlow protocol following an exception report.The deep detection module's CNN-BiLSTM model employs chosen fields from flow table items as well as calculated features, totaling to 19 features as listed in table 2.

Table 3 .
public dataset is used to train and test the model.The dataset comprises 104345 flow data, with 63561 normal data and 40784 DDoS attack data.Please refer to table 3 for the breakdown of normal and attack samples.Division of the Dataset The dataset contains 23 traffic characteristics, and Table 2 lists 19 extracted features.Non-numeric attributes like Source_IP, Destination_IP, and Protocol are one-hot encoded during data preprocessing.

Table 4 .
Different models and CNN-BiLSTM models