Software Defined Security Flow Scheduling Method Based on PSO Algorithm

To solve the problem of link load imbalance and bandwidth waste in software defined security, a secure traffic scheduling method based on particle swarm optimization is proposed in this paper. This method combines the software defined security controller to obtain the network topology information and the current link bandwidth resource status, aiming at the matching degree between the link bandwidth and the network flow size and the function matching degree between the link node and the network flow service chain, combined with specific constraints to establish the objective function. The optimized particle swarm optimization algorithm finds out the best scheduling path of network traffic from the shortest path set. The experiment results show that compared with other algorithms, the proposed algorithm can improve the average throughput of the network and maintain a low packet loss rate in the process of traffic transmission, thus reducing the waste of bandwidth resources and realizing the load balancing of software defined network security traffic better.


Introduction
In recent years, with the continuous progress of computer technology, the Internet presents the characteristics of expanding the scale, increasing the number of network applications and network users, and more diversified protocols carried by network equipment.These new features of the Internet require the core network to be more dynamic and flexible.In the traditional Internet architecture with TCP/IP as the core, network functions are often enclosed in independent network devices.The delivery and deployment of network services are equivalent to the delivery and deployment of hardware devices.This kind of closure increases the difficulty of network optimization and also increases the difficulty of customized network function services, resulting in the limitations of traditional Internet in dealing with a large number of network applications.Software Defined Network (SDN) has the advantages of functional separation at the technical architecture level of network equipment, open API (Application Programming Interface), and flexible deployment based on software, which can effectively solve the problems of traditional networks [1].
Inspired by the architecture of SDN, Security vendors also proposed a new security solution along with the design of SDN, called Software Defined Security (SDS) [2].SDS borrows the features of SDN, separates the security control plane from the security data plane, protects the entire network environment through the security control plane, and adopts a unified protocol to analyze and deploy security services [3].In SDS architecture, the deployment of network security traffic is an important research topic.Due to the limited network bandwidth, if you choose a traffic transmission path without considering the traffic characteristics, the load of link bandwidths may be uneven.
Equal Cost Multi-Path (ECMP) algorithm [4] is one of the traditional algorithms to solve the problem of network traffic planning.When routes are delivered using the ECMP algorithm, multiple route entries with the same cost are generated.When there is a congestion problem in a certain path, the ECMP algorithm replaces other routing items with the same overhead to achieve network load balancing.Due to the inconsistent bandwidth and delay between paths, the ECMP algorithm is unable to perform congestion sensing and traffic classification and tends to disperse traffic to links with small remaining bandwidths, exacerbating link congestion.Hao et al. [5] proposed the rerouting based on the path criticality (RPC) algorithm, took link load and delay as evaluation indexes, and used heuristic algorithms to solve the link congestion problem.RPC algorithm has improved delay and throughput to some extent, but the selection of communication mode between hosts and the selection of evaluation parameters in the path still needs to be further perfected.Al-fares [6] apply the global fit first (GFF) algorithm to the elephant flow scheduling.By calculating the link load, the elephant flow is dynamically scheduled to the path with a low link load to avoid congestion.However, the randomness of the GFF algorithm in route selection and the frequent interaction between the controller and switch make the instruction overhead too high.Based on the above analysis, the current traditional data center traffic scheduling method is suitable for the traffic scheduling of software-defined networks, but it cannot be directly applied to the architecture of the SDS network.
At present, many Optimization methods have been used in the optimization of traffic scheduling algorithms, such as Bayesian Optimization [7], Particle Swarm Optimization (PSO) [8], Sparrow Search Algorithm [9], Simulate Anneal Algorithm [10].Among them, PSO is a widely used optimization algorithm, which has the advantages of fast convergence speed, few control parameters, and strong convergence ability [11].For the SDS traffic scheduling method, it is necessary to take into account both the performance of the network and the functional arrangement of the specific service chain.
In this paper, an efficient and secure traffic scheduling method is designed.According to the matching degree between link bandwidth and network flow size and the matching degree between link nodes and network flow service chain function, the particle swarm optimization algorithm improved by simulated annealing (SAA-PSO) is adopted as the core design model-solving method to form a new SDS traffic scheduling method.The results show that the proposed algorithm reduces SDS network delay and effectively improves the performance of the SDS network.

System framework
In software-defined security, network security devices at the data layer are abstracted into resources in a security resource pool, and their access modes, deployment modes, and implementation functions are decoupled.At the controller level, security device information at the data layer is saved and the data layer traffic is transmitted in a unified and intelligent manner.In order to realize reasonable scheduling of SDS network traffic, the system framework of this paper is shown in Figure 1.The following describes the specific functions of each module in the system.The topology discovery module is responsible for discovering and saving the topology of the nodes in the data layer.The controller sends the packets containing LLDP (Link Layer Discovery Protocol) packets to the specified port on the switch.After receiving the packets, the switch sends the packets to neighboring switches.The neighboring switch then sends the packet back to the controller.The controller obtains network topology information by combining packet information and LLDP packet header information.
The node information statistics module collects information about links and nodes.The controller sends a statistics request packet to the switch.After receiving the request packet, the switch sends a statistics reply packet to the controller.The controller parses the packet to obtain statistics and saves the statistics in the data structure.The statistics include the traffic information of each link and the security service deployment mode of the core layer nodes.
After a security task is delivered, the service chain resolution module resolves the specific task.According to the specific function of each process, security services are divided into seven types of services, namely encryption, digital signature, digital integrity, access control, authentication exchange, service flow filling, and route control.After service chain analysis, with service chain requirements as constraints, the K shortest paths (KSP) algorithm module finds out the set of K shortest paths between the source host and the destination host of the stream.In order to reduce the invalid calculation of the algorithm, the subsequent optimization algorithm is calculated in the K shortest paths.
Finally, the optimized algorithm module designs the optimal transmission path for the SDS traffic by using the optimized particle swarm optimization algorithm based on the obtained traffic statistics and service chain analysis information.The controller sends packets to the switch to deliver the flow table information and installs the flow table to the switch.Then the switch forwards traffic based on the flow table information.The system program execution flow chart is shown in Figure 2.After receiving the security task sent by the user, the service chain analysis of the security task is performed first to determine the security task execution flow.The node information statistics module is used to collect and collect link and node information, and the information and service chain analysis results are input into the optimization algorithm module to obtain the optimal transmission path of security traffic.Finally, the flow table is sent to the router at the data layer to complete secure traffic transmission.
The user delivers a security task

Service chain resolution
Topology discovery

Optimized PSO algorithm
Deliver the flow table Statistics on traffic and node information Implement security task

Principle of particle swarm optimization
PSO is an optimization algorithm, which is inspired by the foraging behavior of birds or fish.It uses parallel and structured strategies to randomly but instructively enhance the search ability in highdimensional space.PSO puts "particles" in the solution space for iterative search, and uses the cooperation and information sharing between "particles" in the search process to finally get the optimal solution of parameters.
In PSO, each optimized particle preserves the memory of velocity and direction.The velocity represents how fast or slow the optimized particle moves in the solution space.The subsequent state of the optimized particle is determined by its own flight history and the flight history of other particles.Particles should have a moderate speed, and too high a speed will probably lead to missing the global optimal solution.The direction represents the direction in which the particle moves in the solution space.The deviation between the position of each particle and the global or local optimal solution is measured by the fitness function.The moving speed of any particle i in the PSO population is determined by Formula (1), and the position is determined by Formula (2).
where n is the number of cycles, and 1 n r and 2 n r are random numbers evenly distributed in the interval (0-1). 1 c and 2 c are learning factors.The particle keeps tracking the local optimization and global optimization values in the search space until the specified number of iterations is reached or the error threshold is satisfied.

Problem modeling
Aiming at the problems of an unbalanced load of link bandwidth and waste of bandwidth resources existing in the current software-defined secure traffic scheduling algorithm, the following model is established considering the matching degree between link bandwidth and network flow size and the matching degree between link nodes and network flow service chain functions.The topology in this paper is denoted as graph ( , )  G V E , where V represents the set of all switch nodes and E represents the set of all links.
We record n secure traffic as min( , , ,..., ) . Then the average link bandwidth average B of path i can be expressed as: When calculating scheduling paths for network flows, if the deviation between available link bandwidth and average link bandwidth is smaller, the load balancing effect of the network will be better to a certain extent.Therefore, the link bandwidth evaluation function i ef is proposed as follows: An evaluation function i ec is established for the matching degree between link nodes and network flow service chain functions.Firstly, the specific process of security service is divided.According to the specific implementation of the process, security service can be divided into seven functions: encryption, digital signature, digital integrity, access control, authentication exchange, service flow filling, and route control.It is assumed that the corresponding service chain F of the network flow to be deployed contains K network functions, which are represented by set 1 2 { , ,..., } K A A A , where i A is the I-th network function.The mapping between network functions and hosts in the service chain is represented by ij P , and indicates the total number of network security functions that can be deployed by j V .When calculating the scheduling path for network flow, if the matching degree between the corresponding function of the link service chain and link implementation function is smaller, the higher utilization rate of secure network flow for nodes is better.Therefore, the matching evaluation function of the network flow service chain is proposed as follows: In order to comprehensively consider the matching degree between link bandwidth and network flow size and the matching degree between link nodes and network flow service chain functions, the established objective function F is shown as follows: (1 ) where (0,1)   .The objective function has two constraints, and one is the link bandwidth constraint: The second is the law of conservation of flow: Formula ( 7) considers the constraint of link bandwidth utilization, where , i j  represents the link bandwidth utilization of the J-th link on the I-th path.Formula (8) represents the law of conservation of traffic that must be satisfied in the process of flow scheduling.The inflow and outflow of intermediate nodes must be equal.

Improved particle swarm optimization algorithm
In order to better utilize the particle swarm optimization algorithm, the method of Linear decreasing weight (LDW) was adopted to dynamically update the value of weight w, and the updating formula is as follows: 1 ( )( ) where 1 w represents the initial setting value, e w represents the setting value when iterating to the maximum evolution generation, k I represents the maximum number of iterations set, and g represents the current iteration number.The LDW method enables the algorithm to search rapidly in the global scope at the initial stage.When the search space rapidly converges to a certain area, the algorithm can conduct a local fine search in this area in the later stage.This method can satisfy the dynamic change process of inertia weight, but the change rate of inertia weight does not change in the process of algorithm optimization.In order to make the inertia weight have a dynamic velocity change rate, the chaos coefficient is added to the linear inertia weight decline, so that the inertia weight will oscillate below the curve as much as possible during the linear decline, and the inertia weight change rate is no longer fixed, so as to enhance the global search ability.The descending parameters of chaotic linear inertia weight are shown in Formula (10).
where t f is the chaos value of the t-th iteration, and the updated formula is shown in Formula (11). is chaos coefficient: 1 1 (1 ) Compared with other algorithms, PSO has fast convergence speed and fewer control parameters, but it is easy to fall into local optimization in the iterative process.SAA can simulate probabilistic hops during the search process, which can effectively prevent the search process from simulating local minimum solutions.The principle of the simulated annealing algorithm is to accept not only good solutions but also bad solutions with a certain probability in the iterative process.The probability of accepting the differential solution is controlled by the temperature parameter, and its magnitude decreases with decreasing temperature.The simulated annealing algorithm is used to optimize the particle swarm optimization algorithm.The calculation formula of probability p is as follows: The Fit function represents the fitness function, and T is the current temperature.The updated formula is shown in Formula (13) : where C is the temperature change rate, which is to simulate the process of slow cooling, generally ranging from 0.95 to 0.99.Particle updating steps of particle swarm optimization using a simulated annealing algorithm (SAA -PSO) are as follows: (1) If the applicability function of the current particle ( ( 1) (2) If the applicability function of the current particle ( ( 1)) i F x n  is greater than ( ( )) i F P n , the probability of updating the optimal position of the particle is p.
(4) When p is less than random (0,1), ( 1)   ( ) In order to further improve the local search ability of the algorithm, the Cauchy mutation operation is added in the process of population iteration.The approach degree ( ( ) ( ( 10) / ( ( ) is defined.After several generations (such as 5 generations), if cp  is less than the threshold value cp  , the diversity of dimension variables of the population is calculated, as shown in Formula (14): ,max ) where , k q j x is the Jth dimension variable of the Qth particle; C (0,1) is a random variable obeying the Cauchy distribution of the parameter (0,1).

Experimental environment
This paper uses Mininet as the simulation platform and Ryu as the controller.The Fat-Tree structure was used as the experimental topology.Iperf was used to generate traffic.The maximum link bandwidth between switches was set to 1 Gb/s.We use the bwm-ng tool at an interval of 1 second to check the statistics of the data transmitted over the port.

Simulation parameter setting
The relevant parameters of the algorithm proposed in this paper are shown in

Simulation result analysis
In the simulation experiment, the average throughput and average packet loss rate are used as evaluation indexes to evaluate the network performance and service quality.In order to ensure the authenticity and accuracy of the experiment, this paper conducts 10 simulation experiments under different modes and takes the average value of the experimental results as the final experimental results.

Average throughput under different network loads
Average network throughput refers to the average amount of data transferred under the current network load.It is a key factor to measure network transmission performance.Figure 3 shows the average network throughput of the proposed algorithm SAA-PSO, RPC, ECMP, and GFF at different secure stream transmission rates.It can be seen that ECMP, GFF, SAA-PSO, and RPC all show relatively good performance when the stream transmission rate is low.As the flow transmission rate increases, ECMP allocates paths equally to network flows regardless of the link status, which makes it easy to schedule multiple networks flows on the same path.GFF always finds the transmission path that meets the network demand first, which leads to the waste of network bandwidth.RPC algorithm carries out rerouting according to link load and delays as the key index.Network bandwidth and service chain requirements are not considered in the model establishment, which will be affected to a certain extent when the flow transmission rate increases.In this paper, the algorithm also considers the link state and the service chain suitability of network flow to reduce the waste of link bandwidth resources.In addition, when calculating the forwarding path for the flow, the core layer node with more adaptive functions is selected to determine the scheduling path, which reduces the possibility of a traffic collision.Therefore, the average throughput obtained by the algorithm in this paper is relatively higher.

Average packet loss rate under different network loads
The packet loss rate is another important factor to measure the network transmission performance, which indicates the congestion status of network flow.Figure 4 shows the network average packet loss rate of the proposed algorithm SAA-PSO, RPC, ECMP, and GFF under different network loads.It can be seen that when the network load is low, the network flow is small and there are many idle links in the link, and the packet loss rate of the five algorithms is almost 0.However, as the load increases, the possibility of network congestion increases, and the packet loss rate of the four algorithms becomes more obvious.SAA_PSO algorithm will avoid the local optimal situation caused by fast convergence.Therefore, when the network load reaches 0.5, The average packet loss rate of the SAA-PSO algorithm is reduced by 76. 7%, 69.8% and 29.3% compared with ECMP, GFF, and RPC algorithms, respectively.

Conclusion
Aiming at the problem of secure traffic scheduling in software defined security, this paper proposes a network traffic scheduling strategy for data centers based on a particle swarm optimization algorithm considering the matching degree between link bandwidth and network flow size and the matching degree between link nodes and network flow service chain functions.The optimal location of particles in the iterative process is taken as the optimal transmission path of network traffic.In order to avoid the particle swarm optimization algorithm falling into local optimal, the particle swarm optimization algorithm was optimized by a simulated annealing algorithm.The linear decreasing weight method is used to make the particle fine search in the late iteration.The simulation results show that the proposed algorithm can adapt to the software defined security environment, improve the network throughput to a certain extent, and reduce the network packet loss rate.Compared with RPC, ECMP, GFF, and other traditional algorithms, the proposed algorithm can show better performance when the network load is heavy.
Table 2 shows the list of abbreviations used in this paper.

2 {
capacity is C, and the bandwidth requirement of section S network traffic is s B .The initial path set 1 the KSP algorithm.It is supposed that the available bandwidth of link j on path i is 1 2 3 and minimum values of the Jth-dimension variables of the Kth generation of the population, respectively, and the search domain of the jth-dimension variables of the population.If the diversity index of a dimension variable of the population is lower than the threshold value div  , Cauchy variation is carried out on the dimension variable of the population, as shown in Formula (15): , ,

Table 1 .
We set up 10 nodes in the core layer of the network.They are classified into A, B, and C nodes according to their functions.There are two Class A nodes, which can deploy all seven network security functions.Five Class B nodes can deploy three different network security functions.Three Class C nodes can deploy different network functions.The locations of the three types of nodes and the network functions that Class B and C nodes can achieve are changed every minute.