Using Grey Wolf Algorithm to Solve the Capacitated Vehicle Routing Problem

The capacitated vehicle routing problem (CVRP) is a class of the vehicle routing problems (VRPs). In CVRP a set of identical vehicles having fixed capacities are required to fulfill customers' demands for a single commodity. The main objective is to minimize the total cost or distance traveled by the vehicles while satisfying a number of constraints, such as: the capacity constraint of each vehicle, logical flow constraints, etc. One of the methods employed in solving the CVRP is the cluster-first route-second method. It is a technique based on grouping of customers into a number of clusters, where each cluster is served by one vehicle. Once clusters are formed, a route determining the best sequence to visit customers is established within each cluster. The recently bio-inspired grey wolf optimizer (GWO), introduced in 2014, has proven to be efficient in solving unconstrained, as well as, constrained optimization problems. In the current research, our main contributions are: combining GWO with the traditional K-means clustering algorithm to generate the ‘K-GWO’ algorithm, deriving a capacitated version of the K-GWO algorithm by incorporating a capacity constraint into the aforementioned algorithm, and finally, developing 2 new clustering heuristics. The resulting algorithm is used in the clustering phase of the cluster-first route-second method to solve the CVR problem. The algorithm is tested on a number of benchmark problems with encouraging results.


Introduction
The Capacitated Vehicle Routing Problem (CVRP) is one of the variants of the classical Vehicle Routing Problem (VRP). It was first introduced in [1]. In the CVRP, a fleet of homogeneous vehicles with fixed capacity depart from a central depot. These vehicles must satisfy the demands of a number of spatially separated customers at minimum cost. This must be done without exceeding any vehicle's capacity. The cost considered in the problem is either the total distance travelled or the total travelling time of all the vehicles. Several methods for solving CVRP have been reported in the literature. These methods are generally categorized into exact methods, heuristics, metaheuristics or a combination of two or more of them. For a detailed review of CVRP solution methods, interested readers are referred to [2]. Cluster-first route-second method is one of the heuristics employed to solve CVRP, where clustering represents the core step within the heuristic [3]. In the cluster-first route-second method, customers are first grouped into feasible clusters that maintain the vehicle capacity constraint. Hence, the clustering phase can be viewed as a capacitated clustering problem (CCP). One vehicle is assigned to serve the customers within each cluster. Efficient routes are then established (route second). Route establishing requires determining the sequence to visit customers, by each vehicle, within its assigned cluster. Grey Wolf Optimizer (GWO) is a new metaheuristic inspired by the leadership hierarchy, as well as the hunting mechanism observed in grey wolves. It was first introduced by [4] in 2014. In GWO, four types of wolves: alpha, beta, delta, and omega are used to simulate the chain of command as illustrated in figure 1, taken from reference [4]. As can be seen from figure 1, the dominance is decreased from top to bottom. Alpha wolves dominate the beta wolves, which in turn dominate the delta wolves. Finally, the omega wolves are dominated by all the higher ranked wolves. The grey wolf hunting behaviour is mainly divided into three steps: tracking, encircling and attacking the prey. The wolves estimate the prey location, through an iterative process. The GWO algorithm starts by a population of randomly generated wolves. The population of randomly generated wolves represent solutions. Similar to the grey wolves' behaviour, the population of wolves (solutions) estimate the optimum solution (the prey) through an iterative process. The three best search agents: alpha, beta and delta are used to update the rest of the population's location. In GWO, tracking the prey represents the exploration phase. The exploitation phase is represented by attacking the prey. The balance between the exploration phase and the exploitation phase represents a key issue in any metaheuristic. Results obtained on uni-modal and multimodal functions verified, respectively, the exploitation and exploration abilities of GWO. Being a new algorithm, modest applications of the GWO algorithm were reported in the literature. In [5], the GWO algorithm was employed for optimum feature subset selection. When GWO is Compared against particle swarm optimization (PSO) and genetic algorithms (GA), GWO showed better accuracy. In [6], an improved version of Grey Wolf Optimizer (GWO) is employed for training q-Gaussian Radial Basis Functional-link nets neural networks. Competitive results were obtained when compared to other metaheuristic methods.
The K-means clustering algorithm is one of the widely used traditional partitioning approaches. It is simple and easy to implement. However, it suffers from the possibility of falling into local minima. To overcome this drawback, the K-means algorithm is usually hybridized with another optimization algorithm. The merits of combining a number of bio-inspired algorithms to K-means have been studied in [7] and [8]. Experimental validation showed the benefits of such integration.
In the current work, we integrate the GWO algorithm with K-means to introduce the (K-GWO), which stands for clustering with GWO. We further incorporate a capacity constraint to render the K-GWO suitable for solving capacitated clustering problems. We refer to this algorithm in this work as (capacitated K-GWO). We use the capacitated K-GWO to solve CVRP. The proposed capacitated K-GWO is used in the clustering phase of the cluster-first route-second technique to solve CVRP. To the best of our knowledge, this is the first time to employ such integration for solving CVRP. The rest of this work is organized as follows: in Section 2 we introduce the problem definition. Details of the solution methodology are given in Section 3, where grey wolf optimization algorithm is first introduced, followed by the K-GWO algorithm, and finally the capacitated K-GWO algorithm. Experimental computations and results are presented in section 4. Finally, concluding remarks are included in section 5.

Problem definition
The vehicle routing problem (VRP) is one of the most studied combinatorial optimization problems [9]. The classical VRP requires determining the optimal routes used by a fleet of vehicles to satisfy the demands of a number of spatially separated customers. Many variants of VRP exist in the literature and in practical applications. These variants usually depend on the different constraint(s) imposed on the problem. The CVRP considered in this work considers distribution of a single commodity from a central depot to a set of customers. A given number of identical vehicles are available. Each vehicle has a fixed capacity. The total demand assigned to each vehicle must not exceed the vehicle's capacity. CVRPs may consider deterministic or stochastic demands. In this work, we consider predetermined deterministic customers' demands. This assumption is based on the fact that many suppliers depend on long term contracts with their customers, and hence they can determine their customers' demands. Practical experience shows that customers prefer to be served once. Therefore, for the convenience of customers, demands may not be split between more than one vehicle. Accordingly, each customer should be visited exactly once by only one vehicle. The objective is to minimize the total distance traveled by all the vehicles to serve all the customers. Each vehicle starts at the depot, serves all its assigned customers, and ends the trip back at the depot. According to [10], VRP is an NP-Hard problem. Heuristics and metaheuristics are therefore considered to solve the problem and its associated variants [11].

Proposed solution methodology
Our approach to solve the CVRP is based on the cluster-first route-second method. In the clustering phase, customers are grouped into feasible clusters. A feasible cluster requires that the total customers' demands within the cluster do not exceed the capacity of a single vehicle. In the routing phase, customers within each cluster are suitably ordered, which amounts this phase to solving a traveling salesman problem TSP [12].
In the present work, clustering will be performed using the proposed capacitated K-GWO algorithm. Details of the proposed algorithm are presented below.

Grey Wolf Optimizer (GWO)
Previous research shows that bio inspired metaheuristics are efficient in solving CVRPs. Such efficiency is recorded in terms of solution quality, as well as computational time. Therefore, many researchers are motivated to examine different bio inspired metaheuristics in solving the CVRP. The Grey Wolf Optimizer (GWO) algorithm is a new bio inspired metaheuristic that has been introduced in 2014 by [4] . It is inspired by both the social hierarchy of wolves, as well as their hunting behavior. In GWO, the search starts by a population of randomly generated wolves (solutions). During hunting (optimization), these wolves estimate the prey's (optimum) location through an iterative procedure. In order to formulate the hierarchy of wolves, the fittest solution is referred to as the alpha ( ). While the second and third best solutions are called beta ( ) and delta ( ), respectively. The wolves , , and are responsible for guiding the search (hunting), while other wolves follow. The hunting behaviour is mainly divided into three steps: tracking, encircling and attacking the prey. The encircling behaviour is mathematically modelled according to the following equations: . (1) 1 . ( Where represents the distance between the position vector of both the prey and a wolf , and represents the current iteration number. and are coefficient vectors, and are calculated as follows:

(4)
Where and are random vectors ∈ 0,1 , and the value of is linearly decreased from 2 to 0 as iterations proceed.
According to (1) and (2), a wolf can randomly update its position in the area around the prey.
From (5), (6) and (7) it can be seen that the position of any wolf is determined by the position of the three best solutions. Position updating of grey wolf is shown in figure 2 taken from reference [4].

Figure 2. Position updating of grey wolf [4]
The hunting ends by attacking (approaching) the prey, this step represents the exploitation phase. It is performed by decreasing the value of , linearly from 2 to 0. This in turn decreases the value of . To avoid local stagnation, random values of greater than 1 are employed to force the wolf away from the prey. This step represents the exploration phase. Using adaptive values of and consequently , which is depicted by linearly decreasing the values of from 2 to 0, guarantees a balance between exploration and exploitation. This balance is achieved since half of the iterations are dedicated to exploration when |A| • 1, while the rest of the iterations is dedicated for exploitation when |A|

1.
This balance is one of the GWO's strengths. To sum up, values of |A| 1 oblige the search agents to move towards the prey while values of |A| >1 oblige them to diverge from it.
The value of represents another component that influences exploration, where ∈ 0,2 . From (1) we can conclude that represents the weight of the prey in defining the distance; values of 1 emphasize its effect while C 1 reduce it. Integrating the GWO with K-means algorithm to overcome the drawbacks of the K-means is discussed in the next section.

Clustering with grey wolf optimize
The K-means clustering algorithm assigns points to k clusters based on their proximity to the cluster's centroid. Initially centroids are selected at random and then iteratively reallocated until a predetermined stopping criterion is met. The K-means algorithm may be divided into an initialization phase, a cluster assignment and centroids update phase, and finally an exploration and evaluation phase [5]. K-means algorithm suffers from two main drawbacks, namely: the dependence on initial centroid values, and its tendency to fall into local optima. Its integration with bio-inspired optimization algorithms to overcome the aforementioned drawbacks is studied in [7], [13].
In this work, we introduce for the first time a combination of the K-means algorithm and the GWO. In our proposed K-GWO algorithm, wolves represent solutions. Each wolf holds a set of ccentroids that correspond to K clusters. Each centroid is a D dimensional vector. Consequently, each wolf is represented by a D vector. A population of N wolves collaboratively hunts for the best possible configuration of the clusters (prey). The best configuration of clusters is reflected by the optimal positions of the centroids. These centroids are subsequently used to grow clusters following the basic K-means principle: points are assigned to the cluster with nearest centroid.
The algorithm aims at minimizing an objective function defined as Where, k is the number of clusters, N is the number of points to be clustered, , is the data point belonging to the cluster, and is the centroid of cluster . As mentioned earlier, each search agent represents a set of k centroids ( , , … ) which, when used in equation (8), gives an indication on how fit this agent is. The fittest search agent is the one associated with the minimum value of . Wolves use equations (3) to (7) for updating their positions depending on the positions of , , and wolves. We further extend our proposed K-GWO, to incorporate a capacity constraint. Details are discussed in the next section.

Capacitated K-GWO
As mentioned earlier, the capacitated clustering problem (CCP) consists of imposing a capacity constraint on each cluster. Mainly, there exists three approaches to handle constraints: direct approach, Lagrange multipliers, and penalty method [14]. The penalty method is selected in our proposed method, since it is the simplest to implement in practice, without reducing the quality of the results. In our approach, constraints violations are incorporated in the objective function defining a penalized function. This function penalizes infeasible solutions relative to the amount of constraint's violation [15]. Accordingly, the new fitness function can be defined as Where: is the penalized fitness function, is the original fitness function, and ∆ is the amount of capacity violation, and are user defined penalty parameters. In the current study, these parameters are determined experimentally. Incorporating the penalized fitness function to the K-GWO algorithm yields a capacitated K-GWO algorithm appropriate for capacitated clustering problems.

Cluster assignment
The cluster assignment phase is among the phases of K-means clustering algorithm where points are assigned to their corresponding clusters. The classical method for assignment is performed according to minimum distance to centroid as illustrated in figure 3. In the CVRP, when customers are assigned to clusters based only on minimum distance to centoid, infeasible clusters might be formed. In order to obtain feasible clusters, we develop two new heuristics. In both heuristics, Euclidean distances between every customer and all the centroids are calculated. In our first developed heuristic, shown in figure 4, points are assigned to the cluster based on minimum distance until capacity limit is reached. At this stage, the remaining non-clustered customers will violate capacity constraints if assigned to any cluster. For every non-clustered customer, we calculate the amount of capacity violation if the customer is assigned to each cluster. Every non-clustered customer is then assigned to the cluster having the minimum capacity violation.

Figure 3. Pseudo code of assignment procedure used in KmeansO
In our second developed heuristic, shown in figure 5, points are assigned to the nearest cluster based on minimum distance and maximum demand [16]. In this case a ratio is calculated using (10). Assignment is conducted according to this ratio until capacity limit is reached. At this stage, nonclustered customers are assigned to clusters according to the minimum capacity violation explained in the first heuristic above.  . Pseudo code of assignment procedure used in KmeansP Based on the above mentioned heuristics, three versions of the capacitated K-GWO will be compared. The first version uses the classical assigning technique, where points are assigned to the nearest centroid, and, in this work, will be referred to as KmeansO. The second version will be referred to in this work as KmeansP, and uses our first assigning heuristic, where points are assigned to clusters with minimum distance and minimum violation. Finally, the third version, and in this work will be referred to as KmeansR, and uses our second assigning heuristic, where points are assigned to clusters with minimum ratio and minimum violation. The capacitated K-GWO is utilized in the clustering phase of the cluster-first route-second technique to solve the CVRP. Experimental computations are carried out in order to investigate the efficiency of the proposed algorithm as will be seen in the next section.

Computational results and analysis
In order to verify the efficiency of the proposed capacitated K-GWO algorithm, a set of experiments are carried out. The proposed algorithm is tested using benchmark problems downloaded from the web http://www.branchandcut.org/. In this work, 20 instances from three different sets are tested. These sets are known in the literature as set A, set B, and set P. All three sets A, B and P are developed in [17]. For the instances in set A and set B, customers' locations and demands are randomly generated. Instances in set P are modified instances from the literature. Each test instance has the following nomenclature: the first letter refers to the set name followed by the number of customers and finally the minimum number of required clusters. The minimum number of required clusters corresponds to the minimum number of vehicles to employ. The minimum number of vehicles to employ is obtained by dividing the total customers' demands by the capacity per vehicle, rounding up fractions. For each instance, the depot and customers' coordinates, as well as customers' demands, are given.
Computational experiments are coded using MATLAB 2009a, on a PC with Pentium Dual Core T3200 Processor 2.0GHz and 2.0 GB memory. In the experiments, we run each instance for 10 independent runs and the best result is reported. The stopping criterion is the number of generations. The parameters of the proposed algorithm were set experimentally after a number of test runs. Chosen values of the parameters are as follows: the wolf population size is 20 and the maximum number of generations is 500. Table 1 summarizes the results of these runs, where the three versions of the capacitated K-GWO are compared. The values reported in Table 1 are those of the total distance travelled by all vehicles after optimally routing the vehicles within each cluster. The best value among the three versions is reported in the last column named "Best". Table 1 shows that each of the three versions, KmeansFnO, KmeansFnP, KmeansFnP, are able to obtain the best value in 10, 9, and 11 instances out of the 20 test instances, respectively. This shows that the versions are comparable in terms of the solution quality each version can obtain. However, during the test runs, we noticed that the first version obtained many non-feasible solutions before being able to obtain a feasible solution. The second and third versions however, were able to reach feasible solutions faster than the first version. This is explained by the fact that the first heuristic minimizes the distance to centroids, and capacity constraints are only considered in the objective function penalty. Contrary to the other 2 versions, where capacity constraints are considered during cluster assignment, in addition to the penalty within the objective function. The computational time for solving the test problems is in the order of a few seconds for any of the three versions.
for each search agent calculate the distance between every customer and all centroids ( , , … ) calculate the ratio between the distance and customer's demand according to (