Adaptive Sampling Offspring Generation Strategy for Multi-objective Optimization

A covariance adaptive sampling offspring generation strategy (CASS) based on fuzzy clustering is proposed, and a multi-objective distribution estimation algorithm (MEDCA) based on this strategy is introduced. The GK-FCM clustering partitioning strategy is designed to build a Gaussian model for each individual, collectively approximating the manifold of the Pareto solution set and generating offspring through sampling. The introduction of an individual’s survival generation adapts the individual’s preference for exploration and exploitation. This is achieved by incorporating it as a scaling factor of the covariance matrix in the sampling model, in order to satisfy the individual’s preferences for development and exploration in different evolutionary stages. This method significantly improves the performance of MEDCA in solving complex multi-objective optimization problems through covariance matrix adaptation sampling strategy and scaling factor adaptation strategy. The experimental results demonstrate the advantages of MEDCA in the application of offspring generation strategies during model sampling.


1.Introduction
Multi-objective evolutionary algorithms represent the latest advancements in technology, incorporating evolutionary principles and classical genetic algorithms.The quality of offspring generation has a significant impact on the convergence and diversity of multi-objective evolutionary algorithms [1].In general, there are two types of offspring generation methods in evolutionary algorithms: recombinationbased [2] and model-sampling-based [3].Currently, most research focuses only on the macro-level evolutionary requirements of populations or subpopulations, without improving the inherent flaws of model sampling operators, which limits their applicability.We introduce a multi-objective distribution estimation algorithm based on covariance adaptive models [4].The algorithm leverages the unique evolutionary search preferences of each individual to effectively enhance the evolutionary search process.
Many scholars at home and abroad have studied this issue.The strategy of descendant generation based on genetic recombination mainly focuses on the way of individual recombination in genetic algorithms [5].For example, Zhang et al. [6] used self-organizing mapping method to establish the neighborhood relationship between current solutions to improve the quality of descendant solutions; Yuan et al. [7] generated descendants using improved locally linear embedding in the early stage of evolutionary search to maintain population diversity.The descendant generation strategy based on model sampling utilizes statistical models to describe the distribution of high-quality individuals in the population, generates descendants through model sampling, and updates the statistical models [8].For example, Sun et al. [9] used K-means clustering to adaptively approximate the manifold curve of Pareto solution set for the multi-objective optimization problem, and adopted a multivariate Gaussian model for sampling to expand the search range.On the other hand, multiple descendant generation strategies can be mixed to make up for the shortcomings of model sampling.For example, Zhang et al. [10] detected population distribution using entropy criteria and used crossover and mutation operators to generate descendants when the population distribution did not show obvious rules.Li et al. [11] introduced a local learning strategy based on nondominated solution information into model sampling to guide descendant solution generation and thus improve the global search ability of the algorithm.
To address the limitations of existing methods, we propose a Covariance Adaptive Sampling Strategy (CASS) based on fuzzy clustering and introduces MEDCA based on this strategy.CASS utilizes GK-FCM [12] clustering to mine the regular features of population distribution and quantifies individual preferences for development or exploration based on their survival generation [13].It incorporates these preferences as scaling factors of the covariance matrix into the sampling model [14], aiming to satisfy individuals' preferences for development or exploration in different evolutionary stages.

Problem Definition
The mathematical model for the box-constrained continuous multi-objective optimization problem (MOP) [15] can be expressed as (1).
In (1), ii ab −    + for all [1, ]   in  . In the context of multi-objective optimization problems (MOPs), the variable k denotes the number of objective functions taken into consideration.12 ( , , , ) represents a vector of decision variables with n dimensions, while denotes a continuous mapping that transforms the decision space into the objective space, where ( ), (1,2, , ) represents the m-th objective that needs to be minimized.Given the conflicting nature of the objectives, it is typically unfeasible to discover a single solution that optimizes all objectives simultaneously.These compromise solutions are called Pareto optimal solutions x * , which consider trade-offs between the objectives.The union of all these solutions forms a Pareto optimal set (PS), while the union of all F(x * ) values represents the Pareto front (PF).

MEDCA Framework
In this section, the MEDCA algorithm is proposed.The main characteristic of MEDCA is the introduction of CASS (Covariance Adaptive Sampling Strategy) within the framework of the Multi-objective Estimation of Distribution Algorithm [16].It dynamically balances the development and exploration phases of evolution by adapting to different search stages.In MEDCA, CASS applies GK-FCM clustering [12] to represent the population distribution, divides the population into several clusters, shares the cluster covariance in each class to build a Gaussian model for each individual that jointly approximates Pareto optimal solutions manifold, and generates offspring through sampling.In offspring generation, the scaling factor of each individual is used to scale the covariance matrix of the Gaussian model to assist the individual in development or exploration.The scaling factor is adaptively updated based on the survival generation of individuals [13] during the population evolution process, aiming to satisfy individuals' preferences for development and exploration in different evolutionary stages.The framework of the MEDCA algorithm is shown in Algorithm 1.
In Algorithm 1, first, N solutions are randomly generated from the decision space as the initial population, and an archive set is constructed (line 1).After that, the adaptive scaling factor sg and the generation vector bn are initi-initialized.

12
Updating the population ; 13 end 14 return non-dominated solution set in .
Both sg and bn are row vectors of 1*N, corresponding to the N individuals in the population, and each element is initialized to 1 (line 2).During the population evolution process, first, the population is subjected to GK-FCM clustering, dividing it into K clusters (line 4).Then, each individual x n in the population is evolved successively.First, the adaptive covariance matrix n Σ for individual x n is constructed (line 6).It is achieved by scaling the covariance matrix of its belonging class with the scaling factor n Σ obtained through the evolution of individual x n .The scaled covariance matrix is then used as the covariance matrix for the sampling model of individual x n .Here, n Σ represents the covariance matrix of the class to which x n belongs.Afterwards, a trial solution ts is generated by sampling around x n (line 7).The environmental selection operator evaluates the trial solution ts and updates the archive set and the generation vector bn (line 8).After one evolution of the population, the adaptive scaling factor sg+1 is calculated based on the generation vector bn (line 10).In the calculation of the adaptive scaling factor sg+1, only the survival generations of individuals in the past H generations are considered.If the threshold is exceeded, the survival generations of individuals will no longer accumulate.Then, the population is updated using the archive set (line 12).When the evolution termination condition is met, MEDCA returns the non-dominated solution set in population (line 14).MEDCA updates the population using an improved method that combines the DE operator [17] and an environmental selection operator based on hypervolume indicator [18].

CASS Framework
In the evolutionary process, individuals belonging to the same category have different preferences for exploitation and exploration.CASS uses a clustering model to represent the population distribution and introduces a scaling factor to perturb the search space around the evolving individuals' center.This scaling factor controls the sampling range by scaling the covariance matrix of the sampling model, thereby assisting individuals in both exploitation and exploration [14].CASS generates offspring solutions by constructing a multivariate Gaussian sampling model using the covariance between individuals and their respective categories.By sharing the covariance within each category, it reduces the computational complexity of the sampling model and achieves better exploration capability with lower computational cost.The steps of CASS are shown in Algorithm 2. , where z is a random vector from a standard normal distribution; 4 Perform polynomial mutation on the trial solution ts; 5 Repair the trial solution ts, where i a and i b correspond to the upper and lower bounds of the i-th gene, respectively., 1, 2, , 6 Return the trial solution ts.
In Algorithm 2, firstly, the adaptive covariance matrix n Σ for the individual to be evolved, x n , is determined (line 1).Then, the Cholesky decomposition is applied to the adaptive covariance matrix n Σ (line 2).Next, a Gaussian sampling model is constructed with x n as the mean and n Σ as the covariance matrix, and a random sample, ts, is generated (line 3).Subsequently, the trial solution, ts, undergoes polynomial mutation, and any genes outside the decision space are repaired (lines 4-5).Finally, the set of trial solutions is outputted (line 6).

Adaptive Scaling Factor Update.
The survival generation of an individual is a comprehensive representation of the environmental selection outcomes over a certain period of evolutionary generations.The definition of an individual's survival generation is shown in (2) [19].Here, n g  denotes the duration for which individual x n persists in the population as the population evolves up to generation g. bn is the generation of individual x n , and its value is the evolutionary generation at which x n is generated, i.e., if x n is generated in generation g0, then bn = g0.H is a predefined parameter that indicates As an attribute value of evolutionary individuals, the survival generation count not only reflects the quality of individuals in the current population but also indicates the evolutionary status and search preference of individuals.Specifically, individuals with higher survival generation counts in the population are relatively of better quality, and they tend to prefer exploitation-based search to improve their quality.Conversely, individuals with lower survival generation counts have a preference for exploration.In MEDCA, an adaptive scaling factor is constructed based on the survival generation count of individuals to scale the covariance matrix of the sampling model, thereby controlling the sampling range to meet the development or exploration preference of individuals.Specifically, the survival generation count of individuals in the population is normalized by taking the mean and mapping it to the range [0, 1].Additionally, the following conditions are satisfied: individuals with higher survival generation counts correspond to smaller scaling factors, enabling local search; individuals with lower survival generation counts use larger scaling factors for global search [20].Algorithm 3 provides the detailed steps for updating the adaptive scaling factor.
In Algorithm 3, firstly, the survival generation count n g  for each individual in the population over the past H generations is calculated.The maximum value for the survival generation count is H, and g represents the current generation of the population (line 1).Then, based on the survival generation count

Experimental Design
In order to verify the effectiveness of CASS and assess the performance of MEDCA, we selected a set of benchmark algorithms comprising three classical MOEAs -NSGA-II [21], SMS-EMOA [22], and RM-MEDA [23] -and three advanced MOEAs -SMEA [6], IM-MOEA [24], and MOEA/D-CMA [25].Among them, IM-MOEA, RM-MEDA, and MOEA/D-CMA are representative works that use modelbased sampling offspring generation strategies, while SMEA is a representative algorithm that uses genetic operations as offspring generation strategies.RM-MEDA is the first multi-objective distribution estimation algorithm that explicitly uses MOP rule characteristics.SMEA defines the neighborhood relationship of individual groups through self-organizing maps and enhances the local search ability of offspring solutions by pairing and recombining within the neighborhood.IM-MOEA creates an inverse model mapping the objective space to the decision space using a Gaussian process, and generates new offspring solutions directly in the objective space based on this model.On the other hand, MOEA/D-CMA incorporates the covariance matrix adaptation evolution strategy (CMA-ES) from single-objective optimization into the MOEA/D framework by aggregating subproblems.
In the experiment, the GLT test set [26] was selected as the objects to be optimized.Among them, the decision variables of the GLT test set have strong nonlinear correlation, and the optimization objectives have complex PF characteristics.GLT1, GLT4, and GLT6 have discontinuous PF, GLT2 and GLT3 have complex PS structure, GLT2 -GLT6 have convex PF, GLT1 -GLT4 are dual-objective, and the remaining solutions are triple-objective.In order to evaluate the overall performance of MEDCA and the benchmark algorithms, we selected inverted generational distance (IGD) [23] and hypervolume (HV) [27] as the evaluation metrics.Smaller values of IGD and larger values of HV indicate a closer approximation front to the Pareto front and a more uniform distribution.When calculating the HV measurement value, all target values are normalized to [0, 1], and the reference point is defined as r* = (1.1,•••, 1.1) T .In order to minimize random errors, each test problem was independently executed 31 times for all algorithms, using the same platform (CPU: 12700kf, Memory: 32GB, GPU: RTX 2060).All algorithms used the same common parameter settings, and the detailed parameter settings for each algorithm are shown in Table 1.

Experimental Results
To ensure reliable statistical conclusions, MEDCA and the benchmark algorithms were executed independently 31 times on each of the 15 test problems.The performance of the algorithms was assessed using statistical measures including the mean and standard deviation of evaluation metrics such as IGD and HV measurements.For each test, the IGD or HV measurements were sorted in ascending or descending order, respectively.The subscript numbers represent the standard deviation, while the numbers in square brackets indicate the ranking of the algorithm for that specific test problem.The comprehensive rankings of each algorithm on different test problems were calculated statistically.Furthermore, Wilcoxon rank-sum tests were conducted to compare the statistical measures of MEDCA and the benchmark algorithms, in order to test whether there are significant differences in performance between MEDCA and the six benchmark algorithms.The confidence level was set at 5%.The symbols " †", " §", and "≈" indicate whether MEDCA's measurement values are significantly better, worse, or not significantly different from the benchmark algorithms.Additionally, the number of times MEDCA achieved " †", " §", and "≈" relative to the benchmark algorithms was also calculated.Table 2 displays the average and standard deviation of IGD and HV measurements for the approximation front.These values were obtained by independently running MEDCA and the benchmark algorithms on the GLT test set 31 times.According to the overall rankings, MEDCA holds the top position, followed by MOEA/D-CMA, RM-MEDA, SMEA, IM-MOEA, NAGA-II, and SMS-EMOA.Compared to the six benchmark algorithms, MEDCA achieved the best average IGD (HV) measurements in 11 out of 12 cases and the second best measurement in one case.According to the Wilcoxon rank-sum tests, in the comparison of average measurements for the 12 cases, MEDCA showed a significant advantage over NSGA-II, SMS-EMOA, RM-MEDA, SMEA, IM-MOEA, and MOEA/D-CMA.Compared to the second-ranked algorithm, MOEA/D-CMA, MEDCA had 11 superior measurements and one inferior measurement.On GLT1, the average HV measurement of MOEA/D-CMA was better than that of MEDCA.Furthermore, since MEDCA, RM-MEDA, SMEA, and IM-MOEA all utilize the rule characteristics of the problem in individual recombination, the statistical findings also suggest that the rule characteristics' knowledge utilization, through CASS in MEDCA, leads to enhanced offspring solution quality while solving the GLT test set.In conclusion, MEDCA In order to evaluate the search efficiency of MEDCA and benchmark algorithms, Fig. 1 presents the variation in the average IGD measurement value of the search population across 31 independent runs of the GLT test set for seven algorithms as generations evolve.Based on the observations from Fig. 1, it is evident that MEDCA consistently achieves the lowest average IGD measurement value with a minimal standard deviation when tackling the GLT test set.Compared to the six benchmark algorithms, MEDCA demonstrates the best convergence effect on the GLT test set, especially on GLT3, GLT5, and GLT6, indicating that MEDCA can effectively balance development and exploration in the search process and exhibit high search efficiency on the GLT test set.Specifically, in solving the GLT test set, MEARS can quickly converge to an approximation front with good convergence and distribution properties compared to SMS-EMOA, IM-MOEA, RM-MEDA, and NSGA-II.Specifically, on GLT1, GLT4, and GLT5, the convergence speed of MEDCA is slightly slower than MOEA/D-CMA in the early stage of evolutionary search, but MEDCA achieves a smaller average IGD measurement value than MOEA/DCMA in the end.On GLT2, MEDCA, RM-MEDA, and SMEA perform similarly, but MEDCA exhibits more stable performance.In conclusion, MEDCA has the best search efficiency on the GLT test set.

4.Conclusion
To address the limitations of model-sampling-based offspring generation strategies, we propose a novel approach that utilizes a Covariance Adaptive Sampling Strategy incorporating fuzzy clustering.Additionally, we introduce a Multi-objective Estimation of Distribution Algorithm that leverages the covariance adaptive model.CASS utilizes the GK-FCM clustering to characterize the population distribution and divides the population into several clusters.For each individual within a cluster, we construct a Gaussian model to approximate the manifold of Pareto solutions.Offspring are then generated by sampling from these models.The preference for exploration and exploitation of individuals is adaptively quantified based on their survival generations, and this quantification is introduced as a scaling factor in the covariance matrix of the sampling model.It ensures that individual preferences for exploration and exploitation are met at different stages of evolution.The mechanism of sharing covariance within a cluster allows for effective exploration with minimal computational cost.
Experimental results demonstrate that MEDCA effectively utilizes domain-specific knowledge to enhance the quality of offspring solutions.Furthermore, MEDCA surpasses the comparison algorithms in terms of average HV measurement values, search efficiency, and the ability to visualize non-dominated solutions.

ALGORITHM
Trial solution ts. 1 Determine the adaptive covariance matrix n Σ for n x ; 2 Perform Cholesky decomposition on the adaptive covariance matrix n


only considers the number of generations in which individual x n has survived in the past H evolutionary generations.If it exceeds the threshold, the number of generations survived by the individual will no longer accumulate.The survival generation of an individual essentially represents the "age" attribute of the individual.Based on the above definition, a larger value of n g  indicates a better quality of the individual, whereas a smaller value indicates that the individual has just been generated and its quality is not yet known.

ALGORITHM 3 :2 6
of each individual, the scaling factor 1 n g s + for the covariance matrix in the next evolution is calculated (line 2).The scaling factor 1 n g s + ranges from 0 to 1, and a larger value of n g  corresponds to a smaller value of 1 n g s + , while a smaller value of n g  corresponds to a larger value of 1 n g s + .When an individual's survival generation count is close to the average survival generation count of individuals in the population, the individual's covariance matrix is repaired to prevent high-quality individuals from getting stuck in local optima due to a small search range for development preferences (lines 3 to 5).Scaling Factor Update Input: Generation vector n b .Output: Adaptive scaling factor 1 g+ s . 1 Calculate the survival generation count of individual x; Calculate the scaling factor of the covariance matrix for individual x, Where mean g return adaptive scaling factor 1 s g+ .

Figure 1 .
Figure 1.The dynamics of average and standard deviations of IGD values across the GLT test suitesIn order to evaluate the search efficiency of MEDCA and benchmark algorithms, Fig.1presents the variation in the average IGD measurement value of the search population across 31 independent runs of the GLT test set for seven algorithms as generations evolve.Based on the observations from Fig.1, it is evident that MEDCA consistently achieves the lowest average IGD measurement value with a minimal standard deviation when tackling the GLT test set.Compared to the six benchmark algorithms, MEDCA demonstrates the best convergence effect on the GLT test set, especially on GLT3, GLT5, and GLT6, indicating that MEDCA can effectively balance development and exploration in the search process and exhibit high search efficiency on the GLT test set.Specifically, in solving the GLT test set, MEARS can quickly converge to an approximation front with good convergence and distribution properties compared to SMS-EMOA, IM-MOEA, RM-MEDA, and NSGA-II.Specifically, on GLT1, GLT4, and GLT5, the convergence speed of MEDCA is slightly slower than MOEA/D-CMA in the early stage of evolutionary search, but MEDCA achieves a smaller average IGD measurement value than MOEA/DCMA in the end.On GLT2, MEDCA, RM-MEDA, and SMEA perform similarly, but MEDCA exhibits more stable performance.In conclusion, MEDCA has the best search efficiency on the GLT test set.

Table 2 .
Average and Standard Deviation Values of IGD and HV Metrics on GLT Test Suites.