A genetic algorithm application in backcross breeding problem

In this paper we discuss a mathematical model of goat breeding strategy, i.e. the backcrossing breeding. The model is aimed to obtain a strategy in producing better variant of species. In this strategy, a female (doe) of a lesser quality goat, in terms of goat quality is bred with a male (buck) of an exotic goat which has a better goat quality. In this paper we will explore a problem on how to harvest the population optimally. A genetic algorithm (GA) approach will been devised to obtain the solution of the problem. We do several trials of the GA implementation which gives different set of solutions, but relatively close to each other in terms of the resulting total revenue, except a few. Further study need to be done to obtain GA solution that closer to the exact solution.


Introduction
In this paper we discuss a mathematical model of goat breeding strategy, i.e. the backcrossing breeding. The model first appeared in [1] and explored further in [2]. The backcrossing breeding strategy is aimed to obtain a better variant of species. In this strategy, a female (doe) of a lesser quality goat, in terms of goat quality such as meat and milk quality, is bred with a male (buck) of an exotic goat which has a better goat quality (a fullblood). It is a common practice in Indonesia to breed a doe of Kambing Kacang / Capra aegagrus hircus. with an exotic male goat from other variant of the same subspecies (such as Boer goat or Jamnapari goat). The resulting offspring of the first generation is called filial 1 (F1). When the F1 is a female, then she is bred with other male exotic goat of the same variant to produce an F2 female which then subsequently is bred with another male exotic goat of the same variant to produce F3. This process goes on to produce better quality of goat which is often called a purebred goat. It is easy to show that lim Fn  fullblood, in terms of quality as n approaches infinity. In fact the process is faster to follow the sequence 50% (½), 75% (¾), 88% (7/8), 94% (15/16), 97% (31/32), for n = 1,2,3,4,5, respectively, hence the F5 is regarded as a good quality goat (a purebred, which the appearence is very closed to the fullblood). In this paper we adopt the model in [1], which has a schematic diagram as in Figure 1. By refering to Figure 1, suppose that the number of does for each age-class is known at ‫ݐ‬ = 0, with: • ‫ܨ‬ ௧ (0,0) is the number of lesser quality or local does belong to the first age-class; • ‫ܨ‬ ௧ (0,1) is the number of lesser quality or local does belong to the second age-class; • ‫ܨ‬ ௧ (1,0) is the number of hybrid does belong to the first age-class; • ‫ܨ‬ ௧ (1,1) is the number of hybrid does belong to the second age-class. The authors in [1] argued that the number of does at time ‫ݐ‬ = 0 is ‫ܨ‬ ௧ୀ ‫,ܩ(‬ ܶ) = ‫ܨ‬ ௧ (0,0) + ‫ܨ‬ ௧ (0,1) + ‫ܨ‬ ௧ (1,0) + ‫ܨ‬ ௧ (1,1). In a more compact form, this can be written as a vector ୀ ‫,ܩ(‬ ܶ) = ሾ‫ܨ‬ ௧ (0,0), ‫ܨ‬ ௧ (0,1), ‫ܨ‬ ௧ (1,0), ‫ܨ‬ ௧ (1,1)ሿ ் . The vector ‫,ܩ(‬ ܶ) is called the initial distribution vector at time ‫.ݐ‬ Further, by the same argument, ‫ܨ‬ ௧ାଵ ‫,ܩ(‬ ܶ) = ‫ܨ‬ ௧ାଵ (0,0) + ‫ܨ‬ ௧ାଵ (0,1) + ‫ܨ‬ ௧ାଵ (1,0) + ‫ܨ‬ ௧ାଵ (1,1), or more compactly can be written in the vector form ା ‫,ܩ(‬ ܶ) = ሾ‫ܨ‬ ௧ାଵ (0,0), ‫ܨ‬ ௧ାଵ (0,1), ‫ܨ‬ ௧ାଵ (1,0), ‫ܨ‬ ௧ାଵ (1,1)ሿ ் , with: Hence the population growth model can be written in the matrix form as which can also be written in a more compact form ା (, ) = (, ). The model is easy to solve iteratively to give the solution for the next n year is, i.e. ା ‫,ܩ(‬ ܶ) = ‫,ܩ(‬ ܶ).
In this paper we pose a problem on how to harvest the population optimally. If ܳ ଵ୩ , ܳ ଶ୩ , ܳ ଷ୩ , and ܳ ସ୩ kis the numbers of individual form sub population ‫ܨ‬ (0,0), ‫ܨ‬ (0,1), ‫ܨ‬ (1,0), and ‫ܨ‬ (1,1), respectively, then the problem is to maximize the profit Π along the time horizon T, with , and pi is the associated unit price of the sub population i in the profit function above. The discounting factor is represented by ρ. We use genetic algorithm approach [3] to devise the solution of the problem.

Genetic Algorithm
Genetic algorithms (GA) was found by John Holland and developed by David Golberg. GA is numerical optimization algorithms inspired by both natural selection and natural genetics [3]. It is regarded as a heuristic approach. Some advantages of genetic algorithm compared to other optimization methods are: the direct, parallel, and powerful technique which can be used to optimize continuous or discrete variables, free derivative information, simultaneously searches from a large domain of objective function, deals with a large number of variables, optimizes variables with extremely complex objective function, provides several optimum solutions instead of one and so on [1,4,5]. This heuristic approach is applicable to many fields of optimization case, which makes GA attractive for many researchers [6]. Following is the basic algorithm of GA as suggested by Golberg in flowchart of Figure 2.

Genetic Algorithm implemented in Backcross Breeding Matrix
As explained earlier, in this research, our attention is directed to the population growth model given in [1]. In [1] it is explained that ܽ = Upgrading birth rate with ݅ = 0,1,2, … , ݊ ߙ = Non-Upgrading birth rate ݅ = 0,1,2, … , ݊ ܾ = Survival rate with ݅ = 0,1,2, … , ݊ െ 1 ܾ = Survival rate of the last class ߬ = Birth rate of the 1st age class then, our consideration is to solving the following problem, (3) subject to: Our purpose is to solve the maximization problem (3) with constrained (, ) as in (4) using Genetic Algorithm method. We built the GA routine using C++ Program. Following are some remark for the GA program construction.
• The objective function is the function ࢰ with maximization problem • The calculation of fitness for maximization is using ‫ݒ(݈ܽݒ݁‬ ) = ࢰ, ݇ = 1,2, … ܰ; N= Number of population size. We take N=10 for this case Generally, the algorithm genetics processes include the following important steps: encoding, selection, crossover, and mutation ( [8]). The details of the steps are explained in detail in this section.

Pre-step 1: Initialization
We begin the process with initialization as follows:

Step 1: Initial Population
The random process is done to obtain 10 initial populations. This obtained 10 random initial population are in the form of binary number with length 8. The decimal value of each initial population is calculated by substituting the decimal of each substring to formula (5)

Step 2: Chromosome Evaluation
The fitness value of chromosome, in our case is maximization problem, is the value of objective function ߎ. The fitness value of each chromosome is evaluated in this step, in order to see the best value of ߎ (maximum value).

Step 3: Calculation of Population Convergence Percentage
The where ݊ is the most number of same fitness value and ‫‬ is the number of individual. The counter generation is calculated by formula, ܿ݃ = ܿ݃ + 1.

Step 4: Termination Criteria
The genetic algorithm process will be terminated when the counter generation ܿ݃ reach the defined number of generation (݆݃ = 1000) or the convergence of population ‫݇‬ reach the defined threshold level (ߠ = 90%). After initialization of population, the value of ܿ݃ = 0, and if there is no individual with same fitness number, ‫݇‬ = 10% so that the genetics algorithm process is continue.
Step Select a set of 10 chromosome using roulette wheel for 10 times, and get a random numbers ‫ݎ‬ ∈ ሾ0,1ሿ. We apply the following criteria to addresses the position to replace the selected ݇ െ ‫݄ݐ‬ current individual,
Step 6: Crossover The crossover will be done to the new population, in this paper, the Single Point Crossover (SPX) method are used. SPX is the most simple crossover method where a random number are generated in the interval [1,݉] as an intersection point. According to [7] the SPX algorithm is as follows: • Generate a number randomly in interval [1,݉], and use this number as the starting position of a gen in a chromosome which will encounter crossover. Mark alleles which will get crossover (from the starting position until the last gen). • Exchange the marking allele between parent 1 and 2.
Step 7: Mutation Number of gen in one generation is ݉ * ܰ = 8 * 10 = 80. If we want to arrange the mutation probability value equal to ‫,݉‬ then we should have calculate number of mutation (݊݉) as ݊݉ = ݉ * ܰ * ‫݉‬ In this simulation, we use ‫݉‬ = 0.05, then, in the mutation process we generate randomly ݊݉ = 4 number of integer in interval [1, ‫‬ * ݉]. This 4 number is used as locating in which gen will be mutated. The replacement process of allele is 1 and 0, as follows After the mutation process then the calculation in one generation is finished. The next process is to repeat the process in step 2 until the termination criteria is satisfied. Step 8: Decoding Decoding is the process to recode the gens in a chromosome so that the value is return into the value (before encoded).
We run the implemented GA algorithm in C++ for 10 trials with different set of initial population of solutions. We only consider a short time horizon to go, i.e. ܶ = 2. We do 10 trials just to illustrate the method. The results of one of the trial is presented in Table 1. The colums of Table 1  which is automatically taken as the solution, i.e. ൦ ܳ ଵ ܳ ଶ ܳ ଷ ܳ ସ ൪ = 6 9 6 3 . The total revenue from selling these population is 84,545,454 currency unit. Figure 3 shows the plot for the total revenue from selling the goat population according to the GA results for consecutive generations, associated with the consecutive rows in Table 1, for all 10 different initial solutions. The final generation shows that at least 90% of the evolving solutions has already convergent to the same solution. Table 2 shows the results for the whole 10 trials done in the experiment. The table reveals that the GA gives different set of solutions but relatively close to each other in terms of the resulting total revenue, except a few.