Genetic Algorithms

1. Introduction

A genetic algorithm (GA) is a stochastic search technique based on the principles of biological evolution, natural selection, and genetic recombination, simulating Òsurvival of the fittestÓ in a population of potential solutions or individuals. GAs are capable of globally exploring a solution space, pursuing potentially fruitful paths while also examining random points to reduce the likelihood of settling for a local optimum [Goldberg, 1989]. They are conceptually simple yet computationally powerful, making them attractive for use in complex domains, and have been demonstrated on a wide variety of problems. We begin by describing the basic genetic algorithm framework and introducing the vocabulary of the field. In subsequent sections we examine in more detail the issues which govern genetic algorithm design decisions and the trade-offs which have given rise to variations on the basic algorithm.

2. The Basic Genetic Algorithm

All genetic algorithms begin with the premise that all of the information required to specify the evolved system can be encoded using a string of characters from a fixed-length alphabet. This string is referred to as a chromosome. Sites on the chromosome corresponding to specific characteristics of the encoded system are called genes, and may assume a set of possible values, or alleles. The instantiation of a string is its genotype, and the behavior pattern exhibited or expressed by the genotype is its phenotype. By analogy, each genotype represents a particular point in the solution space of the function being optimized. Niches are subdomains of the search space, and species are individuals with a common characteristic or set of characteristics.

 Figure 1. The Basic Genetic Algorithm

The genetic algorithm begins with a population of strings generated either randomly or from some set of known specimens, and cycles through three stepsÐevaluation, selection, and reproduction. Each string is evaluated according to a given performance criterion or fitness function, and assigned a fitness score. The fitness function must be designed such that the fitness score of an individual or group of individuals moves toward an extremum as its performance on a given task improves. Once all of the individuals have been assigned a fitness score, a decision must be made as to which individuals will be permitted to produce offspring and with what probabilityÐthe selection step. One selection strategy, for instance, is for all individuals to reproduce with probability proportionate to their fitness score, and mating pairs would be selected randomly on this basis. In the third step, the reproduction of a pair of strings proceeds by copying bits from one string until a randomly triggered crossover point, after which bits are copied from the other string. As each bit is copied, there is also the probability that a mutation will occur. The most common mutation is the inversion of a bit, however, any function, including adding or deleting bits, can be labeled a mutation. An implicit component of reproduction is replacementÐthe determination of which members of the current population are forced to perish in order to make room for their offspring.

 

Figure 2. Genetic Operators

The cycle of evaluation, selection, and reproduction continues for a predetermined number of generations, or until an acceptable performance level is achieved. In addition, individuals in a population will become increasingly similar with each generation, converging upon a specific portion of the solution space due to the effects of repeated reproduction over a diminishing region. As the population becomes more homogeneous, the genetic operators become less effective. Therefore, a termination condition based upon the homogeneity of the population may also be specified.

Variations on the basic genetic algorithm differ in the details of how each of these steps are implemented. The specifics of a particular genetic algorithm implementation should be dictated by the nature of the given problemÐa technique that works well in one situation may not be as appropriate in another. Since the components are essentially modular, each can be selected individually and assembled into the overall algorithm. In addition, hybrid methodologies (e.g. evolving a population broadly into a number of species, then further evolving each species more specifically separately) or dynamic algorithms (e.g. gradually changing the crossover and mutation probabilities) may be employed as required to solve the problem.

3. String Encoding

The two most critical factors in selecting a string encoding scheme are the range and granularity of each system component or parameter. Encoding a parameter in eight bits means that it can only assume 256 different values, regardless of the range. The system designer must assess the sensitivity of each parameter, and allocate enough bits to adequately cover the range to the desired sensitivity. Although lengthening the representation of a parameter increases the dimensionality of the search space and may make acceptable solutions statistically harder to find, we have found that this is often not the case, and that a solution may actually be converged upon more quickly. Of course, longer strings are also more computationally expensive to evolve, and the time required to complete the evolution should also be considered when designing the representation. In certain applications the length of the strings may be permitted to change during evolution, and all of the strings need not be the same length.

Typically, the system parameters are encoded into a string as the concatenation of their binary representations. Alternative representations include Gray codes which guarantee that similar strings represent similar systems, or codes representing specific actions (e.g., the bit strings 00, 01, 10, and 11 might represent Òstop,Ó Òturn right,Ó Òturn left,Ó and Ògo straight,Ó respectively). Floating-point numbers or integers may be used as parameters instead of bit strings when a high degree of accuracy is required. Trees and tables are often selected to solve combinatorial optimization problems, data structures encoding conditional (IF-THEN-ELSE) information to facilitate expert system evolution, and tree structures to represent computer programs (genetic programming). Many of these techniques require specialized crossover and mutation operators, and are particularly useful when applied to specific problem domains.

4. Evaluation

Systems produced using genetic algorithms are sensitive to the selection pressures imposed by the system designer. The range and distribution of the test cases must be representative of those encountered by the evolved system. If the test cases are unevenly distributed, then the evolved systems will fail to learn the outliers correctly. It is generally preferable to maintain a consistent set of test cases throughout evolution so that the optimal distribution of test cases can be preserved.

The performance criterion selected is probably the most important factor in successful evolution. It is usually best to measure the performance of an individual with respect to a single measure. If performance is predicated on more than one measure, their relative weights must be carefully balanced. Individuals that perform well on one criterion, to the exclusion of all others, may be favored if their performance exceeds that of the more Òwell roundedÓ individuals. Eventually, sensitivity to all but that single criterion may be selected out of the population, and many times, the result is unintended. All other things being equal, genetic algorithms will tend to take the Òpath of least resistance,Ó evolving the most superficial traits first. For example, in one of the authorsÕ early experiments, individuals were awarded a bonus that was inversely proportional to the length of the string. Within two generations, this bonus superseded the performance of the individuals on the actual problem and the result was a population of single-bit individuals, none of whom could solve the problem. Genetic algorithms tend to implement systems exactly as mandated by the evaluation functionÐeven when that is not necessarily what is expected.

Finally, fitness value scaling must be considered. For instance, with a range of payoff values [0, 100], the population could very quickly converge to a point where most individuals score in the range [99, 100]. At this point, the selection differentials will not be sufficient to select between individuals. Therefore, a payoff incentive must be provided to exploit such differentials. One way to accomplish this is to select individuals based on relative fitness within the population rather than the absolute raw fitness scores.

5. Test Fitness Functions

While the true test of a genetic algorithm is its performance with respect to a given problem domain, it is possible to evaluate a methodology using a set of test fitness functions. While there is no ÒstandardÓ set of functions, the following five functions are commonly used to characterize the behavior of a genetic algorithm, to highlight its computational strengths and weaknesses, and to attempt to determine its appropriateness to a given problem domain. Functions F1(x) and F2(x) are shown in Figure 3 and are defined by:

F1(x) = sin6(5.1px + 0.5)

F2(x) = exp-4(ln2)(x-0.0667)2/0.64 sin6(5.1px + 0.5)

Figure 3. Five-Optima Sine Functions

Function F3(x), called ÒShekelÕs FoxholesÓ (Figure 4) has twenty-five optima. The peaks are located on the grid intersections formed by the values x = [32, 16, 0, -16, -32] and y = [32, 16, 0, -16, -32]. All 25 peaks have the same base width. The height of peak i is given by the expression 0.002 + 1/i, where the peaks are located at (32, 32), (16, 32), ..., (-32, -32).

Figure 4. ShekelÕs Foxholes (Inverted)

Function F4(x, y), on the left side of Figure 5, contains two global optima with the same height and width, but located far apart. Function F5(x, y), the sample on the right, contains five optima with height, width, and location chosen at random in every run. Both of these functions are defined by:

 where p indicates the number of peaks in the function, (Xi, Yi) the coordinates of peak i, Ai the height of peak i, and Wi the width of the base of peak i. Table 1 summarizes the parameters for functions F4 and F5 shown in Figure 5.

Of course, we cannot, in general, assume or conclude that a genetic algorithm will perform well by using the results of a different problem and it is usually best to select a test function that is related to the problem domain and that tests the genetic algorithm in extreme cases. The central idea behind the use of these functions is to determine whether a genetic algorithm method will tend to converge to a single local optimum, to the global optimum, or to several global optima simultaneously. If the objective is to design a genetic algorithm to find all of the peaks (or the k-best peaks), then it is best to select a test function having many peaks, each having a different height and each located at a random location. Conversely, if the global maximum is desired, the test function should be designed to have a single large peak surrounded by several ÒteaserÓ peaks that would attract local convergence.

Table 1: Parameters for functions F4 and F5. Figure 5. Functions F4(x, y) and F5(x, y)

6. Premature Convergence

As a stochastic search method, successful genetic algorithms must maintain a balance between the exploration of a wide area of the solution space so as to avoid premature convergence, and the exploitation of beneficial aspects of existing solutions in order to improve them. If weighted too far in favor of exploration, the algorithm essentially jumps randomly around the solution space. If weighted too far in favor of exploitation, the population becomes homogeneous and the genetic search degenerates to hill-climbing. Convergence and homogeneity are not typically problems in simulations that last only a few generations (up to 50 or 100), however in simulations which last hundreds or thousands of generations these issues become critical to the success of the experiment and must be considered in the design of the genetic algorithm.

One way to overcome this problem is to enforce constraints on reproduction which promote diversity. For instance, a string might be added to the population only if it differs from all other strings by some minimum distance metric, or individuals that are very similar may be prohibited from mating. The primary drawback of these approaches is that they significantly increase the time required to produce each generation. One encouraging approach is the use of crowding selection to encourage niching and speciation [Cede–o and Vemuri, 1995]. In this way, GAs can facilitate convergence to more than one extremum in a multi-modal search space.

7. Selection

Once the population has been evaluated, individuals must be selected to produce the subsequent generation. The most common techniques are variations of fitness proportionate reproduction (FPR) or roulette wheel selection, where all of the individuals in the population are eligible for reproduction and the probability with which a string is selected for reproduction is proportional to its fitness score. The most notable drawback of FPR is that as the population converges, fitness score differentials may not be sufficient to exert the necessary selection pressures to continue improvement. Nevertheless, FPR is the basis for the selection process in most GA applications.

In ranking selection, the individuals in the population are sorted from best to worst according to their fitness values. Then, a number generated by a non-increasing function is assigned to each sequentially. Finally, FPR is performed according to that assigned number. Ranking selection overcomes the fitness score differential problem by calculating probabilities based upon the relative performance of each individual within the population.

Another variation of FPR is tournament selection. Groups of individuals are chosen randomly from the population. The best-scoring individual from each group is selected to reproduce with the ÒwinnerÓ from another randomly selected group. This process is repeated until an entire new generation is produced. This method is closely related to ranking selection in that the number of times that an individual is selected for reproduction (i.e. wins a tournament) is roughly proportional to its rank in the population.

The most significant drawback of selection techniques based on FPR is the possibility of premature convergence. Once a candidate with an above average fitness score is identified, the FPR rule begins to favor that candidate from generation to generation, unless a better candidate emerges through crossovers and mutations. This means that the FPR simply assigns an exponential number of mating chances to those members of the population that exhibit above average survival rates. This is analogous to the process of convergence to a local minimum from an initial guess in the neighborhood of that minimum. These techniques are therefore best used for quickly locating one peak and converging toward it; however, they will often fail to thoroughly explore a complex search space with multiple peaks.

One selection technique which does work well for multi-modal search is crowding selection. In this method, every individual in a population has the same probability of reproducing, but its mate is selected as the individual to which it is most similar (as measured by phenotypic distance or performance, rather than genotypic distance or bit-wise similarity) in a randomly selected subset of the population. Therefore, individuals from the same niche are most likely to mate, although on occasion matings between individuals from different niches occurs also. Here, the fitness function is not used to determine whether or how often, but rather to whom an individual mates.

An elitist strategy which automatically singles out the highest scoring strings and copies them directly into the next generation may be combined with any of the above selection techniques to achieve strictly non-decreasing performance maxima throughout the genetic search. This also guarantees that the highest scoring instance in the most recent generation is the best solution found by the genetic search (assuming that the performance criteria do not change from generation to generation).

8. Replacement

In implementations where individuals can only reproduce once, the pair of generated offspring usually replaces its parents. However, when an individual can participate in the generation of several offspring, it is most convenient to produce the entire next generation in a temporary buffer and then replace the current generation all at once. Other variations replace individuals with the lowest scores or the individuals that are most similar to each other. In some GA implementations where fitness is determined by the number of similar individuals rather than the fitness score of any given individual, a mating pair may be replaced by a large number of offspring.

9. Genetic Parameters

Besides the selection strategy, an acceptable balance between exploration and exploitation can also be accomplished through appropriate genetic parameter settings (population size, crossover, and mutation rates). Unfortunately, there is no way to definitively specify genetic parameters. Rather, they are usually set to Òtraditionally accepted valuesÓ or adjusted by trial and error based upon the experience of the system designer.

The first systematic study of genetic parameters was performed by DeJong [1975]. In his classic experiment, he applied genetic algorithms using various parameter settings to several tasks and determined an ÒacceptableÓ set of parameters which is still used by many researchers today: crossover probability 60%, mutation probability 0.1%, and population size of 50 to 100. While these values are typically taken as the starting point for genetic experiments, they are often tuned by trial and error to improve performance, and some studies have found them to be inadequate. This should not be surprising since each parameter is responsible for controlling some aspect of the genetic search and the relative degrees of exploration and exploitation required may differ depending the topology of the solution space. Several other attempts have been made to formalize the relationship between population size, crossover rate, and mutation rate, but none have been consistently successful in practice. Ultimately, it is left to the individual system designer to set the genetic parameters since each selection involves tradeoffs between competing (and often incompatible) requirements and that success depends upon striking a balance between these factors. In fact, simulations which succeed quickly with the appropriate genetic parameter settings often fail completely when specified even slightly differently.

Some general trends, however, have been identified and a few generalizations can be made. Genetic search tends to improve with population size as long as diversity is maintained. In small populations, combining a high mutation rate with an elitist strategy tends to produce wide exploration while still ÒrememberingÓ acceptable solutions. In most cases, the probability of mutation remains constant across all bits and over all generations. However, increasing the probability of mutation in later generations can sometimes overcome the effects of premature convergence, and varying the probability within each string may permit selected exploration of specific parts of the genome while leaving intact other portions of the string, possibly identified as being more successful. Overall, crossover plays an important role when construction and survival are required for good performanceÑwhen the population is diverse near the beginning of the simulation. Mutation, then, becomes more important later in the evolutionary process when a fairly homogeneous population seeks to make incremental improvements. In our experiments [Cooper, 1994] we have found success in varying both the crossover and mutation rates within a predefined range. Another approach that has received recent attention is the evolution of the genetic parameters along with each individual in the population. In other words, each string would be prefaced with bits that encode each of the genetic parameters for that individual.

10. References

Cedeno, W. and V. Vemuri (1994). Multi-niche crowding in genetic algorithms and its application to the assembly of DNA restriction fragments, Evolutionary Computation, Vol. 2, No. 4, pp 321-345. MIT Press.

Cooper, M.G. (1994). Genetic Design of Rule-Based Fuzzy Controllers. PhD dissertation, Department of Computer Science, University of California, Los Angeles.

DeJong, K.A. (1975). Analysis of the Behavior of a Class of Genetic Adaptive Systems. PhD dissertation, Department of Computer and Communication Sciences, University of Michigan.

Goldberg, D.E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley.

 


rvemuri@ucdavis.edu
Thursday the 15th, August 1996