Recent Articles



































Genetic algorithm



         


A genetic algorithm (GA) is an algorithm used to find approximate solutions to difficult-to-solve problems through application of the principles of evolutionary biology to computer science. Genetic algorithms use biologically-derived techniques such as inheritance, mutation, natural selection, and recombination. Genetic algorithms are a particular class of evolutionary algorithms.

Genetic algorithms are typically implemented as a computer simulation in which a population of abstract representations (called chromosomes) of candidate solutions (called individuals) to an optimization problem evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but different encodings are also possible. The evolution starts from a population of completely random individuals and happens in generations. In each generation, multiple individuals are stochastically selected from the current population, modified (mutated or recombined) to form a new population, which becomes current in the next iteration of the algorithm.

[Top]

Operation of a GA

The problem to be solved is represented by a list of parameters which can be used to drive an evaluation procedure, called chromosomes or genomes. Chromosomes are typically represented as simple strings of data and instructions, in a manner not unlike instructions for a von Neumann machine, although a wide variety of other data structures for storing chromosomes have also been tested, with varying degrees of success in different problem domains.

Initially several such parameter lists or chromosomes are generated. This may be totally random, or the programmer may seed the gene pool with "hints" to form an initial pool of possible solutions. This is called the first generation pool.

During each successive generation, each organism (or individual) is evaluated, and a value of goodness or fitness is returned by a fitness function. The pool is sorted, with those having better fitness (representing better solutions to the problem) ranked at the top. Notice that "better" in this context is relative, as initial solutions are all likely to be rather poor.

The next step is to generate a second generation pool of organisms, which is done using any or all of the genetic operators: selection, crossover (or recombination), and mutation. A pair of organisms are selected for breeding. Selection is biased towards elements of the initial generation which have better fitness, though it is usually not so biased that poorer elements have no chance to participate, in order to prevent the solution set from converging too early to a sub-optimal or local solution. There are several well-defined organism selection methods; roulette wheel selection and tournament selection are popular methods.

Following selection, the crossover (or recombination) operation is performed upon the selected chromosomes. Most genetic algorithms will have a single tweakable probability of crossover (Pc), typically between 0.6 and 1.0, which encodes the probability that two selected organisms will actually breed. A random number between 0 and 1 is generated, and if it falls under the crossover threshold, the organisms are mated; otherwise, they are propagated into the next generation unchanged. Crossover results in two new child chromosomes, which are added to the second generation pool. The chromosomes of the parents are mixed in some way during crossover, typically by simply swapping a portion of the underlying data structure (although other, more complex merging mechanisms have proved useful for certain types of problems.) This process is repeated with different parent organisms until there are an appropriate number of candidate solutions in the second generation pool.

The next step is to mutate the newly created offspring. Typical genetic algorithms have a fixed, very small probability of mutation (Pm) of perhaps 0.01 or less. A random number between 0 and 1 is generated; if it falls within the Pm range, the new child organism's chromosome is randomly mutated in some way, typically by simply randomly altering bits in the chromosome data structure.

These processes ultimately result in a second generation pool of chromosomes that is different from the initial generation. Generally the average degree of fitness will have increased by this procedure for the second generation pool, since only the best organisms from the first generation are selected for breeding. The entire process is repeated for this second generation: each organism in the second generation pool is then evaluated, the fitness value for each organism is obtained, pairs are selected for breeding, a third generation pool is generated, etc. The process is repeated until an organism is produced which gives a solution that is "good enough".

A slight variant of this method of pool generation is to allow some of the better organisms from the first generation to carry over to the second, unaltered. This form of genetic algorithm is known as an elite selection strategy.


[Top]

Observations

There are several general observations about the generation of solutions via a genetic algorithm:

[Top]

Variants

The simplest algorithm represents each chromosome as a bit string. Typically, numeric parameters can be represented by integers, though it is possible to use floating point representations. The basic algorithm performs crossover and mutation at the bit level.

Other variants treat the chromosome as a list of numbers which are indexes into an instruction table, nodes in a linked list, hashes, objects, or any other imaginable data structure. Crossover and mutation are performed so as to respect data element boundaries. For most data types, specific variation operators can be designed. Different chromosomal data types seem to work better or worse for different specific problem domains.

There have also been attempts to introduce other evolutionary operations such as movement of genes, in the manner of transposons. These movements change the schema of the chromosome making effects of linkage visible.

There are also parallel implementations of genetic algorithms that use computers as 'islands' and implement migrations of populations from one computer to another over a network.

Some other variants introduce a variable fitness function. In classical genetic algorithms, the fitness function is unchanged over time. In simulated annealing the fitness function is changed over time and in artificial life, the fitness of any individual is affected by all individuals in the population with which it interacts.

[Top]

Efficiency

Genetic algorithms are known to produce good results for some problems. Their major disadvantage is that they are relatively slow, being very computationally intensive compared to other methods, such as random optimization.

Recent speed improvements have focused on speciation, where crossover can only occur if individuals are closely-enough related.

Genetic algorithms are extremely easy to adapt to parallel computing and clustering environments. One method simply treats each node as a parallel population. Organisms are then migrated from one pool to another according to various propagation techniques.

Another method, the Farmer/worker architecture, designates one node the farmer, responsible for organism selection and fitness assignment, and the other nodes as workers, responsible for recombination, mutation, and function evaluation.

[Top]

Problem domains

Problems which appear to be particularly appropriate for solution by genetic algorithms include timetabling and scheduling problems, and many scheduling software packages are based on GAs. GAs have also been applied to engineering. Genetic algorithms are often applied as an approach to solve global optimization problems. Genetic algorithms have been successfully applied to the study of neurological evolution (see NeuroEvolution by Augmented Topologies).

As a general rule of thumb genetic algorithms might be useful in problem domains that have a complex fitness landscape as recombination is designed to move the population away from local minima that a traditional hill climbing algorithm might get stuck in.

[Top]

History

John Holland was the pioneering founder of much of today's work in genetic algorithms, which has moved on from a purely theoretical subject (though based on computer modelling) to provide methods which can be used to actually solve some difficult problems today.

[Top]

Pseudo-code Algorithm

Choose initial population Evaluate each individual's fitness Repeat Select best-ranking individuals to reproduce Mate pairs at random Apply crossover operator Apply mutation operator Evaluate each individual's fitness Until terminating condition (see below)

Terminating conditions often include:

[Top]

Applications

[Top]

Related techniques

Genetic programming is a related technique developed by John Koza, in which computer programs, rather than function parameters, are optimised. Genetic programming often uses tree-based internal data structures to represent the computer programs for adaptation instead of the list, or array, structures typical of genetic algorithms. Genetic programming algorithms typically require running time that is orders of magnitude greater than that for genetic algorithms, but they may be suitable for problems that are intractable with genetic algorithms.

Interactive genetic algorithms are genetic algorithms that use human evaluation. They are usually applied to domains where it is hard to design a computational fitness function, for example, evolving images, music, artistic designs and forms to fit user's aesthetic preference.

[Top]

See also

artificial life, automatic label placement, bio-inspired computing, combinatorial optimization, eight queens puzzle, Evolutionary art, Evolutionary computation, fitness landscape, genetic programming, Human-based genetic algorithm, Interactive evolutionary computation, Interactive genetic algorithm, local search, meta heuristics, neuroevolution, NEAT, simulated annealing, tabu search.

[Top]

References

[Top]




  View Live Article   This article is from Wikipedia. All text is available under the terms of the GNU Free Documentation License