Genetic Optimization / Maximization
This is part two in a series of articles on Genetic Optimization / Maximization. Please visit part1 to get up to speed.
What is it?
Genetic based optimization uses the principals of survival of the fittest in order to determine the ideal parameters. This happens by first creating a random population of parameter sets (or starting with a seed population), and evaluating each parameter set’s performance. The algorithm then “mates” the highest performing parameter sets together, and includes the genetic concepts of mutation and crossover to create “children”. Wikipedia does a great job of covering what Genetic Algorithms are, so I won’t repeat the theory, but will dive into their application to a rules based system.
The basic components of Genetic Algorithm are:
- Encoding & Mating
- The Fitness Function
- Training
I will attack each of these in a different article, and begin here with encoding & mating.
Applying the Genetic Algorithm to our Rules
The basic concept here is that we have to “encode” our parameters in such a way that we can apply crossover and mutation during each mating. This will depend largely on the number of parameters you are tuning for, but there are really only two options: binary encoding or chain encoding.
Binary Encoding
Binary encoding is when you convert all of your parameters to their raw binary form and string them together to effectively create a genome for your parameter set. In our example above, we are using two double types, which encoded into binary form would consist of 128 bits of data. Thus to create the random seed population, a uniform distribution is used to randomly select each of the 128 bits of data. The data is then read back as doubles and each parameter set is tested, and their results are fed into the fitness function to select the parents of the next generation.
The nice part of binary encodings is that they are very easily mated. Once the two parents have been selected (I will cover the fitness function below) the mating is performed with cross over, effectively selecting groupings from each parent, and then with mutation, randomly changing some of the children’s bits. The algorithm for this is pseudocoded below.
Function bit[] BitEncodeMate(bit[] mom, bit[] dad,
double crossoverPercent, double mutationPercent)
{
bit[] child = new bit[mom.length];
bool fromMom = true;
for(int ix = 0; ix < mom.length; ix++)
{
if(RandDouble(0,1) < crossoverPercent)
fromMom = !fromMom;
if(fromMom)
child[ix] = mom[ix];
else
child[ix] = dad[ix];
if(RandDouble(0,1) < mutationPercent)
child[ix] = !child[ix];
}
return child;
}
As you can see, this function is a very simplistic model to mate two parents. I introduced two variables, crossoverPercent, and mutationPercent. These variables are critical to the performance of the genetic algorithm, and are the topic for another blog post. Suffice it to say I am using a simple uniform distribution model (RandDouble), and crossoverPercent will drive how many crossovers occur, and the mutation percent should be quite low (think 5% or below) in order to provide good coverage, but still to converge.
The trouble with the bitwise encoding is that you cannot bound your variables easily. This encoding will randomly cover the entire range of the double variables, including infinity and several uninterpretable(NaN) values. It could be bounds checked after a mate to check validity and simply constrain all of the children to values within acceptable ranges, but as you can see this allows you very little control over the distribution of each parameter.
To summarize, bit encoding provides a very easy way to encode binary data and allows for crossover to occur despite a small parameter space. Unfortunately, bit encoding provides very little control over the ranges of the parameters, nor the distribution that the mutations will take. Despite this, it is still the most common form of encoding, and works much better for integers than for floating point numbers. You can thus constrain the number of bits for a particular integer to constrain its range, but cannot control its distribution.
Chain Encoding
Chain encoding works best for a very high parameter space where large speed gains may be had with bounding each parameter. This is the more typical case for rules based engines. While the function I provided above only has two parameters, it is often much more common to have 10 to 20 parameters which need tuning.
Here the parameter sets are simply encoded as an array of doubles (or whatever datatypes you want), and the crossover operation is taken by parameter rather than by bit. The big advantage now is that mutation can be carefully controlled. If you are able to provide a close starting point, the algorithm may use a normal probability distribution to mutate about the current value. You may also continue to operate on a uniform distribution, and simply constrain each parameter to its range. The following is the pseudocode for a function which mates two parents that are encoded with chain encoding, and whose mutations are constrained to a range.
Function double[] ChainEncodeMate(double[] mom, double[] dad,
double crossoverPercent, double mutationPercent,
double[] maxRange, double[] minRange)
{
double[] child = new double[mom.length];
bool fromMom = true;
for(int ix = 0; ix < mom.length; ix++)
{
if(RandDouble(0,1) < crossoverPercent)
fromMom = !fromMom;
if(fromMom)
child[ix] = mom[ix];
else
child[ix] = dad[ix];
if(RandDouble(0,1) < mutationPercent)
child[ix] = RandDouble(minRange[ix], maxRange[ix]);
if(child[ix] < minRange[ix])
child[ix] = minRange[ix];
if(child[ix] > maxRange[ix])
child[ix] = maxRange[ix];
}
return child;
}
Clearly this code is very similar to that used for binary encoding, but the power to control each parameter can make a dramatic effect on run time (again, this will be analyzed in a later article). This algorithm would have some trouble with narrow parameter scopes though, as in the example above, everytime a parent mates, effectively the child would very closely resemble the parents, or simply conduct a random walk of the parameter space. It is again also important to control the mutationPercent, while we are now in control of the distribution, we do not want to modify all of the parameters very often. As much as possible, we would prefer to allow the mutation to happen in one dimension, and to let the parental variability converge to the solution.
The above chart illustrates the results of 1,674 generations of running against a rules system for one of seven parameters. As you can see, the mutations caused the parameter to be exposed over a large part of its given range (0 to 100) while generally sticking with values it has found to be profitable (hence the vertical piers).
That’s all for encoding and Mating – next up we will cover the fitness function!
Recent Comments