Archive for the 'optimization' Category

Trading System Framework

The core of our architecture rests on a universal trading system framework. This framework abstracts all of the basic market interfaces, allowing us to write generic strategies that run on any market, including simulation.

2010-11-22 Trading System Framework

As you can see in the above central box, our trading system abstracts several core functionalities.

  • Settings Management – the entire trading system is configured via a straightforward xml configuration file. The actual storage and management of this is abstracted by the particular profile. For live running, these settings are version controlled and managed in a central replicated sql database. For simulation, these are stored as a simple file provided to a console based simulator. For optimization purposes, these files serve as the basis for chromosomes in the genetic optimizer (with an optimization file providing the constraints for the search space). At the end of the day, develop a simple generic settings management system that can be abstracted for different targets.
  • Contract Manager / Base Contract – The core component of any system is the instrument that you are trading / measuring. The contract manager provides position management and risk management abstractions, as well as contract locating functionalities. Ultimately any object that requires a contract, goes through the contract manager, and is given an abstraction of a base contract. The base contract can be a futures contract, equity, bond etc. This provides for a universal interface to subscribe to market data, and issue / monitor orders.
  • Strategy Engine / Base Strategies – The strategy engine is the very heart of any trading system. This basic class subscribes to message pumps and processes the messages to handle orders. It is the most versatile object in the trading system, allowing for nearly any type of strategy.
  • Charting – Few systems put enough emphasis on thorough charting, but I find it critical for visualizing the results of a simulation, as well as determining what is happening during live trading. All contracts and strategies implement a simple IChartable interface that allows them to output highly configurable charts, right down to the Graphics handles. This allows the charts to be presented in a live windows forms view, or painted to a Bitmap class for saving to disk.
  • Logging – At the end of the day, traceability is critical. Every trade made needs to be serialized to disk / database in order to reconcile with your clearing house. Furthermore, every strategy needs to output useful tracing information to aid in debugging. Beyond the obvious tracing, strategies also need to implement a reporting interface to provide live state information to the user interface in order to determine how it is behaving, and if necessary to modify its parameter set, or to debug the strategy. This again is abstracted, just like settings and charting to go to different destinations based on the target of the trading engine. For simulation it outputs to the simulation results, whereas in live trading we work against easily queried database engines.

Next up I want to cut into application design and multithreading. There is a lot to cover, and I am swamped, so expect the articles to continue to appear as I have time. And if you have any questions feel free to email email hidden; JavaScript is required.

Genetic Optimization and Maximization – Fitness Function

This is part 3 in a series on Genetic Optimization, please visit part 1 and part 2 to catch up.

What Does the Fitness Function Do?

The fitness function is the basis of the “survival of the fittest” premise of genetic algorithms. It is responsible for evaluating the parameter set, and choosing which parameter sets mate. The most difficult part of the fitness function is designing the function to produce parameters that are reliable and effective on data outside of the training set.

It helps to consider nature’s fitness function, we are the result of millions of years of genetic optimization, yet do not retain the brawn of a gorilla, nor the size of a sauropods (dinosaur that weighed 209 tons), nor the predatorial skills of a Tyrannosaurus. A genetic function does not just optimize for the strongest creature, but for the creature that can survive and thrive in all circumstances. Dinosaurs were clearly at the top of the food chain and thriving 65 million years ago, but were easily outlived by insects for their ability to survive the harsh aftermath of the Cretaceous-Tertiary extinction event. (Can you tell I have been researching a lot about dinosaurs since starting this blog?).

My point is that you need a fitness function which results in a set of parameters that performs well during a bull run, bear run, and also survives a market crash. A parameter set that makes a fortune on rallies, but bleeds on sideways patterns and reversals is no better than the dinosaurs, ultimately they will perish, taking a lot of your equity with them.

What Makes a Good Fitness Function?

A fitness function can be as simple as the profit generated by running your rules over training data, but this is likely to exploit onetime events in the data, and not to place an emphasis on reliability.

A good fitness function does the following

  • Understands Risk – does not evaluate only profit, but how much capital the rules placed at risk to earn that profit
  • Punishes Losses Heavily – by punishing the parameter set more heavily for losses than profits, you are training it to focus on consistent profits over volatile returns.
  • Punishes High Risk – any rules can earn a lot on a good day by loading up on beta, you want to train your algorithm to seek true alpha.
  • Does not punish zero gains – it is important to let your algorithm learn when to enter the market, and when to stay clear. Providing some incentive to simply not take a loss can be just as important as proving incentives to take a large gain.
  • Run on a reasonable time frame – A fitness function should evaluate each day (or possibly shorter) of sample data on its own, accumulating the results for a particular parameter set.

Following these guidelines the fitness function must rank each parameter set, and select mates.

Mate Selection

Once the parameter sets have been ranked they must undergo selection. The obvious solution would be to only select the top ranked parameters to mate, but this may ignore other minima that lesser parameter sets are exploring.

The chart above illustrates the importance of occasionally exploring lesser ranked parameter sets. The green lines represent the highest ranked parameter sets, but as we can see on the parameter space the red line is at the base of the global minima, while the green lines are just exploring local minima. The best way to allow for this is to select mates with an absolute valued normal distribution. The choice of probability distribution and standard deviation has a large effect on how fast a genetic algorithm converges, an analysis of which will be in a future article. For now the normal distribution proves to be more than adequate.

As you can see the fitness function has a huge impact on the output of your maximization, it defines what the ideal function should do.

Tune in for more Genetic Optimization in Part 4 where I will talk about Training.

Genetic Optimization and Maximization- Encoding

Genetic Optimization / Maximization

This is part two in a series of articles on Genetic Optimization / Maximization. Please visit part1 to get up to speed.

What is it?

Genetic based optimization uses the principals of survival of the fittest in order to determine the ideal parameters. This happens by first creating a random population of parameter sets (or starting with a seed population), and evaluating each parameter set’s performance. The algorithm then “mates” the highest performing parameter sets together, and includes the genetic concepts of mutation and crossover to create “children”. Wikipedia does a great job of covering what Genetic Algorithms are, so I won’t repeat the theory, but will dive into their application to a rules based system.

The basic components of Genetic Algorithm are:

  1. Encoding & Mating
  2. The Fitness Function
  3. Training

I will attack each of these in a different article, and begin here with encoding & mating.

Applying the Genetic Algorithm to our Rules

The basic concept here is that we have to “encode” our parameters in such a way that we can apply crossover and mutation during each mating. This will depend largely on the number of parameters you are tuning for, but there are really only two options: binary encoding or chain encoding.

Binary Encoding

Binary encoding is when you convert all of your parameters to their raw binary form and string them together to effectively create a genome for your parameter set. In our example above, we are using two double types, which encoded into binary form would consist of 128 bits of data. Thus to create the random seed population, a uniform distribution is used to randomly select each of the 128 bits of data. The data is then read back as doubles and each parameter set is tested, and their results are fed into the fitness function to select the parents of the next generation.

The nice part of binary encodings is that they are very easily mated. Once the two parents have been selected (I will cover the fitness function below) the mating is performed with cross over, effectively selecting groupings from each parent, and then with mutation, randomly changing some of the children’s bits. The algorithm for this is pseudocoded below.

Function bit[] BitEncodeMate(bit[] mom, bit[] dad,
double crossoverPercent, double mutationPercent)
bit[] child = new bit[mom.length];
bool fromMom = true;
for(int ix = 0; ix < mom.length; ix++)
if(RandDouble(0,1) < crossoverPercent)
fromMom = !fromMom;
child[ix] = mom[ix];
child[ix] = dad[ix];
if(RandDouble(0,1) < mutationPercent)
child[ix] = !child[ix];
return child;

As you can see, this function is a very simplistic model to mate two parents. I introduced two variables, crossoverPercent, and mutationPercent. These variables are critical to the performance of the genetic algorithm, and are the topic for another blog post. Suffice it to say I am using a simple uniform distribution model (RandDouble), and crossoverPercent will drive how many crossovers occur, and the mutation percent should be quite low (think 5% or below) in order to provide good coverage, but still to converge.

The trouble with the bitwise encoding is that you cannot bound your variables easily. This encoding will randomly cover the entire range of the double variables, including infinity and several uninterpretable(NaN) values. It could be bounds checked after a mate to check validity and simply constrain all of the children to values within acceptable ranges, but as you can see this allows you very little control over the distribution of each parameter.

To summarize, bit encoding provides a very easy way to encode binary data and allows for crossover to occur despite a small parameter space. Unfortunately, bit encoding provides very little control over the ranges of the parameters, nor the distribution that the mutations will take. Despite this, it is still the most common form of encoding, and works much better for integers than for floating point numbers. You can thus constrain the number of bits for a particular integer to constrain its range, but cannot control its distribution.

Chain Encoding

Chain encoding works best for a very high parameter space where large speed gains may be had with bounding each parameter. This is the more typical case for rules based engines. While the function I provided above only has two parameters, it is often much more common to have 10 to 20 parameters which need tuning.

Here the parameter sets are simply encoded as an array of doubles (or whatever datatypes you want), and the crossover operation is taken by parameter rather than by bit. The big advantage now is that mutation can be carefully controlled. If you are able to provide a close starting point, the algorithm may use a normal probability distribution to mutate about the current value. You may also continue to operate on a uniform distribution, and simply constrain each parameter to its range. The following is the pseudocode for a function which mates two parents that are encoded with chain encoding, and whose mutations are constrained to a range.

Function double[] ChainEncodeMate(double[] mom, double[] dad,
double crossoverPercent, double mutationPercent,
double[] maxRange, double[] minRange)
double[] child = new double[mom.length];
bool fromMom = true;
for(int ix = 0; ix < mom.length; ix++)
if(RandDouble(0,1) < crossoverPercent)
fromMom = !fromMom;
child[ix] = mom[ix];
child[ix] = dad[ix];
if(RandDouble(0,1) < mutationPercent)
child[ix] = RandDouble(minRange[ix], maxRange[ix]);
if(child[ix] < minRange[ix])
child[ix] = minRange[ix];
if(child[ix] > maxRange[ix])
child[ix] = maxRange[ix];
return child;

Clearly this code is very similar to that used for binary encoding, but the power to control each parameter can make a dramatic effect on run time (again, this will be analyzed in a later article). This algorithm would have some trouble with narrow parameter scopes though, as in the example above, everytime a parent mates, effectively the child would very closely resemble the parents, or simply conduct a random walk of the parameter space. It is again also important to control the mutationPercent, while we are now in control of the distribution, we do not want to modify all of the parameters very often. As much as possible, we would prefer to allow the mutation to happen in one dimension, and to let the parental variability converge to the solution.

The above chart illustrates the results of 1,674 generations of running against a rules system for one of seven parameters. As you can see, the mutations caused the parameter to be exposed over a large part of its given range (0 to 100) while generally sticking with values it has found to be profitable (hence the vertical piers).

That’s all for encoding and Mating – next up we will cover the fitness function!


Rules based Optimization

Before I discuss my second Automated Trading System (ATS), I need to explain the principal in which it operates. A rule based ATS depends on carefully chosen thresholds and parameters to determine when a particular stock should be entered and exited (long or short). Experience and theory can provide an excellent starting point, but to perform really well for a particular stock, it is useful to maximize these parameters on historical data.

Think about your set of rules as though they are a function.

function rules (double shortMATime, double longMATime)
“Enter Long when the shortMATime minute moving average crosses above the longMATime minute moving average.”
“Exit when the longMATime minute moving average crosses above the shortMATime minute moving average.”

In this example, there are two rules which are executed over live or historical data, with two parameters, shortMATime and longMATime. We would like to select values for these parameters such that they would have made the most money over the last week (or any time frame), assuming this represents closely what values will make the most money tomorrow. This function is very difficult to maximize, as it is not continuous. Small adjustments to either parameter can cause huge swings in the profitability of the system.

In this particular case, you may consider running the entire variable space through the function, setting each parameter from 0 to 1000 minutes, incrementing by one second, and taking the maximum output when you are done. This turns out to be roughly (1000 * 60)2 = 3,600,000,000 runs. Assuming a very fast 5 seconds per run for a week’s worth of data, this would take 570.77 years to process. Clearly a better maximization function is needed, but it cannot depend on the derivative of the function, nor can it require continuity of the first order. This is exactly where genetic algorithms shine.

My next series of articles will cover Genetic Optimization in detail. Stay tuned for more updates.