Archive for the 'auto trading system' Category

Trading System Framework

The core of our architecture rests on a universal trading system framework. This framework abstracts all of the basic market interfaces, allowing us to write generic strategies that run on any market, including simulation.

2010-11-22 Trading System Framework

As you can see in the above central box, our trading system abstracts several core functionalities.

  • Settings Management – the entire trading system is configured via a straightforward xml configuration file. The actual storage and management of this is abstracted by the particular profile. For live running, these settings are version controlled and managed in a central replicated sql database. For simulation, these are stored as a simple file provided to a console based simulator. For optimization purposes, these files serve as the basis for chromosomes in the genetic optimizer (with an optimization file providing the constraints for the search space). At the end of the day, develop a simple generic settings management system that can be abstracted for different targets.
  • Contract Manager / Base Contract – The core component of any system is the instrument that you are trading / measuring. The contract manager provides position management and risk management abstractions, as well as contract locating functionalities. Ultimately any object that requires a contract, goes through the contract manager, and is given an abstraction of a base contract. The base contract can be a futures contract, equity, bond etc. This provides for a universal interface to subscribe to market data, and issue / monitor orders.
  • Strategy Engine / Base Strategies – The strategy engine is the very heart of any trading system. This basic class subscribes to message pumps and processes the messages to handle orders. It is the most versatile object in the trading system, allowing for nearly any type of strategy.
  • Charting – Few systems put enough emphasis on thorough charting, but I find it critical for visualizing the results of a simulation, as well as determining what is happening during live trading. All contracts and strategies implement a simple IChartable interface that allows them to output highly configurable charts, right down to the Graphics handles. This allows the charts to be presented in a live windows forms view, or painted to a Bitmap class for saving to disk.
  • Logging – At the end of the day, traceability is critical. Every trade made needs to be serialized to disk / database in order to reconcile with your clearing house. Furthermore, every strategy needs to output useful tracing information to aid in debugging. Beyond the obvious tracing, strategies also need to implement a reporting interface to provide live state information to the user interface in order to determine how it is behaving, and if necessary to modify its parameter set, or to debug the strategy. This again is abstracted, just like settings and charting to go to different destinations based on the target of the trading engine. For simulation it outputs to the simulation results, whereas in live trading we work against easily queried database engines.

Next up I want to cut into application design and multithreading. There is a lot to cover, and I am swamped, so expect the articles to continue to appear as I have time. And if you have any questions feel free to email email hidden; JavaScript is required.

Six Pillars of Automated Trading

There are six major components to an automated trading system.2010-11-04 Automated Trading Overview

  • Live Trading Engine – Any given system will start with the live trading engine. This is the piece of software which runs in real time and actually places orders and reacts to market data.
  • Simulation Engine – When developing strategies, you often need to back test them. In an ideal world back testing would demonstrate profitability, but in reality it is just used to verify that your strategy does what you think it does. The key to a good simulation engine is that you run the exact same code in simulation as you do in production. I can’t understate that last sentence, so I’ll state it again – the key to a good simulation engine is that you run the exact same code in simulation as you do in production.
  • Historical Service – this runs hand in hand with the simulation engine. You need a tick database for simulation. This is the backbone of all research applications, from back testing strategies to developing market models, you need a thorough, indexed, tick database. You can also build bar data from ticks, but you better have ticks available for simulation.
  • Optimization Engine – All of your automated strategies require parameterization. Generally speaking these are best optimized by hand through selection of sensible variables. Sometimes however, you need to parameterize a simple strategy for a large number of symbols, in which case you want an automated system for optimization. Our system uses a cloud computing service to distribute instances of our simulation engine which run chromosomes from a centralized genetic optimization engine.
  • Analytics – You need to ruthlessly track your trading performance. At the core of any solid trading engine is a solid analytics engine which tracks your various strategies.
  • Reconciler – This was the biggest surprise coming from retail brokers to institutional brokers, but everyone makes mistakes. Sometimes the exchange will fail to tell your clearing house about trades you made, other times your clearing house will accidentally include another clients trades in your account. At the end of every day you need to reconcile every fill you think you made with the statements you receive from your clearing house and immediately reconcile any errors with your clearing house and the exchange.

Next up, I will cover the major components of the Trading Engine.

Black Box Development

In late 2008/early 2009 I made the transition from full time engineering to full time Black Box trading software and strategy development. The past several months have certainly been exciting times in the financial markets, and proven to be very good for automated strategies.

I will still be maintaining the IbAPI open source library (just saw IB posted a 9.62 beta), and if anything will be more responsive now.

I am also always interested in discussing interesting opportunities, so please continue to drop me a line at email hidden; JavaScript is required

Good Trading!

-Karl

*.*.*2 Bug Fix Release

Interactive Broker’s specification for "m_right" is

String m_right

Specifies a Put or Call. Valid values are: P, PUT, C, CALL.

 

I chose to make the RightType enumeration translate to "PUT" and "CALL". A bug report from the yahoo forums illustrated that this is no longer true, and that "P" and "C" are the only accepted values.

Please download the bug fix versions

9.2.0.2 and 9.1.0.2

Both also available under utilities.

Genetic Optimization and Maximization – Fitness Function

This is part 3 in a series on Genetic Optimization, please visit part 1 and part 2 to catch up.

What Does the Fitness Function Do?

The fitness function is the basis of the “survival of the fittest” premise of genetic algorithms. It is responsible for evaluating the parameter set, and choosing which parameter sets mate. The most difficult part of the fitness function is designing the function to produce parameters that are reliable and effective on data outside of the training set.

It helps to consider nature’s fitness function, we are the result of millions of years of genetic optimization, yet do not retain the brawn of a gorilla, nor the size of a sauropods (dinosaur that weighed 209 tons), nor the predatorial skills of a Tyrannosaurus. A genetic function does not just optimize for the strongest creature, but for the creature that can survive and thrive in all circumstances. Dinosaurs were clearly at the top of the food chain and thriving 65 million years ago, but were easily outlived by insects for their ability to survive the harsh aftermath of the Cretaceous-Tertiary extinction event. (Can you tell I have been researching a lot about dinosaurs since starting this blog?).

My point is that you need a fitness function which results in a set of parameters that performs well during a bull run, bear run, and also survives a market crash. A parameter set that makes a fortune on rallies, but bleeds on sideways patterns and reversals is no better than the dinosaurs, ultimately they will perish, taking a lot of your equity with them.

What Makes a Good Fitness Function?

A fitness function can be as simple as the profit generated by running your rules over training data, but this is likely to exploit onetime events in the data, and not to place an emphasis on reliability.

A good fitness function does the following

  • Understands Risk – does not evaluate only profit, but how much capital the rules placed at risk to earn that profit
  • Punishes Losses Heavily – by punishing the parameter set more heavily for losses than profits, you are training it to focus on consistent profits over volatile returns.
  • Punishes High Risk – any rules can earn a lot on a good day by loading up on beta, you want to train your algorithm to seek true alpha.
  • Does not punish zero gains – it is important to let your algorithm learn when to enter the market, and when to stay clear. Providing some incentive to simply not take a loss can be just as important as proving incentives to take a large gain.
  • Run on a reasonable time frame – A fitness function should evaluate each day (or possibly shorter) of sample data on its own, accumulating the results for a particular parameter set.

Following these guidelines the fitness function must rank each parameter set, and select mates.

Mate Selection

Once the parameter sets have been ranked they must undergo selection. The obvious solution would be to only select the top ranked parameters to mate, but this may ignore other minima that lesser parameter sets are exploring.

The chart above illustrates the importance of occasionally exploring lesser ranked parameter sets. The green lines represent the highest ranked parameter sets, but as we can see on the parameter space the red line is at the base of the global minima, while the green lines are just exploring local minima. The best way to allow for this is to select mates with an absolute valued normal distribution. The choice of probability distribution and standard deviation has a large effect on how fast a genetic algorithm converges, an analysis of which will be in a future article. For now the normal distribution proves to be more than adequate.

As you can see the fitness function has a huge impact on the output of your maximization, it defines what the ideal function should do.

Tune in for more Genetic Optimization in Part 4 where I will talk about Training.