Should your backtest be vectorized?
There is a point in most strategic designs and quantitative investments rationales nowadays where you start to realize you need a backtest. This inflection point usually has no turning back, understanding the power of quantitative based decision making can be a game changer in your investment perception and strategy.
When you reach this phase, inevitably you will look among the myriad of trading platforms and backtesting engines that exist online, from open-source Python and R modules to fully fledged platforms that allow for quick code snippets that produce your desired performance reports in hours or days.
For tech-savvy investors and business that can afford to build everything in house, of course the C++ build from the ground-up solution will pop up. But is this the right approach? Or maybe you should build your backtesting engine on the cloud, using some of the popular solutions from AWS, Google or Azure?
The solution and planning are not easy and depends on your specific strategy (regrettably this is the answer for most tech problems!)
Let’s do a breakdown of the possible paths and expand a bit on the hidden dangers you might find.
Event-driven backtesting
This solution is quite popular as it is what powers backtesting engines in well-known trading platforms such as Tradingview, Multicharts or Ninjatrader.
There are also variants within this method, Ninjatrader on one hand uses Ninjascript which is based on C#. This will allow for a much more flexible approach, but also the time to get your code to production will be much higher unless you are already an expert on the platform.
Also, with flexibility comes a compromise. You need to be careful in your code structure and methods. You have the option to build with unmanaged methods and this requires handling all order states, which events are you going to handle as potential drivers for your executions, cancellations, data calls…
The most immediate question might be, is it possible to accidentally create a loop of orders that execute without no control? The short answer is yes. Luckily Ninjatrader has a managed version of Ninjascript that can make us breath and code our gap strategies, swing methods and VWAP crossover strategies without needing to look at every single state of the order.
This last version of the managed event-driven programming is what you can see in other common retail trading platforms and most institutional trading platforms.
Programming with this method is one of the most accurate, effective and robust choices when it comes to building a quantitative trading strategy. You will just need to trust the platform you are working with is good at order handling, fast at transmitting messages and resilient/scalable with processing events.
We should consider though, that in the end you will also need to trust the broker, the data provider (more on this very soon…), the server where you host your strategy… It is a compromise of course, but in relative terms should be a safe choice.
Vectorized backtesting
If we talk about vectorized backtesting, we are going to be focusing on speed. The computer does not need to wait for a candle to finish to perform operations on events that happened on the other side of the historical data you are using to test your strategy against. The set of data will be processed as a whole, no turning back.
The most immediate problem of vectorized backtesting is look-ahead bias. You can easily use data from the future to code your strategy rules with this method. To an experienced coder this could be evident, but this now adds another layer to building your code. You will need to be very careful in not adding this bias into your backtest as it can easily make your system look great when in fact you are just using information from the future to build your edge.
We will expand on a specific strategy using vectorized backtesting in another article. The risk and extra care you need for building your backtest is compensated by the enormous advantage in speed it gives in some cases.
Now that we have covered a bit on the actual backtesting engine that most trading platforms use, we can go to understand some technology stacks to build your own backtesting engine.
Open-source libraries
There are a myriad of open-source libraries available for backtest and live trading, both for cryptocurrencies and traditional markets.
The good is obvious, most of the times you can start building on already made code (if you have the rights and permissions — not so trivial), so you have a starting point.
The thing here though is you will need to match the code you can use with your strategy and when you code your specific set of rules in the open source library or platform you are using, the result can range from excellent — your strategies backtest and trade exactly as they are supposed to, withstanding time and inclement weather without issues, to the most common one — you need updates on some specific part of the engine, hence, you are dependent on the creators of the library you are using.
C++ fully fledged rocket platform
Please don’t do this unless you need a sub millisecond latency for your high frequency trading optimal allocation strategy, or you are looking at MEV within DeFi strategies. You will need a team of engineers to work on an optimal memory allocation for some time before sending your first order.
If you need the above and can print money before deploying your advanced strategies, do consider this option.
Cloud based solution
Trendy. You can launch your own EC2 instance in AWS within minutes and you can send a few orders and performance reports with an AMI suited for the job. Should you do it?
Cloud solutions are the modern way of building your tech stack, it is relatively easier to scale your applications and interconnect your modules. Your backtesting, data warehouse and machine learning systems all in the same place… Well, this is true, but be prepared to study hard or hire an experienced cloud engineer or two.
Building a backtester in the cloud is an advanced task. Check pricing. Many moving parts affect your solution and if you miss one point you will need to sacrifice performance or accuracy. Or maybe your solution does not need such an advanced backtester but then, why not going with a standalone platform that handles all the backend requirements for you meanwhile you just centre your shot on building the strategy?
A lot of quantitative hedge funds use solutions that are suboptimal because of having the determination of building everything in-house. Of course, there are other factors that need to be considered as data protection, flexibility, security… However, a correct planning based on your needs will likely save you from headaches and phantom orders in the future. Maybe it can save you also from trading a broken model.
By Jesús Martín García