report/report.md

4.9 KiB

Exec Sum

Thanks and acknowledgements

Introduction

Subject

The first description of my internship project was given to me as:

The project is about benchmarking a new service we're building related to exchange connectivity. It would involve writing a program to generate load on the new service, preparing a test environment and analyzing the performance results. Time permitting might also involve making performance improvements to the services.

To understand this subject, we must start with an explanation of what exchange connectivity means at IMC: it is the layer in IMC's architecture that ensures the connection between internal trading services and external exchanges' own infrastructure and services. It is at this layer that exchange-specific protocols are normalised into IMC's own protocol messages, and vice versa.

Here is the list of tasks that I am expected to have accomplished during this internship:

  • become familiar with the service,
  • write a dummy load generator,
  • benchmark the system under the load,
  • analyze the measurements.

This kind of project is exactly the reason that I was interested in working in finance and trading. It is a field that is focused on achieving the highest performance possible, because being faster is directly tied with making more trades and results in more profits.

Because I expressed this personal interest for working on high performance systems and related subjects, I was given this internship project to work on.

Context of the subject

The exchange connectivity layer must route orders as fast possible, to stay competitive, reduce transaction costs, and lower latencies which could result in lost opportunities, therefore less profits.

It must also take on other duties, due to it being closer to the exchange than the rest of the infrastructure. For example, a trading strategy can register conditional orders with this service: it must monitor the price of product A and X, if product A's cost rise over X's, then it must start selling product B at price Y.

A new exchange connectivity service, called the Execution Gateway, is being built at IMC, the eventual goal being to migrate all trading strategies to using this gateway to send orders to exchanges. This will allow it to be scaled more appropriately. However, care must be taken to maintain the current performance during the entirety of the migration in order to stay competitive, and the only way to ensure this is to measure it.

With that context, let's review my expected tasks once more, and expand on each of them:

  • Become familiar with the service: before writing the code for the benchmark I must first understand what goes into the process of a trade at IMC, what is needed from the gateway and from the clients in order to run them and execute orders. There is a lot of code at IMC: having different teams working at the same time on different trading service results in a lot of churn. The global execution team was created to centralise the work on core services that must be provided to the rest of the IMC workforce. The global execution gateway is one such project, aiming to consolidate all trading strategies under one singular method to send orders to their exchanges.
  • Write a dummy load generator: we want to send orders under different conditions in order to run multiple scenarios which can model varying cases of execution. Having more data for varying corner cases can make us more confident of the robustness and efficiency of the service. This is especially needed becaue of the various roles that the gateway must fulfill: not only must it act as a bridge for the communication between exchanges and traders, but also as an order executor. All those cases must be accounted for when writing the different scenarios.
  • Benchmark the system under the load: once we can run those scenarios smoothly we can start taking multiple measurements. The main one that IMC is interested in is wall-to-wall latency (abbreviated W2W): the time it takes for a trade to go from a trading strategy to an exchange. The lower this time, the more occasions there are to make good trades. FIXME: probably more context in my notes
  • Analyze the measurements: the global execution team has some initial expectations of the gateway's performance. A divergence on that part could mean that the measurements are flawed in some way, or that the gateway is not performing as expected. Further analysis can be done to look at the difference between mean execution time and the 99th percentile, and analyse the tail of the timing distribution: the smaller it is the better. Consistent timing is more important than a lower average, because we must be absolutely confident that a trade order is going to be executed smoothly, and introducing inconsistent latency can result in bad trades.

Internship roadmap

Engineering practices

Illustrated analysis of acquired skills

Added value

Conclusion

Annex

About IMC

Results & Comments