Compare commits

...

2 commits

Author SHA1 Message Date
Bruno BELANYI 05f6ab2845 report: address a few FIXMEs 2021-08-01 15:27:29 +02:00
Bruno BELANYI 52bc23c554 report: misc fixes 2021-08-01 15:14:14 +02:00

View file

@ -67,10 +67,10 @@ allowing for further analysis of a single run and comparison of their evolution
as time goes on.
This initial work being finished, I integrated my framework with the tooling in
use at IMC to allow for smoother use of the runner, either locally for
development purposes or remotely for measurements. This is also used to test for
breakage in the Continuous Integration pipeline, to keep the benchmarks runnable
as changes are merged into the code base.
use at IMC to allow for running it more easily, either locally for development
purposes or remotely for measurements. This is also used to test for breakage in
the Continuous Integration pipeline, to keep the benchmarks runnable as changes
are merged into the code base.
Once that was done, I then picked up a user story about compatibility testing:
with the way IMC deploys software, we want to ensure that both the gateway and
@ -99,10 +99,10 @@ protocols are normalised into IMC's own protocol messages, and vice versa.
Here is the list of tasks that I am expected to have accomplished during this
internship:
* become familiar with the service,
* write a dummy load generator,
* benchmark the system under the load,
* analyze the measurements.
* Become familiar with the service.
* Write a dummy load generator.
* Benchmark the system under the load.
* Analyze the measurements.
This kind of project is exactly the reason that I was interested in working in
finance and trading. It is a field that is focused on achieving the highest
@ -136,18 +136,15 @@ conditional orders with this service: it must monitor the price of product A and
X, if product A's cost rise over X's, then it must start selling product B at
price Y.
## The competition
FIXME: what can I even say about them?
## Strategy
A new exchange connectivity service, called the Execution Gateway, is being
built at IMC, the eventual goal being to migrate all trading strategies to using
this gateway to send orders to exchanges. This will allow it to be scaled more
appropriately. However, care must be taken to maintain the current performance
during the entirety of the migration in order to stay competitive, and the only
way to ensure this is to measure it.
A new exchange connectivity service called the Execution Gateway, and its
accompanying *Execution API* to communicate with it, are being built at IMC. The
eventual goal being to migrate all trading strategies to using this gateway to
send orders to exchanges. This will allow it to be scaled more appropriately.
However, care must be taken to maintain the current performance during the
entirety of the migration in order to stay competitive, and the only way to
ensure this is to measure it.
## Roadmap
@ -175,20 +172,18 @@ scenarios.
* Benchmark the system under the load: once we can run those scenarios smoothly
we can start taking multiple measurements. The main one that IMC is interested
in is wall-to-wall latency (abbreviated W2W): the time it takes for a trade to
go from a trading strategy to an exchange. The lower this time, the more
occasions there are to make good trades. FIXME: probably more context in my
notes
in is wire-to-wire latency (abbreviated W2W): the time it takes for a trade
to go from a trading strategy to an exchange. The lower this time, the more
occasions there are to make good trades.
* Analyze the measurements: the global execution team has some initial
expectations of the gateway's performance. A divergence on that part could mean
that the measurements are flawed in some way, or that the gateway is not
performing as expected. Further analysis can be done to look at the difference
between mean execution time and the 99th percentile, and analyse the tail of the
timing distribution: the smaller it is the better. Consistent timing is more
important than a lower average, because we must be absolutely confident that a
trade order is going to be executed smoothly, and introducing inconsistent
latency can result in bad trades.
between median execution time and the 99th percentile, and analyse the tail of
the timing distribution: the smaller it is the better. Having a low execution
time is necessary, however consistent timing also plays an important role to
make sure that an order will actually be executed by the exchange reliably.
## Internship positioning amongst company works
@ -223,13 +218,12 @@ for the benchmark. This has allowed me to get acquainted with their development
process.
After writing that proof of concept, we were now certain that the benchmark was
a feasible project, with very few actual dependencies to be run: the only one
that we needed to be concerned with it called the RDS server. The RDS server
is responsible for holding the information about all trade-able instruments at
an exchange. The gateway connects to it to receive a snapshot of the state of
those instruments, for example the mapping from IMC IDs to the ones used by the
exchange. I wrote a small module that could be used as a fake RDS server by the
benchmark framework to provide its inputs to the gateway being instrumented.
a feasible project, with very few actual dependencies to be run. The low amount
of external dependencies meant fewer moving parts for the benchmarks, and a
lower amount of components to setup.\
For the ones that were needed, I had to write small modules that would model
their behaviour, and be configured as part of the framework to provide them as
input to the gateway under instrumentation.
## The framework
@ -267,15 +261,18 @@ picked up another story related to testing the Execution API. Before then, all
Execution API implementations were tested using what is called the *method-based
API*, using a single process to test its behavior. This method was favored
during the transition period to Execution API, essentially being an interface
between it and the legacy *drivers* which connect directly to the exchange: it
allowed for lower transition costs while the rest of the execution API
between it and the *drivers* which connect directly to the exchange: it allowed
for lower transition costs while the rest of the *Execution API* was being
built.
This poses two long-term problems:
* The *request-based API*, making use of a network protocol and a separate
gateway binary, cannot be mocked/tested as easily. Having a way to test the
integration between client and server in a repeatable way that is integrated
with the Continuous Integration pipeline is valuable to avoid regressions.
gateway binary, inherently relies on the interaction between a gateway and its
clients. None of the tests so far were able to check the behaviour of client and
server together using this API. Having a way to test the integration between
both components in a repeatable way that is integrated with the Continuous
Integration pipeline is valuable to avoid regressions.
* Some consumers of the *request-based API* in production are going to be in use
for long periods of time without a possibility for upgrades due to
@ -344,10 +341,9 @@ the gateway binary in order to instrument it under different scenarios.
I worked on writing those components in a way that was usable for the benchmark,
making sure that they were working an tested along the way. One such component
was writing a fake version of the RDS (FIXME: what does it mean again?) that
would be populated from the benchmark scenario, and provided the information
about financial instruments to the gateway in order to use them in the scenario,
e.g: ordering a stock.
was writing a fake version of the RDS that would be populated from the benchmark
scenario, which provided the information about financial instruments to the
gateway in order to use them in the scenario, e.g: ordering a stock.
I went on to write a first version of the benchmark framework for a specific
gateway and a specific exchange: this served as the basis for further iteration
@ -370,7 +366,7 @@ framework.
I have delivered a complete, featureful product from scratch to finish, complete
with documentation and demonstration of its use. This is at the heart of our
schooling at EPITA: making us well-rounded engineers that can deliver their work
to completion. FIXME: wax a bit more poetics
to completion.
## Acquiring new skills and knowledge
@ -547,7 +543,7 @@ results directly in more profits being made.
During my internship, I got to work on a large code base, interact with smart
and knowledgeable colleagues, and tinker on what constitutes the basic bricks of
IMC's production software (FIXME: phrasing).
IMC's production software.
Working at IMC was my first experience with such a large code base, a dizzying
amount of code. It is impossible to wrap you head around *everything* that is
@ -601,9 +597,9 @@ correct tool set to deal with those problems.
## Introspection
Working abroad, with the additional COVID restrictions, is a harsh (FIXME: find
softer term) transition from the routine of school. However, both the company
and the team have made it easy to adjust.
Working abroad, with the additional COVID restrictions, is a harsh transition
from the routine of school. However, both the company and the team have made it
easy to adjust.
* The daily stand-up meeting, and weekly retrospective seem more important than
ever when you can potentially not talk to your colleagues for days due to