From 0e7c2a7b0598686689b57d8ec6503000333f6f06 Mon Sep 17 00:00:00 2001
From: Bruno BELANYI <bruno@belanyi.fr>
Date: Mon, 19 Jul 2021 11:03:34 +0200
Subject: [PATCH] report: update roadmap

---
 report.md | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 91 insertions(+), 2 deletions(-)

diff --git a/report.md b/report.md
index e405f8c..d646886 100644
--- a/report.md
+++ b/report.md
@@ -168,6 +168,8 @@ and I could reuse most of the tools developed for the framework to that end.
 
 # Internship roadmap
 
+## Getting acquainted with the code base
+
 The first month was dedicated to familiarizing myself with the vocabulary at
 IMC, understanding the context surrounding the team I am working in, and
 learning about the different services that are currently being used in their
@@ -185,8 +187,95 @@ those instruments, for example the mapping from IMC IDs to the ones used by the
 exchange. I wrote a small module that could be used as a fake RDS server by the
 benchmark framework to provide its inputs to the gateway being instrumented.
 
-I am now currently beginning to write the benchmark framework, using what
-I wrote and learned during the previous month.
+## The framework
+
+With the exploratory phase done, writing the framework was my next task. The
+first thing to do was ensuring I could run all the necessary component locally,
+not accounting for correct behaviour. Once I got the client communicating to the
+gateway, and the gateway connected with the fake exchange, I wrote a few basic
+scenarios to ensure that everything was working correctly and reliably.
+
+After writing the basis of the framework and ensuring it was in correct working
+order, I integrated it with the build tools used by the developers and the
+Continuous Integration pipeline. This allows running a single command to build
+and run the benchmark on a local machine, allowing for easier iteration when
+writing integrating the benchmark framework with a new exchange, and easy
+testing of regressions during the testing pipeline that are run before merging
+patches into the code base.
+
+Once this was done, further modifications were done to allow the benchmark to be
+run using remote machines, with a lab set-up specially made to replicate the
+production environment in a sand-boxed way. This was done in a way to
+transparently allow either local or remote runs depending on what is desired,
+without further modification of either the benchmark scenarios, or the framework
+implementation for each exchange.
+
+Under this setup, thanks to a component of the benchmark framework which can be
+used to record and dump performance data collected and emitted by the gateway,
+we could take a look at the timings under different scenarios. This showed
+results close to the expected values, and demonstrated that the framework was a
+viable way to collect this information.
+
+## Compatibility testing
+
+After writing the benchmark framework and integrating it for one exchange, I
+picked up another story related to testing the Execution API. Before then, all
+Execution API implementations were tested using what is called the *method-based
+API*, using a single process to test its behavior. This method was favored
+during the transition period to Execution API, essentially being an interface
+between it and the legacy *drivers* which connect directly to the exchange: it
+allowed for lower transition costs while the rest of the execution API
+
+This poses two long-term problems:
+
+* The *request-based API*, making use of a network protocol and a separate
+gateway binary, cannot be mocked/tested as easily. Having a way to test the
+integration between client and server in a repeatable way that is integrated
+with the Continuous Integration pipeline is valuable to avoid regressions.
+
+* Some consumers of the *request-based API* in production are going to be in use
+for long periods of time without a possibility for upgrades due to
+comformability testing. To avoid any problem in production, it is of the up most
+importance that the *behavior* stays compatible between versions.
+
+To that end, I endeavoured to do the necessary modifications to the current test
+framework to allow running them with the actual gateway binary. This meant the
+following:
+
+* Being able to run them without reliable timings: due to the asynchronous
+nature of the Execution API, and the use of network communication between the
+client, gateway, and exchange, some timing expectations from the tests needed to
+be relaxed.
+
+* Because we may be running many tests in parallel, we need to avoid any
+hard-coded port value in the tests, allowing us to simply run them all in
+parallel without the fear of any cross-talk or interference thanks to this
+dynamic port discovery.
+
+Once those changes were done, the tests implemented, and some bugs squashed, we
+could make use of those tests to ensure compatibility not just at the protocol
+level but up to the observable behaviour.
+
+## Documenting my work
+
+With that work done, I now need to ensure that the relevant knowledge is shared
+across the team. This work was two-fold:
+
+* Do a presentation about the benchmark framework: because it only contains the
+tools necessary as the basis for running benchmarks, other engineers will need
+to pick it up to write new scenarios, or implement the benchmark for new
+exchanges. To that end, I FIXME
+
+* How to debug problems in benchmarks and compatibility test runs: due to the
+unconventional setup required to run those, investigating a problem when running
+either of them necessitates specific steps and different approaches. To help
+improve productivity when investigating those, I share how to replicate the test
+setup in an easily replicable manner, and explain a few of the methods I have
+used to debug problems I encountered during their development.
+
+## Gantt diagram
+
+FIXME
 
 # Engineering practices