|Bruno BELANYI 8d89bdf31d
crate::core module contain three new-type-style tuple-struct definitions:
TxAmount. The reason for new-typing instead of using
u32 values is to avoid potentially mixing up different kinds of
The implementation for
TxAmount makes use of a
fpdec::Dec instead of
a floating point value: using floats for financial values is a terrible idea,
instead we should use integers to represent exact decimal fractions.
fpdec::Dec was used instead of a custom implementation for convenience. It
does allow storing values with a precision that is way higher than is needed in
our case (input/output should be limited 4 digits after the decimal point) but
this does not affect us negatively. It would be easy to swap in a custom numeric
type if it turns out to be more appropriate for our needs down the line.
Unfortunately, most of the code relating to (de)serialization is less than
ideal, from a code aesthetic view-point, under-using the idiomatic
library. Since CSV as an input/output format is very limited when paired with
serde, either some hacks were needed to massage the input into the correct
format for consumption by the payment processor (see
crate::transaction::TransactionRecord) or for output (see
crate::Ledger::dump_csv). The reasons are outlined in the related commits for
these features (see 1, 2, 3 for upstream issues).
Another thing to note:
Ledger::dump_csv outputs the accounts in order (even
though they are stored unordered), to simplify diff-ing and testing.
Accounts are represented as a map from
AccountInfo. The funds
are split between
available_funds (relating to disputes). The
total funds are computed when needed but otherwise not stored to avoid
dis-synchronization between a potential
total_funds and its `held_funds
- available_funds` equivalent.
Once a client account has been frozen (after a
Chargeback transaction) then
any attempt to modify the balance of this account will result in an error.
It is assumed that each transaction id is unique, however the key to map into
transaction-related data is
(ClientId, TxId). This is done to simplify the
error-handling in case a valid transaction is used with an invalid user and
vice-versa. Since all of
chargeback reference both
ids together, we should check that both of them are correct before further
All behaviour testing was done using unit tests runnable with
rust-analyzer fame) was used to easily compare between the
final state of a ledger and its expected value, and for the ease of writing new
tests (or updating them, should a bug be found and squashed).
All errors that are raised from
Ledger::process are non-fatal. They are the
result of invalid input, and mean that the transaction's processing was stopped,
and it is ignored.
The few errors states that could arise, from internal invariants not being
upheld, are instead surfaced through panics as further processing cannot go on
once an invariant is broken. The few
expect calls inside
Ledger::get_past_transaction_info are used for this reason.
Currently, the code is single threaded, reading the input CSV in a streaming
fashion and processing transactions one-at-a-time. If we wanted to expose the
ledger in a way that allows multiple transactions to transpire at once (e.g:
a REST API), we would need to guard data accesses to ensure that no invariants
are broken. Similarly, TOCTOU issues would be a big deal (for example, trying to
chargeback the same
disputed transaction twice at the same time: the first
one could finish processing after the second one already checked that it was
dispute state but before applying the chargeback, resulting in
a double charge). To deal with that we could serialize all accesses with a big
Mutex, but this would be slow and no different from handling each transaction
serially. Or we could instead have a lock per account affected by a transaction
which would be obtained at the very beginning of that transaction's processing
before any other action is taken. This would allow transactions affecting
different accounts to be processed in parallel, and would not impact the
correctness of the ledger or affect its internal invariants. This would be done
thanks to a concurrent hash-map with fine-grained locking (either bucket-level
It is unclear whether
dispute can be applied to both
transactions, or only the later. I took the more general position that it should
be possible to do both, but the point could be made that, rationally, who in
their right minds would dispute against some unexpected funds showing up in
their accounts (it's free money!).
Handling disputes is otherwise a very simple state machine, so I implemented it
TxState enum. This meant that any of
chargeback are handled very similarly, and resulted in (IMHO) quite elegant
code to guard against illegal states (like trying to dispute a transaction that
was already disputed).