doc: add 'README.md'
This commit is contained in:
parent
e66f4f687d
commit
8d89bdf31d
111
README.md
Normal file
111
README.md
Normal file
|
@ -0,0 +1,111 @@
|
||||||
|
# Payment processor
|
||||||
|
|
||||||
|
## Implementation details
|
||||||
|
|
||||||
|
### Strong typing
|
||||||
|
|
||||||
|
The `crate::core` module contain three new-type-style tuple-struct definitions:
|
||||||
|
`ClientId`, `TxId`, and `TxAmount`. The reason for new-typing instead of using
|
||||||
|
bare `u16` or `u32` values is to avoid potentially mixing up different kinds of
|
||||||
|
ids.
|
||||||
|
|
||||||
|
### Monetary values
|
||||||
|
|
||||||
|
The implementation for `TxAmount` makes use of a `fpdec::Dec` instead of
|
||||||
|
a floating point value: using floats for financial values is a terrible idea,
|
||||||
|
instead we should use integers to represent exact decimal fractions.
|
||||||
|
`fpdec::Dec` was used instead of a custom implementation for convenience. It
|
||||||
|
does allow storing values with a precision that is way higher than is needed in
|
||||||
|
our case (input/output should be limited 4 digits after the decimal point) but
|
||||||
|
this does not affect us negatively. It would be easy to swap in a custom numeric
|
||||||
|
type if it turns out to be more appropriate for our needs down the line.
|
||||||
|
|
||||||
|
### Serialization/deserialization
|
||||||
|
|
||||||
|
Unfortunately, most of the code relating to (de)serialization is less than
|
||||||
|
ideal, from a code aesthetic view-point, under-using the idiomatic `serde`
|
||||||
|
library. Since CSV as an input/output format is very limited when paired with
|
||||||
|
`serde`, either some hacks were needed to massage the input into the correct
|
||||||
|
format for consumption by the payment processor (see
|
||||||
|
`crate::transaction::TransactionRecord`) or for output (see
|
||||||
|
`crate::Ledger::dump_csv`). The reasons are outlined in the related commits for
|
||||||
|
these features (see [1], [2], [3] for upstream issues).
|
||||||
|
|
||||||
|
Another thing to note: `Ledger::dump_csv` outputs the accounts in order (even
|
||||||
|
though they are stored unordered), to simplify diff-ing and testing.
|
||||||
|
|
||||||
|
[1]: https://github.com/BurntSushi/rust-csv/issues/211
|
||||||
|
[2]: https://github.com/BurntSushi/rust-csv/issues/172
|
||||||
|
[3]: https://github.com/BurntSushi/rust-csv/issues/98
|
||||||
|
|
||||||
|
### Account information
|
||||||
|
|
||||||
|
Accounts are represented as a map from `ClientId` to `AccountInfo`. The funds
|
||||||
|
are split between `held_funds` and `available_funds` (relating to disputes). The
|
||||||
|
total funds are computed when needed but otherwise not stored to avoid
|
||||||
|
dis-synchronization between a potential `total_funds` and its `held_funds
|
||||||
|
+ available_funds` equivalent.
|
||||||
|
|
||||||
|
Once a client account has been frozen (after a `Chargeback` transaction) then
|
||||||
|
any attempt to modify the balance of this account will result in an error.
|
||||||
|
|
||||||
|
### Transaction log
|
||||||
|
|
||||||
|
It is assumed that each transaction id is unique, however the key to map into
|
||||||
|
transaction-related data is `(ClientId, TxId)`. This is done to simplify the
|
||||||
|
error-handling in case a valid transaction is used with an invalid user and
|
||||||
|
vice-versa. Since all of `dispute`, `resolve`, and `chargeback` reference both
|
||||||
|
ids together, we should check that both of them are correct before further
|
||||||
|
processing.
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
|
||||||
|
All behaviour testing was done using unit tests runnable with `cargo test`.
|
||||||
|
`expect-test` (of `rust-analyzer` fame) was used to easily compare between the
|
||||||
|
final state of a ledger and its expected value, and for the ease of writing new
|
||||||
|
tests (or updating them, should a bug be found and squashed).
|
||||||
|
|
||||||
|
### Error handling
|
||||||
|
|
||||||
|
All errors that are raised from `Ledger::process` are non-fatal. They are the
|
||||||
|
result of invalid input, and mean that the transaction's processing was stopped,
|
||||||
|
and it is ignored.
|
||||||
|
|
||||||
|
The few errors states that could arise, from internal invariants not being
|
||||||
|
upheld, are instead surfaced through panics as further processing cannot go on
|
||||||
|
once an invariant is broken. The few `expect` calls inside
|
||||||
|
`Ledger::get_past_transaction_info` are used for this reason.
|
||||||
|
|
||||||
|
### Parallelisation
|
||||||
|
|
||||||
|
Currently, the code is single threaded, reading the input CSV in a streaming
|
||||||
|
fashion and processing transactions one-at-a-time. If we wanted to expose the
|
||||||
|
ledger in a way that allows multiple transactions to transpire at once (e.g:
|
||||||
|
a REST API), we would need to guard data accesses to ensure that no invariants
|
||||||
|
are broken. Similarly, TOCTOU issues would be a big deal (for example, trying to
|
||||||
|
`chargeback` the same `dispute`d transaction twice at the same time: the first
|
||||||
|
one could finish processing after the second one already checked that it was
|
||||||
|
initially in `dispute` state but before applying the chargeback, resulting in
|
||||||
|
a double charge). To deal with that we could serialize all accesses with a big
|
||||||
|
`Mutex`, but this would be slow and no different from handling each transaction
|
||||||
|
serially. Or we could instead have a lock per account affected by a transaction
|
||||||
|
which would be obtained at the very beginning of that transaction's processing
|
||||||
|
before any other action is taken. This would allow transactions affecting
|
||||||
|
different accounts to be processed in parallel, and would not impact the
|
||||||
|
correctness of the ledger or affect its internal invariants. This would be done
|
||||||
|
thanks to a concurrent hash-map with fine-grained locking (either bucket-level
|
||||||
|
or entry-level).
|
||||||
|
|
||||||
|
### Disputes
|
||||||
|
|
||||||
|
It is unclear whether `dispute` can be applied to both `deposit` and `withdraw`
|
||||||
|
transactions, or only the later. I took the more general position that it should
|
||||||
|
be possible to do both, but the point could be made that, rationally, who in
|
||||||
|
their right minds would dispute against some unexpected funds showing up in
|
||||||
|
their accounts (it's free money!).
|
||||||
|
|
||||||
|
Handling disputes is otherwise a very simple state machine, so I implemented it
|
||||||
|
using the `TxState` enum. This meant that any of `dispute`, `resolve`, and
|
||||||
|
`chargeback` are handled very similarly, and resulted in (IMHO) quite elegant
|
||||||
|
code to guard against illegal states (like trying to dispute a transaction that
|
||||||
|
was already disputed).
|
Loading…
Reference in a new issue