- MacOS 14.4.1
- CMake v3.28.0 - minimum requirement: 3.20
- Clang v17.0.6 - compiler needs to support at least the C++17 standard
To build this project, run follows in the project root directory:
mkdir build && cd build
cmake ..
make
Then, you'll find a generated /bin directory under root. Now you can access through command line under /bin:
./main
After builiding, in /bulid run:
ctest
To perform query, you should write query directly in cpp source file. the query syntax is:
auto result = engine.queryOrderBookSnapshots(
processor,
{"SCS", "SCH"}, // symbols
0, // start time
6, // end time
{3, 2, 1}, // bids index level
{1, 2, 3}, // asks index level
true, // whether to output last trade price
true // whether to output last trade quantity
);This is a quick example in src/main.cpp:
#include <iostream>
#include "trading/execution/engine.h"
#include "trading/processor/processor.h"
int main() {
trading::OrderBookProcessor processor;
processor.ReadMarketData("./data/trade_test_data.log");
std::vector<std::string> symbol{"APPL"};
processor.ProcessMarketData(symbol[0], 6);
auto engine = trading::ExecutionEngine();
auto result = engine.queryOrderBookSnapshots(
processor, symbol, 0, 6, {3, 2, 1}, {1, 2, 3}, true, true);
std::cout << result << std::endl;
return 0;
}
And the result of example should be:
symbol, epoch, bid3q@bid3p bid2q@bid2p bid1q@bid1p X ask1q@ask1p ask2q@ask2p ask3q@ask3p, last trade price, last trade quantity
APPL, 1, 0@0.000 0@0.000 4@9.600 X 0@0.000 0@0.000 0@0.000, nan, NA
APPL, 2, 0@0.000 6@9.500 4@9.600 X 0@0.000 0@0.000 0@0.000, nan, NA
APPL, 3, 0@0.000 6@9.500 4@9.600 X 5@9.700 0@0.000 0@0.000, nan, NA
APPL, 4, 0@0.000 6@9.500 4@9.600 X 15@9.700 0@0.000 0@0.000, nan, NA
APPL, 5, 0@0.000 6@9.500 4@9.600 X 11@9.700 0@0.000 0@0.000, 9.700000, 4
APPL, 6, 0@0.000 0@0.000 6@9.500 X 11@9.700 0@0.000 0@0.000, 9.600000, 4
The desgin is mainly modularized in three parts, just acts like the question description:
- OrderBook
- Processor
- Execution Engine
- price-level data storage. I have utilized the
std::mapdata structure to store price levels based on their prices. This choice allows for sorted ordering of the price levels, ensuring efficient O(logn) time complexity for price queries. An alternative approach could have been usingstd::vector, which offers O(1) complexity for data append and deletion. However, finding a specific price level using a vector would have taken O(n) time. Thus, the vector data structure would have been more suitable for situations involving a small number of price levels or frequent changes in price levels. - order storage in same price level. I choose
std::vector. With FIFO feature,std::queuecould be a choice, but when perform CANCEL order operation,std::queuehas to pop a lot of times, which is costly.
This processor achieved three operations for Orderbook: NEW, CANCEL, TRADE
NEWoperation:- Find order in its price level.
- Push it in the back of
Orders
CANCELoperation:- Find order to be canceled by price level first, then find the order with same order_id in the
Orders, erase it
- Find order to be canceled by price level first, then find the order with same order_id in the
TRADEoperation:- Trade with the best price level first, then with
Ordersin FIFO order. - if the quantity at that price level is 0, erase this level in
OrderBook.
- Trade with the best price level first, then with
Apart from that, the processor will store current state of orderbook and snapshot for optimization on multi query. This means snapshot will only be processed once if we query 0-1000 then 0-500, snapshot will only be processed 1000 times.
The engine is relatively simple. As we only define what type of output we want, I just use a for-loop to find the field that satisfied demand, concat it, and finally print it. The result format should pretty similar as the description.
For implementation, when perform snapshot query, the ExecutionEngine will first look at whether the Processor has stored snapshot cache:
-
if yes,
ExectuionEnginewill find data fromSnapShotCacheinProcessor. -
if no,
ExectuionEnginewill callProcessorwill process data, up to the lastest time as query needed.