Tutorial: Statistical Arbitrage Strategy

We will create a statistical arbitrage trading application in this tutorial. It monitors the data of different contracts in the same market, and performs arbitrage operations when the price difference deviates from a certain standard deviation of the average value of the previous period.

Front leg trades are placed in a single strategy Arb1Strategy and back leg trades are placed in another strategy Arb2Strategy.

When Arb1Strategy completes the front leg transaction, the overall risk of the account is transmitted to the back leg for corresponding hedging operations, thus completing the entire arbitrage process.

Steps to create an arbitrage strategy

To create a flexible strategy, you will need do the following:

Decide which exchange to trade.
Add the symbols you plan to trade or listen for market data.
Add price models.
Add variables to generate signals.
Add models.
Add strategies. see more details -> quantlib_configuration reference

Configure

We use a json file to configure the trading application. The file contains information about api keys, log path, and strategy components. Please update the API key information in the example configuration below:

{
    "instance": {
        "license_id":"TRAIL001",
        "license_key":"apifiny123456",
        "log_path": "/data/cc/logs",
        "name": "sim1"
    },
    "sim": {
        "ioc_only": false,
        "use_tbbo": true,
        "delay_o2a_us": 0,
        "delay_a2m_us": 0
    },
    "fees": {
        "OKEX_SWAP": {
            "make": 0.0002,
            "take": 0.0004
        },
        "OKEX": {
            "make": 0.0003,
            "take": 0.0005
        }
    },
    "players": [
        ["BSVUSDTSWAP.OKEXSWAP_Player", ["CobJsonPlayer", {"port": ["BSVUSDTSWAP", "OKEX_SWAP"], "path": "/data/cc/cob_data"}]], 
        ["BSVUSDSWAP.OKEXSWAP_Player", ["CobJsonPlayer", {"port": ["BSVUSDSWAP", "OKEX_SWAP"], "path": "/data/cc/cob_data"}]]
    ],
    "risk_formulas": [
        ["Port_Risk", ["RiskFormula", {"components": [[["BSVUSDTSWAP", "OKEX_SWAP"], 1.0], [["BSVUSDSWAP", "OKEX_SWAP"], 1.0]]}]]
    ],
    "accounts": [
        [10001, ["Account", {"risk_formulas": ["Port_Risk"], "id": 10001}]]
    ],
    "symbols": [
        {"port": ["BSVUSDTSWAP", "OKEX_SWAP"], "cid": 10001}, 
        {"port": ["BSVUSDSWAP", "OKEX_SWAP"], "cid": 10002}
    ],
    "samplers": [
        ["std_sampler", ["TimeSampler", {"halflife": 1800, "msecs": 60000}]]
    ],
    "pricing_models": [
        ["BSVUSDTSWAP.OKEX_SWAP_askpx", ["AskPx", {"port": ["BSVUSDTSWAP", "OKEX_SWAP"]}]], 
        ["BSVUSDTSWAP.OKEX_SWAP_bidpx", ["BidPx", {"port": ["BSVUSDTSWAP", "OKEX_SWAP"]}]], 
        ["BSVUSDTSWAP.OKEX_SWAP_midpx", ["MidPx", {"port": ["BSVUSDTSWAP", "OKEX_SWAP"]}]],
        ["BSVUSDSWAP.OKEX_SWAP_askpx", ["AskPx", {"port": ["BSVUSDSWAP", "OKEX_SWAP"]}]],
        ["BSVUSDSWAP.OKEX_SWAP_bidpx", ["BidPx", {"port": ["BSVUSDSWAP", "OKEX_SWAP"]}]],
        ["BSVUSDSWAP.OKEX_SWAP_midpx", ["MidPx", {"port": ["BSVUSDSWAP", "OKEX_SWAP"]}]]
    ],
    "variables": [
        ["VAR_A_askpx", ["PriceVar", {"pm": "BSVUSDTSWAP.OKEX_SWAP_askpx"}]], 
        ["VAR_A_bidpx", ["PriceVar", {"pm": "BSVUSDTSWAP.OKEX_SWAP_bidpx"}]], 
        ["VAR_A_midpx", ["PriceVar", {"pm": "BSVUSDTSWAP.OKEX_SWAP_midpx"}]], 
        ["VAR_B_askpx", ["PriceVar", {"pm": "BSVUSDSWAP.OKEX_SWAP_askpx"}]], 
        ["VAR_B_bidpx", ["PriceVar", {"pm": "BSVUSDSWAP.OKEX_SWAP_bidpx"}]],
        ["VAR_B_midpx", ["PriceVar", {"pm": "BSVUSDSWAP.OKEX_SWAP_midpx"}]],
        ["RGAP_SBBA", ["Ratio", {"v1": "VAR_B_askpx", "v2": "VAR_A_askpx"}]], 
        ["RGAP_BBSA", ["Ratio", {"v1": "VAR_B_bidpx", "v2": "VAR_A_bidpx"}]], 
        ["RM_BDA", ["Ratio", {"v1": "VAR_B_midpx", "v2": "VAR_A_midpx"}]],
        ["EMA_SBBA", ["VarEma", {"variable": "RGAP_SBBA", "sampler": "std_sampler"}]],
        ["EMA_BBSA", ["VarEma", {"variable": "RGAP_BBSA", "sampler": "std_sampler"}]],
        ["STD_SBBA", ["VarStd2", {"variable": "RGAP_SBBA", "init_var_value": 1,"init_std_value": 0.0006,"sampler": "std_sampler"}]],
        ["STD_BBSA", ["VarStd2", {"variable": "RGAP_BBSA", "init_var_value": 1,"init_std_value": 0.0006,"sampler": "std_sampler"}]],
        ["UP", ["Add", {"v1": "EMA_SBBA", "v2": "STD_SBBA"}]],  
        ["DOWN", ["Sub", {"v1": "EMA_BBSA", "v2": "STD_BBSA"}]],  
        ["SELL_UP", ["GreaterThan", {"v1": "RGAP_SBBA", "v2": "UP"}]],  
        ["BUY_DOWN", ["LessThan", {"v1": "RGAP_BBSA", "v2": "DOWN"}]]
    ],
    "models": [
        ["model_a", ["SimpleModel", {"variable": "RM_BDA"}]], 
        ["model_b", ["SimpleModel", {"variable": "RM_BDA"}]]
    ],
    "strategies": [
        ["Arb01", ["Arb1Strategy", {"symbol": "BSVUSDTSWAP", "trade_market": "OKEX_SWAP","risk_id":0, "use_margin": true, "account": 10001, "use_separate_logs": true, "model": "model_a","rbda": "RM_BDA","sell_up": "SELL_UP","buy_down": "BUY_DOWN", "order_notional": 100, "max_notional": 1000, "max_risk": 500, "start_time": "00:30:00", "end_time": "23:59:59"}]],
        ["Arb02", ["Arb2Strategy", {"symbol": "BSVUSDSWAP", "trade_market": "OKEX_SWAP","risk_id":0,  "use_margin": true, "account": 10001, "use_separate_logs": true, "model": "model_a", "max_notional": 1000, "max_risk": 500, "start_time": "00:30:00", "end_time": "23:59:59"}]]
}

Principles of Statistical Arbitrage

Statistical arbitrage is based on the statistical analysis of historical data, estimating the probability distribution of related variables, and combining fundamental data for analysis to guide arbitrage trading. Compared with risk-free arbitrage, statistical arbitrage adds some risk to a small amount, but the arbitrage opportunities available from it will be several times that of risk-free arbitrage. The basic idea of statistical arbitrage is to use statistical analysis tools to study and analyze the historical data of the relationship between a set of related prices, study the stability of the relationship in history, and estimate its probability distribution, determine the extreme region in the distribution, that is, the negative domain, when the real price relationship enters the negative domain, it is considered that the price relationship cannot be maintained for a long time, and the arbitrager has a high probability of success in arbitrage

Steps of this Statistical Arbitrage

sbba = askprice of contract b / askprice of contract a
bbsa = bidprice of contract b / bidprice of contract a
sbba stands for action Sell B contract and buy A contract of equal value
bbsa stands for action Buy B contract and sell A contract of equal value
sbba can calculate its ema std value in the past period, namely sbba_ema, sbba_std
bbsa can calculate its ema std value in the past period, namely bbsa_ema, bbsa_std
When the real-time sbba is greater than sbba_ema+sbba_std, make the sbba action.
When the real-time bbsa is greater than bbsa_ema-bbsa_std, make the bbsa action.

Get data and algo sdk

get Sim data (quantlib_data.tar.gz)and algo sdk in -> download

unpack algo sdk to algo_sdk/bin/xlibs unpack quantlib_data.tar.gz to /data/cc/cob_data

Run

Arb strategy depends on several shared libraries, so you need setup some environment variables first.

Set environment variable ALGO_HOME using the path of your algo_sdk. e.g. /data/cc/algo_sdk:

export ALGO_HOME=YOUR_ALGO_SDK_PATH

Setup other environment variables:

export TZ=UTC
export LD_LIBRARY_PATH=${ALGO_HOME}/bin:$LD_LIBRARY_PATH
export PATH=${ALGO_HOME}/bin:$PATH

You can simulate the application now. It takes one command line argument, which is the path to a json configuration file.

ccc_sim_trader ${ALGO_HOME}/examples/arbitrage/cfg/arbitrage001.json 20220705

You can visualize the simulation results of the strategy in the form of a graph.

cd ~/code/algo_sdk/scripts
python3 sim_ana.py -p /data/cc/logs -sd 20220705 -ed 20220705

The results like this