mts1b-riskengine

Live risk enforcement: pretrade gates, synthetic stops, drawdown halt, broker-exit reconciler.

Repo: github.com/MTS1B/mts1b-riskengine Layer: 3 Depends on: foundation, platform, portfolio, quantkit Audience: mts1b-oms (every order passes through)

What it is

The enforcement layer between strategy and execution. Every order from mts1b-oms calls into mts1b-riskengine and either passes all 7 gates or is rejected with a structured reason.

Pure-math kernels (Sharpe, MaxDD computations, etc.) live in mts1b-quantkit. This repo is enforcement + workers.

The 7 gates

See concept: risk envelopes for the full hierarchy. Summary:

#	Gate	What it checks
1	Idempotency	Dedupe on `Order.idempotency_key`
2	Schema	pydantic validation
3	Static risk	allowed_brokers, max_order_notional, allowed_order_types
4	Position risk	max_position_pct, gross/net exposure, sector/asset-class concentration
5	Drawdown halt	daily/weekly/monthly loss thresholds → halt when breached
6	Short-side	enable_shorting flag, borrow fee ceiling, locate check (v2)
7	CRO veto (optional)	LLM-based edge-case override; fail-OPEN with 5s timeout

Module layout

mts1b_riskengine/
├── gates/
│   ├── idempotency.py
│   ├── static.py
│   ├── position.py
│   ├── drawdown.py
│   ├── short.py
│   └── cro_veto.py
├── envelope/
│   ├── loader.py           # listen on mts.v1.risk.envelope.updated
│   └── audit.py            # envelope version history
├── workers/
│   ├── synthetic_stop_runner.py
│   ├── broker_exit_reconciler.py
│   └── position_arm_sweeper.py
└── service/
    └── grpc.py             # gRPC server called by mts1b-oms

gRPC interface

mts1b_riskengine.proto
service RiskEngine {
  rpc CheckOrder(Order) returns (RiskDecision);
  rpc GetEnvelope(EnvelopeRequest) returns (RiskEnvelope);
}

message RiskDecision {
  string order_id = 1;
  bool approved = 2;
  string rejecting_gate = 3;       // empty if approved
  string rejection_code = 4;       // "MAX_POSITION_EXCEEDED" etc.
  string reason = 5;
  map<string, string> details = 6;
  string envelope_id = 7;
}

OMS calls CheckOrder synchronously for every order. p99 target: ≤ 5 ms.

Synthetic stops

We don't rely on broker-side stop-loss orders (they get jumped on gaps). Instead, every filled position registers a synthetic stop:

# After fill at $500, stop at $500 × (1 - 0.10) = $450
synthetic_stop = fill_price * (1 - envelope.synthetic_stop_pct)

synthetic_stop_runner.py polls quotes every N seconds (default 5s for crypto, 60s for equities). When price crosses the stop, the worker emits a closing order via the OMS — same path, same gates.

This means stops are honored even on:

Broker outage (broker-side stop wouldn't fire)
Flash crash (broker stop might be jumped)
After-hours gaps (RTH-only stops miss them)

Broker-exit reconciler

Periodically reconciles OMS positions ↔ broker positions:

# every 120s
for fund_id in active_funds:
    oms_positions = await oms.get_positions(fund_id=fund_id)
    broker_positions = await broker.get_positions()

    discrepancies = diff(oms_positions, broker_positions)
    if discrepancies:
        await alert(f"position drift detected: {discrepancies}")
        # then either: auto-reconcile (configurable) or halt

Catches lost-fill bugs, broker reconciliation lag, manual broker trades. Catches problems before they snowball.

Drawdown halt — the kill switch

Most important production safety. When daily/weekly/monthly loss thresholds breach:

# In drawdown.py
if today_pl_pct <= -envelope.daily_loss_halt_pct:
    await publish_halt(severity="DAILY_LOSS", fund_id=fund_id)
    # mts1b-oms stops accepting new risk-on orders
    # existing positions NOT auto-closed (don't sell into a crash)

Halt is sticky: requires operator mts cmd resume <fund_id> to lift. Designed so you can't accidentally remove your own safety net.

CRO veto (optional, advisory)

For edge cases the deterministic gates can't catch, an optional LLM-based veto:

# Reads the order + recent market context, decides
result = await mts1b_llm.persona("CRO").veto(
    order=order,
    market_context=...,
    envelope=envelope,
    timeout=5.0,
)

if result.veto and result.confidence > 0.85:
    return RiskDecision(approved=False, ...)

Fail-OPEN: if the LLM doesn't respond in 5s, the order passes. Fail-CLOSED would create an availability problem.

Configuration

services/mts1b-riskengine/config.yaml
grpc:
  port: 50052

envelope:
  reload_interval_s: 60
  versions_kept: 100        # for audit trail

synthetic_stop:
  enabled: true
  poll_interval_s: 5        # crypto
  poll_interval_equities_s: 60

broker_exit_reconciler:
  enabled: true
  interval_s: 120
  alert_threshold: 1        # alert on any discrepancy

cro_veto:
  enabled: false            # opt-in; defaults off
  timeout_s: 5.0
  confidence_threshold: 0.85

Build + test

pip install -e ".[dev]"
pytest -v                   # gate logic + worker tests
pytest -m integration       # with NATS + Postgres

Roadmap

Version	Items
0.1 (Wave 1)	7 gates + synthetic stops + broker-exit reconciler + drawdown halt
0.2 (Wave 2)	Borrow locate (hard-to-borrow check)
0.3 (Wave 2)	Per-strategy envelopes (today only per-fund)
0.4 (Wave 3)	Reg-reporting (CAT/OATS/MIFID) integration
1.0 (LTS)	Stable enforcement interface

What it is​

The 7 gates​

Module layout​

gRPC interface​

Synthetic stops​

Broker-exit reconciler​

Drawdown halt — the kill switch​

CRO veto (optional, advisory)​

Configuration​

Build + test​

Roadmap​

See also​