Risk Controls

All of rocky-bot's risk machinery lives in three mechanisms: RiskCaps (threshold config), CircuitBreaker (runtime trip), and the Position-Cap Gate (strategy-layer pre-emptive avoidance). This page also covers the backend's LEVERAGE_V1 = 10 constant — the key fix for the margin-leak saga.

If you haven't read the strategy details, see Strategy Loops first.

1. RiskCaps — threshold config

Source: rocky_bot/risk.py

@dataclass
class RiskCaps:
    max_loss_usdc: float = 50.0          # cumulative PnL loss → CB trips
    max_notional_usdc: float = 200.0     # per-account notional cap (overridden to 150 in main)
    max_leverage: int = 10               # (currently only recorded, not enforced)
    api_errors_to_trip: int = 5          # consecutive API errors → CB trips
    pause_seconds: int = 60              # cooldown duration after trip
    feed_stale_seconds: int = 10         # Binance feed-stale window

One RiskCaps instance per account, injected by main.py:

# rocky_bot/main.py
circuits = {
    acc.id: CircuitBreaker(RiskCaps(max_notional_usdc=150.0))
    for acc in accounts
}

max_notional_usdc=150.0 is overridden in main.py (default 200 was too loose). This number directly caps each account's max position size (150 USDC notional value) and is the key parameter for the funnel to operate stably.

2. CircuitBreaker — runtime trip

Source: rocky_bot/risk.py

Each CircuitBreaker maintains:

class CircuitBreaker:
    caps: RiskCaps
    _consecutive_errors: int
    _cumulative_pnl: float        # cumulative wallet PnL (real-time delta)
    _opened_at: float | None      # trip timestamp; None = not tripped
    _open_reason: str | None
    _last_wallet: float | None    # last seen wallet, used to compute delta

2.1 Three trip conditions

Trigger	Condition	Action
API error streak	`_consecutive_errors >= api_errors_to_trip`	Trip for `pause_seconds`
Cumulative loss	`_cumulative_pnl <= -max_loss_usdc`	Trip permanently (until external reset)
Feed staleness	(strategy decides) `feed.mid()` raises `StaleFeedError`	Skip this iter but don't count

2.2 Behavior after trip

async def iterate_once(self):
    if self.circuit.is_open():
        return    # ← every strategy loop checks this first
    ...

The first thing every iteration does is check the CB. If open, return immediately — wait for next iter (pause_seconds later the CB recovers and the next iter passes).

2.3 State update points

self.circuit.record_api_success()    # after every successful API call, clear _consecutive_errors
self.circuit.record_api_error()      # after every failed API call, +1
self.circuit.update_wallet(usdc)     # after balance() each iter, updates _cumulative_pnl
self.circuit.record_realised_pnl()   # (not called from bot, reserved)

Example from ladder.py:

balances = await self.client.balance()
self.circuit.record_api_success()
for b in balances:
    if b["asset"] == "USDC":
        try:
            self.circuit.update_wallet(float(b["balance"]))
        except (KeyError, ValueError):
            pass
        break

3. Position-Cap Gate — strategy-layer avoidance

Not all risks can wait for the CB to trip — CB is the after-the-fact emergency brake, while the position cap is the active pre-emptive avoider.

3.1 Why we need it

Historical lesson (full saga in Deployment & Operations § 5-round saga):

Earlier ladder versions had no position-cap check. Every fill caused the ladder to quote another same-side order → one-sided inventory accumulated → wallet margin exhausted → -2010 insufficient balance flooded → no orders could place → empty book.

Within 30 minutes a single account's locked margin went from $0 to $98 (near the wallet ceiling). The CB couldn't even trip in time (no API errors, just backend rejecting new orders).

3.2 The fix: pre-place would-be check

All three strategies (ladder / anchor / taker) do this check before place_order:

positions = await self.client.position_risk(symbol=binance_sym)
pos_amt = positions[0]["positionAmt"]
mark = positions[0]["markPrice"]

would_be = pos_amt + sign(side) * qty
if abs(would_be * mark) > caps.max_notional_usdc:
    # cancel existing same-side order (so it doesn't fill and add more)
    if live_order: await self.client.cancel_order(order_id=live_order["orderId"])
    return     # don't place

3.3 Key details

Gate runs before cancel-replace — judge cap first, then drift, so you don't "cancel an old one then immediately place a same-size new one"
Opposite-side always allowed — the cap only restricts adding; reducing is always free, letting opposing orders / takers shrink the position
Re-read positionRisk every iter — position may have changed between iters (taker or other maker filled against it)

3.4 Per-strategy behavior on gate trigger

Strategy	Action when gated
Ladder	Cancel current same-side order + return (skip this iter)
Anchor	Cancel only that side's order + skip that side; the other side proceeds normally
Taker	Flip side to the reducing direction (not skip — actively reverses)

4. LEVERAGE_V1 — backend fix

Strictly not part of the bot, but without this fix, all bot-side caps were defeated.

4.1 Symptom

In backend apply_trade_matched (the fill ledger applier):

let leverage = (notional / order_margin).round_dp(0);    // old version

Derives leverage from notional and original order_margin. Failure modes:

After partial fills, order_margin has been proportionally released
Price drift makes fill price differ from placement price
Combined, notional / order_margin could round to 7, 9, 11 — non-10 integers

Result: apply_margin_recompute computed new_locked with the wrong leverage, which didn't match what decrement_with_margin_release released as order margin → accounts.locked got a phantom increment per trade ($1–3).

4.2 Diagnostic data

After deploying an invariant logger, 956 violations were collected in 30 minutes. 180 trades caused diff to go up (never down). pos_sum values contained the 142857 recurring pattern (decimal of 1/7).

4.3 The fix

// services/internal-ledger/src/apply.rs
const LEVERAGE_V1: u32 = 10;
let taker_leverage = LEVERAGE_V1;
let maker_leverage = LEVERAGE_V1;

4 lines to replace 12 lines of derivation. Premise: the whole stack above hardcodes lev=10 (api-gateway/routes_orders.rs:96: leverage: 10, // default until /v1/leverage endpoint lands), so this is a true single source of truth.

4.4 Verification

After deploy, 5-min smoke + 30-min monitor:

0 invariant violations
max_locked = $27.31 (within cap)
over_80 == 0
-2010 count == 0
After 63 minutes max_locked stayed stable

Full 5-round margin-leak investigation: see rocky.interface/docs/superpowers/specs/2026-05-25-leverage-derivation-fix-design.md.

5. All risk config — quick reference

Param	Location	Current value	Effect
`max_notional_usdc`	`main.py` (override)	150	Per-account notional cap
`max_loss_usdc`	`RiskCaps` default	50	Cumulative loss trip threshold
`api_errors_to_trip`	`RiskCaps` default	5	API error count trip threshold
`pause_seconds`	`RiskCaps` default	60	CB cooldown
`LEVERAGE_V1`	`apply.rs` (backend)	10	Matching layer's locked-margin leverage
`TAKER_AGGRESSION`	`taker.py`	0.005 (50 bps)	How far taker crosses
`DRIFT_BPS` (ladder)	`ladder.py`	0.0002 (2 bps)	Live-order drift threshold
`DRIFT_BPS` (anchor)	`anchor.py`	0.0001 (1 bps)	Same, anchor more sensitive
`interval_s` (ladder)	`ladder.py`	3.0	Main loop period
`interval_s` (anchor)	`anchor.py`	2.0	Same
`base_interval_s` (taker)	`taker.py`	30.0	Same

Strategy Loops — how these risk mechanisms are invoked in each strategy
Deployment & Operations — metrics for checking that risk is working + historical postmortems

1. RiskCaps — threshold config​

2. CircuitBreaker — runtime trip​

2.1 Three trip conditions​

2.2 Behavior after trip​

2.3 State update points​

3. Position-Cap Gate — strategy-layer avoidance​

3.1 Why we need it​

3.2 The fix: pre-place would-be check​

3.3 Key details​

3.4 Per-strategy behavior on gate trigger​

4. LEVERAGE_V1 — backend fix​

4.1 Symptom​

4.2 Diagnostic data​

4.3 The fix​

4.4 Verification​

5. All risk config — quick reference​

6. Related​