Self-Hardening Governance

MUTATION LIFECYCLE

Every Change is Paper-Tested First

When your agent proposes a strategy improvement, it does not go live immediately. The mutation state machine ensures every change proves itself against real market data before it ever touches your capital.

IDLE

→

ANALYZING

→

PAPER TESTING

→

EVALUATING

→

PROMOTED

ROLLED BACK

SHADOW EXECUTION

Paper Testing Against Live Data

The proposed mutation runs as a shadow strategy alongside your live agent. It receives the exact same market data feed -- same prices, same timestamps, same order book conditions. But it trades with simulated money.

After the required number of paper trades, the system evaluates the results. If the mutation clears the configured benchmark and passes all harness cases, it gets promoted. If not, it gets rolled back. Your live strategy never changes until the mutation proves itself.

Same market data as your live agent
Configurable number of test trades (default: 10)
Must pass all active harness cases
Baseline staleness detection (fail-closed on concurrent changes)

                    mutation_lifecycle.log
                    LIVE
                
32:01 ANALYZE Learning triggered (drawdown spike)

32:03 ANALYZE 23 losing trades in low-volume mkts

32:05 PROPOSE Add volume filter, lower threshold

32:05 PROPOSE Code hash: a7f2...3b1c

32:06 PAPER Shadow strategy instantiated

32:06 PAPER Baseline hash captured: d4e1...8f2a

45:22 PAPER Trade 1/10: YES FOMC-MAR @ 42c +$2.40

52:18 PAPER Trade 5/10: NO LAL-BOS @ 61c -$1.20

18:44 PAPER Trade 10/10 complete. Win: 7/10

18:44 EVAL Paper win rate: 70% vs live 52%

18:44 EVAL Harness: 4/4 passed (0 blocking)

18:44 EVAL Baseline hash unchanged (no drift)

18:45 PROMOTE Mutation deployed to live strategy

LOSS CLASSIFICATION

Not Every Loss is a Mistake

When your agent takes a losing trade, the AI classifies it into one of three categories. Only structural failures create safety rules. Noise and sizing issues are handled differently -- because a bad trade today could be a good trade tomorrow.

Structural

The signal logic was fundamentally wrong for the market conditions. This trade should not have been taken.

Creates harness rule

Noise

Right thesis, wrong timing. A fakeout or random variance. The same setup could win next time.

Observation only

Sizing

Right idea, but the position was too large for the risk. The trade type is fine, the allocation was not.

Adjusts parameters

HARNESS CASES

Safety Rules That Only Grow

When a structural loss is classified, the AI generates a harness case -- a boolean rule that blocks future mutations from making the same type of mistake. These rules accumulate over time. They are never automatically deleted.

Every future strategy mutation must pass all active harness cases during paper testing. If a mutation would trigger a known structural failure pattern, it gets blocked before it ever reaches your live capital.

Monotonic accumulation: safety only grows
Blocking rules prevent promotion, warning rules log only
User can deactivate rules (never deleted, always auditable)
Full evidence trail for every classification

                    harness_cases
                    4 ACTIVE
                
Feb 14 -- Trade #47
Block YES trades when bid-ask spread > 15c BLOCKING
Feb 11 -- Trade #38
Block entries when market volume < 500 contracts BLOCKING
Feb 8 -- Trade #29
Warn on YES side when confidence < 0.6 WARNING
Feb 3 -- Trade #14
Block trades on markets expiring within 2 hours BLOCKING

EXECUTION GOVERNOR

The Final Gate Before Every Trade

Every signal passes through the Execution Governor before it becomes a real trade. Circuit breaker, rate limiter, and harness enforcement -- three layers of protection that run on every single signal.

Strategy

→

LLM Review

→

Governor Gate

→

Execute

Circuit Breaker

OK

Consecutive Losses

1 / 5

Rate Limit

3 / 10

Harness Cases

4

EVIDENCE LAYER

Every Decision on Record

The evidence layer creates an immutable record of every governance decision. Mutation state transitions, paper trade results, loss classifications, harness evaluations, and promotion decisions -- all stored as INSERT-only records that are never modified.

This gives you a complete audit trail of how your agent evolved, why it made changes, and what safety checks it passed or failed along the way.

INSERT-only: records are never modified or deleted
Six evidence types: mutation, classification, harness, performance, vote, risk
Filterable by agent, type, and mutation ID
Accessible via dashboard and API

                    evidence_trail
                    IMMUTABLE
                
18MUTATIONState: EVALUATING -> PROMOTED (mutation_a7f2)
18HARNESS4/4 cases passed. 0 blocking failures.
18PERFPaper results: 10 trades, 70% win, +$14.20 PnL
52CLASSIFYTrade #52 -> NOISE (right thesis, fakeout)
45MUTATIONState: PAPER_TESTING -> EVALUATING (10/10)
32MUTATIONState: IDLE -> ANALYZING (trigger: drawdown_spike)
28CLASSIFYTrade #47 -> STRUCTURAL (wide spread, low liquidity)
28HARNESSNew rule: block when spread > 15c (BLOCKING)
10CLASSIFYTrade #46 -> SIZING (position too large for vol)
55PERFHourly stats: 3 trades, 67% win, +$6.40 PnL

CONFIGURATION

Opt-In, Fully Configurable

The control plane is off by default. Enable it per agent and tune every parameter.

            agent_config.json
            EDITABLE
        
{

  "control_plane_enabled": true,

  "paper_test_trades": 10,

  "staleness_window_secs": 30.0,

  "learning_enabled": true,

  "auto_approve": false,

  "rollback_on_drawdown": 0.15,

  // Circuit breaker trips after 5

  // consecutive losses. Rate limiter

  // allows 10 signals per minute.

  // Harness cases accumulate forever.

}

Your Agents Protect Themselves