Market-Making System: Comprehensive Analysis & Statement of Work

Date: 2026-01-31
Author: AI Code Review
Status: Phase 1 Ready to Start
Workspace: .builders/0013-market-maker-mvp


Executive Summary

The market-making tool has a solid technical foundation but is approximately 40-50% complete toward stated MVP goals. The core regime detection engine works well, but the Grid Exit Strategy (primary value proposition) is only partially implemented.

CRITICAL ISSUE IDENTIFIED: Hardcoded dummy values in restart gates evaluation make metrics YAMLs untrustworthy. This must be fixed immediately.

Completion Estimate: 180-250 hours total work (9-12 weeks at 20h/week, or 4.5-6 weeks at 40h/week)


Table of Contents

  1. Current State Analysis
  2. Critical Gaps Identified
  3. Data Quality Issues (CRITICAL)
  4. Statement of Work
  5. Phase Breakdown
  6. Risk Assessment
  7. Success Criteria

Current State Analysis

What’s Working ✅

1. Regime Detection Engine

Location: repos/market-making/metrics-service/src/regime/

Capabilities:

  • Hourly OHLCV analysis from KuCoin API
  • 4 regime classifications: RANGE_OK, RANGE_WEAK, TRANSITION, TREND
  • 15+ metrics per analysis:
    • Bollinger Band analysis
    • Mean reversion strength (OU half-life, Z-scores)
    • Volatility metrics (ATR, expansion ratios)
    • Trend indicators (swing structure, EMA crossovers)
    • Range confidence scores
  • Git-backed storage in market-maker-data repository

Assessment: ✅ SOLID - Core regime detection logic is well-implemented

2. Infrastructure

Location: repos/market-making/infra/

Components:

  • Kubernetes CronJob running hourly at :01
  • ExternalSecrets integration for KuCoin API keys
  • Docker image build/deploy workflow
  • ArgoCD deployment patterns
  • Git-based persistence (no database required)

Assessment: ✅ PRODUCTION-READY - Infrastructure is well-designed

3. Notification System (Partial)

Location: repos/market-making/metrics-service/send_regime_notifications.py

Features:

  • Pushover integration working
  • Basic regime alerts functional
  • Entry evaluator module created
  • Rate limiting (4h minimum between same-state notifications)

Assessment: ⚠️ PARTIAL - Working but incomplete (see gaps below)

4. Grid Configuration Management

Location: repos/market-making/config/

Features:

  • YAML-based grid configurations
  • History tracking via Git
  • Configuration versioning
  • Grid state determination from history array

Assessment: ✅ FUNCTIONAL - Config management is solid


Critical Gaps Identified

Gap 1: Grid Exit Strategy - NOT IMPLEMENTED (P0)

Current Status: 30% complete - stub implementation only

Location: repos/market-making/metrics-service/src/exit_strategy/evaluator.py

What Exists:

  • Basic ExitState enum (NORMAL, WARNING, LATEST_ACCEPTABLE_EXIT, MANDATORY_EXIT)
  • Simple MANDATORY_EXIT trigger for TREND regime
  • Basic boundary violation check (≥2 consecutive closes outside range)

What’s Missing:

LATEST_ACCEPTABLE_EXIT Triggers (NOT IMPLEMENTED)

Per spec (.ai/projects/market-making/grid-exit-strategy/spec.md):

  • ❌ TRANSITION persistence tracking (≥2 consecutive 4h bars OR ≥4 consecutive 1h bars)
  • ❌ Mean reversion degradation (OU half-life ≥ 2× baseline)
  • ❌ Volatility expansion ratio > 1.25 threshold
  • ❌ Z-score reversion failure tracking

WARNING Triggers (NOT IMPLEMENTED)

Per spec:

  • ❌ TRANSITION probability ≥ 40% (configurable)
  • ❌ Regime confidence declining over 3 bars
  • ❌ Efficiency Ratio rising above range threshold
  • ❌ Mean reversion speed slowing
  • ❌ Volatility expansion 1.1-1.25× range
  • Require 2+ conditions to trigger (critical logic)

State Transition Tracking (NOT IMPLEMENTED)

  • ❌ Store previous exit states in Git
  • ❌ Track state durations
  • ❌ Prevent notification spam for same state

Historical Data Loading (NOT IMPLEMENTED)

  • ❌ Load last N metrics files for persistence checks
  • ❌ Cache recent history for performance
  • ❌ Multi-timeframe analysis (1h + 4h bars)

Impact: 🔴 CRITICAL - Cannot trust system to alert when to exit grids, defeating entire purpose

Effort: 50-70 hours


Gap 2: Position Risk Quantification - MISSING (P0)

Current Status: NOT IMPLEMENTED

What’s Missing:

Position Tracking

# DOES NOT EXIST - Need to implement:
class PositionTracker:
    def get_active_positions(self, grid_id: str) -> List[Position]:
        """Fetch actual open orders from KuCoin API"""
        
    def calculate_unrealized_pnl(self, positions: List[Position]) -> float:
        """Current unrealized P&L"""
        
    def get_inventory_imbalance(self, positions: List[Position]) -> float:
        """Fraction of grids stuck on one side"""

Capital Risk Calculator

# DOES NOT EXIST - Need to implement:
class CapitalRiskCalculator:
    def calculate_capital_at_risk(self, inventory, current_price, stop_loss) -> float:
        """Inventory value × (current_price - stop_loss) / current_price"""
        
    def estimate_profit_giveback(self, peak_pnl, current_pnl, delay_hours) -> Tuple[float, float]:
        """Range estimate: [min_giveback, max_giveback]"""
        
    def get_stop_distance_atr(self, current_price, stop_loss, atr) -> float:
        """Distance to stop in ATR units"""

Current Behavior: Risk assessment in history.py only looks at config, not actual positions

Impact: 🔴 CRITICAL - Notifications cannot show:

  • “Capital at risk: $120.50”
  • “Expected give-back if delayed 12h: $4-7”
  • “Stop-loss distance: 0.85 ATR”

Effort: 30-40 hours


Gap 3: Evaluation Cadence - WRONG (P1)

Current: Hourly (at :01 past the hour)
Required: Every 15 minutes

Impact: ⚠️ MEDIUM

  • 45-minute blind spot between evaluations
  • Could miss rapid regime transitions
  • Doesn’t meet “MANDATORY_EXIT → Immediately” requirement

Fix: Change CronJob from 1 * * * * to 1,16,31,46 * * * *

Effort: 0.5 hours


Gap 4: Audit Logging - MISSING (P2)

Current Status: NOT IMPLEMENTED

Required (per requirements):

  • Log all exit signals
  • Log operator responses (or lack thereof)
  • Track response time vs recommended window
  • Enable retrospective analysis of signal quality

Impact: 🟡 LOW - Cannot measure system effectiveness over time

Effort: 3-4 hours


Gap 5: Testing - NEARLY ZERO (P0)

Current Status: Minimal test coverage

Location: repos/market-making/metrics-service/tests/

What Exists:

  • Some integration test stubs
  • No unit tests for exit strategy
  • No backtesting capability

Required:

  • Unit tests for all metric calculations
  • Unit tests for exit trigger logic
  • Integration tests (regime → exit → notification flow)
  • Backtesting framework to validate signal quality

Impact: 🔴 CRITICAL - Cannot refactor or trust changes

Effort: 40-50 hours


Data Quality Issues (CRITICAL)

The Smoking Gun: Hardcoded Dummy Values

Location: repos/market-making/metrics-service/src/regime/engine.py

Lines 268-280 (and duplicated at 349-359):

# Mock values for now - these should come from actual analysis
# TODO: Extract these from the detailed_analysis once refined classification is implemented
trend_score = regime_state.trend_score or 50.0
mean_rev_score = regime_state.mean_rev_score or 50.0
adx = 25.0  # TODO: Extract from analysis
adx_history = [25.0] * 10  # TODO: Extract from analysis
normalized_slope = 0.1  # TODO: Extract from analysis
efficiency_ratio = 0.4  # TODO: Extract from analysis
lag1_autocorr = -0.1  # TODO: Extract from analysis
ou_half_life = 24.0  # TODO: Extract from analysis
atr = 1500.0  # TODO: Extract from analysis
atr_history = [1500.0] * 100  # TODO: Extract from analysis
bb_bandwidth = 0.02  # TODO: Extract from analysis
bb_bandwidth_history = [0.02] * 10  # TODO: Extract from analysis

Impact Analysis

Why This is Critical:

  1. Restart gates evaluation depends on these metrics
  2. Grid creation recommendations use these values
  3. Exit strategy (when implemented) would use these values
  4. Metrics YAMLs contain fake data - cannot trust historical analysis
  5. No way to validate regime classifications with dummy data

Affected Components:

  • Restart gates (Gate 1: Directional Energy Decay, Gate 2: Mean Reversion Return, Gate 3: Tradable Volatility)
  • Risk assessment in notifications
  • Historical backtesting (impossible with fake data)

User Impact:

“I often don’t trust the generated metrics yamls” - This is why!

Missing Calculations

Need to implement:

  1. ADX (Average Directional Index)

    • Measures trend strength
    • Range: 0-100 (>25 = trending, <20 = ranging)
    • Formula: Smoothed average of Directional Movement
  2. Normalized Slope

    • Price slope normalized by ATR
    • Measures trend direction relative to volatility
    • Formula: (current_price - price_N_bars_ago) / (ATR * N)
  3. Efficiency Ratio (Perry Kaufman)

    • Measures trend efficiency
    • Range: 0-1 (higher = more trending)
    • Formula: |net_change| / sum(abs(bar_changes))
  4. Lag-1 Autocorrelation

    • Measures mean reversion
    • Range: -1 to 1 (negative = mean reverting)
    • Formula: Pearson correlation of price[t] vs price[t-1]
  5. Ornstein-Uhlenbeck Half-Life

    • Time for price to revert halfway to mean
    • Critical for grid viability
    • Formula: -log(2) / log(ar_coefficient)
  6. Bollinger Band Bandwidth

    • Normalized volatility measure
    • Formula: (upper_band - lower_band) / middle_band

Statement of Work

Phase 1: Data Trust & Quality (CRITICAL - P0)

Objective: Remove all hardcoded dummy values and implement real metric calculations

Duration: 2-3 weeks (40-60 hours)

Priority: 🔴 P0 - BLOCKER - Must complete before any other work

Tasks

1.1 Implement Missing Metric Calculations (20-30h)

Deliverables:

  • src/regime/metrics/adx.py - ADX calculation
  • src/regime/metrics/efficiency_ratio.py - Efficiency Ratio
  • src/regime/metrics/autocorrelation.py - Lag-1 autocorrelation
  • src/regime/metrics/ou_process.py - OU half-life estimation
  • src/regime/metrics/slope.py - Normalized slope
  • src/regime/metrics/bollinger.py - BB bandwidth calculation

Acceptance Criteria:

  • Each metric has dedicated module with docstrings
  • Input validation (e.g., ATR can’t be negative)
  • Output validation (e.g., correlation ∈ [-1, 1])
  • Type hints throughout
1.2 Extract Metrics from Regime Analysis (8-12h)

Current Flow:

regime_state = classify_regime(price_data)
# ❌ Use hardcoded values
adx = 25.0

Target Flow:

regime_state = classify_regime(price_data)
detailed_analysis = regime_state.detailed_analysis
# ✅ Extract real values
adx = detailed_analysis.get('adx', {}).get('current', None)
if adx is None:
    raise ValueError("ADX not calculated in regime analysis")

Files to Modify:

  • src/regime/engine.py (lines 268-280, 349-359)
  • src/regime/classifier.py (enhance to calculate all metrics)

Acceptance Criteria:

  • All 10 hardcoded values replaced with real calculations
  • No fallback to dummy values (fail fast if metrics missing)
  • Logging shows which metrics were calculated
1.3 Add Data Validation (8-12h)

Create: src/regime/validation/schema_validator.py

Validation Rules:

metrics_schema:
  adx:
    type: float
    range: [0, 100]
    required: true
  efficiency_ratio:
    type: float
    range: [0, 1]
    required: true
  lag1_autocorr:
    type: float
    range: [-1, 1]
    required: true
  ou_half_life:
    type: float
    range: [0.1, 1000]  # hours
    required: true
  atr:
    type: float
    min: 0
    required: true

Features:

  • Schema validation for all metrics YAMLs
  • Sanity checks per metric type
  • Automated validation on every metrics write

Acceptance Criteria:

  • All metrics YAMLs validated before Git commit
  • Validation errors logged with details
  • CI/CD fails on invalid metrics
1.4 Create Data Quality Dashboard (4-6h)

Create: src/regime/quality/dashboard.py

Features:

  • Visual indicators: ✅ Real data vs ⚠️ Dummy data
  • Historical trend validation (detect sudden jumps)
  • Anomaly detection (e.g., ADX stuck at 25.0)
  • Comparison: before/after Phase 1

Output: HTML report showing data quality metrics

Acceptance Criteria:

  • Dashboard shows 100% real data after Phase 1
  • Can identify dummy data in historical files
  • Anomaly detection catches obvious errors
1.5 Unit Tests (8-12h)

Test Coverage Required:

  • ADX calculation: 5+ test cases (trending, ranging, edge cases)
  • Efficiency Ratio: 4+ test cases
  • Autocorrelation: 3+ test cases
  • OU half-life: 4+ test cases (including non-stationary rejection)
  • Normalized slope: 3+ test cases
  • BB bandwidth: 3+ test cases

Test Data:

  • Known input/output pairs (validated manually)
  • Edge cases (empty data, single point, NaN handling)
  • Real market data samples

Acceptance Criteria:

  • 90%+ code coverage for new metric modules
  • All tests passing
  • Tests run in CI/CD

Total Phase 1 Effort: 40-60 hours


Phase 2: Complete Grid Exit Strategy (HIGH PRIORITY - P0)

Objective: Implement all missing exit triggers and state tracking

Duration: 2-3 weeks (50-70 hours)

Priority: 🔴 P0 - BLOCKER - Core value proposition

Tasks

2.1 LATEST_ACCEPTABLE_EXIT Triggers (8-12h)

Implement: src/exit_strategy/triggers/latest_acceptable.py

Requirements:

  1. TRANSITION Persistence Tracking

    def check_transition_persistence(history: List[Dict]) -> Tuple[bool, str]:
        """
        Trigger if:
        - ≥2 consecutive 4h bars with TRANSITION verdict, OR
        - ≥4 consecutive 1h bars with TRANSITION verdict
        
        Args:
            history: List of recent metrics (last 12 hours)
        
        Returns:
            (triggered: bool, reason: str)
        """
  2. Mean Reversion Degradation

    def check_mean_reversion_degradation(
        current_half_life: float,
        baseline_half_life: float,
        threshold_multiplier: float = 2.0
    ) -> Tuple[bool, str]:
        """
        Trigger if OU half-life ≥ 2× baseline
        
        Baseline = 7-day rolling average during RANGE_OK
        """
  3. Volatility Expansion

    def check_volatility_expansion(
        current_atr: float,
        baseline_atr: float,
        threshold: float = 1.25
    ) -> Tuple[bool, str]:
        """
        Trigger if volatility expansion ratio > 1.25
        """
  4. Z-Score Reversion Failure

    def check_zscore_reversion_failure(
        price_history: List[float],
        lookback_bars: int = 6
    ) -> Tuple[bool, str]:
        """
        Trigger if Z-score excursions fail to revert within expected bars
        """

Acceptance Criteria:

  • All 4 trigger functions implemented
  • Each trigger independently testable
  • Configurable thresholds via YAML
  • Unit tests for each trigger
2.2 WARNING Triggers (4-6h)

Implement: src/exit_strategy/triggers/warning.py

Requirements:

Require 2+ conditions to trigger WARNING:

  1. TRANSITION probability ≥ 40% (configurable)
  2. Regime confidence declining over 3 bars
  3. Efficiency Ratio rising above 0.6 (configurable)
  4. Mean reversion speed slowing
  5. Volatility expansion 1.1-1.25×

Logic:

def evaluate_warning_conditions(regime_history: List[Dict], config: Dict) -> ExitEvaluation:
    """
    Evaluate all warning conditions.
    
    Returns WARNING if 2+ conditions met, else NORMAL.
    """
    conditions_met = []
    
    # Check each condition...
    if transition_probability >= config['warning_transition_threshold']:
        conditions_met.append("TRANSITION probability rising")
    
    # ... check others ...
    
    if len(conditions_met) >= 2:
        return ExitState.WARNING, conditions_met
    else:
        return ExitState.NORMAL, ["Single warning condition - not actionable"]

Acceptance Criteria:

  • 2+ conditions required to trigger
  • All 5 condition checks implemented
  • Configurable thresholds
  • Unit tests covering edge cases (1 condition, 2 conditions, all conditions)
2.3 State Transition Tracking (4-6h)

Implement: src/exit_strategy/state_tracker.py

Features:

  1. State History in Git

    market-maker-data/
      exit_states/
        ETH-USDT/
          2026-01-31.json  # Daily state log
    
  2. Track Transitions

    {
      "transitions": [
        {
          "timestamp": "2026-01-31T14:23:00Z",
          "from_state": "NORMAL",
          "to_state": "WARNING",
          "reasons": ["TRANSITION probability rising", "Confidence declining"],
          "regime_verdict": "RANGE_WEAK",
          "confidence": 0.48
        }
      ]
    }
  3. Prevent Notification Spam

    • Max 1 WARNING per 4h for same grid
    • Max 1 LATEST_ACCEPTABLE_EXIT per 2h
    • Max 1 MANDATORY_EXIT per 1h
    • Track last notification timestamp

Acceptance Criteria:

  • State transitions logged to Git
  • Rate limiting prevents spam
  • Can query: “When did we last alert for this grid?”
  • Unit tests for rate limiting logic
2.4 Historical Data Loading (4-6h)

Implement: src/exit_strategy/history_loader.py

Features:

  1. Load Last N Metrics Files

    def load_recent_metrics(
        symbol: str,
        hours: int = 12,
        data_repo: Path
    ) -> List[Dict]:
        """
        Load last N hours of metrics for persistence checks
        
        Returns sorted list (oldest first)
        """
  2. Multi-Timeframe Analysis

    def get_4h_bars(metrics_history: List[Dict]) -> List[Dict]:
        """Extract 4h bar data for structural confirmation"""
        
    def get_1h_bars(metrics_history: List[Dict]) -> List[Dict]:
        """Extract 1h bar data for rapid detection"""
  3. Caching for Performance

    • Cache last 24h of metrics in memory
    • Invalidate cache on new metrics arrival
    • Reduce Git reads

Acceptance Criteria:

  • Can load last 12-24 hours of metrics
  • Multi-timeframe extraction works
  • Caching reduces duplicate reads
  • Unit tests with mock file system
2.5 Integration & Testing (8-12h)

Tasks:

  1. Wire up all triggers in evaluator

    • Update ExitStateEvaluator.evaluate() to use new triggers
    • Ensure correct priority: MANDATORY → LATEST_ACCEPTABLE → WARNING → NORMAL
  2. Integration tests

    • Test full flow: metrics → history load → trigger eval → state classification
    • Test state transitions: NORMAL → WARNING → LATEST_ACCEPTABLE → MANDATORY
    • Test notification prevention (rate limiting)
  3. Real data validation

    • Run against last 7 days of actual metrics
    • Verify exit states make sense
    • Check for false positives/negatives

Acceptance Criteria:

  • All triggers integrated
  • Integration tests passing
  • Manual validation against real data shows reasonable behavior
2.6 Configuration & Documentation (4-6h)

Create: config/exit_strategy_config.yaml

exit_rules:
  latest_acceptable_exit:
    transition_persistence_4h_bars: 2
    transition_persistence_1h_bars: 4
    mean_reversion_halflife_multiplier: 2.0
    volatility_expansion_threshold: 1.25
    zscore_reversion_failure_bars: 6
    
  warning:
    transition_probability_threshold: 0.40
    regime_confidence_decline_bars: 3
    efficiency_ratio_threshold: 0.6
    volatility_expansion_min: 1.10
    volatility_expansion_max: 1.25
    
  mandatory_exit:
    consecutive_closes_outside_range: 2
    directional_swing_bars: 6
    stop_loss_buffer_atr: 0.1
    
notifications:
  rate_limits:
    warning_min_hours: 4
    latest_acceptable_min_hours: 2
    mandatory_min_hours: 1

Documentation:

  • Trigger logic explained
  • Configuration guide
  • Tuning recommendations

Acceptance Criteria:

  • All thresholds configurable
  • Configuration validated on load
  • Documentation complete

Total Phase 2 Effort: 50-70 hours


Phase 3: Position Risk Quantification (MEDIUM PRIORITY - P1)

Objective: Add real position tracking and capital risk calculations

Duration: 1-2 weeks (30-40 hours)

Priority: 🟡 P1 - Enhances notifications but not blocking

Tasks

3.1 KuCoin Position Tracker (8-12h)

Implement: src/position/tracker.py

Features:

  1. Fetch Active Positions

    class PositionTracker:
        def __init__(self, kucoin_client: KuCoinExchange):
            self.client = kucoin_client
        
        def get_active_grid_orders(self, grid_id: str) -> List[GridOrder]:
            """
            Fetch all open orders for a grid from KuCoin API
            
            Returns:
                List of GridOrder objects with price, size, side, etc.
            """
  2. Calculate Unrealized PnL

    def calculate_unrealized_pnl(
        self,
        orders: List[GridOrder],
        current_price: float
    ) -> float:
        """
        Calculate unrealized P&L based on current positions
        
        Formula: Sum of (current_price - entry_price) * size for all positions
        """
  3. Inventory Imbalance

    def get_inventory_imbalance(self, orders: List[GridOrder]) -> float:
        """
        Calculate fraction of grids stuck on one side
        
        Returns: 
            Value in [-1, 1] where:
            -1 = all short positions (bought too much)
            +1 = all long positions (sold too much)
            0 = balanced
        """

Acceptance Criteria:

  • Successfully fetches positions from KuCoin API
  • Calculates accurate PnL (validate against KuCoin UI)
  • Handles edge cases (no positions, API errors)
  • Unit tests with mocked KuCoin responses
3.2 Capital Risk Calculator (6-8h)

Implement: src/position/risk_calculator.py

Features:

  1. Capital at Risk

    def calculate_capital_at_risk(
        inventory: List[GridOrder],
        current_price: float,
        stop_loss: float
    ) -> float:
        """
        Calculate capital at risk if stop-loss hit
        
        Formula:
            For each position:
                risk = position_value × abs((current_price - stop_loss) / current_price)
            
            Total risk = sum(all positions)
        
        Returns:
            Dollar amount at risk
        """
  2. Profit Give-Back Estimation

    def estimate_profit_giveback(
        peak_pnl: float,
        current_pnl: float,
        delay_hours: int,
        volatility_atr: float
    ) -> Tuple[float, float]:
        """
        Estimate profit give-back if exit delayed
        
        Returns:
            (min_giveback, max_giveback) range in dollars
            
        Assumptions:
            - Price continues trending at 0.5-1.0 × ATR per hour
            - Min scenario: slow drift (0.5 ATR/h)
            - Max scenario: acceleration (1.0 ATR/h)
        """
  3. Stop-Loss Distance in ATR

    def get_stop_distance_atr(
        current_price: float,
        stop_loss: float,
        atr: float
    ) -> float:
        """
        Calculate distance to stop-loss in ATR units
        
        Returns:
            Number of ATR units to stop (e.g., 0.85 means 0.85 ATR away)
        """

Acceptance Criteria:

  • All 3 risk calculations implemented
  • Formulas validated against manual calculations
  • Edge case handling (no inventory, zero ATR, etc.)
  • Unit tests with known scenarios
3.3 Enhance Notifications (4-6h)

Modify: send_regime_notifications.py

Add to notifications:

  1. WARNING Alert

    ⚠️ ETH/USDT Grid WARNING
    
    Regime: TRANSITION probability rising (45%)
    Issues: Mean reversion slowing, volatility expanding (1.15x)
    
    Position:
    • Unrealized profit: $12.34
    • Capital at risk: $45.67
    
    Action: Review grid within 24h
    
  2. LATEST_ACCEPTABLE_EXIT Alert

    ⏳ ETH/USDT - LATEST ACCEPTABLE EXIT
    
    Grid assumptions failing:
    • TRANSITION persists 4 bars (4 hours)
    • Mean reversion half-life 2.3x baseline
    
    Position:
    • Unrealized profit: $12.34
    • Est. give-back if delayed 12h: $4.00-$7.00 (30-50% of profit)
    • Stop-loss distance: 0.85 ATR
    • Capital at risk: $120.50
    
    Action: STOP GRID within 4-12 hours to preserve 75-90% of profit
    
  3. MANDATORY_EXIT Alert

    🛑 ETH/USDT - MANDATORY EXIT
    
    TREND DETECTED - STOP GRID IMMEDIATELY
    
    Trigger: 2 consecutive closes outside range bounds
    
    Position:
    • Capital at risk: $120.50
    • Stop-loss distance: 0.6 ATR (CRITICAL)
    • Unrealized profit: $12.34 (will become loss if trend continues)
    
    ACTION REQUIRED: Stop grid NOW to protect capital
    

Acceptance Criteria:

  • All notification templates updated
  • Risk metrics included in every alert
  • Formatting clear and actionable
  • Integration tests validate notification content
3.4 Error Handling & Graceful Degradation (4-6h)

Requirements:

  1. KuCoin API Failures

    • Retry logic with exponential backoff
    • Fallback to last known position if API down
    • Log errors but don’t stop exit evaluation
  2. Missing Position Data

    • If can’t fetch positions, show in notification:
      Position: Unable to fetch from KuCoin API
      (Exit evaluation based on regime only)
      
  3. Circuit Breaker

    • After 3 consecutive API failures, skip position tracking
    • Resume after 15 minutes

Acceptance Criteria:

  • Graceful degradation on API failures
  • Exit evaluation continues even without position data
  • Clear indication in notifications when data missing
  • Unit tests for error scenarios
3.5 Testing & Validation (8-12h)

Test Coverage:

  1. Unit Tests

    • Position fetching with mocked KuCoin API
    • Risk calculations with known inputs
    • Notification formatting
    • Error handling scenarios
  2. Integration Tests

    • Full flow: fetch positions → calculate risk → generate notification
    • Test with real KuCoin testnet (if available)
    • Validate against actual grid positions
  3. Manual Validation

    • Compare PnL calculations with KuCoin UI
    • Validate capital-at-risk with manual math
    • Review notification content with real data

Acceptance Criteria:

  • 80%+ code coverage for new modules
  • Integration tests passing
  • Manual validation confirms accuracy

Total Phase 3 Effort: 30-40 hours


Phase 4: Testing & Validation (CRITICAL - P0)

Objective: Comprehensive test coverage and backtesting validation

Duration: 1 week (40-50 hours)

Priority: 🔴 P0 - Cannot deploy without tests

Tasks

4.1 Unit Tests for Metric Calculations (8-10h)

Coverage Required:

  1. ADX Calculation (5 test cases)

    • Trending market (ADX > 40)
    • Ranging market (ADX < 20)
    • Edge case: insufficient data
    • Edge case: flat price (zero movement)
    • Validate against known indicators (TradingView, TA-Lib)
  2. Efficiency Ratio (4 test cases)

    • Strong trend (ER > 0.8)
    • Weak trend (ER < 0.3)
    • Edge case: zero net change
    • Validate formula implementation
  3. Autocorrelation (3 test cases)

    • Mean reverting series (negative correlation)
    • Trending series (positive correlation)
    • Random walk (near zero)
  4. OU Half-Life (4 test cases)

    • Fast mean reversion (half-life < 10h)
    • Slow mean reversion (half-life > 50h)
    • Non-stationary rejection (no half-life)
    • Edge case: perfect mean reversion
  5. Normalized Slope (3 test cases)

    • Uptrend
    • Downtrend
    • Sideways
  6. BB Bandwidth (3 test cases)

    • High volatility (wide bands)
    • Low volatility (narrow bands)
    • Edge case: zero volatility

Test Data Sources:

  • Manually calculated examples
  • Known market scenarios (2020 crash, 2021 bull run)
  • Synthetic data with known properties

Acceptance Criteria:

  • All 22 test cases implemented
  • Tests pass consistently
  • Coverage ≥ 90% for metric modules
4.2 Unit Tests for Exit Triggers (10-12h)

Coverage Required:

  1. MANDATORY_EXIT Triggers (8 test cases)

    • TREND regime detected
    • 2 consecutive closes outside range
    • Directional structure (HH/HL pattern)
    • Stop-loss breached
    • Edge: 1 close outside (should NOT trigger)
    • Edge: TRANSITION but no closes outside
    • Edge: Boundary violation but reverses
    • Multiple triggers active simultaneously
  2. LATEST_ACCEPTABLE_EXIT Triggers (10 test cases)

    • TRANSITION persists 2× 4h bars
    • TRANSITION persists 4× 1h bars
    • OU half-life ≥ 2× baseline
    • Volatility expansion > 1.25
    • Z-score fails to revert
    • Edge: TRANSITION for 1 bar only (should NOT trigger)
    • Edge: Half-life exactly 2× baseline
    • Edge: Volatility 1.24× (just below threshold)
    • Multiple triggers active
    • Historical data insufficient
  3. WARNING Triggers (8 test cases)

    • 2 conditions met (minimum)
    • 3 conditions met
    • All 5 conditions met
    • Only 1 condition (should NOT trigger)
    • TRANSITION probability exactly 40%
    • Confidence declining over 3 bars
    • ER rising above threshold
    • Volatility expanding in warning range

Acceptance Criteria:

  • All 26 test cases implemented
  • Tests cover edge cases (boundary conditions)
  • Coverage ≥ 90% for trigger modules
4.3 Integration Tests (10-12h)

Test Scenarios:

  1. End-to-End Flow (5 scenarios)

    Scenario 1: NORMAL → WARNING → LATEST_ACCEPTABLE_EXIT → MANDATORY_EXIT
    Scenario 2: NORMAL → WARNING → back to NORMAL (false alarm)
    Scenario 3: NORMAL → MANDATORY_EXIT (rapid trend)
    Scenario 4: LATEST_ACCEPTABLE_EXIT → operator exits → reset
    Scenario 5: Notification rate limiting prevents spam
    
  2. Multi-Timeframe Analysis

    • 1h bars detect rapid transitions
    • 4h bars provide structural confirmation
    • Conflicting signals resolved correctly
  3. Git Integration

    • Metrics stored correctly
    • State transitions logged
    • History loading works across date boundaries
  4. Notification Delivery

    • Pushover receives correct messages
    • Rate limiting works
    • Error handling (API down)

Test Setup:

  • Mock market-maker-data Git repo
  • Fake Pushover API endpoint
  • Synthetic metrics spanning multiple days

Acceptance Criteria:

  • All 5 end-to-end scenarios pass
  • Integration tests run in CI/CD
  • Tests use isolated environment (no prod data)
4.4 Backtesting Framework (12-16h)

Objective: Validate exit strategy would have preserved profits historically

Implement: backtest/regime_exit_backtest.py

Features:

  1. Replay Historical Metrics

    class RegimeExitBacktest:
        def __init__(self, metrics_path: Path, start_date: str, end_date: str):
            """Load historical metrics for backtest period"""
        
        def run(self) -> BacktestResults:
            """
            Simulate exit strategy on historical data
            
            For each hour:
                1. Load metrics
                2. Evaluate exit state
                3. Record would-be action
                4. Calculate profit preservation
            
            Returns:
                Summary of exit quality, profit preservation, signal accuracy
            """
  2. Exit Quality Metrics

    @dataclass
    class BacktestResults:
        total_exit_signals: int
        mandatory_exits: int
        latest_acceptable_exits: int
        warnings: int
        
        # Profit preservation
        avg_profit_retention_ratio: float  # Target: ≥ 0.75
        peak_profit_captured_pct: float
        
        # Timeliness
        avg_exit_lead_time_hours: float  # Before trend confirmed
        exits_before_stop_loss_pct: float  # Target: ≥ 95%
        
        # Accuracy
        true_transition_detection_rate: float  # Target: ≥ 70%
        false_exit_rate: float  # Target: ≤ 30%
  3. Scenario Analysis

    • 2024 ETH Range: Sep-Nov (should show NORMAL mostly)
    • 2024 ETH Breakout: Dec (should trigger exits)
    • 2025 Volatility Spike: Jan (should warn/exit)

Validation Criteria:

  • Did exit signals fire before major trends?
  • Would we have preserved ≥75% of peak profit?
  • False positive rate acceptable (≤30%)?

Acceptance Criteria:

  • Backtesting framework working
  • Run on 3+ historical scenarios
  • Results show system would have worked
  • Report generated with charts
4.5 CI/CD Integration (4-6h)

Setup:

  1. GitHub Actions Workflow

    # .github/workflows/test-metrics-service.yml
    name: Metrics Service Tests
     
    on: [push, pull_request]
     
    jobs:
      test:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v2
          - name: Set up Python
            uses: actions/setup-python@v2
            with:
              python-version: '3.12'
          - name: Install dependencies
            run: |
              cd repos/market-making/metrics-service
              pip install -r requirements.txt
              pip install pytest pytest-cov
          - name: Run tests
            run: |
              cd repos/market-making/metrics-service
              pytest --cov=src --cov-report=xml --cov-report=html
          - name: Upload coverage
            uses: codecov/codecov-action@v2
  2. Pre-commit Hooks

    • Run unit tests before commit
    • Lint with black/flake8
    • Type check with mypy
  3. Quality Gates

    • Minimum 80% code coverage
    • All tests must pass
    • No lint errors

Acceptance Criteria:

  • CI/CD pipeline running
  • Tests execute on every commit
  • Coverage reports generated
  • Quality gates enforced

Total Phase 4 Effort: 40-50 hours


Phase 5: Operational Improvements (MEDIUM - P2)

Objective: Production readiness and observability

Duration: 1 week (20-30 hours)

Priority: 🟡 P2 - Nice to have but not blocking

Tasks

5.1 15-Minute Evaluation Cadence (0.5h)

Current: 1 * * * * (hourly at :01)
Target: 1,16,31,46 * * * * (every 15 minutes)

Files to Modify:

  • repos/market-making/infra/metrics-service/cronjob.yaml
# Before
schedule: "1 * * * *"
 
# After
schedule: "1,16,31,46 * * * *"

Alternative: Create separate CronJob for exit evaluation

  • Metrics collection: Hourly (heavy API usage)
  • Exit evaluation: Every 15 minutes (reads from Git)

Acceptance Criteria:

  • CronJob runs every 15 minutes
  • Logs show 4× evaluations per hour
  • No API rate limiting issues
5.2 Audit Logging (3-4h)

Implement: src/exit_strategy/audit_logger.py

Log Format:

# market-maker-data/exit_events/YYYY/MM/DD/HH-MM-SYMBOL.yaml
event_id: evt_2026-01-31T14:23:00_eth
grid_id: eth-v3
symbol: ETH/USDT
timestamp: "2026-01-31T14:23:00Z"
 
exit_state:
  current: LATEST_ACCEPTABLE_EXIT
  previous: WARNING
  transition_time: "2026-01-31T14:23:00Z"
 
triggers:
  - type: TRANSITION_PERSISTENCE
    description: "TRANSITION verdict for 4 consecutive 1h bars"
    bars_affected: [2026-01-31T11:00, 2026-01-31T12:00, ...]
  - type: MEAN_REVERSION_DEGRADATION
    description: "OU half-life 2.3x baseline"
    current_halflife_minutes: 1840
    baseline_halflife_minutes: 800
 
position_risk:
  capital_at_risk: 120.50
  unrealized_pnl: 12.34
  stop_loss_distance_atr: 0.85
 
notifications_sent:
  - channel: push
    status: delivered
    sent_at: "2026-01-31T14:23:15Z"
    delivery_confirmed_at: "2026-01-31T14:23:16Z"
 
operator_action:
  action_taken: null  # To be updated manually
  action_time: null
  reaction_time_seconds: null
  notes: null

Features:

  • Log every exit state transition
  • Log notification attempts (success/failure)
  • Placeholder for operator action (manual update)
  • Git-backed storage

Acceptance Criteria:

  • All transitions logged
  • Logs readable and parseable
  • Can query: “Show all MANDATORY_EXIT events last 30 days”
  • Unit tests for logger
5.3 KPI Tracking (4-6h)

Implement: src/exit_strategy/kpis.py

KPIs to Track (per .ai/projects/market-making/new-instructions.md):

  1. Exit Within Acceptable Window (EAW%)

    • Formula: ExitsBeforeMandatory / TotalExitEvents
    • Target: ≥ 90%
  2. Profit Retention Ratio (PRR)

    • Formula: RealizedProfitAtExit / MaxUnrealizedProfitBeforeExit
    • Target: ≥ 0.75
  3. Stop-Loss Avoidance Rate (SLAR)

    • Formula: ExitsBeforeStop / TotalGridsStopped
    • Target: ≥ 95%
  4. True Transition Detection Rate (TTDR)

    • Formula: TransitionExitsWithFollowThrough / TotalTransitionExits
    • Target: ≥ 70%
  5. Mandatory Exit Compliance (MEC%)

    • Formula: CompliedMandatoryExits / MandatoryExitSignals
    • Target: 100%

Monthly Report:

def generate_monthly_kpi_report(year: int, month: int) -> KPIReport:
    """
    Aggregate all exit events for the month and calculate KPIs
    
    Returns:
        KPIReport with metrics, charts, recommendations
    """

Acceptance Criteria:

  • All 5 KPIs calculable from audit logs
  • Monthly report generation works
  • Can track KPI trends over time
  • Unit tests for KPI calculations
5.4 Documentation (4-6h)

Create:

  1. Operational Runbook (docs/ops/runbook.md)

    • How to deploy
    • How to monitor
    • How to troubleshoot
    • Emergency procedures
  2. Troubleshooting Guide (docs/ops/troubleshooting.md)

    • Common errors and solutions
    • API failures
    • Notification not received
    • Exit signals not firing
  3. Metrics Interpretation Guide (docs/metrics_guide.md)

    • What each metric means
    • How to interpret exit states
    • When to override system recommendations
    • Tuning thresholds
  4. Configuration Reference (docs/configuration.md)

    • All config parameters explained
    • Recommended values
    • How to tune for different markets

Acceptance Criteria:

  • All 4 documents complete
  • Reviewed by operator (you)
  • Examples included
  • Links to relevant code
5.5 Monitoring & Alerting (4-6h)

Setup:

  1. Prometheus Metrics

    • Exit state distribution (gauge)
    • Notification success rate (counter)
    • API call latency (histogram)
    • Error rate (counter)
  2. Grafana Dashboard

    • Exit state timeline
    • Notification delivery status
    • System health (API errors, latency)
    • KPI trends
  3. Alerts

    • No metrics collected in 2 hours → Alert
    • Notification delivery failure → Alert
    • API error rate > 10% → Alert

Acceptance Criteria:

  • Prometheus metrics exposed
  • Grafana dashboard working
  • Alerts configured and tested
  • Can diagnose issues from dashboard
5.6 Performance Optimization (4-6h)

Optimizations:

  1. Caching

    • Cache last 24h of metrics in memory
    • Reduce Git reads by 90%
  2. Async Processing

    • Fetch position data async
    • Send notifications async
    • Don’t block exit evaluation
  3. Database for State (optional)

    • Consider SQLite for state tracking
    • Faster than Git for queries
    • Git remains source of truth

Benchmarks:

  • Exit evaluation: < 30 seconds per grid
  • Notification delivery: < 60 seconds
  • Git commit/push: < 10 seconds

Acceptance Criteria:

  • Performance benchmarks met
  • No degradation with multiple grids
  • Load testing passed (5 concurrent grids)

Total Phase 5 Effort: 20-30 hours


Risk Assessment

Technical Risks

RiskImpactLikelihoodMitigation
Metric calculations incorrectHighMedium• Extensive unit tests
• Validation against known indicators
• Manual verification with TradingView
False MANDATORY_EXIT signalsHighMedium• Require multiple confirming indicators
• Tune thresholds via backtesting
• Track False Exit Rate KPI
Missed regime transitionsHighLow• 15-min cadence
• Multi-timeframe confirmation
• Conservative thresholds
KuCoin API rate limitingMediumLow• Cache position data
• Implement backoff strategy
• Monitor API usage
Git push failuresMediumLow• Retry logic with exponential backoff
• Local backup before push
• Alert on persistent failures

Operational Risks

RiskImpactLikelihoodMitigation
Operator misses notificationHighMedium• Multi-channel delivery (Pushover + Email)
• Escalating urgency for MANDATORY_EXIT
• Track Mandatory Exit Compliance KPI
Notification fatigueMediumHigh• Smart rate limiting by exit state
• Clear urgency indicators
• Only actionable alerts
Grid stopped unnecessarilyMediumMedium• Backtesting validates signal quality
• Track False Exit Rate
• Tunable thresholds
Phase 1 reveals more data issuesMediumMedium• Allocate buffer time (20% contingency)
• Iterative approach
• Daily code review

Schedule Risks

RiskImpactLikelihoodMitigation
Phase 1 takes longer than estimatedHighMedium• Priority-based execution
• Can proceed with partial completion
• Daily progress tracking
Testing reveals major bugsMediumLow• Test early and often
• Integration tests during development
• Code review before merge
Scope creepMediumMedium• Strict adherence to SOW
• Document future enhancements separately
• Phase gates

Success Criteria

Phase 1 (Data Quality) - Complete When:

All hardcoded values replaced

  • Zero TODOs in regime/engine.py lines 268-359
  • All metrics calculated from real data
  • No fallback to dummy values

All metric calculations implemented

  • ADX, Efficiency Ratio, Autocorrelation, OU half-life, Normalized slope, BB bandwidth
  • Unit tests passing (90%+ coverage)
  • Validated against known indicators

Data validation in place

  • Schema validator working
  • All metrics YAMLs validated before commit
  • Quality dashboard shows 100% real data

Can trust metrics YAMLs

  • Manual inspection of recent files shows real data
  • Anomaly detection catches issues
  • User (Craig) confirms trust restored

Phase 2 (Exit Strategy) - Complete When:

All exit triggers implemented

  • MANDATORY_EXIT: 4 trigger types working
  • LATEST_ACCEPTABLE_EXIT: 4 trigger types working
  • WARNING: 5 condition checks, require 2+ to trigger
  • Unit tests covering all edge cases

State tracking working

  • Transitions logged to Git
  • Rate limiting prevents spam
  • Can query historical state

Historical data loading

  • Load last 12-24 hours of metrics
  • Multi-timeframe analysis (1h + 4h)
  • Caching reduces duplicate reads

Integration validated

  • End-to-end tests passing
  • Manual testing with real data looks good
  • Configuration complete and documented

Phase 3 (Position Risk) - Complete When:

Position tracking working

  • Successfully fetches from KuCoin API
  • PnL matches KuCoin UI
  • Graceful error handling

Risk calculations accurate

  • Capital-at-risk calculation validated
  • Profit give-back estimates reasonable
  • Stop-loss distance correct

Notifications enhanced

  • All templates include risk metrics
  • Clear and actionable
  • Tested with real positions

Phase 4 (Testing) - Complete When:

Comprehensive test coverage

  • Unit tests: 80%+ coverage
  • Integration tests: all scenarios passing
  • Backtesting: validates system would work

Backtesting shows success

  • Profit retention ratio ≥ 0.75 on historical data
  • Stop-loss avoidance ≥ 95%
  • False exit rate ≤ 30%

CI/CD pipeline working

  • Tests run on every commit
  • Quality gates enforced
  • Coverage reports generated

Phase 5 (Operational) - Complete When:

15-min cadence running

  • CronJob executing 4× per hour
  • No performance issues

Audit logging complete

  • All events logged
  • Can query historical data
  • Manual operator action tracking works

KPIs tracked

  • All 5 KPIs calculable
  • Monthly reports generated
  • Trends visible

Production ready

  • Documentation complete
  • Monitoring setup
  • Runbook tested

Project Timeline

Waterfall Approach (Sequential)

PhaseDurationCumulative
Phase 1: Data Quality2-3 weeks2-3 weeks
Phase 2: Exit Strategy2-3 weeks4-6 weeks
Phase 3: Position Risk1-2 weeks5-8 weeks
Phase 4: Testing1 week6-9 weeks
Phase 5: Operational1 week7-10 weeks

Total: 7-10 weeks (1.5-2.5 months)

Agile Approach (Parallel where possible)

Sprint 1 (Week 1-2): Phase 1 + Start Phase 4 unit tests
Sprint 2 (Week 3-4): Phase 2 + Continue Phase 4
Sprint 3 (Week 5-6): Phase 3 + Complete Phase 4 (backtesting)
Sprint 4 (Week 7): Phase 5 (operational)

Total: 7 weeks (1.75 months) with parallel execution


Immediate Next Steps (Week 1)

Day 1-2: Setup & Planning

  • Review this SOW with stakeholder (Craig)
  • Set up development environment in .builders/0013-market-maker-mvp
  • Create feature branch: feature/phase-1-data-quality
  • Set up project tracking (GitHub issues/project board)

Day 3-5: Start Phase 1.1 (Metric Calculations)

  • Implement ADX calculation (src/regime/metrics/adx.py)
  • Implement Efficiency Ratio (src/regime/metrics/efficiency_ratio.py)
  • Unit tests for both (10 test cases)
  • Validate against TradingView/TA-Lib

Week 2: Continue Phase 1

  • Implement remaining metrics (autocorrelation, OU, slope, BB)
  • Complete all unit tests (22 test cases)
  • Extract metrics from regime analysis (modify engine.py)
  • Manual testing with real data

Week 3: Complete Phase 1

  • Data validation schema
  • Quality dashboard
  • Remove all TODOs
  • Code review
  • Merge to main
  • Deploy to test environment
  • User acceptance testing

Appendices

A. File Structure After Completion

repos/market-making/metrics-service/
├── src/
│   ├── regime/
│   │   ├── metrics/              # NEW
│   │   │   ├── adx.py
│   │   │   ├── efficiency_ratio.py
│   │   │   ├── autocorrelation.py
│   │   │   ├── ou_process.py
│   │   │   ├── slope.py
│   │   │   └── bollinger.py
│   │   ├── validation/           # NEW
│   │   │   └── schema_validator.py
│   │   ├── quality/              # NEW
│   │   │   └── dashboard.py
│   │   └── engine.py             # MODIFIED (TODOs removed)
│   ├── exit_strategy/
│   │   ├── triggers/             # NEW
│   │   │   ├── mandatory.py
│   │   │   ├── latest_acceptable.py
│   │   │   └── warning.py
│   │   ├── evaluator.py          # ENHANCED
│   │   ├── state_tracker.py      # ENHANCED
│   │   ├── history_loader.py     # NEW
│   │   ├── audit_logger.py       # NEW
│   │   └── kpis.py               # NEW
│   ├── position/                 # NEW
│   │   ├── tracker.py
│   │   └── risk_calculator.py
│   └── ...
├── tests/
│   ├── regime/
│   │   └── metrics/              # NEW (22 test cases)
│   ├── exit_strategy/
│   │   └── triggers/             # NEW (26 test cases)
│   ├── position/                 # NEW
│   └── integration/              # ENHANCED
├── backtest/                     # NEW
│   └── regime_exit_backtest.py
├── config/
│   └── exit_strategy_config.yaml # NEW
└── docs/                         # NEW
    ├── ops/
    │   ├── runbook.md
    │   └── troubleshooting.md
    ├── metrics_guide.md
    └── configuration.md

B. Configuration Reference

exit_strategy_config.yaml (complete example):

# Exit Strategy Configuration
# Version: 1.0.0
 
evaluation:
  cadence_minutes: 15
  lookback_hours: 12
  history_cache_hours: 24
 
exit_rules:
  # LATEST_ACCEPTABLE_EXIT triggers
  latest_acceptable_exit:
    transition_persistence_4h_bars: 2
    transition_persistence_1h_bars: 4
    mean_reversion_halflife_multiplier: 2.0
    volatility_expansion_threshold: 1.25
    zscore_reversion_failure_bars: 6
    
  # WARNING triggers (require 2+ conditions)
  warning:
    transition_probability_threshold: 0.40
    regime_confidence_decline_bars: 3
    efficiency_ratio_threshold: 0.6
    volatility_expansion_min: 1.10
    volatility_expansion_max: 1.25
    
  # MANDATORY_EXIT triggers
  mandatory_exit:
    consecutive_closes_outside_range: 2
    directional_swing_bars: 6  # For HH/HL pattern
    stop_loss_buffer_atr: 0.1  # Trigger before actual stop
 
# Notification configuration
notifications:
  rate_limits:
    warning_min_hours: 4
    latest_acceptable_min_hours: 2
    mandatory_min_hours: 1
    
  pushover:
    enabled: true
    priority_map:
      WARNING: 0           # Normal priority
      LATEST_ACCEPTABLE_EXIT: 1  # High priority
      MANDATORY_EXIT: 2    # Emergency priority
      
  email:
    enabled: true
    from: grid-alerts@example.com
    to:
      - craig@example.com
    subject_prefix: "[Grid Exit]"
 
# Position risk configuration
position_risk:
  api_timeout_seconds: 10
  retry_attempts: 3
  retry_backoff_multiplier: 2.0
  
  profit_giveback_estimation:
    min_atr_multiplier: 0.5  # Slow drift scenario
    max_atr_multiplier: 1.0  # Acceleration scenario
 
# KPI targets
kpis:
  review_cadence_days: 30
  targets:
    exit_within_acceptable_window_pct: 90
    profit_retention_ratio: 0.75
    stop_loss_avoidance_rate: 0.95
    true_transition_detection_rate: 0.70
    mandatory_exit_compliance_pct: 100
 
# Baseline calculation
baselines:
  ou_halflife:
    calculation_window_days: 7
    regime_filter: RANGE_OK  # Only calculate during ranging
    min_samples: 48  # Minimum 48 hours of data
  
  volatility:
    calculation_window_days: 30
    percentile_for_expansion: 0.8  # 80th percentile

C. Metric Calculation Formulas

ADX (Average Directional Index):

1. Calculate +DM, -DM (Directional Movement)
2. Calculate +DI, -DI (Directional Indicators)
3. Calculate DX = |+DI - -DI| / (+DI + -DI) × 100
4. ADX = Smoothed average of DX (typically 14 periods)

Efficiency Ratio:

ER = |Price[0] - Price[n]| / Σ|Price[i] - Price[i-1]|

Where:
- Numerator = Net price change
- Denominator = Sum of absolute bar-to-bar changes
- Range: [0, 1]
- Higher = more efficient trend

Lag-1 Autocorrelation:

r = Σ[(x[i] - x̄)(x[i-1] - x̄)] / Σ(x[i] - x̄)²

Where:
- x̄ = mean of series
- Range: [-1, 1]
- Negative = mean reverting

OU Half-Life:

1. Fit AR(1) model: x[t] = ϕ × x[t-1] + ε
2. Half-life = -log(2) / log(ϕ)

Requirements:
- |ϕ| < 1 (stationary)
- If |ϕ| ≥ 1, reject (non-stationary)

Normalized Slope:

slope = (Price[0] - Price[n]) / n
normalized_slope = slope / ATR

Where:
- n = lookback period (bars)
- Positive = uptrend
- Negative = downtrend

Bollinger Band Bandwidth:

bandwidth = (upper_band - lower_band) / middle_band

Where:
- middle_band = 20-period SMA
- upper_band = SMA + (2 × stddev)
- lower_band = SMA - (2 × stddev)

Conclusion

This Statement of Work provides a comprehensive roadmap to bring the market-making system from 40% complete to production-ready MVP.

Critical Path:

  1. Phase 1 (Data Quality) - MUST complete first
  2. Phase 2 (Exit Strategy) - Core value proposition
  3. Phase 4 (Testing) - Cannot deploy without

Optional (defer if needed):

  • Phase 3 (Position Risk) - Enhances notifications but not blocking
  • Phase 5 (Operational) - Nice to have

Estimated Timeline: 7-10 weeks at 20-40h/week

Next Action: Start Phase 1.1 - Implement ADX calculation


Document Version: 1.0
Last Updated: 2026-01-31
Author: AI Code Review
Status: Ready for Approval