Market-Making System: Design Review & Implementation Roadmap

Date: 2026-01-31
Reviewer: AI Code Analysis
Status: Phase 1 In Progress (11% Complete)
Related Workspace: .builders/0013-market-maker-mvp


πŸš€ Implementation Progress (Session: 2026-01-31)

Phase 1: Data Quality - In Progress

Overall Progress: 11% (1/9 tasks complete, 4/44 hours invested)

βœ… Completed Tasks

Task 1: ADX (Average Directional Index) - COMPLETE

  • Files: src/regime/metrics/adx.py, tests/regime/metrics/test_adx.py
  • Tests: 8/8 passing
  • Time: ~4 hours
  • Implementation: Welles Wilder’s ADX with double smoothing, safe division, proper error handling
  • Note: Requires 28+ bars for period=14 (double Wilder’s smoothing)

⏳ Remaining Phase 1 Tasks (33/44 hours)

#TaskFilesEst. TimeStatus
2Efficiency Ratioefficiency_ratio.py + tests4hTODO
3Autocorrelationautocorrelation.py + tests3hTODO
4OU Half-Lifeou_process.py + tests9hTODO
5Normalized Slopeslope.py + tests3hTODO
6BB Bandwidthbollinger.py + tests3hTODO
7IntegrationModify regime/engine.py8hTODO
8Validationschema_validator.py6hTODO
9Dashboardquality/dashboard.py4hTODO

Resume Instructions

Environment Setup:

cd /home/coder/src/repos/market-making/metrics-service
source .venv/bin/activate  # numpy, pytest installed
python -m pytest tests/regime/metrics/test_adx.py -v  # Verify: 8 passed

Next Task: Implement Efficiency Ratio (Task 2)

  • Reference: .ai/projects/market-making/phase-1-plan.md (Day 3-4)
  • Pattern: Write tests first (RED), implement (GREEN), refactor
  • Formula: ER = |Price[0] - Price[n]| / Ξ£|Price[i] - Price[i-1]|

Files Modified This Session:

  • repos/market-making/metrics-service/src/regime/metrics/__init__.py (new)
  • repos/market-making/metrics-service/src/regime/metrics/adx.py (new, 155 lines)
  • repos/market-making/metrics-service/tests/regime/metrics/__init__.py (new)
  • repos/market-making/metrics-service/tests/regime/metrics/test_adx.py (new, 125 lines, 8 tests)

Executive Summary

The market-making tool has a solid technical foundation but is approximately 40-50% complete toward stated MVP goals. The core regime detection engine works well, but the Grid Exit Strategy (primary value proposition) is only partially implemented.

CRITICAL ISSUE IDENTIFIED: Hardcoded dummy values in restart gates evaluation (repos/market-making/metrics-service/src/regime/engine.py lines 268-280, 349-359) make metrics YAMLs untrustworthy.

Completion Estimate

  • Total Effort: 180-250 hours
  • Timeline: 9-12 weeks at 20h/week, or 4.5-6 weeks at 40h/week
  • Priority Phases: P0 (Data Quality, Exit Strategy, Testing)

Current State Analysis

What’s Working βœ…

1. Regime Detection Engine (src/regime/)

  • βœ… Hourly OHLCV analysis from KuCoin API
  • βœ… 4 regime classifications: RANGE_OK, RANGE_WEAK, TRANSITION, TREND
  • βœ… 15+ metrics per analysis (Bollinger Bands, mean reversion, volatility, trend)
  • βœ… Git-backed storage in market-maker-data repository

Assessment: Core regime detection logic is well-implemented.

2. Infrastructure (infra/)

  • βœ… Kubernetes CronJob running hourly at :01
  • βœ… ExternalSecrets integration for KuCoin API keys
  • βœ… Docker image build/deploy workflow
  • βœ… ArgoCD deployment patterns
  • βœ… Git-based persistence (no database required)

Assessment: Infrastructure is production-ready.

3. Notification System (Partial)

  • βœ… Pushover integration working
  • βœ… Basic regime alerts functional
  • βœ… Entry evaluator module created (src/exit_strategy/entry_evaluator.py)
  • βœ… Rate limiting (4h minimum between same-state notifications)

Assessment: Working but incomplete.

4. Grid Configuration Management

  • βœ… YAML-based grid configurations
  • βœ… History tracking via Git
  • βœ… Configuration versioning
  • βœ… Grid state determination from history array

Assessment: Config management is functional.


Critical Gaps Identified

Gap 1: Data Quality - Hardcoded Dummy Values (P0 - CRITICAL)

Location: repos/market-making/metrics-service/src/regime/engine.py

The Problem (lines 268-280 and duplicated at 349-359):

# Mock values for now - these should come from actual analysis
# TODO: Extract these from the detailed_analysis once refined classification is implemented
trend_score = regime_state.trend_score or 50.0
mean_rev_score = regime_state.mean_rev_score or 50.0
adx = 25.0  # TODO: Extract from analysis
adx_history = [25.0] * 10  # TODO: Extract from analysis
normalized_slope = 0.1  # TODO: Extract from analysis
efficiency_ratio = 0.4  # TODO: Extract from analysis
lag1_autocorr = -0.1  # TODO: Extract from analysis
ou_half_life = 24.0  # TODO: Extract from analysis
atr = 1500.0  # TODO: Extract from analysis
atr_history = [1500.0] * 100  # TODO: Extract from analysis
bb_bandwidth = 0.02  # TODO: Extract from analysis
bb_bandwidth_history = [0.02] * 10  # TODO: Extract from analysis

Impact:

  • πŸ”΄ Restart gates evaluation uses fake data
  • πŸ”΄ Grid creation recommendations unreliable
  • πŸ”΄ Metrics YAMLs contain static values
  • πŸ”΄ Cannot trust historical analysis
  • πŸ”΄ No way to validate regime classifications
  • πŸ”΄ User quote: β€œI often don’t trust the generated metrics yamls”

Missing Calculations:

  1. ADX (Average Directional Index) - Trend strength measurement
  2. Efficiency Ratio - Trend efficiency (Perry Kaufman formula)
  3. Lag-1 Autocorrelation - Mean reversion indicator
  4. OU Half-Life - Time for price to revert halfway to mean
  5. Normalized Slope - Price slope normalized by ATR
  6. Bollinger Band Bandwidth - Volatility measurement

Effort: 40-60 hours (Phase 1)


Gap 2: Grid Exit Strategy - NOT IMPLEMENTED (P0)

Location: src/exit_strategy/evaluator.py

Current Status: 30% complete - stub implementation only

What Exists:

  • Basic ExitState enum (NORMAL, WARNING, LATEST_ACCEPTABLE_EXIT, MANDATORY_EXIT)
  • Simple MANDATORY_EXIT trigger for TREND regime
  • Basic boundary violation check (β‰₯2 consecutive closes outside range)

What’s Missing:

LATEST_ACCEPTABLE_EXIT Triggers

Per spec (.ai/projects/market-making/grid-exit-strategy/spec.md):

  • ❌ TRANSITION persistence tracking (β‰₯2 consecutive 4h bars OR β‰₯4 consecutive 1h bars)
  • ❌ Mean reversion degradation (OU half-life β‰₯ 2Γ— baseline)
  • ❌ Volatility expansion ratio > 1.25 threshold
  • ❌ Z-score reversion failure tracking

WARNING Triggers

  • ❌ TRANSITION probability β‰₯ 40% (configurable)
  • ❌ Regime confidence declining over 3 bars
  • ❌ Efficiency Ratio rising above range threshold
  • ❌ Mean reversion speed slowing
  • ❌ Volatility expansion 1.1-1.25Γ— range
  • ❌ Require 2+ conditions to trigger (critical logic)

State Transition Tracking

  • ❌ Store previous exit states in Git
  • ❌ Track state durations
  • ❌ Prevent notification spam for same state

Historical Data Loading

  • ❌ Load last N metrics files for persistence checks
  • ❌ Cache recent history for performance
  • ❌ Multi-timeframe analysis (1h + 4h bars)

Impact: Cannot trust system to alert when to exit grids - defeats entire purpose.

Effort: 50-70 hours (Phase 2)


Gap 3: Position Risk Quantification - MISSING (P1)

Current Status: NOT IMPLEMENTED

What’s Missing:

Position Tracking

# DOES NOT EXIST - Need to implement:
class PositionTracker:
    def get_active_positions(self, grid_id: str) -> List[Position]:
        """Fetch actual open orders from KuCoin API"""
        
    def calculate_unrealized_pnl(self, positions: List[Position]) -> float:
        """Current unrealized P&L"""

Capital Risk Calculator

# DOES NOT EXIST - Need to implement:
class CapitalRiskCalculator:
    def calculate_capital_at_risk(self, inventory, current_price, stop_loss) -> float:
        """Inventory value Γ— (current_price - stop_loss) / current_price"""
        
    def estimate_profit_giveback(self, peak_pnl, current_pnl, delay_hours) -> Tuple[float, float]:
        """Range estimate: [min_giveback, max_giveback]"""

Current Behavior: Risk assessment in history.py only looks at config, not actual positions.

Impact: Notifications cannot show:

  • β€œCapital at risk: $120.50”
  • β€œExpected give-back if delayed 12h: $4-7”
  • β€œStop-loss distance: 0.85 ATR”

Effort: 30-40 hours (Phase 3)


Gap 4: Testing - NEARLY ZERO (P0)

Current Status: Minimal test coverage

Required:

  • Unit tests for all metric calculations
  • Unit tests for exit trigger logic
  • Integration tests (regime β†’ exit β†’ notification flow)
  • Backtesting framework to validate signal quality

Impact: Cannot refactor or trust changes without tests.

Effort: 40-50 hours (Phase 4)


Gap 5: Operational Improvements (P2)

Issues:

  • ⚠️ Evaluation cadence: Hourly (should be 15 minutes)
  • ❌ Audit logging: Not implemented
  • ❌ KPI tracking: Not implemented
  • ⚠️ Documentation: Minimal

Effort: 20-30 hours (Phase 5)


Implementation Roadmap

Phase 1: Data Trust & Quality (P0 - CRITICAL)

Duration: 2-3 weeks (40-60 hours)
Objective: Remove all hardcoded dummy values and implement real metric calculations

Tasks:

  1. Implement missing metric calculations (20-30h)

    • ADX (Average Directional Index)
    • Efficiency Ratio
    • Lag-1 Autocorrelation
    • OU Half-Life
    • Normalized Slope
    • Bollinger Band Bandwidth
  2. Extract metrics from regime analysis (8-12h)

    • Modify src/regime/engine.py lines 268-280, 349-359
    • Remove all hardcoded values
    • Extract from detailed_analysis dict
  3. Add data validation (8-12h)

    • Schema validator for metrics YAMLs
    • Sanity checks per metric type
    • Automated validation on Git commit
  4. Create data quality dashboard (4-6h)

    • Visual indicators: Real vs Dummy data
    • Historical trend validation
    • Anomaly detection
  5. Unit tests (8-12h)

    • 22+ test cases covering all metrics
    • Validation against TA-Lib/TradingView
    • 90%+ code coverage

Success Criteria:

  • βœ… All TODOs removed from regime/engine.py
  • βœ… All metrics calculated from real data
  • βœ… Data validation prevents bad data from being committed
  • βœ… Quality dashboard shows 100% real data
  • βœ… User confirms: β€œI trust the metrics YAMLs now”

Phase 2: Complete Grid Exit Strategy (P0)

Duration: 2-3 weeks (50-70 hours)
Objective: Implement all missing exit triggers and state tracking

Tasks:

  1. LATEST_ACCEPTABLE_EXIT triggers (8-12h)

    • TRANSITION persistence tracking
    • Mean reversion degradation checks
    • Volatility expansion detection
    • Z-score reversion failure
  2. WARNING triggers (4-6h)

    • 5 condition checks
    • Require 2+ conditions to trigger
    • Configurable thresholds
  3. State transition tracking (4-6h)

    • Store state history in Git
    • Prevent notification spam
    • Track state durations
  4. Historical data loading (4-6h)

    • Load last 12-24 hours of metrics
    • Multi-timeframe analysis (1h + 4h)
    • Caching for performance
  5. Integration & testing (8-12h)

    • Wire up all triggers
    • End-to-end tests
    • Real data validation
  6. Configuration & documentation (4-6h)

    • exit_strategy_config.yaml
    • Trigger logic documentation

Success Criteria:

  • βœ… All exit triggers implemented and tested
  • βœ… State tracking working
  • βœ… Historical data loading functional
  • βœ… Integration validated with real data

Phase 3: Position Risk Quantification (P1)

Duration: 1-2 weeks (30-40 hours)
Objective: Add real position tracking and capital risk calculations

Tasks:

  1. KuCoin Position Tracker (8-12h)

    • Fetch active positions from API
    • Calculate unrealized PnL
    • Inventory imbalance tracking
  2. Capital Risk Calculator (6-8h)

    • Capital-at-risk calculation
    • Profit give-back estimation
    • Stop-loss distance in ATR units
  3. Enhance notifications (4-6h)

    • Add risk metrics to all alerts
    • Update notification templates
  4. Error handling (4-6h)

    • Graceful degradation on API failures
    • Circuit breaker pattern
    • Clear error messaging
  5. Testing (8-12h)

    • Unit tests with mocked KuCoin
    • Integration tests
    • Manual validation vs KuCoin UI

Success Criteria:

  • βœ… Position tracking working
  • βœ… Risk calculations accurate
  • βœ… Notifications enhanced with risk data

Phase 4: Testing & Validation (P0)

Duration: 1 week (40-50 hours)
Objective: Comprehensive test coverage and backtesting validation

Tasks:

  1. Unit tests for metrics (8-10h)

    • 22+ test cases for all calculations
    • Edge cases covered
    • 90%+ coverage
  2. Unit tests for exit triggers (10-12h)

    • 26+ test cases
    • Boundary conditions tested
  3. Integration tests (10-12h)

    • End-to-end flow tests
    • Multi-timeframe analysis
    • Git integration
    • Notification delivery
  4. Backtesting framework (12-16h)

    • Replay historical metrics
    • Evaluate exit quality
    • Profit preservation analysis
    • KPI validation
  5. CI/CD integration (4-6h)

    • GitHub Actions workflow
    • Quality gates
    • Coverage reports

Success Criteria:

  • βœ… 80%+ code coverage
  • βœ… All critical paths tested
  • βœ… Backtesting shows system would work
  • βœ… CI/CD pipeline running

Phase 5: Operational Improvements (P2)

Duration: 1 week (20-30 hours)
Objective: Production readiness and observability

Tasks:

  1. 15-minute evaluation cadence (0.5h)

    • Update CronJob schedule
    • Or create separate exit-evaluation job
  2. Audit logging (3-4h)

    • Log all exit state transitions
    • Track notification delivery
    • Record operator actions
  3. KPI tracking (4-6h)

    • Implement KPI calculations
    • Monthly report generation
    • Trend analysis
  4. Documentation (4-6h)

    • Operational runbook
    • Troubleshooting guide
    • Metrics interpretation guide
  5. Monitoring & alerting (4-6h)

    • Prometheus metrics
    • Grafana dashboard
    • Alert configuration
  6. Performance optimization (4-6h)

    • Caching
    • Async processing
    • Benchmarking

Success Criteria:

  • βœ… 15-min cadence running
  • βœ… Audit logging complete
  • βœ… KPIs tracked
  • βœ… Production ready

Risk Assessment

Technical Risks

RiskImpactLikelihoodMitigation
Metric calculations incorrectHighMediumExtensive unit tests, validation against known indicators
False MANDATORY_EXIT signalsHighMediumRequire multiple confirming indicators, tune thresholds
Missed regime transitionsHighLow15-min cadence, multi-timeframe confirmation
KuCoin API rate limitingMediumLowCache data, backoff strategy
Git push failuresMediumLowRetry logic, local backup

Operational Risks

RiskImpactLikelihoodMitigation
Operator misses notificationHighMediumMulti-channel delivery, escalating urgency
Notification fatigueMediumHighSmart rate limiting, clear urgency indicators
Grid stopped unnecessarilyMediumMediumBacktesting, tunable thresholds, track False Exit Rate

Success Metrics (KPIs)

Per .ai/projects/market-making/new-instructions.md:

Exit Quality KPIs

  1. Exit Within Acceptable Window (EAW%) β‰₯ 90%

    • Formula: ExitsBeforeMandatory / TotalExitEvents
  2. Profit Retention Ratio (PRR) β‰₯ 0.75

    • Formula: RealizedProfitAtExit / MaxUnrealizedProfitBeforeExit
  3. Stop-Loss Avoidance Rate (SLAR) β‰₯ 95%

    • Formula: ExitsBeforeStop / TotalGridsStopped
  4. True Transition Detection Rate (TTDR) β‰₯ 70%

    • Formula: TransitionExitsWithFollowThrough / TotalTransitionExits
  5. Mandatory Exit Compliance (MEC%) = 100%

    • Formula: CompliedMandatoryExits / MandatoryExitSignals

Architecture Recommendations

1. Implement Exit State Engine First

# src/exit_strategy/state_engine.py
class ExitStateEngine:
    def evaluate(self, regime: Dict, grid: Dict) -> ExitState:
        """
        Main entry point. Returns one of:
        - NORMAL
        - WARNING
        - LATEST_ACCEPTABLE_EXIT
        - MANDATORY_EXIT
        """
        if self._check_mandatory_exit(regime, grid):
            return ExitState.MANDATORY_EXIT
        elif self._check_latest_acceptable_exit(regime, grid):
            return ExitState.LATEST_ACCEPTABLE_EXIT
        elif self._check_warning(regime, grid):
            return ExitState.WARNING
        else:
            return ExitState.NORMAL

2. Refactor Notification System

Current: Monolithic script with mixed concerns
Proposed:

send_regime_notifications.py  (orchestrator)
    β”œβ†’ src/exit_strategy/state_engine.py  (exit state classification)
    β”œβ†’ src/exit_strategy/message_builder.py  (notification content)
    β””β†’ src/exit_strategy/pushover_client.py  (delivery)

3. Add Position Tracker

# src/position/tracker.py
class PositionTracker:
    def get_active_positions(self, grid_id: str) -> List[Position]:
        """Fetch actual open orders from KuCoin API"""
        
    def calculate_pnl(self, positions: List[Position]) -> PnLSummary:
        """Calculate unrealized PnL"""

4. Separate Risk Assessment from Metrics Collection

Current: metrics/history.py does both
Proposed:

src/
  metrics/
    collector.py           # Fetch & store metrics
  risk/
    assessor.py           # Analyze metrics β†’ risk level
  exit_strategy/
    state_engine.py       # Risk + regime β†’ exit state

File Structure After Completion

repos/market-making/metrics-service/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ regime/
β”‚   β”‚   β”œβ”€β”€ metrics/              # NEW
β”‚   β”‚   β”‚   β”œβ”€β”€ adx.py
β”‚   β”‚   β”‚   β”œβ”€β”€ efficiency_ratio.py
β”‚   β”‚   β”‚   β”œβ”€β”€ autocorrelation.py
β”‚   β”‚   β”‚   β”œβ”€β”€ ou_process.py
β”‚   β”‚   β”‚   β”œβ”€β”€ slope.py
β”‚   β”‚   β”‚   └── bollinger.py
β”‚   β”‚   β”œβ”€β”€ validation/           # NEW
β”‚   β”‚   β”‚   └── schema_validator.py
β”‚   β”‚   β”œβ”€β”€ quality/              # NEW
β”‚   β”‚   β”‚   └── dashboard.py
β”‚   β”‚   └── engine.py             # MODIFIED (TODOs removed)
β”‚   β”œβ”€β”€ exit_strategy/
β”‚   β”‚   β”œβ”€β”€ triggers/             # NEW
β”‚   β”‚   β”‚   β”œβ”€β”€ mandatory.py
β”‚   β”‚   β”‚   β”œβ”€β”€ latest_acceptable.py
β”‚   β”‚   β”‚   └── warning.py
β”‚   β”‚   β”œβ”€β”€ evaluator.py          # ENHANCED
β”‚   β”‚   β”œβ”€β”€ state_tracker.py      # ENHANCED
β”‚   β”‚   β”œβ”€β”€ history_loader.py     # NEW
β”‚   β”‚   β”œβ”€β”€ audit_logger.py       # NEW
β”‚   β”‚   └── kpis.py               # NEW
β”‚   β”œβ”€β”€ position/                 # NEW
β”‚   β”‚   β”œβ”€β”€ tracker.py
β”‚   β”‚   └── risk_calculator.py
β”‚   └── ...
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ regime/metrics/           # NEW (22 test cases)
β”‚   β”œβ”€β”€ exit_strategy/triggers/   # NEW (26 test cases)
β”‚   β”œβ”€β”€ position/                 # NEW
β”‚   └── integration/              # ENHANCED
β”œβ”€β”€ backtest/                     # NEW
β”‚   └── regime_exit_backtest.py
β”œβ”€β”€ config/
β”‚   └── exit_strategy_config.yaml # NEW
└── docs/                         # NEW
    β”œβ”€β”€ ops/
    β”‚   β”œβ”€β”€ runbook.md
    β”‚   └── troubleshooting.md
    β”œβ”€β”€ metrics_guide.md
    └── configuration.md

Immediate Next Steps

Week 1: Start Phase 1

  1. Setup (Day 1-2)

    • Review this document
    • Set up development environment
    • Create feature branch: feature/phase-1-data-quality
  2. Implement Metrics (Day 3-5)

    • ADX calculation
    • Efficiency Ratio
    • Unit tests (10 test cases)
    • Validate against TradingView
  3. Continue (Week 2-3)

    • Remaining metrics
    • Integration with regime engine
    • Data validation
    • Quality dashboard

  • Detailed SOW: .builders/0013-market-maker-mvp/SYSTEM_ANALYSIS.md
  • Phase 1 Plan: .builders/0013-market-maker-mvp/PHASE_1_PLAN.md
  • Original Review: .ai/projects/market-making/SYSTEM_REVIEW.md
  • Exit Strategy Spec: .ai/projects/market-making/grid-exit-strategy/spec.md
  • Requirements: .ai/projects/market-making/regime-management/requirements.md
  • New Instructions: .ai/projects/market-making/new-instructions.md

Conclusion

The market-making system has a strong technical foundation but requires focused effort to complete:

  • πŸ”΄ Phase 1 (Data Quality) - CRITICAL: Fix dummy data immediately
  • πŸ”΄ Phase 2 (Exit Strategy) - CRITICAL: Core value proposition
  • 🟑 Phase 3 (Position Risk) - IMPORTANT: Enhances notifications
  • πŸ”΄ Phase 4 (Testing) - CRITICAL: Cannot deploy without
  • 🟒 Phase 5 (Operational) - NICE TO HAVE: Polish and monitoring

Total: 180-250 hours over 7-10 weeks

Recommendation: Start with Phase 1 immediately. This is a blocker for everything else and addresses the root cause of trust issues with the system.


Document Version: 1.0
Last Updated: 2026-01-31
Next Review: After Phase 1 completion