Market-Making System: Design Review & Implementation Roadmap
Date: 2026-01-31
Reviewer: AI Code Analysis
Status: Phase 1 In Progress (11% Complete)
Related Workspace: .builders/0013-market-maker-mvp
π Implementation Progress (Session: 2026-01-31)
Phase 1: Data Quality - In Progress
Overall Progress: 11% (1/9 tasks complete, 4/44 hours invested)
β Completed Tasks
Task 1: ADX (Average Directional Index) - COMPLETE
- Files:
src/regime/metrics/adx.py,tests/regime/metrics/test_adx.py - Tests: 8/8 passing
- Time: ~4 hours
- Implementation: Welles Wilderβs ADX with double smoothing, safe division, proper error handling
- Note: Requires 28+ bars for period=14 (double Wilderβs smoothing)
β³ Remaining Phase 1 Tasks (33/44 hours)
| # | Task | Files | Est. Time | Status |
|---|---|---|---|---|
| 2 | Efficiency Ratio | efficiency_ratio.py + tests | 4h | TODO |
| 3 | Autocorrelation | autocorrelation.py + tests | 3h | TODO |
| 4 | OU Half-Life | ou_process.py + tests | 9h | TODO |
| 5 | Normalized Slope | slope.py + tests | 3h | TODO |
| 6 | BB Bandwidth | bollinger.py + tests | 3h | TODO |
| 7 | Integration | Modify regime/engine.py | 8h | TODO |
| 8 | Validation | schema_validator.py | 6h | TODO |
| 9 | Dashboard | quality/dashboard.py | 4h | TODO |
Resume Instructions
Environment Setup:
cd /home/coder/src/repos/market-making/metrics-service
source .venv/bin/activate # numpy, pytest installed
python -m pytest tests/regime/metrics/test_adx.py -v # Verify: 8 passedNext Task: Implement Efficiency Ratio (Task 2)
- Reference:
.ai/projects/market-making/phase-1-plan.md(Day 3-4) - Pattern: Write tests first (RED), implement (GREEN), refactor
- Formula:
ER = |Price[0] - Price[n]| / Ξ£|Price[i] - Price[i-1]|
Files Modified This Session:
repos/market-making/metrics-service/src/regime/metrics/__init__.py(new)repos/market-making/metrics-service/src/regime/metrics/adx.py(new, 155 lines)repos/market-making/metrics-service/tests/regime/metrics/__init__.py(new)repos/market-making/metrics-service/tests/regime/metrics/test_adx.py(new, 125 lines, 8 tests)
Executive Summary
The market-making tool has a solid technical foundation but is approximately 40-50% complete toward stated MVP goals. The core regime detection engine works well, but the Grid Exit Strategy (primary value proposition) is only partially implemented.
CRITICAL ISSUE IDENTIFIED: Hardcoded dummy values in restart gates evaluation (repos/market-making/metrics-service/src/regime/engine.py lines 268-280, 349-359) make metrics YAMLs untrustworthy.
Completion Estimate
- Total Effort: 180-250 hours
- Timeline: 9-12 weeks at 20h/week, or 4.5-6 weeks at 40h/week
- Priority Phases: P0 (Data Quality, Exit Strategy, Testing)
Current State Analysis
Whatβs Working β
1. Regime Detection Engine (src/regime/)
- β Hourly OHLCV analysis from KuCoin API
- β 4 regime classifications: RANGE_OK, RANGE_WEAK, TRANSITION, TREND
- β 15+ metrics per analysis (Bollinger Bands, mean reversion, volatility, trend)
- β
Git-backed storage in
market-maker-datarepository
Assessment: Core regime detection logic is well-implemented.
2. Infrastructure (infra/)
- β Kubernetes CronJob running hourly at :01
- β ExternalSecrets integration for KuCoin API keys
- β Docker image build/deploy workflow
- β ArgoCD deployment patterns
- β Git-based persistence (no database required)
Assessment: Infrastructure is production-ready.
3. Notification System (Partial)
- β Pushover integration working
- β Basic regime alerts functional
- β
Entry evaluator module created (
src/exit_strategy/entry_evaluator.py) - β Rate limiting (4h minimum between same-state notifications)
Assessment: Working but incomplete.
4. Grid Configuration Management
- β YAML-based grid configurations
- β History tracking via Git
- β Configuration versioning
- β Grid state determination from history array
Assessment: Config management is functional.
Critical Gaps Identified
Gap 1: Data Quality - Hardcoded Dummy Values (P0 - CRITICAL)
Location: repos/market-making/metrics-service/src/regime/engine.py
The Problem (lines 268-280 and duplicated at 349-359):
# Mock values for now - these should come from actual analysis
# TODO: Extract these from the detailed_analysis once refined classification is implemented
trend_score = regime_state.trend_score or 50.0
mean_rev_score = regime_state.mean_rev_score or 50.0
adx = 25.0 # TODO: Extract from analysis
adx_history = [25.0] * 10 # TODO: Extract from analysis
normalized_slope = 0.1 # TODO: Extract from analysis
efficiency_ratio = 0.4 # TODO: Extract from analysis
lag1_autocorr = -0.1 # TODO: Extract from analysis
ou_half_life = 24.0 # TODO: Extract from analysis
atr = 1500.0 # TODO: Extract from analysis
atr_history = [1500.0] * 100 # TODO: Extract from analysis
bb_bandwidth = 0.02 # TODO: Extract from analysis
bb_bandwidth_history = [0.02] * 10 # TODO: Extract from analysisImpact:
- π΄ Restart gates evaluation uses fake data
- π΄ Grid creation recommendations unreliable
- π΄ Metrics YAMLs contain static values
- π΄ Cannot trust historical analysis
- π΄ No way to validate regime classifications
- π΄ User quote: βI often donβt trust the generated metrics yamlsβ
Missing Calculations:
- ADX (Average Directional Index) - Trend strength measurement
- Efficiency Ratio - Trend efficiency (Perry Kaufman formula)
- Lag-1 Autocorrelation - Mean reversion indicator
- OU Half-Life - Time for price to revert halfway to mean
- Normalized Slope - Price slope normalized by ATR
- Bollinger Band Bandwidth - Volatility measurement
Effort: 40-60 hours (Phase 1)
Gap 2: Grid Exit Strategy - NOT IMPLEMENTED (P0)
Location: src/exit_strategy/evaluator.py
Current Status: 30% complete - stub implementation only
What Exists:
- Basic
ExitStateenum (NORMAL, WARNING, LATEST_ACCEPTABLE_EXIT, MANDATORY_EXIT) - Simple MANDATORY_EXIT trigger for TREND regime
- Basic boundary violation check (β₯2 consecutive closes outside range)
Whatβs Missing:
LATEST_ACCEPTABLE_EXIT Triggers
Per spec (.ai/projects/market-making/grid-exit-strategy/spec.md):
- β TRANSITION persistence tracking (β₯2 consecutive 4h bars OR β₯4 consecutive 1h bars)
- β Mean reversion degradation (OU half-life β₯ 2Γ baseline)
- β Volatility expansion ratio > 1.25 threshold
- β Z-score reversion failure tracking
WARNING Triggers
- β TRANSITION probability β₯ 40% (configurable)
- β Regime confidence declining over 3 bars
- β Efficiency Ratio rising above range threshold
- β Mean reversion speed slowing
- β Volatility expansion 1.1-1.25Γ range
- β Require 2+ conditions to trigger (critical logic)
State Transition Tracking
- β Store previous exit states in Git
- β Track state durations
- β Prevent notification spam for same state
Historical Data Loading
- β Load last N metrics files for persistence checks
- β Cache recent history for performance
- β Multi-timeframe analysis (1h + 4h bars)
Impact: Cannot trust system to alert when to exit grids - defeats entire purpose.
Effort: 50-70 hours (Phase 2)
Gap 3: Position Risk Quantification - MISSING (P1)
Current Status: NOT IMPLEMENTED
Whatβs Missing:
Position Tracking
# DOES NOT EXIST - Need to implement:
class PositionTracker:
def get_active_positions(self, grid_id: str) -> List[Position]:
"""Fetch actual open orders from KuCoin API"""
def calculate_unrealized_pnl(self, positions: List[Position]) -> float:
"""Current unrealized P&L"""Capital Risk Calculator
# DOES NOT EXIST - Need to implement:
class CapitalRiskCalculator:
def calculate_capital_at_risk(self, inventory, current_price, stop_loss) -> float:
"""Inventory value Γ (current_price - stop_loss) / current_price"""
def estimate_profit_giveback(self, peak_pnl, current_pnl, delay_hours) -> Tuple[float, float]:
"""Range estimate: [min_giveback, max_giveback]"""Current Behavior: Risk assessment in history.py only looks at config, not actual positions.
Impact: Notifications cannot show:
- βCapital at risk: $120.50β
- βExpected give-back if delayed 12h: $4-7β
- βStop-loss distance: 0.85 ATRβ
Effort: 30-40 hours (Phase 3)
Gap 4: Testing - NEARLY ZERO (P0)
Current Status: Minimal test coverage
Required:
- Unit tests for all metric calculations
- Unit tests for exit trigger logic
- Integration tests (regime β exit β notification flow)
- Backtesting framework to validate signal quality
Impact: Cannot refactor or trust changes without tests.
Effort: 40-50 hours (Phase 4)
Gap 5: Operational Improvements (P2)
Issues:
- β οΈ Evaluation cadence: Hourly (should be 15 minutes)
- β Audit logging: Not implemented
- β KPI tracking: Not implemented
- β οΈ Documentation: Minimal
Effort: 20-30 hours (Phase 5)
Implementation Roadmap
Phase 1: Data Trust & Quality (P0 - CRITICAL)
Duration: 2-3 weeks (40-60 hours)
Objective: Remove all hardcoded dummy values and implement real metric calculations
Tasks:
-
Implement missing metric calculations (20-30h)
- ADX (Average Directional Index)
- Efficiency Ratio
- Lag-1 Autocorrelation
- OU Half-Life
- Normalized Slope
- Bollinger Band Bandwidth
-
Extract metrics from regime analysis (8-12h)
- Modify
src/regime/engine.pylines 268-280, 349-359 - Remove all hardcoded values
- Extract from
detailed_analysisdict
- Modify
-
Add data validation (8-12h)
- Schema validator for metrics YAMLs
- Sanity checks per metric type
- Automated validation on Git commit
-
Create data quality dashboard (4-6h)
- Visual indicators: Real vs Dummy data
- Historical trend validation
- Anomaly detection
-
Unit tests (8-12h)
- 22+ test cases covering all metrics
- Validation against TA-Lib/TradingView
- 90%+ code coverage
Success Criteria:
- β
All TODOs removed from
regime/engine.py - β All metrics calculated from real data
- β Data validation prevents bad data from being committed
- β Quality dashboard shows 100% real data
- β User confirms: βI trust the metrics YAMLs nowβ
Phase 2: Complete Grid Exit Strategy (P0)
Duration: 2-3 weeks (50-70 hours)
Objective: Implement all missing exit triggers and state tracking
Tasks:
-
LATEST_ACCEPTABLE_EXIT triggers (8-12h)
- TRANSITION persistence tracking
- Mean reversion degradation checks
- Volatility expansion detection
- Z-score reversion failure
-
WARNING triggers (4-6h)
- 5 condition checks
- Require 2+ conditions to trigger
- Configurable thresholds
-
State transition tracking (4-6h)
- Store state history in Git
- Prevent notification spam
- Track state durations
-
Historical data loading (4-6h)
- Load last 12-24 hours of metrics
- Multi-timeframe analysis (1h + 4h)
- Caching for performance
-
Integration & testing (8-12h)
- Wire up all triggers
- End-to-end tests
- Real data validation
-
Configuration & documentation (4-6h)
exit_strategy_config.yaml- Trigger logic documentation
Success Criteria:
- β All exit triggers implemented and tested
- β State tracking working
- β Historical data loading functional
- β Integration validated with real data
Phase 3: Position Risk Quantification (P1)
Duration: 1-2 weeks (30-40 hours)
Objective: Add real position tracking and capital risk calculations
Tasks:
-
KuCoin Position Tracker (8-12h)
- Fetch active positions from API
- Calculate unrealized PnL
- Inventory imbalance tracking
-
Capital Risk Calculator (6-8h)
- Capital-at-risk calculation
- Profit give-back estimation
- Stop-loss distance in ATR units
-
Enhance notifications (4-6h)
- Add risk metrics to all alerts
- Update notification templates
-
Error handling (4-6h)
- Graceful degradation on API failures
- Circuit breaker pattern
- Clear error messaging
-
Testing (8-12h)
- Unit tests with mocked KuCoin
- Integration tests
- Manual validation vs KuCoin UI
Success Criteria:
- β Position tracking working
- β Risk calculations accurate
- β Notifications enhanced with risk data
Phase 4: Testing & Validation (P0)
Duration: 1 week (40-50 hours)
Objective: Comprehensive test coverage and backtesting validation
Tasks:
-
Unit tests for metrics (8-10h)
- 22+ test cases for all calculations
- Edge cases covered
- 90%+ coverage
-
Unit tests for exit triggers (10-12h)
- 26+ test cases
- Boundary conditions tested
-
Integration tests (10-12h)
- End-to-end flow tests
- Multi-timeframe analysis
- Git integration
- Notification delivery
-
Backtesting framework (12-16h)
- Replay historical metrics
- Evaluate exit quality
- Profit preservation analysis
- KPI validation
-
CI/CD integration (4-6h)
- GitHub Actions workflow
- Quality gates
- Coverage reports
Success Criteria:
- β 80%+ code coverage
- β All critical paths tested
- β Backtesting shows system would work
- β CI/CD pipeline running
Phase 5: Operational Improvements (P2)
Duration: 1 week (20-30 hours)
Objective: Production readiness and observability
Tasks:
-
15-minute evaluation cadence (0.5h)
- Update CronJob schedule
- Or create separate exit-evaluation job
-
Audit logging (3-4h)
- Log all exit state transitions
- Track notification delivery
- Record operator actions
-
KPI tracking (4-6h)
- Implement KPI calculations
- Monthly report generation
- Trend analysis
-
Documentation (4-6h)
- Operational runbook
- Troubleshooting guide
- Metrics interpretation guide
-
Monitoring & alerting (4-6h)
- Prometheus metrics
- Grafana dashboard
- Alert configuration
-
Performance optimization (4-6h)
- Caching
- Async processing
- Benchmarking
Success Criteria:
- β 15-min cadence running
- β Audit logging complete
- β KPIs tracked
- β Production ready
Risk Assessment
Technical Risks
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Metric calculations incorrect | High | Medium | Extensive unit tests, validation against known indicators |
| False MANDATORY_EXIT signals | High | Medium | Require multiple confirming indicators, tune thresholds |
| Missed regime transitions | High | Low | 15-min cadence, multi-timeframe confirmation |
| KuCoin API rate limiting | Medium | Low | Cache data, backoff strategy |
| Git push failures | Medium | Low | Retry logic, local backup |
Operational Risks
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Operator misses notification | High | Medium | Multi-channel delivery, escalating urgency |
| Notification fatigue | Medium | High | Smart rate limiting, clear urgency indicators |
| Grid stopped unnecessarily | Medium | Medium | Backtesting, tunable thresholds, track False Exit Rate |
Success Metrics (KPIs)
Per .ai/projects/market-making/new-instructions.md:
Exit Quality KPIs
-
Exit Within Acceptable Window (EAW%) β₯ 90%
- Formula:
ExitsBeforeMandatory / TotalExitEvents
- Formula:
-
Profit Retention Ratio (PRR) β₯ 0.75
- Formula:
RealizedProfitAtExit / MaxUnrealizedProfitBeforeExit
- Formula:
-
Stop-Loss Avoidance Rate (SLAR) β₯ 95%
- Formula:
ExitsBeforeStop / TotalGridsStopped
- Formula:
-
True Transition Detection Rate (TTDR) β₯ 70%
- Formula:
TransitionExitsWithFollowThrough / TotalTransitionExits
- Formula:
-
Mandatory Exit Compliance (MEC%) = 100%
- Formula:
CompliedMandatoryExits / MandatoryExitSignals
- Formula:
Architecture Recommendations
1. Implement Exit State Engine First
# src/exit_strategy/state_engine.py
class ExitStateEngine:
def evaluate(self, regime: Dict, grid: Dict) -> ExitState:
"""
Main entry point. Returns one of:
- NORMAL
- WARNING
- LATEST_ACCEPTABLE_EXIT
- MANDATORY_EXIT
"""
if self._check_mandatory_exit(regime, grid):
return ExitState.MANDATORY_EXIT
elif self._check_latest_acceptable_exit(regime, grid):
return ExitState.LATEST_ACCEPTABLE_EXIT
elif self._check_warning(regime, grid):
return ExitState.WARNING
else:
return ExitState.NORMAL2. Refactor Notification System
Current: Monolithic script with mixed concerns
Proposed:
send_regime_notifications.py (orchestrator)
ββ src/exit_strategy/state_engine.py (exit state classification)
ββ src/exit_strategy/message_builder.py (notification content)
ββ src/exit_strategy/pushover_client.py (delivery)
3. Add Position Tracker
# src/position/tracker.py
class PositionTracker:
def get_active_positions(self, grid_id: str) -> List[Position]:
"""Fetch actual open orders from KuCoin API"""
def calculate_pnl(self, positions: List[Position]) -> PnLSummary:
"""Calculate unrealized PnL"""4. Separate Risk Assessment from Metrics Collection
Current: metrics/history.py does both
Proposed:
src/
metrics/
collector.py # Fetch & store metrics
risk/
assessor.py # Analyze metrics β risk level
exit_strategy/
state_engine.py # Risk + regime β exit state
File Structure After Completion
repos/market-making/metrics-service/
βββ src/
β βββ regime/
β β βββ metrics/ # NEW
β β β βββ adx.py
β β β βββ efficiency_ratio.py
β β β βββ autocorrelation.py
β β β βββ ou_process.py
β β β βββ slope.py
β β β βββ bollinger.py
β β βββ validation/ # NEW
β β β βββ schema_validator.py
β β βββ quality/ # NEW
β β β βββ dashboard.py
β β βββ engine.py # MODIFIED (TODOs removed)
β βββ exit_strategy/
β β βββ triggers/ # NEW
β β β βββ mandatory.py
β β β βββ latest_acceptable.py
β β β βββ warning.py
β β βββ evaluator.py # ENHANCED
β β βββ state_tracker.py # ENHANCED
β β βββ history_loader.py # NEW
β β βββ audit_logger.py # NEW
β β βββ kpis.py # NEW
β βββ position/ # NEW
β β βββ tracker.py
β β βββ risk_calculator.py
β βββ ...
βββ tests/
β βββ regime/metrics/ # NEW (22 test cases)
β βββ exit_strategy/triggers/ # NEW (26 test cases)
β βββ position/ # NEW
β βββ integration/ # ENHANCED
βββ backtest/ # NEW
β βββ regime_exit_backtest.py
βββ config/
β βββ exit_strategy_config.yaml # NEW
βββ docs/ # NEW
βββ ops/
β βββ runbook.md
β βββ troubleshooting.md
βββ metrics_guide.md
βββ configuration.md
Immediate Next Steps
Week 1: Start Phase 1
-
Setup (Day 1-2)
- Review this document
- Set up development environment
- Create feature branch:
feature/phase-1-data-quality
-
Implement Metrics (Day 3-5)
- ADX calculation
- Efficiency Ratio
- Unit tests (10 test cases)
- Validate against TradingView
-
Continue (Week 2-3)
- Remaining metrics
- Integration with regime engine
- Data validation
- Quality dashboard
Related Documentation
- Detailed SOW:
.builders/0013-market-maker-mvp/SYSTEM_ANALYSIS.md - Phase 1 Plan:
.builders/0013-market-maker-mvp/PHASE_1_PLAN.md - Original Review:
.ai/projects/market-making/SYSTEM_REVIEW.md - Exit Strategy Spec:
.ai/projects/market-making/grid-exit-strategy/spec.md - Requirements:
.ai/projects/market-making/regime-management/requirements.md - New Instructions:
.ai/projects/market-making/new-instructions.md
Conclusion
The market-making system has a strong technical foundation but requires focused effort to complete:
- π΄ Phase 1 (Data Quality) - CRITICAL: Fix dummy data immediately
- π΄ Phase 2 (Exit Strategy) - CRITICAL: Core value proposition
- π‘ Phase 3 (Position Risk) - IMPORTANT: Enhances notifications
- π΄ Phase 4 (Testing) - CRITICAL: Cannot deploy without
- π’ Phase 5 (Operational) - NICE TO HAVE: Polish and monitoring
Total: 180-250 hours over 7-10 weeks
Recommendation: Start with Phase 1 immediately. This is a blocker for everything else and addresses the root cause of trust issues with the system.
Document Version: 1.0
Last Updated: 2026-01-31
Next Review: After Phase 1 completion