0004 - Regime Management: Snapshot Architecture

Summary

This specification completes the snapshot-based regime detection architecture for the market-making system. The core change separates data collection (creating comprehensive market snapshots) from regime analysis (consuming those snapshots), enabling better testability, historical analysis, and architectural clarity.

Problem Statement

The current regime detection system has tight coupling between data collection and regime analysis. This creates several issues:

Testability: Cannot easily test regime analysis in isolation without mocking exchange APIs
Historical Analysis: Cannot run regime detection on past data without re-fetching from exchanges
Separation of Concerns: Metrics collection performs both data gathering and analysis in a single pass
Validation Coverage: Multiple checkpoint tests remain incomplete (grid configuration, enabled status, restart gates, snapshot architecture)

The system needs to complete the transition to a snapshot-based architecture where:

Data collection creates self-contained snapshot files with all data needed for regime analysis
Regime analysis operates solely on pre-collected snapshot files
Historical backtesting uses the same regime analysis code as live operation

Goals

Complete snapshot file reader (Task 12.1): Enable regime analysis to read from pre-collected YAML snapshot files containing minute-level price data
Refactor regime engine for snapshots (Task 12.2): Remove direct API dependencies from regime classification, using snapshot data instead
Implement snapshot-based market data service (Task 12.3): Create a market data layer that reads from snapshots rather than live APIs
Update recommendation generator (Task 12.4): Ensure recommendations can be generated from historical snapshot data
Add snapshot validation (Task 12.5): Implement integrity checks ensuring snapshots contain complete data (60 minute prices per hour)
Separate metrics collection from analysis (Task 13.1): Focus metrics collector purely on creating comprehensive snapshots
Create standalone regime analysis service (Task 13.2): Extract regime analysis into a service that consumes snapshot files
Update dashboard for snapshots (Task 13.3): Modify dashboard generation to use snapshot-based regime results
Update backtesting system (Task 13.4): Enable backtesting to run regime analysis on historical snapshot files
Validate checkpoints (Tasks 4.6, 7.2, 7.6, 10, 14): Complete all outstanding checkpoint test validations

Non-Goals

Adding new regime classification algorithms (the classification logic is already implemented)
Changing the YAML snapshot file format (the format is established)
Adding new exchange integrations
Modifying the grid trading execution logic
Implementing real-time streaming (system uses hourly batch collection)

Technical Approach

Phase 1: Snapshot File Infrastructure (Tasks 12.1, 12.5)

Create snapshot file reader and validation:

# Snapshot file reader for regime analysis
class SnapshotReader:
    def load_snapshot(self, market: str, timestamp: datetime) -> MarketSnapshot
    def load_range(self, market: str, start: datetime, end: datetime) -> List[MarketSnapshot]
    def validate_snapshot(self, snapshot: MarketSnapshot) -> ValidationResult

Validation checks:

60 minute-level prices per hour
Required fields present (market_summary, grid_config, regime_analysis)
Data consistency across related snapshots

Phase 2: Regime Engine Refactor (Tasks 12.2, 12.3)

Modify regime engine to accept snapshot data instead of calling APIs:

# Current: Engine calls exchange APIs directly
result = regime_engine.analyze(symbol, exchange_client)
 
# Target: Engine receives pre-collected snapshot data
snapshot = snapshot_reader.load_snapshot(symbol, timestamp)
result = regime_engine.analyze_snapshot(snapshot)

Create snapshot-based market data service:

Replace live API calls with snapshot file access
Add caching layer for frequently accessed data
Support time-range queries across multiple snapshot files

Phase 3: Service Separation (Tasks 13.1, 13.2)

Split metrics collector responsibilities:

Metrics Collector (data collection only):

Fetch market data from exchange APIs
Collect grid status and configuration
Create comprehensive snapshot YAML files
No regime analysis logic

Regime Analysis Service (analysis only):

Read snapshot files
Perform regime classification
Generate recommendations
Support historical and current analysis

Phase 4: Dependent System Updates (Tasks 13.3, 13.4, 12.4)

Update systems that depend on regime analysis:

Dashboard reads regime results from snapshot files
Backtesting runs regime analysis on historical snapshots
Recommendation generator works with snapshot-based regime detection

Phase 5: Checkpoint Validation (Tasks 4.6, 7.2, 7.6, 10, 14)

Complete all checkpoint test validations:

Task	Checkpoint	Status	Remaining Work
4.6	Grid configuration management tests	Partial	Property tests 39-48
7.2	Integration tests for enhanced metrics	Partial	n8n webhook, per-market file tests
7.6	Enabled status awareness tests	Partial	Property tests 49-53
10	Restart gates tests	Partial	Property tests 54-74
14	Snapshot-based architecture tests	Not started	Full test suite

Success Criteria

Snapshot Independence: Regime analysis produces identical results whether run on live data or snapshot files
Complete Validation: All checkpoint tests pass (4.6, 7.2, 7.6, 10, 14)
Historical Analysis: Can run regime detection on any historical period with existing snapshots
Data Integrity: Snapshot validation catches incomplete or corrupted data before analysis
Service Separation: Clear boundary between data collection and regime analysis components
Test Coverage: Property tests implemented for snapshot-based regime detection (77-78)

Dependencies

Existing Implementation: Tasks 1-11 are largely complete, providing the foundation
Snapshot File Format: Established YAML format with minute-level price arrays
Regime Classification Engine: Fully implemented (range discovery, feature calculation, score aggregation)
Grid Restart Gates: Implemented but tests incomplete

Risks

Risk	Impact	Mitigation
Snapshot/API result divergence	High	Property tests comparing snapshot vs live results
Performance with large snapshot files	Medium	Implement caching layer, lazy loading
Historical data gaps	Medium	Validation catches missing data, backfill tooling exists
Regression in live system	High	Run parallel comparison before cutover
Test suite execution time	Low	Property tests are marked optional with `*`

Appendix: H-Priority Task Details

Checkpoint Tests (Partial)

Task 4.6 - Grid Configuration Management

Property tests 39-48 not yet written
Tests probationary grid parameters, validation criteria, quick stop triggers
Tests configuration version consistency, metrics completeness

Task 7.2 - Enhanced Metrics Integration

Integration test for collection-to-dashboard workflow incomplete
n8n webhook integration testing needed
Per-market file creation verification needed

Task 7.6 - Enabled Status Awareness

Property tests 49-53 not yet written
Tests enabled status consideration, disabled grid recommendations
Tests grid repositioning recommendations

Task 10 - Restart Gates

Property tests 54-74 not yet written
Tests gate state initialization, condition evaluation, blocking behavior
Tests gate progression, regime transitions, fresh grid parameters

Snapshot Architecture Tasks (Partial)

Task 12.1 - Snapshot File Reader

Skeleton exists but needs completion
Needs minute-level price data parsing
Needs multi-market snapshot loading

Task 12.2 - Regime Engine Refactor

Engine structure exists but still has API dependencies
Needs snapshot data injection pattern
Needs removal of direct exchange calls

Task 12.3 - Snapshot Market Data Service

Not yet implemented
Needs caching layer
Needs time-range query support

Task 12.4 - Recommendation Generator Update

Works with current regime engine
Needs snapshot-based regime analysis integration
Needs grid comparison from snapshot data

Task 12.5 - Snapshot Validation

Basic structure exists
Needs 60-minute completeness check
Needs cross-file consistency validation

Service Separation Tasks (Not Started)

Task 13.1 - Metrics Collector Update

Remove regime analysis from collection
Focus on comprehensive snapshot creation
Ensure all regime analysis inputs captured

Task 13.2 - Regime Analysis Service

Extract into standalone service
Add historical period analysis API
Support batch analysis for backtesting

Task 13.3 - Dashboard Update

Modify to read snapshot-based regime results
Update chart generation for snapshot data
Support historical regime visualization

Task 13.4 - Backtesting Update

Use snapshot-based regime detection
Run analysis on historical snapshot files
Ensure consistency with live detection

Task 14 - Architecture Checkpoint

Full test suite for snapshot architecture
Verify data collection/analysis separation
Confirm regime detection works with snapshots

Techcle Wiki

Explorer

Spec