tradai-common Design Document¶
Overview¶
tradai-common provides shared utilities, base classes, and AWS integrations for the TradAI platform. This library follows SOLID principles and applies DRY patterns throughout.
Architecture Decisions¶
What We Keep (Improved)¶
-
base_service.py - Base class for services with Hydra config loading
-
ADD: Docstrings, settings_class validation
-
KEEP: LoggerMixin composition pattern
-
base_settings.py - Pydantic settings with S3 support
-
ADD: Path validation, proper return types
-
KEEP: Strict mode, frozen entities
-
logger_mixin.py - Logging mixin for composition
-
KEEP: Already well-designed
-
common.py - Utility functions for time, quantization
-
KEEP: Excellent type hints
-
ADD: More validation where needed
-
cache.py - Caching mechanisms
-
KEEP: Already well-designed with Final types
-
mlflow_client.py - MLflow REST client
-
REMOVE: All test code (lines 530-550)
- IMPROVE: Error handling
What We Fix (Security)¶
-
secrets_manager.py (rename from secrets_maneger.py)
-
FIX: Thread-safe with lock
- FIX: Secret name from environment variables
- FIX: Proper error handling
-
REMOVE: Test code
-
ecr_util.py - ECS job manager
-
FIX: All AWS config from environment variables
- REMOVE: Hardcoded ARNs, subnets, security groups
- REMOVE: Test code (lines 519-560)
- KEEP: Async patterns, CloudWatch streaming
What We Add (SOLID)¶
-
exceptions.py - Custom exception hierarchy
-
NEW: TradAIError base class
-
NEW: ValidationError, NotFoundError, ConfigurationError, etc.
-
repositories.py - Repository pattern ABCs
-
NEW: Repository[T] generic base
- NEW: DataRepository for market data
-
NEW: Dependency Inversion principle
-
s3_utils.py - S3 path parsing (DRY)
-
NEW: S3Path frozen dataclass
- NEW: Centralized parsing logic
What We Skip (Too Complex/Refactor Later)¶
-
scope.py - Thread-unsafe, overly complex
-
SKIP: Will redesign with dependency injection later
-
REASON: 500+ lines, global state, tight coupling
-
log.py - 50+ methods, too complex
-
SKIP: Will simplify later if needed
-
REASON: Logging is already handled by logger_mixin
-
strategy_config_mixin.py - Service-specific logic
-
SKIP: Move to strategy service layer later
- REASON: Violates Single Responsibility
Design Patterns Applied¶
1. Repository Pattern (Dependency Inversion)¶
# Abstract interface
class DataRepository(ABC):
@abstractmethod
def get_ohlcv(...) -> pd.DataFrame:
pass
# Concrete implementation
class ArcticDBDataRepository(DataRepository):
def get_ohlcv(...) -> pd.DataFrame:
# Implementation
2. Frozen Entities (Immutability)¶
from pydantic import BaseModel
class Strategy(BaseModel):
name: str
version: str
class Config:
frozen = True # Immutable
3. Mixin Pattern (Composition over Inheritance)¶
class LoggerMixin:
@property
def logger(self) -> logging.Logger:
...
class MyService(LoggerMixin):
# Gets logging for free
3a. Settings Mixin Pattern (DRY Configuration)¶
Reusable Pydantic settings mixins for shared configuration fields across services:
from tradai.common.settings_mixins import ArcticSettingsMixin, MLflowSettingsMixin
# Mixin defines shared fields with sensible defaults
class ArcticSettingsMixin:
arctic_s3_bucket: str = Field(default="", description="S3 bucket for ArcticDB")
arctic_library: str = Field(default="futures", description="ArcticDB library name")
arctic_s3_endpoint: str = Field(default="s3.us-east-1.amazonaws.com")
arctic_region: str = Field(default="us-east-1")
# Services inherit and optionally override
class DataCollectionSettings(ArcticSettingsMixin, Settings):
arctic_s3_bucket: str = Field(..., description="Required for data-collection") # Override: required
class StrategyServiceSettings(ArcticSettingsMixin, MLflowSettingsMixin, Settings):
# Inherits all Arctic + MLflow fields with defaults
pass
Available mixins: - ArcticSettingsMixin: arctic_s3_bucket, arctic_library, arctic_s3_endpoint, arctic_region - MLflowSettingsMixin: mlflow_tracking_uri, mlflow_username, mlflow_password, mlflow_verify_ssl
4. Thread-Safe Singleton with lru_cache¶
from functools import lru_cache
@lru_cache(maxsize=1)
def get_secrets_client():
return boto3.client("secretsmanager")
5. Factory Pattern (Already exists in source.py)¶
6. Handler Registry Pattern (entrypoint/base.py)¶
Enables dependency inversion for mode handlers, avoiding circular imports:
from tradai.common.entrypoint.base import StrategyEntrypoint
from tradai.common.entrypoint.settings import TradingMode
from tradai.common.protocols import ModeHandler
# Services register handlers using decorator
@StrategyEntrypoint.register_handler(TradingMode.BACKTEST)
class BacktestHandler:
def __init__(self, entrypoint: StrategyEntrypoint) -> None:
self._ep = entrypoint
def run(self) -> int:
# Execute backtest logic
return 0
# Handler is invoked via registry lookup (no direct import)
handler_factory = StrategyEntrypoint.get_handler(TradingMode.BACKTEST)
handler = handler_factory(entrypoint)
return handler.run()
Benefits: - Breaks circular dependencies (common → strategy-service) - Services own their handlers - Base class remains generic
Module Organization¶
tradai/common/
├── __init__.py # Public API exports
├── exceptions.py # Custom exception hierarchy
├── protocols.py # Shared protocols (ModeHandler, etc.)
├── settings.py # Settings base class (Pydantic)
├── settings_mixins.py # Reusable settings mixins (ArcticSettingsMixin, MLflowSettingsMixin)
├── logger_mixin.py # Logging mixin (composition pattern)
├── tracing.py # OpenTelemetry tracing utilities
├── validators.py # Shared field validators (S3 bucket names, etc.)
├── types.py # Type aliases
├── warmup_loader.py # Model warmup loading
├── aws/ # AWS integrations
│ ├── __init__.py # Re-exports (get_dynamodb_resource, etc.)
│ ├── _client_factory.py # Cached boto3 client/resource factory
│ ├── cloudwatch_logs.py # CloudWatch log retrieval
│ ├── cloudwatch_metrics.py # CloudWatch metrics publishing
│ ├── dynamodb/ # DynamoDB adapters (sync & async)
│ ├── ecs_deploy.py # ECS deployment helper
│ ├── s3_utils.py # S3 path parsing, upload/download
│ ├── async_s3_utils.py # Async S3 utilities
│ ├── state_repository.py # DynamoDB state repository
│ ├── step_functions.py # Step Functions client
│ └── constants.py # AWS constants
├── clients/ # External service clients
│ ├── data_collection_client.py # Data collection REST client
│ └── mlflow_client.py # MLflow REST client with circuit breaker
├── repositories/ # Generic repository patterns
│ ├── base.py # DynamoDBRepository base class
│ ├── codecs.py # DynamoDB codec system
│ ├── crud.py # Generic CRUD repository
│ ├── deployment.py # Deployment repository
│ ├── in_memory.py # InMemoryStatefulRepository (A/B test, shadow test)
│ ├── pagination.py # Paginated query support
│ ├── protocols.py # Repository protocols
│ ├── query.py # Query builder
│ └── trading_state.py # Trading state repository
├── entities/ # Domain entities (Pydantic, frozen)
│ ├── __init__.py # Barrel exports
│ ├── aws.py # S3Path, AWSConfig, JobStatus
│ ├── backtest.py # BacktestConfig, BacktestResult, BacktestJobStatus
│ ├── config_version.py # ConfigVersion, ConfigVersionStatus
│ ├── deployment.py # DeploymentRecord, DeploymentStatus, DeploymentDiff
│ ├── exchange.py # ExchangeConfig, TradingMode, OperatingMode
│ ├── hyperopt.py # OptunaHyperoptConfig, TrialResult, HyperparamSearchSpace
│ ├── identifiers.py # Arn, ExperimentName, TableName
│ ├── mlflow.py # ModelVersion, RegisteredModel, ExperimentRun
│ ├── pagination.py # PaginatedResponse
│ ├── retraining.py # RetrainingState, RetrainingDecision
│ ├── strategy.py # Strategy
│ └── trading_state.py # TradingState, StrategyPnL, LiveTrade
├── entrypoint/ # Strategy execution entrypoints
│ ├── base.py # StrategyEntrypoint with handler registry
│ ├── settings.py # TradingMode, EntrypointSettings
│ └── training/ # Training entrypoint (extracted from monolithic handler)
│ ├── __init__.py # Public API
│ ├── handler.py # TrainHandler (orchestrator only)
│ ├── manifest_builder.py # Config manifest construction
│ ├── metadata.py # Training metadata extraction
│ ├── mlflow_reporter.py # MLflow experiment logging
│ ├── model_registrar.py # Model registration
│ └── result_parser.py # Backtest result parsing
├── resilience/ # Resilience patterns
│ ├── circuit_breaker.py # CircuitBreaker with state machine
│ ├── execution.py # Resilient execution helpers
│ ├── policy.py # ResiliencePolicy (retry + CB)
│ └── retry.py # Retry with exponential backoff
├── auth/ # Authentication utilities
│ ├── __init__.py # JWT validator, credential reloader
│ └── fastapi_deps.py # Reusable FastAPI auth dependencies
├── ab_testing/ # A/B testing framework
│ ├── entities.py # ABTestConfig, ABTestState, ABTestResult
│ ├── manager.py # ABTestManager lifecycle
│ ├── repository.py # DynamoDB + in-memory repositories
│ ├── statistical_tester.py # Hypothesis testing (lazy-loaded)
│ └── traffic_router.py # Request routing to variants
├── alerting/ # SNS-based alert management
│ ├── entities.py # Alert, AlertContext, AlertSeverity
│ ├── service.py # AlertService (SNS publisher)
│ └── webhook.py # SlackNotifier, DiscordNotifier
├── config/ # Strategy configuration management
│ ├── catalog.py # StrategyConfig schema
│ ├── loader.py # StrategyConfigLoader (S3/MLflow)
│ ├── merge.py # ConfigMergeService (OmegaConf)
│ ├── repository.py # ConfigVersionRepository (DynamoDB)
│ ├── service.py # ConfigVersionService
│ └── validators.py # ConfigValidator (mode-specific)
├── drift/ # Model drift detection
│ ├── detector.py # DriftDetector (PSI-based, lazy-loaded)
│ └── entities.py # DriftResult, DriftSeverity, DriftThresholds
├── fastapi/ # FastAPI utilities
│ ├── app_factory.py # create_app(), AppConfig, create_lifespan()
│ ├── dependencies.py # validated_resource_name(), get_correlation_id()
│ ├── di_helpers.py # Dependency injection helpers
│ ├── error_schemas.py # ErrorResponse, ValidationErrorResponse
│ └── middleware.py # RequestTracingMiddleware
├── features/ # Feature engineering
│ ├── lineage.py # DataLineage tracking
│ └── schema.py # FeatureSchema, compute_dataframe_hash()
├── freqtrade/ # Freqtrade integration
│ ├── cli.py # FreqtradeCLIBuilder
│ ├── registry.py # FreqAIModelRegistry
│ └── runner.py # FreqtradeRunner, FreqtradeBacktester
├── health/ # Health check service
│ ├── aws_checkers.py # DynamoDBHealthChecker, S3HealthChecker, etc.
│ ├── base.py # BaseHealthChecker
│ ├── checkers.py # RedisHealthChecker, HTTPHealthChecker, etc.
│ ├── protocols.py # HealthChecker protocol
│ ├── reporter.py # HealthReporter (background heartbeat)
│ └── service.py # HealthService (aggregator)
├── http_client/ # HTTP client with retries
│ └── client.py # HttpClient, HttpClientConfig, HttpRetryConfig
├── lambda_/ # Lambda handler framework
│ ├── backtest_processor.py # SQSJobProcessor, BatchResult
│ ├── context.py # LambdaContext[T] DI container
│ ├── decorators.py # @lambda_handler, @lambda_handler_with_di
│ ├── entities.py # DriftState, HealthState, HeartbeatState
│ ├── response.py # LambdaResponse builder
│ └── settings.py # LambdaSettings + per-handler settings
├── mlflow/ # MLflow integration
│ ├── adapter.py # MLflowAdapter (REST + SDK)
│ ├── artifacts.py # Artifact management
│ ├── backtest_logger.py # BacktestMLflowLogger
│ ├── client.py # MLflow REST client
│ ├── experiments.py # Experiment management
│ ├── registry.py # Model registry operations
│ ├── tags.py # 40+ tag constants (TAG_STRATEGY_NAME, etc.)
│ └── utils.py # URI conversion, parsing
├── model_comparison/ # Model comparison (statistical)
│ ├── comparator.py # ModelComparator (champion/challenger)
│ └── entities.py # ComparisonResult, PromotionDecision
├── validation/ # Deployment validation
│ └── entities.py # DryRunValidationReport, GoLiveValidationReport
└── utils/ # Utility functions
├── __init__.py # Re-exports (git utils, etc.)
├── datetime.py # Date/time conversion helpers
├── decimal.py # Decimal precision utilities
├── git.py # Git commit/branch/dirty checks
├── profit.py # Profit calculation helpers
└── seed.py # Reproducible random seeding
Module Details¶
config/ — Strategy Configuration Management¶
Manages loading, merging, versioning, and validating strategy configurations across environments.
Key Classes: - StrategyConfigLoader — Loads strategy configs from S3 or MLflow, resolving by strategy name + version - ConfigMergeService — Merges multiple config layers using OmegaConf (base → environment → override) - ConfigVersionService — Content-addressable config versioning with hash-based deduplication - ConfigVersionRepository — DynamoDB persistence for config versions - ConfigValidator — Mode-specific validation (LIVE requires all fields; DRY_RUN allows partial configs) - StrategyConfig — Canonical Pydantic schema for strategy configuration
Usage:
drift/ — Model Drift Detection¶
Detects distribution shifts between training and production data using Population Stability Index (PSI).
Key Classes: - DriftDetector — PSI-based drift detection across feature distributions (lazy-loaded, requires numpy/scipy) - DriftResult — Overall drift detection result with severity and per-feature breakdown - DriftSeverity — Enum: NONE, MINOR, MODERATE, SEVERE - DriftThresholds — Configurable PSI thresholds for severity classification - FeatureDrift — Per-feature drift information with PSI value
features/ — Feature Engineering¶
Provides feature schema validation and data lineage tracking for ML pipelines.
Key Classes: - FeatureSchema — Defines and validates expected feature columns, types, and ranges - DataLineage — Tracks data transformation provenance (source → transformations → output) - compute_dataframe_hash() — Deterministic hash of DataFrame contents for reproducibility checks
freqtrade/ — Freqtrade Integration¶
Wraps Freqtrade CLI and subprocess management for backtesting and model registry operations.
Key Classes: - FreqtradeCLIBuilder — Fluent builder for Freqtrade CLI commands (backtesting, hyperopt, etc.) - FreqtradeRunner — Subprocess management with stdout/stderr capture - FreqtradeBacktester — Specialized runner for backtest execution - FreqAIModelRegistry — Discovery and management of FreqAI model directories
lambda_/ — Lambda Handler Framework¶
Dependency injection framework for AWS Lambda handlers with consistent setup, error handling, and response formatting.
Key Classes: - LambdaSettings — Pydantic BaseSettings for automatic env var loading per handler - LambdaContext[T] — Generic DI container providing typed settings, publishers, and repositories - @lambda_handler — Decorator for consistent Lambda setup, error handling, and response formatting - @lambda_handler_with_di — DI variant that injects LambdaContext into handler - SQSJobProcessor — Batch SQS message processing with partial failure support - LambdaResponse — Builder for consistent API Gateway-compatible response formatting
Per-handler settings: HealthCheckSettings, DriftMonitorSettings, PulumiDriftSettings, RetrainingSchedulerSettings, HeartbeatSettings, ModelManagementSettings, SQSTriggerSettings, DynamoDBSettings
health/ — Health Check Service¶
Aggregates health status from multiple dependencies with pre-built checkers and background heartbeat reporting.
Key Classes: - HealthService — Aggregates results from multiple HealthChecker instances - HealthResult — Overall health status with individual dependency check results - BaseHealthChecker — Abstract base for implementing custom health checks - HealthReporter — Background task that periodically reports health to DynamoDB
Pre-built Checkers: - AWS: DynamoDBHealthChecker, S3HealthChecker, SQSHealthChecker, SNSHealthChecker - Infrastructure: RedisHealthChecker, DatabaseHealthChecker, HTTPHealthChecker, MLflowHealthChecker
http_client/ — HTTP Client¶
HTTP client with configurable retry and circuit breaker integration.
Key Classes: - HttpClient — Requests-based client with automatic retries, timeout, and header management - HttpClientConfig — Configuration for base URL, timeout, headers - HttpRetryConfig — Retry configuration (max attempts, backoff, retryable status codes) - HttpResponse — Response wrapper with status, body, and headers
alerting/ — SNS-Based Alert Management¶
Multi-channel alerting infrastructure supporting SNS, Slack, and Discord notifications.
Key Classes: - AlertService — Publishes alerts to SNS topics with severity-based routing - Alert — Alert model with message, severity, context, and timestamp - AlertSeverity — Enum: INFO, WARNING, ERROR, CRITICAL - SlackNotifier — Sends formatted alerts to Slack via webhook - DiscordNotifier — Sends formatted alerts to Discord via webhook
fastapi/ — FastAPI Utilities¶
App factory, middleware, DI helpers, and standardized error responses for FastAPI services.
Key Classes: - create_app() — Factory function that builds FastAPI app with middleware, exception handlers, and health endpoints - AppConfig — Configuration for app title, version, CORS, middleware - RequestTracingMiddleware — Injects correlation IDs into request context - ErrorResponse — Standardized error response schema - get_correlation_id() — FastAPI dependency for extracting/generating correlation IDs - validated_resource_name() — DI dependency for validating path parameters
mlflow/ — MLflow Integration¶
Comprehensive MLflow adapter for experiment tracking, model registry, and artifact management.
Key Classes: - MLflowAdapter — Unified interface for MLflow REST API and Python SDK operations - BacktestMLflowLogger — Logs backtest results, metrics, and artifacts to MLflow experiments - MLflowURIConverter — Converts between MLflow artifact URIs and S3 paths - 40+ tag constants — Standardized tag keys (TAG_STRATEGY_NAME, TAG_GIT_COMMIT, TAG_SHARPE_RATIO, etc.) ensuring consistent metadata across all experiments
model_comparison/ — Model Comparison¶
Statistical comparison of champion vs challenger models with automated promotion decisions.
Key Classes: - ModelComparator — Compares two model versions across multiple metrics with statistical significance - ComparisonResult — Overall comparison result with per-metric breakdown and promotion recommendation - PromotionDecision — Enum: PROMOTE, KEEP_CHAMPION, INCONCLUSIVE - ModelCandidate — Model version metadata and performance metrics - MetricComparison — Per-metric comparison with direction (higher/lower is better)
validation/ — Deployment Validation¶
Validation reports for dry-run and go-live deployment gates.
Key Classes (Dry-Run): - DryRunValidationReport — Aggregates candle metrics, order metrics, position metrics, and heartbeat uptime - DryRunCheckResult — Individual check with severity (INFO, WARNING, CRITICAL) - HeartbeatMetrics, CandleMetrics, OrderMetrics, PositionMetrics — Domain-specific metric containers
Key Classes (Go-Live): - GoLiveValidationReport — Infrastructure readiness checks before live trading promotion - GoLiveCheckResult — Individual infrastructure check result - AlarmCheckResult, LambdaCheckResult, SNSCheckResult — AWS resource-specific checks
ab_testing/ — A/B Testing Framework¶
Manages canary-style A/B tests with traffic routing, statistical testing, and lifecycle management.
Key Classes: - ABTestManager — Orchestrates test lifecycle: create → route traffic → collect metrics → decide - ABTestConfig — Test configuration (variants, traffic split, duration, success criteria) - ABTestState — Current test state with metrics and stage - CanaryStage — Enum for progressive rollout stages: 5%, 25%, 50%, 100% - StatisticalTester — Hypothesis testing for A/B results (lazy-loaded, requires scipy) - TrafficRouter — Routes incoming requests to champion or challenger variant - ABTestRepository / DynamoDBABTestRepository — Persistence for test state
Entity Types Catalog¶
The entities/ directory contains 12 entity files with ~50 Pydantic models:
| Entity File | Key Classes | Purpose |
|---|---|---|
backtest.py | BacktestConfig, BacktestResult, BacktestJobStatus, BacktestJobMessage | Backtest lifecycle |
deployment.py | DeploymentRecord, DeploymentStatus, DeploymentDiff | Deployment tracking |
exchange.py | ExchangeConfig, TradingMode, OperatingMode | Exchange config |
hyperopt.py | OptunaHyperoptConfig, TrialResult, HyperparamSearchSpace | Hyperparameter optimization |
mlflow.py | ModelVersion, RegisteredModel, ExperimentRun | MLflow integration |
retraining.py | RetrainingState, RetrainingDecision | Retraining workflows |
trading_state.py | TradingState, StrategyPnL, LiveTrade | Live trading state |
identifiers.py | Arn, ExperimentName, TableName | Type-safe AWS identifiers |
aws.py | S3Path, AWSConfig, JobStatus | AWS infrastructure |
config_version.py | ConfigVersion, ConfigVersionStatus | Config versioning |
pagination.py | PaginatedResponse | Generic pagination |
strategy.py | Strategy | Strategy entity |
Trading Mode Safety¶
LIVE Mode Fail-Fast (entrypoint/trading.py)¶
LIVE trading requires valid configuration - empty or invalid configs cause immediate failure:
def _load_strategy_config(self) -> dict[str, Any]:
is_live = self._settings.trading_mode == TradingMode.LIVE
try:
config_dict = loader.load_config(...)
self._validate_config_for_mode(config_dict)
return config_dict
except Exception as e:
if is_live:
# LIVE mode: fail immediately - never trade with empty config
raise TradingError(f"Failed to load strategy config for LIVE trading: {e}")
# DRY_RUN: warn and continue (for development/testing)
self.logger.warning(f"Config load failed, using empty config: {e}")
return {}
Unified StrategyConfig (strategy_config_loader.py)¶
Single canonical schema for strategy configuration:
class StrategyConfig(BaseModel):
model_config = ConfigDict(frozen=True, extra="allow")
name: str = Field(..., min_length=1)
version: str = Field(..., pattern=r"^\d+\.\d+\.\d+$")
timeframe: str = Field(..., pattern=r"^\d+[mhd]$")
pairs: list[str] = Field(default_factory=list)
stake_currency: str = Field(default="USDT")
exchange: str = Field(default="binance")
parameters: dict[str, Any] = Field(default_factory=dict)
buy_params: dict[str, Any] = Field(default_factory=dict)
sell_params: dict[str, Any] = Field(default_factory=dict)
freqai: dict[str, Any] | None = Field(default=None)
Services should import from tradai.common:
Security Improvements¶
-
No Hardcoded Credentials
-
All secrets from AWS Secrets Manager
-
Secret names from environment variables
-
No Hardcoded Infrastructure IDs
-
All AWS ARNs, subnets, security groups from env vars
-
Configuration factory pattern
-
Thread Safety
-
Thread locks for shared state
-
Immutable entities with Pydantic frozen=True
-
Input Validation
-
S3 path validation
- File path sanitization
- Pydantic validation throughout
Code Quality Standards¶
- Type Hints: 100% coverage (strict mypy)
- Docstrings: Google style for all public APIs
- Test Coverage: 85%+ target
- Absolute Imports: Enforced via Ruff
- No Dead Code: Remove all test blocks from production files
Dependencies¶
Core:
- pydantic>=2.0.0 (validation, immutability)
- pydantic-settings>=2.0.0 (environment-based config)
- boto3>=1.35.0 (AWS SDK)
Utilities:
- redis>=5.0.0 (caching)
- python-json-logger>=2.0.7 (structured logging)
- PyYAML>=6.0.1 (config files)
- requests>=2.32.0 (HTTP client)
- omegaconf>=2.3.0 (Hydra compatibility)
- mlflow-skinny>=2.18.0 (experiment tracking)
- docker>=7.0.0 (ECS job manager)
Testing Strategy¶
-
Unit Tests (fast, isolated):
-
Mock all external dependencies
- Test business logic only
-
Use pytest-mock
-
Integration Tests (slower, real services):
-
LocalStack for AWS services
-
Test actual AWS interactions
-
Test Structure:
tests/
├── unit/
│ ├── test_exceptions.py
│ ├── test_base_settings.py
│ ├── aws/
│ │ ├── test_secrets_manager.py
│ │ └── test_s3_utils.py
│ └── utils/
│ └── test_cache.py
└── integration/
└── aws/
└── test_ecr_util.py
Migration from Original¶
Removed (Security/Quality)¶
- ❌ secrets_maneger.py test code (lines 56-58)
- ❌ mlflow_client.py test code (lines 530-550)
- ❌ ecr_util.py test code (lines 519-560)
- ❌ ecr_util.py hardcoded ARNs (lines 16-26)
- ❌ Commented dead code in mlflow_client.py
Renamed (Correctness)¶
- ✅ secrets_maneger.py → secrets_manager.py
Skipped (Complexity)¶
- ⏭ scope.py (700+ lines, will redesign)
- ⏭ log.py (50+ methods, use logger_mixin instead)
- ⏭ strategy_config_mixin.py (move to services layer)
Added (Architecture)¶
- ✅ exceptions.py (custom exception hierarchy)
- ✅ repositories.py (repository ABCs)
- ✅ aws/s3_utils.py (DRY S3 parsing)
- ✅ types.py (type aliases)
Health & Risk (C3/H1)¶
Risk Controls (entities/risk_limits.py)¶
RiskLimits— platform-level risk limits (drawdown, open trades, leverage, action on breach)validate_deployment_bounds()classmethod — shared pre-flight validation (CLI + backend)RiskBreach/RiskCheckResult— structured breach reporting
Risk Monitor (health/risk_monitor.py)¶
- Pure evaluator: receives metrics, returns
RiskCheckResult(no I/O) - Fail-closed: tracks consecutive metric failures, triggers breach after threshold
- Only accessed from single
_heartbeat_loopthread — no lock needed
Metrics Collector (health/metrics_collector.py)¶
- Transforms Freqtrade REST API responses →
StrategyPnL+[LiveTrade] - Calls
/profitand/statusexactly once percollect_all()invocation - Handles ratio→percentage conversion (Freqtrade returns 0.0-1.0)
Integration (health/reporter.py)¶
HealthReporter._heartbeat_loop()orchestrates: collect → heartbeat → risk check → pause/resume- Metric snapshots persisted to DynamoDB via
TradingStateRepository.update_metrics() - Risk breach actions: pause Freqtrade, update DynamoDB, send CRITICAL alert
_risk_breach_alertedflag prevents alert storms; recovery sends INFO alert
Success Criteria¶
- [ ] All 4 security vulnerabilities fixed
- [ ] 85%+ test coverage
- [ ] 100% type hint coverage
- [ ] No hardcoded credentials or infrastructure IDs
- [ ] All public APIs documented
- [ ] Thread-safe implementations
- [ ] Absolute imports only