Skip to content

Configuration & Model Versioning

How strategy configurations and ML models are versioned, promoted, and managed across environments.

graph LR
    subgraph Config["Strategy Config"]
        C1["tradai.yaml"] --> C2["S3 Upload"]
        C2 --> C3["DRAFT"]
        C3 --> C4["ACTIVE"]
        C4 --> C5["DEPRECATED"]
    end

    subgraph Model["ML Model"]
        M1["Training"] --> M2["MLflow Registry"]
        M2 --> M3["Unversioned"]
        M3 --> M4["Staging"]
        M4 --> M5["Production"]
        M5 --> M6["Archived"]
    end

    C4 -.->|deployed together| M5

Strategy Configuration

tradai.yaml

Every strategy has a tradai.yaml at its root that defines how TradAI services interact with it:

strategy:
  name: "MyStrategy"
  version: "1.0.0"
  entry_point: "mystrategy.strategy:MyStrategy"
  category: "momentum"
  timeframe: "1h"

strategy_service:
  source:
    KIND: Binance
  adapter:
    KIND: AWS
    bucket_name: tradai-data
    library: ohlcv
  defaults:
    timerange: "20240101-20241201"
    symbols:
      - "BTC/USDT:USDT"
      - "ETH/USDT:USDT"
    stake_amount: 1000
    max_open_trades: 3

mlflow:
  tracking_uri: ${MLFLOW_TRACKING_URI:-http://localhost:5001}
  experiment_name: "strategies/mystrategy"
  auto_log_params: true

optimization:
  defaults:
    epochs: 100
    loss_function: sharpe
    spaces: [buy, sell]
  presets:
    quick:
      epochs: 50
      spaces: [buy]
    standard:
      epochs: 200
      spaces: [buy, sell]
    production:
      epochs: 1000
      spaces: [buy, sell, roi, stoploss, trailing]
      walk_forward: true

deployment:
  ecr:
    repository: "tradai/strategies/mystrategy"
  ecs:
    cpu: 512
    memory: 1024

Config Storage

Configurations are stored in S3 and versioned in DynamoDB:

graph TD
    YAML["tradai.yaml"] -->|upload| S3["S3 Bucket - tradai-configs"]
    S3 -->|version| DDB["DynamoDB - config-versions"]
    DDB -->|ACTIVE version| ECS["ECS Task"]

    ENV[".env / Secrets Manager"] -->|merge| Merge["ConfigMergeService"]
    S3 -->|base config| Merge
    Merge --> ECS
Component Location Purpose
tradai.yaml Strategy repo root Source of truth for strategy config
S3 bucket tradai-configs-{env} Persisted config versions
DynamoDB table tradai-config-versions-{env} Version registry with lifecycle tracking
Secrets Manager AWS Secrets Manager Exchange credentials, API keys

Config Version Lifecycle

Each config version follows a strict lifecycle:

stateDiagram-v2
    [*] --> DRAFT : create_version()
    DRAFT --> ACTIVE : activate()
    ACTIVE --> DEPRECATED : new version activated
    DEPRECATED --> [*] : TTL auto-cleanup (90 days)

    note right of DRAFT : Not yet validated
    note right of ACTIVE : Currently deployed (one per strategy)
    note right of DEPRECATED : Superseded, auto-deleted after 90d

Key rules:

  • Only one ACTIVE version per strategy at any time
  • Activating a new version automatically deprecates the previous one
  • Deprecated versions have a 90-day TTL for auto-cleanup in DynamoDB
  • Versions are content-addressable (SHA256 hash) -- duplicate configs are detected

Config Version Entity

Each version is tracked with these fields:

Field Type Description
strategy_name string Partition key (e.g., "PascalStrategy")
config_id string Sort key: v{version}-{hash[:8]}
config_hash string SHA256 of normalized config content
config_data dict Frozen config content
status enum DRAFT, ACTIVE, or DEPRECATED
version_number int Sequential version per strategy
created_at datetime When created
deployed_at datetime When activated (null if DRAFT)
superseded_by string config_id of newer version (if deprecated)

Managing Config Versions

# Upload and create a new config version
tradai strategy stage MyStrategy --version 1

# View strategy config
tradai strategy list

# The config service handles versioning programmatically:
from tradai.common.config.service import ConfigVersionService

service = ConfigVersionService(table_name="tradai-config-versions-dev")

# Create a new version (starts as DRAFT)
version = service.create_version(
    strategy_name="MyStrategy",
    config_data={"timeframe": "1h", "symbols": ["BTC/USDT:USDT"]},
    description="Updated symbols list",
)

# Activate it (auto-deprecates previous ACTIVE version)
active = service.activate("MyStrategy", version.config_id)

# List all versions for a strategy
versions = service.list_versions("MyStrategy")

# Get the currently active version
current = service.get_active("MyStrategy")

Config Loading at Runtime

When a strategy container starts, configs are loaded and merged from multiple sources:

graph TD
    S3["S3 Config - (base)"] --> Loader["StrategyConfigLoader"]
    MLflow["MLflow Tags - (model params)"] --> Loader
    ENV["Environment Vars - (overrides)"] --> Loader
    Loader --> Merge["ConfigMergeService"]
    Merge --> Validate["Validation"]
    Validate -->|pass| Config["StrategyConfig - (runtime)"]
    Validate -->|fail| Error["Startup Error"]

Priority order (highest wins):

  1. Environment variables
  2. MLflow model tags
  3. S3 stored config
  4. tradai.yaml defaults

Model Versioning (MLflow)

Model Lifecycle

Models are tracked in the MLflow Model Registry with four stages:

stateDiagram-v2
    [*] --> None : register()
    None --> Staging : stage()
    Staging --> Production : promote()
    Production --> Archived : new model promoted
    Archived --> Production : rollback()

    note right of None : Newly registered, not yet validated
    note right of Staging : Under validation, dry-run testing
    note right of Production : Live trading model (one per strategy)
    note right of Archived : Previous production, available for rollback
Stage Description Who sets it
None Freshly registered, not yet reviewed ModelRegistrar (automatic after training)
Staging Under validation, dry-run testing tradai strategy set-version or API
Production Active production model Promotion after validation passes
Archived Previous production model, kept for rollback Auto-archived when new model is promoted

Model Registration

After a backtest or training run, models are automatically registered:

sequenceDiagram
    participant ECS as ECS Task - (Freqtrade)
    participant MLflow as MLflow - Registry
    participant DDB as DynamoDB - State

    ECS->>MLflow: Log metrics + params
    ECS->>MLflow: Log model artifacts
    ECS->>MLflow: Register model version
    MLflow-->>ECS: Version number
    ECS->>DDB: Update job status
    ECS->>MLflow: Tag with git_commit, strategy_name

The ModelRegistrar handles this automatically:

from tradai.common.entrypoint.training.model_registrar import ModelRegistrar

registrar = ModelRegistrar(mlflow_adapter=adapter)
result = registrar.register(config=training_config, result=training_result)
# result.model_version is now set

Model Comparison

Before promoting a new model, the ModelComparator compares it against the current champion:

graph TD
    Challenger["Challenger Model - (new version)"] --> Compare["ModelComparator"]
    Champion["Champion Model - (current Production)"] --> Compare
    Compare --> Decision{"Better?"}
    Decision -->|"yes"| Promote["Promote to Production"]
    Decision -->|"no"| Keep["Keep current champion"]
    Decision -->|"insufficient data"| Manual["Manual review needed"]

Key comparison metrics:

Metric Weight Required Threshold
total_profit High Yes Must be positive
sharpe_ratio High Yes >= 0.5
max_drawdown Medium Yes <= 30%
win_rate Medium No >= 40%
total_trades Low Yes >= 10

CLI Commands

# Stage a model version for validation
tradai strategy set-version MyStrategy 3 --stage Staging

# Promote to production (archives current champion)
tradai strategy set-version MyStrategy 3 --stage Production

# Promote without archiving previous version
tradai strategy set-version MyStrategy 3 --no-archive

# Preview promotion (dry run)
tradai strategy set-version MyStrategy 3 --dry-run

# Rollback to previous deployment
tradai deploy rollback MyStrategy --env dev

# Rollback to a specific deployment
tradai deploy rollback MyStrategy --env dev --deployment deploy-abc123

API Endpoints

The Strategy Service exposes model management APIs:

Method Path Description
GET /api/v1/strategies/{name}/models List model versions
GET /api/v1/strategies/{name}/models/rollback-candidates List rollback candidates
POST /api/v1/strategies/{name}/models/stage Stage a model version
POST /api/v1/strategies/{name}/models/promote Promote to Production
POST /api/v1/strategies/{name}/models/rollback Rollback model

Automated Retraining Pipeline

The retraining workflow is orchestrated by Step Functions:

graph TD
    Trigger["Trigger - Schedule / Drift / Manual"] --> Check["Check Retraining - Needed?"]
    Check -->|"yes"| Train["Train Model - (ECS + FreqAI)"]
    Check -->|"no"| Skip["Skip"]
    Train --> Compare["Compare Models - (Lambda)"]
    Compare -->|"better"| Promote["Promote Model - (Lambda)"]
    Compare -->|"worse"| Keep["Keep Champion"]
    Promote --> Notify["Notify - (SNS + Slack)"]
    Keep --> Notify

Lambda functions involved:

Lambda Role
check-retraining-needed Evaluates drift scores and schedules
compare-models Champion vs challenger comparison
promote-model Transitions model to Production stage
model-rollback Reverts to previous model version
drift-monitor Monitors PSI for model/data drift
retraining-scheduler Triggers retraining on schedule

Environment-Specific Configuration

Settings Hierarchy

Each service uses Pydantic settings with environment variable prefixes:

Service Prefix Key Settings
Backend BACKEND_ BACKEND_EXECUTOR_MODE, BACKEND_BACKTEST_QUEUE_URL
Strategy Service STRATEGY_SERVICE_ STRATEGY_SERVICE_MLFLOW_TRACKING_URI, STRATEGY_SERVICE_STRATEGY_PATH
Data Collection DATA_COLLECTION_ DATA_COLLECTION_EXCHANGES, DATA_COLLECTION_ARCTIC_S3_BUCKET

Settings Mixins

Common settings are shared via mixins:

# MLflow settings (shared by Strategy Service and Backend)
class MLflowSettingsMixin:
    mlflow_tracking_uri: str   # MLFLOW_TRACKING_URI
    mlflow_username: str       # MLFLOW_USERNAME
    mlflow_password: str       # MLFLOW_PASSWORD

# ArcticDB settings (shared by Data Collection and Strategy Service)
class ArcticSettingsMixin:
    arctic_s3_bucket: str      # ARCTIC_S3_BUCKET
    arctic_library_name: str   # ARCTIC_LIBRARY_NAME (default: "ohlcv")
    arctic_s3_endpoint: str    # ARCTIC_S3_ENDPOINT

Per-Environment Differences

Setting Dev Staging Prod
Executor mode local sqs stepfunctions
RDS instance db.t4g.micro db.t4g.micro db.t4g.small
ECS launch type EC2 (consolidated) EC2 (consolidated) Fargate
Log retention 30 days 30 days 90 days
Deletion protection Off Off On
MLflow URL http://localhost:5001 Service Discovery Service Discovery

Traceability

Every operation is traceable across the entire system:

graph LR
    TraceID["trace_id"] --> Backend["Backend API"]
    TraceID --> SQS["SQS Message"]
    TraceID --> SF["Step Functions"]
    TraceID --> ECS["ECS Task"]
    TraceID --> MLflow["MLflow Run"]
    TraceID --> DDB["DynamoDB - job record"]

    JobID["job_id"] --> DDB
    JobID --> S3["S3 Results"]

    RunID["mlflow_run_id"] --> MLflow
    RunID --> DDB

    GitSHA["git_commit"] --> MLflow
    GitSHA --> DDB
Field Description Where stored
trace_id End-to-end correlation ID DynamoDB, Step Functions input, ECS env
job_id DynamoDB job identifier DynamoDB, S3 result paths
mlflow_run_id MLflow experiment run DynamoDB, BacktestResult, MLflow
git_commit Code version SHA BacktestResult, MLflow tags

Quick Reference

Config Commands

Command Description
tradai strategy list List registered strategies
tradai strategy stage NAME --version V Stage a strategy version
tradai strategy set-version NAME V Set model version stage
tradai strategy set-version NAME V --stage Production Promote model
tradai strategy set-version NAME V --dry-run Preview promotion
tradai deploy strategy ./path --env dev Deploy strategy to ECS
tradai deploy rollback NAME --env dev Rollback deployment

Key Source Files

Component Path
ConfigVersion entity libs/tradai-common/src/tradai/common/entities/config_version.py
ConfigVersionService libs/tradai-common/src/tradai/common/config/service.py
S3ConfigRepository libs/tradai-common/src/tradai/common/aws/s3_config_repository.py
ConfigMergeService libs/tradai-common/src/tradai/common/config/merge.py
StrategyConfigLoader libs/tradai-common/src/tradai/common/config/loader.py
ModelStage enum libs/tradai-common/src/tradai/common/entities/mlflow.py
MLflowAdapter libs/tradai-common/src/tradai/common/mlflow/adapter.py
ModelComparator libs/tradai-common/src/tradai/common/model_comparison/comparator.py
ModelRegistrar libs/tradai-common/src/tradai/common/entrypoint/training/model_registrar.py
Promotion routes services/strategy-service/src/tradai/strategy_service/api/promotion_routes.py
Config routes services/strategy-service/src/tradai/strategy_service/api/config_routes.py

See Also