Configuration & Model Versioning¶

How strategy configurations and ML models are versioned, promoted, and managed across environments.

graph LR
    subgraph Config["Strategy Config"]
        C1["tradai.yaml"] --> C2["S3 Upload"]
        C2 --> C3["DRAFT"]
        C3 --> C4["ACTIVE"]
        C4 --> C5["DEPRECATED"]
    end

    subgraph Model["ML Model"]
        M1["Training"] --> M2["MLflow Registry"]
        M2 --> M3["Unversioned"]
        M3 --> M4["Staging"]
        M4 --> M5["Production"]
        M5 --> M6["Archived"]
    end

    C4 -.->|deployed together| M5

Strategy Configuration¶

tradai.yaml¶

Every strategy has a tradai.yaml at its root that defines how TradAI services interact with it:

strategy:
  name: "MyStrategy"
  version: "1.0.0"
  entry_point: "mystrategy.strategy:MyStrategy"
  category: "momentum"
  timeframe: "1h"

strategy_service:
  source:
    KIND: Binance
  adapter:
    KIND: AWS
    bucket_name: tradai-data
    library: ohlcv
  defaults:
    timerange: "20240101-20241201"
    symbols:
      - "BTC/USDT:USDT"
      - "ETH/USDT:USDT"
    stake_amount: 1000
    max_open_trades: 3

mlflow:
  tracking_uri: ${MLFLOW_TRACKING_URI:-http://localhost:5001}
  experiment_name: "strategies/mystrategy"
  auto_log_params: true

optimization:
  defaults:
    epochs: 100
    loss_function: sharpe
    spaces: [buy, sell]
  presets:
    quick:
      epochs: 50
      spaces: [buy]
    standard:
      epochs: 200
      spaces: [buy, sell]
    production:
      epochs: 1000
      spaces: [buy, sell, roi, stoploss, trailing]
      walk_forward: true

deployment:
  ecr:
    repository: "tradai/strategies/mystrategy"
  ecs:
    cpu: 512
    memory: 1024

Config Storage¶

Configurations are stored in S3 and versioned in DynamoDB:

graph TD
    YAML["tradai.yaml"] -->|upload| S3["S3 Bucket - tradai-configs"]
    S3 -->|version| DDB["DynamoDB - config-versions"]
    DDB -->|ACTIVE version| ECS["ECS Task"]

    ENV[".env / Secrets Manager"] -->|merge| Merge["ConfigMergeService"]
    S3 -->|base config| Merge
    Merge --> ECS

Component	Location	Purpose
`tradai.yaml`	Strategy repo root	Source of truth for strategy config
S3 bucket	`tradai-configs-{env}`	Persisted config versions
DynamoDB table	`tradai-config-versions-{env}`	Version registry with lifecycle tracking
Secrets Manager	AWS Secrets Manager	Exchange credentials, API keys

Config Version Lifecycle¶

Each config version follows a strict lifecycle:

stateDiagram-v2
    [*] --> DRAFT : create_version()
    DRAFT --> ACTIVE : activate()
    ACTIVE --> DEPRECATED : new version activated
    DEPRECATED --> [*] : TTL auto-cleanup (90 days)

    note right of DRAFT : Not yet validated
    note right of ACTIVE : Currently deployed (one per strategy)
    note right of DEPRECATED : Superseded, auto-deleted after 90d

Key rules:

Only one ACTIVE version per strategy at any time
Activating a new version automatically deprecates the previous one
Deprecated versions have a 90-day TTL for auto-cleanup in DynamoDB
Versions are content-addressable (SHA256 hash) -- duplicate configs are detected

Config Version Entity¶

Each version is tracked with these fields:

Field	Type	Description
`strategy_name`	string	Partition key (e.g., "PascalStrategy")
`config_id`	string	Sort key: `v{version}-{hash[:8]}`
`config_hash`	string	SHA256 of normalized config content
`config_data`	dict	Frozen config content
`status`	enum	DRAFT, ACTIVE, or DEPRECATED
`version_number`	int	Sequential version per strategy
`created_at`	datetime	When created
`deployed_at`	datetime	When activated (null if DRAFT)
`superseded_by`	string	config_id of newer version (if deprecated)

Managing Config Versions¶

# Upload and create a new config version
tradai strategy stage MyStrategy --version 1

# View strategy config
tradai strategy list

# The config service handles versioning programmatically:

from tradai.common.config.service import ConfigVersionService

service = ConfigVersionService(table_name="tradai-config-versions-dev")

# Create a new version (starts as DRAFT)
version = service.create_version(
    strategy_name="MyStrategy",
    config_data={"timeframe": "1h", "symbols": ["BTC/USDT:USDT"]},
    description="Updated symbols list",
)

# Activate it (auto-deprecates previous ACTIVE version)
active = service.activate("MyStrategy", version.config_id)

# List all versions for a strategy
versions = service.list_versions("MyStrategy")

# Get the currently active version
current = service.get_active("MyStrategy")

Config Loading at Runtime¶

When a strategy container starts, configs are loaded and merged from multiple sources:

graph TD
    S3["S3 Config - (base)"] --> Loader["StrategyConfigLoader"]
    MLflow["MLflow Tags - (model params)"] --> Loader
    ENV["Environment Vars - (overrides)"] --> Loader
    Loader --> Merge["ConfigMergeService"]
    Merge --> Validate["Validation"]
    Validate -->|pass| Config["StrategyConfig - (runtime)"]
    Validate -->|fail| Error["Startup Error"]

Priority order (highest wins):

Environment variables
MLflow model tags
S3 stored config
tradai.yaml defaults

Model Versioning (MLflow)¶

Model Lifecycle¶

Models are tracked in the MLflow Model Registry with four stages:

stateDiagram-v2
    [*] --> None : register()
    None --> Staging : stage()
    Staging --> Production : promote()
    Production --> Archived : new model promoted
    Archived --> Production : rollback()

    note right of None : Newly registered, not yet validated
    note right of Staging : Under validation, dry-run testing
    note right of Production : Live trading model (one per strategy)
    note right of Archived : Previous production, available for rollback

Stage	Description	Who sets it
None	Freshly registered, not yet reviewed	`ModelRegistrar` (automatic after training)
Staging	Under validation, dry-run testing	`tradai strategy set-version` or API
Production	Active production model	Promotion after validation passes
Archived	Previous production model, kept for rollback	Auto-archived when new model is promoted

Model Registration¶

After a backtest or training run, models are automatically registered:

sequenceDiagram
    participant ECS as ECS Task - (Freqtrade)
    participant MLflow as MLflow - Registry
    participant DDB as DynamoDB - State

    ECS->>MLflow: Log metrics + params
    ECS->>MLflow: Log model artifacts
    ECS->>MLflow: Register model version
    MLflow-->>ECS: Version number
    ECS->>DDB: Update job status
    ECS->>MLflow: Tag with git_commit, strategy_name

The ModelRegistrar handles this automatically:

from tradai.common.entrypoint.training.model_registrar import ModelRegistrar

registrar = ModelRegistrar(mlflow_adapter=adapter)
result = registrar.register(config=training_config, result=training_result)
# result.model_version is now set

Model Comparison¶

Before promoting a new model, the ModelComparator compares it against the current champion:

graph TD
    Challenger["Challenger Model - (new version)"] --> Compare["ModelComparator"]
    Champion["Champion Model - (current Production)"] --> Compare
    Compare --> Decision{"Better?"}
    Decision -->|"yes"| Promote["Promote to Production"]
    Decision -->|"no"| Keep["Keep current champion"]
    Decision -->|"insufficient data"| Manual["Manual review needed"]

Key comparison metrics:

Metric	Weight	Required	Threshold
`total_profit`	High	Yes	Must be positive
`sharpe_ratio`	High	Yes	>= 0.5
`max_drawdown`	Medium	Yes	<= 30%
`win_rate`	Medium	No	>= 40%
`total_trades`	Low	Yes	>= 10

CLI Commands¶

# Stage a model version for validation
tradai strategy set-version MyStrategy 3 --stage Staging

# Promote to production (archives current champion)
tradai strategy set-version MyStrategy 3 --stage Production

# Promote without archiving previous version
tradai strategy set-version MyStrategy 3 --no-archive

# Preview promotion (dry run)
tradai strategy set-version MyStrategy 3 --dry-run

# Rollback to previous deployment
tradai deploy rollback MyStrategy --env dev

# Rollback to a specific deployment
tradai deploy rollback MyStrategy --env dev --deployment deploy-abc123

API Endpoints¶

The Strategy Service exposes model management APIs:

Method	Path	Description
`GET`	`/api/v1/strategies/{name}/models`	List model versions
`GET`	`/api/v1/strategies/{name}/models/rollback-candidates`	List rollback candidates
`POST`	`/api/v1/strategies/{name}/models/stage`	Stage a model version
`POST`	`/api/v1/strategies/{name}/models/promote`	Promote to Production
`POST`	`/api/v1/strategies/{name}/models/rollback`	Rollback model

Automated Retraining Pipeline¶

The retraining workflow is orchestrated by Step Functions:

graph TD
    Trigger["Trigger - Schedule / Drift / Manual"] --> Check["Check Retraining - Needed?"]
    Check -->|"yes"| Train["Train Model - (ECS + FreqAI)"]
    Check -->|"no"| Skip["Skip"]
    Train --> Compare["Compare Models - (Lambda)"]
    Compare -->|"better"| Promote["Promote Model - (Lambda)"]
    Compare -->|"worse"| Keep["Keep Champion"]
    Promote --> Notify["Notify - (SNS + Slack)"]
    Keep --> Notify

Lambda functions involved:

Lambda	Role
`check-retraining-needed`	Evaluates drift scores and schedules
`compare-models`	Champion vs challenger comparison
`promote-model`	Transitions model to Production stage
`model-rollback`	Reverts to previous model version
`drift-monitor`	Monitors PSI for model/data drift
`retraining-scheduler`	Triggers retraining on schedule

Environment-Specific Configuration¶

Settings Hierarchy¶

Each service uses Pydantic settings with environment variable prefixes:

Service	Prefix	Key Settings
Backend	`BACKEND_`	`BACKEND_EXECUTOR_MODE`, `BACKEND_BACKTEST_QUEUE_URL`
Strategy Service	`STRATEGY_SERVICE_`	`STRATEGY_SERVICE_MLFLOW_TRACKING_URI`, `STRATEGY_SERVICE_STRATEGY_PATH`
Data Collection	`DATA_COLLECTION_`	`DATA_COLLECTION_EXCHANGES`, `DATA_COLLECTION_ARCTIC_S3_BUCKET`

Settings Mixins¶

Common settings are shared via mixins:

# MLflow settings (shared by Strategy Service and Backend)
class MLflowSettingsMixin:
    mlflow_tracking_uri: str   # MLFLOW_TRACKING_URI
    mlflow_username: str       # MLFLOW_USERNAME
    mlflow_password: str       # MLFLOW_PASSWORD

# ArcticDB settings (shared by Data Collection and Strategy Service)
class ArcticSettingsMixin:
    arctic_s3_bucket: str      # ARCTIC_S3_BUCKET
    arctic_library_name: str   # ARCTIC_LIBRARY_NAME (default: "ohlcv")
    arctic_s3_endpoint: str    # ARCTIC_S3_ENDPOINT

Per-Environment Differences¶

Setting	Dev	Staging	Prod
Executor mode	`local`	`sqs`	`stepfunctions`
RDS instance	`db.t4g.micro`	`db.t4g.micro`	`db.t4g.small`
ECS launch type	EC2 (consolidated)	EC2 (consolidated)	Fargate
Log retention	30 days	30 days	90 days
Deletion protection	Off	Off	On
MLflow URL	`http://localhost:5001`	Service Discovery	Service Discovery

Traceability¶

Every operation is traceable across the entire system:

graph LR
    TraceID["trace_id"] --> Backend["Backend API"]
    TraceID --> SQS["SQS Message"]
    TraceID --> SF["Step Functions"]
    TraceID --> ECS["ECS Task"]
    TraceID --> MLflow["MLflow Run"]
    TraceID --> DDB["DynamoDB - job record"]

    JobID["job_id"] --> DDB
    JobID --> S3["S3 Results"]

    RunID["mlflow_run_id"] --> MLflow
    RunID --> DDB

    GitSHA["git_commit"] --> MLflow
    GitSHA --> DDB

Field	Description	Where stored
`trace_id`	End-to-end correlation ID	DynamoDB, Step Functions input, ECS env
`job_id`	DynamoDB job identifier	DynamoDB, S3 result paths
`mlflow_run_id`	MLflow experiment run	DynamoDB, BacktestResult, MLflow
`git_commit`	Code version SHA	BacktestResult, MLflow tags

Quick Reference¶

Config Commands¶

Command	Description
`tradai strategy list`	List registered strategies
`tradai strategy stage NAME --version V`	Stage a strategy version
`tradai strategy set-version NAME V`	Set model version stage
`tradai strategy set-version NAME V --stage Production`	Promote model
`tradai strategy set-version NAME V --dry-run`	Preview promotion
`tradai deploy strategy ./path --env dev`	Deploy strategy to ECS
`tradai deploy rollback NAME --env dev`	Rollback deployment

Key Source Files¶

Component	Path
ConfigVersion entity	`libs/tradai-common/src/tradai/common/entities/config_version.py`
ConfigVersionService	`libs/tradai-common/src/tradai/common/config/service.py`
S3ConfigRepository	`libs/tradai-common/src/tradai/common/aws/s3_config_repository.py`
ConfigMergeService	`libs/tradai-common/src/tradai/common/config/merge.py`
StrategyConfigLoader	`libs/tradai-common/src/tradai/common/config/loader.py`
ModelStage enum	`libs/tradai-common/src/tradai/common/entities/mlflow.py`
MLflowAdapter	`libs/tradai-common/src/tradai/common/mlflow/adapter.py`
ModelComparator	`libs/tradai-common/src/tradai/common/model_comparison/comparator.py`
ModelRegistrar	`libs/tradai-common/src/tradai/common/entrypoint/training/model_registrar.py`
Promotion routes	`services/strategy-service/src/tradai/strategy_service/api/promotion_routes.py`
Config routes	`services/strategy-service/src/tradai/strategy_service/api/config_routes.py`