TradAI Final Architecture - Canonical Configuration¶
Version: 9.3.0 | Date: 2026-03-28 | Status: CURRENT
TL;DR: Single source of truth for all infrastructure configuration. Defines VPC CIDRs, service specs (CPU/memory/ports), DynamoDB tables, S3 buckets, Lambda configs, and environment-specific overrides. All values in infra/shared/tradai_infra_shared/config.py.
This document mirrorsinfra/shared/tradai_infra_shared/config.py, which is the actual source of truth deployed by Pulumi. Update config.py first, then update this document to match.
graph TD
B["infra/shared/config.py<br/><b>Source of Truth (code)</b>"] --> A["10-CANONICAL-CONFIG.md<br/><b>Human-readable mirror</b>"]
B --> F["persistent stack"]
B --> G["foundation stack"]
B --> H["compute stack"]
B --> I["edge stack"]
A -.->|"read by"| C["Service Environment Variables<br/>ECS task definitions"]
A -.->|"read by"| D["Lambda Configuration<br/>Memory, timeout, VPC"]
A -.->|"read by"| E["Security Configuration<br/>SGs, NACLs, WAF, Cognito"]
style B fill:#d32f2f,color:#fff
style A fill:#1565c0,color:#fff
Immutable Values
The following values must NEVER change after initial deployment without a full migration plan: VPC CIDR (10.0.0.0/16), subnet CIDRs, DynamoDB partition keys (per table, see Section 3.2), S3 bucket naming pattern (tradai-{component}-{env}), and Cognito User Pool ID. Changing these will cause data loss or service disruption.
Environment Overrides
Environment-specific values (RDS instance class, desired count, log retention, etc.) are controlled by the Pulumi stack name. The stack name IS the environment (dev, staging, prod). See Section 10 for the full environment differences matrix.
Cost-effective NAT Instance with automatic failover (~$32/month savings vs NAT Gateway).
Component
Value
Notes
Instance Type
t4g.nano
ARM-based, ~$3/month
AMI
Amazon Linux 2023 ARM64
Latest AMI, auto-selected
ASG
min=1, max=1, desired=1
Single instance for cost optimization
EIP
Static allocation
Attached by user data script
Lambda
tradai-update-nat-routes
Updates route on instance replacement
Subnet
Public-1 (eu-central-1a)
Single AZ deployment
User Data Configuration: - IP forwarding enabled (net.ipv4.ip_forward = 1) - NAT masquerading via iptables - Automatic EIP association - Source/destination check disabled
Failover Flow: 1. ASG detects unhealthy instance (EC2 health check) 2. ASG terminates failed instance and launches replacement 3. ASG Lifecycle Hook triggers EventBridge rule 4. Lambda function tradai-update-nat-routes invoked 5. Lambda waits for instance running state 6. Lambda updates private route table (0.0.0.0/0 -> new instance) 7. Lambda completes lifecycle action 8. Traffic resumes (~2-3 minutes failover time)
Strategy class name (or FREQTRADE_STRATEGY as fallback)
PascalStrategy
STRATEGY_ID
Unique strategy identifier (required for live/dry-run)
pascal-btc
TRADING_MODE
Execution mode
backtest, live, dry-run, train
MLFLOW_TRACKING_URI
MLflow endpoint
http://mlflow.tradai-{env}.local:5000/mlflow
Note (Issue 10 Fix): Documentation previously used STRATEGY_NAME and STRATEGY_STAGE. The actual env vars are STRATEGY and STRATEGY_ID. Stage is determined by deployment config.
Optional (Infrastructure):
Variable
Description
Default
CONFIG_PATH
Freqtrade config file path
/freqtrade/user_data/config.json
DYNAMODB_TABLE
Workflow state table (also accepts WORKFLOW_STATE_TABLE)
All 12 tables use PAY_PER_REQUEST billing, server-side encryption, and point-in-time recovery (except idempotency). Prod tables have deletion protection enabled.
Note: NAT SG currently allows TCP only. DNS (port 53) and NTP (port 123) use UDP. If private subnet workloads need direct DNS resolution or NTP through NAT (rather than via VPC-provided DNS), UDP rules for ports 53 and 123 should be added to both ingress and egress. Currently VPC DNS handles resolution so this is not a blocker, but should be added if custom DNS or NTP is required.
All 12 DynamoDB tables match Section 3.2 (names, keys, GSIs, TTL)
All resources have required tags
Secrets exist in Secrets Manager
SSL enforcement enabled on RDS
Document Control: - This document mirrors infra/shared/tradai_infra_shared/config.py which is the actual source of truth. Update config.py first, then update this document. - Changes require review and version increment - When this document conflicts with config.py, config.py takes precedence
Added all 12 DynamoDB tables with keys/GSIs/TTL (was 1 of 12). Fixed ownership: config.py is source of truth, this doc mirrors it. Added Compute Mode row to env differences. Added UDP note to NAT SG. Fixed validation checklist.