Skip to content

TradAI Final Architecture - Cost Analysis

Version: 9.2 | Date: 2025-12-09


Cost Summary

Before vs After Optimization

Metric Original (v5.1) Corrected Baseline Optimized (v8.0)
Documented Cost $109/mo $220/mo $78/mo
Accuracy Low (missing components) High High
Savings - Baseline 64%

Detailed Cost Breakdown

1. Compute - Long-Running Services

Service vCPU Memory Hours/Month Rate Monthly Cost
Backend API 0.5 1 GB 730 $0.020/hr $14.60
Data Collection 0.25 512 MB 730 $0.010/hr $7.30
MLflow 0.5 1 GB 730 $0.020/hr $14.60
Subtotal $36.50

2. Compute - On-Demand Tasks

Assumptions: 50 backtests/month, 20 data syncs/month

Task vCPU Memory Avg Duration Runs/Month Monthly Cost
Strategy Service 0.5 1 GB 10 min 50 $1.67
Strategy Container* 1 2 GB 30 min 50 $5.00
Data Collection Task 0.5 1 GB 15 min 20 $1.00
Subtotal (On-Demand) $7.67
With Fargate Spot (70% savings) $1.92

*Strategy Container uses Fargate Spot for cost optimization

3. Networking

Component Configuration Baseline Cost Optimized Cost
NAT Gateway (2 AZs) Standard $64.80 -
NAT Instance (t4g.nano) 1 instance - $6.13
ALB 1 LB, minimal traffic $17.00 $17.00
VPC Endpoints (ECR, Secrets) 4 interface endpoints $28.00 $0*
Subtotal $109.80 $23.13

*VPC Endpoints removed - using NAT Instance for all AWS API calls

4. Storage & Database

Component Configuration Baseline Cost Optimized Cost
RDS PostgreSQL db.t4g.micro, Multi-AZ $36.00 $18.00*
S3 Storage 100 GB $2.50 $1.00**
S3 Requests 100K requests $0.50 $0.50
ECR 50 GB images $5.00 $5.00
DynamoDB On-demand, ~100 ops/day $2.00 $2.00
Subtotal $46.00 $26.50

Single-AZ for dev environment *With lifecycle policies (delete temp after 7 days)

5. Serverless

Component Configuration Monthly Cost
AWS API Gateway HTTP API, ~50K requests $3.50
Lambda (8 functions) ~10K invocations $0.00 (free tier)
Step Functions ~500 transitions $0.50
SQS (FIFO) ~1K messages $0.40
Subtotal $4.40

6. Observability

Component Configuration Baseline Cost Optimized Cost
CloudWatch Logs 10 GB, 30-day retention $10.00 $5.00*
CloudWatch Metrics Custom metrics $3.00 $3.00
CloudWatch Alarms 10 alarms $1.00 $1.00
CloudWatch Dashboard 1 dashboard $3.00 $3.00
Subtotal $17.00 $12.00

*7-day retention instead of 30-day

7. Security

Component Configuration Monthly Cost
Secrets Manager 5 secrets $2.00
CloudTrail 1 trail $2.00
Cognito < 50K MAU $0.00 (free tier)
WAF Web ACL + managed rules $5.00
Subtotal $9.00

8. Live Trading (v9.1)

Note: Live trading costs are per-strategy and added on top of the base platform.

Component Specification Per Strategy/Month
ECS Fargate (24/7) 0.5 vCPU, 1GB $15.33
CloudWatch Logs ~1 GB/month $0.50
DynamoDB ~1M reads (heartbeats) $0.25
Secrets Manager 2 secrets (exchange keys) $0.80
EventBridge 8,640 invocations/month $0.01
Lambda (health-check) 8,640 × 128MB × 1s $0.11
SNS ~100 alerts $0.01
Infrastructure Subtotal $17.01
Reserve (exchange fees, data feeds) +$29.00
Total per Strategy ~$46/month

Example: 3 Live Strategies

Item Monthly Cost
Base Platform (Phases 1-5) $78-99
Pascal Strategy (live) $46
Momentum Strategy (dry-run) $46
ML Trend Strategy (live) $46
Platform Total $216-237

Total Cost Comparison

Monthly Cost Summary (Base Platform)

Category Baseline Optimized Savings
Long-Running Services $36.50 $36.50 $0
On-Demand Tasks $7.67 $1.92 $5.75
Networking $109.80 $23.13 $86.67
Storage & Database $46.00 $26.50 $19.50
Serverless $4.40 $4.40 $0
Observability $17.00 $12.00 $5.00
Security $9.00 $9.00 $0
BASE PLATFORM $230.37 $113.45 $116.92
With Reserved RDS (1yr) $99.45 $130.92
Final Target (Backtesting only) ~$78-99/mo 57-66%

Monthly Cost Summary (With Live Trading - v9.1)

Component Cost
Base Platform (optimized) $78-99
Live Strategy #1 $46
Live Strategy #2 $46
Live Strategy #3 $46
TOTAL (3 strategies) $216-237

Live trading costs scale linearly with number of strategies.

Cost by Category (Optimized)

┌─────────────────────────────────────────────────────────────┐
│                    Monthly Cost: $99.45                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Long-Running Services    ████████████████████  $36.50 (37%)│
│  Storage & Database       ███████████           $26.50 (27%)│
│  Networking               █████████             $23.13 (23%)│
│  Observability            █████                 $12.00 (12%)│
│  Security                 ███                   $9.00  (9%) │
│  Serverless               ██                    $4.40  (4%) │
│  On-Demand Tasks          █                     $1.92  (2%) │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Optimization Details

Optimization 1: NAT Instance vs NAT Gateway

NAT Gateway (2 AZs):
├─ Hourly charge: $0.045/hr × 730 hrs × 2 = $65.70
├─ Data processing: $0.045/GB × ~100 GB = $4.50
└─ Total: ~$70/month

NAT Instance (t4g.nano):
├─ Instance: $0.0042/hr × 730 hrs = $3.07
├─ EIP: $0 (attached to instance)
├─ Data: Included
└─ Total: ~$3/month + some overhead = $6.13

SAVINGS: $64/month (91% reduction)

Trade-offs:
- Lower bandwidth (5 Gbps vs 45 Gbps) - acceptable for our workload
- Self-managed (patching, monitoring) - minimal effort with ASG
- Single point of failure - mitigated with ASG health checks

Optimization 2: Remove VPC Endpoints

VPC Interface Endpoints:
├─ ECR API: $7/mo × 2 AZs = $14
├─ ECR DKR: $7/mo × 2 AZs = $14
├─ Secrets Manager: $7/mo × 2 AZs = $14 (optional)
└─ Total: $28-42/month

Alternative: Route through NAT Instance
├─ Additional NAT traffic: ~10 GB/month
├─ Cost: $0.045/GB × 10 GB = $0.45/month
└─ Total: ~$0.45/month

SAVINGS: $28-42/month

Trade-offs:
- Slightly higher latency for AWS API calls (~10ms)
- Traffic goes through internet (still encrypted)
- Acceptable for non-high-frequency operations

Optimization 3: RDS Single-AZ

RDS Multi-AZ:
├─ Primary: $18/month
├─ Standby: $18/month
└─ Total: $36/month

RDS Single-AZ:
└─ Total: $18/month

SAVINGS: $18/month (50% reduction)

Trade-offs:
- No automatic failover
- 15-30 minute recovery on failure
- Acceptable for dev/staging environments

Production recommendation:
- Keep Multi-AZ for production
- Use Single-AZ for dev/staging

Optimization 4: Fargate Spot

Fargate On-Demand (Strategy Tasks):
├─ 50 runs × 30 min × $0.04/hr = $10/month
└─ Total: ~$10/month

Fargate Spot:
├─ 70% discount on Fargate pricing
├─ 50 runs × 30 min × $0.012/hr = $3/month
└─ Total: ~$3/month

SAVINGS: $7/month (70% reduction)

Trade-offs:
- Tasks may be interrupted (2-minute warning)
- Mitigation: Checkpoint progress to S3
- Acceptable for batch processing (backtests)

Optimization 5: CloudWatch Logs Retention

30-day retention:
├─ 10 GB × $0.50/GB ingestion = $5
├─ 10 GB × $0.03/GB storage × 30 days = $9
└─ Total: ~$14/month

7-day retention:
├─ 10 GB × $0.50/GB ingestion = $5
├─ 10 GB × $0.03/GB storage × 7 days = $2.10
└─ Total: ~$7/month

SAVINGS: $7/month (50% reduction)

Trade-offs:
- Shorter debugging window
- Mitigation: Export important logs to S3 for long-term storage

Optimization 6: S3 Lifecycle Policies

Without lifecycle:
├─ Temp files accumulate
├─ 100 GB × $0.023/GB = $2.30
└─ Growing over time

With lifecycle:
├─ Temp files deleted after 7 days
├─ Results archived to Glacier after 30 days
├─ ~40 GB average = $1.00
└─ Stable cost

SAVINGS: $1.30/month + prevents cost growth

Optimization 7: Reserved Capacity (Optional)

RDS On-Demand:
└─ db.t4g.micro: $18/month

RDS Reserved (1-year, no upfront):
└─ db.t4g.micro: $12/month

SAVINGS: $6/month (33% reduction)

Requirements:
- 1-year commitment
- Wait 1 month to verify stable usage

Cost Scaling Analysis

Cost at Different Usage Levels

Backtests/Month 10 50 100 200
On-Demand Tasks $0.40 $1.92 $3.84 $7.68
Step Functions $0.10 $0.50 $1.00 $2.00
S3 (results) $0.50 $1.00 $2.00 $4.00
Variable Cost $1.00 $3.42 $6.84 $13.68
Fixed Cost $95.03 $95.03 $95.03 $95.03
Total $96.03 $98.45 $101.87 $108.71

Break-Even Analysis

Current architecture supports 50-200 backtests/month at ~$100/month

At 500+ backtests/month:
- Consider provisioned Step Functions
- Consider reserved Fargate capacity
- Estimated cost: ~$150/month

At 1000+ backtests/month:
- Consider dedicated EC2 instances for backtesting
- Consider Kubernetes (EKS) for orchestration
- Estimated cost: ~$300/month

Cost Monitoring

Budget Alerts

Budget Configuration:
├─ Monthly Budget: $100
├─ Alert 1: 50% ($50) - Email notification
├─ Alert 2: 80% ($80) - Email + Slack notification
├─ Alert 3: 100% ($100) - Email + Slack + PagerDuty
└─ Forecast Alert: 110% projected - Email notification

Cost Allocation Tags

Required Tags:
├─ Application: tradai
├─ Environment: production | staging | dev
├─ Service: backend-api | data-collection | mlflow | strategy
├─ Owner: team-name
└─ CostCenter: trading-platform

Cost Explorer Queries

Monthly by Service:
- Filter: Application = tradai
- Group by: Service tag
- Time: Last 3 months

Daily Trend:
- Filter: Application = tradai
- Group by: Day
- Time: Last 30 days

Anomaly Detection:
- Enable AWS Cost Anomaly Detection
- Threshold: 20% above normal
- Alert: Email + Slack

Future Cost Optimizations (Phase 2)

Optimization Potential Savings Effort Priority
Graviton instances for ECS 20% on compute Medium High
Spot instances for MLflow $5/month Low Medium
S3 Intelligent Tiering $0.50/month Low Low
Reserved Fargate 30% on compute Low Medium
Right-sizing after 1 month 10-20% Medium High

Summary

Metric Value
Final Monthly Cost $78-99
Cost per Backtest $0.07
Fixed Costs $95/month
Variable Costs $0.07/backtest
Savings vs Baseline 57-66%

Next Steps

  1. Review 08-IMPLEMENTATION-ROADMAP.md for deployment plan
  2. Set up budget alerts in AWS Cost Explorer
  3. Apply cost allocation tags to all resources