Skip to content

Pulumi Module Reference

Complete reference for all 28 infrastructure modules in /infra/modules/.

Module Inventory

Phase 0: State Management

Module File Task ID Description
Pulumi Backend pulumi_backend.py - S3 bucket for Pulumi state, IAM roles for CI/CD

Phase 1: Foundation

Module File Task ID Description
VPC Network vpc.py IF002 VPC, 6 subnets (2 AZs, 3 tiers), IGW, route tables
VPC Endpoints vpc_endpoints.py SEC002 Gateway endpoints for S3 and DynamoDB
VPC Flow Logs vpc_flow_logs.py SEC004 Flow logs to CloudWatch for audit
Network ACLs nacl.py SEC005 Stateless firewall rules per subnet tier
S3 Buckets s3.py IS001 5 buckets: configs, results, arcticdb, logs, mlflow
CloudTrail cloudtrail.py SEC003 Audit logging to S3 and CloudWatch
DynamoDB Tables dynamodb.py IS003 8 tables for workflow/health/trading state
SNS Topics sns.py MN001, SR016 Alert notifications and registration events
Security Groups security_groups.py IF004 5 SGs: ALB, ECS, Lambda, RDS, NAT
NAT Instance nat_instance.py IF003 t4g.nano NAT with ASG for HA
RDS Database rds.py IS002 PostgreSQL (v15.4) for MLflow
Secret Rotation secret_rotation.py SEC006 RDS secret rotation (30-day schedule)
ECR Repositories ecr.py IS004 12 repos: 4 services + 8 Lambda images
CodeArtifact codeartifact.py SR003 Private Python package repository

Phase 2: Compute

Module File Task ID Description
IAM Roles iam.py IC001 ECS execution role + task role
ECS Cluster ecs.py IC001, BE007 Fargate cluster, strategy task definition
ALB alb.py IC002 Application Load Balancer, listeners, target groups
SQS Queues sqs.py IO001 Backtest queue + DLQ
Cognito Auth cognito.py DK005 User pool with MFA, M2M client
ECS Services ecs_services.py IC003 4 services: backend, strategy, data, mlflow
Lambda Functions lambda_funcs.py IC004 8 container-image Lambdas
API Gateway api_gateway.py IC005 HTTP API with 11 routes, Cognito auth
WAF waf.py SEC001 WebACL with rate limiting

Phase 3: Orchestration

Module File Task ID Description
Step Functions step_functions.py IO002, BE008 Backtest workflow state machine

Phase 4: Monitoring

Module File Task ID Description
CloudWatch Alarms cloudwatch_alarms.py MN003, INF007 Composite alarm, heartbeat detection, Lambda errors
CloudWatch Dashboard cloudwatch_dashboard.py OB001 Trading platform metrics dashboard

Dependency Graph

                              ┌─────────────────┐
                              │ pulumi_backend  │
                              └────────┬────────┘
               ┌───────────────────────┼───────────────────────┐
               │                       │                       │
        ┌──────▼──────┐         ┌──────▼──────┐         ┌──────▼──────┐
        │     vpc     │         │     s3      │         │  dynamodb   │
        └──────┬──────┘         └──────┬──────┘         └──────┬──────┘
               │                       │                       │
    ┌──────────┼──────────┐           │                       │
    │          │          │    ┌──────▼──────┐                │
┌───▼───┐ ┌────▼────┐ ┌───▼───┐│ cloudtrail  │                │
│  vpc  │ │  nacl   │ │  vpc  ││             │                │
│ endpt │ │         │ │ flow  │└─────────────┘                │
└───────┘ └─────────┘ │ logs  │                               │
                      └───────┘                               │
               │                                              │
        ┌──────▼──────┐                                       │
        │security_grps│                                       │
        └──────┬──────┘                                       │
               │                                              │
    ┌──────────┼──────────┬───────────────────┐              │
    │          │          │                   │              │
┌───▼───┐ ┌────▼────┐ ┌───▼───┐         ┌─────▼─────┐        │
│  nat  │ │   rds   │ │  alb  │         │    sns    │────────┤
│ inst  │ └────┬────┘ └───┬───┘         └─────┬─────┘        │
└───────┘      │          │                   │              │
               │          │                   │              │
        ┌──────▼──────┐   │            ┌──────▼──────┐       │
        │secret_rotat │   │            │cw_alarms    │       │
        └─────────────┘   │            └─────────────┘       │
                          │                                  │
               ┌──────────┼──────────┐                       │
               │          │          │                       │
        ┌──────▼──────┐   │   ┌──────▼──────┐               │
        │     iam     │   │   │   cognito   │               │
        └──────┬──────┘   │   └──────┬──────┘               │
               │          │          │                       │
        ┌──────▼──────┐   │          │                       │
        │     ecs     │───┼──────────┤                       │
        └──────┬──────┘   │          │                       │
               │          │          │                       │
    ┌──────────┼──────────┤          │                       │
    │          │          │          │                       │
┌───▼───┐ ┌────▼────┐ ┌───▼───────────▼───┐                 │
│  ecs  │ │ lambda  │ │    api_gateway    │                 │
│ srvcs │ │  funcs  │ └─────────┬─────────┘                 │
└───┬───┘ └────┬────┘           │                           │
    │          │         ┌──────▼──────┐                    │
    │          │         │     waf     │                    │
    │          │         └─────────────┘                    │
    │          │                                            │
    └──────────┼────────────────────────────────────────────┘
        ┌──────▼──────┐
        │step_functs  │
        └─────────────┘

Module Details

vpc.py (IF002)

Creates: - VPC with CIDR 10.0.0.0/16 - 6 subnets across 2 AZs (public, private, database) - Internet Gateway - Route tables per tier

Outputs:

vpc_id: str
public_subnet_ids: list[str]
private_subnet_ids: list[str]
database_subnet_ids: list[str]
private_route_table_id: str
database_route_table_id: str

Usage:

vpc = VpcNetwork()
pulumi.export("vpc_id", vpc.vpc.id)


security_groups.py (IF004)

Creates 5 security groups:

SG Ingress Egress Purpose
ALB 80, 443 from 0.0.0.0/0 All Load balancer
ECS From ALB SG All Container traffic
Lambda None All Function networking
RDS 5432 from ECS SG All Database access
NAT From private CIDR All Outbound internet

Outputs:

alb_sg_id: str
ecs_sg_id: str
lambda_sg_id: str
rds_sg_id: str
nat_sg_id: str


s3.py (IS001)

Creates 5 buckets:

Bucket Purpose Lifecycle
configs Strategy configurations None
results Backtest results 90-day expiration
arcticdb Time-series data None
logs Application logs 30-day expiration
mlflow MLflow artifacts None

Features: - AES-256 encryption - Versioning enabled - Public access blocked - Lifecycle policies


dynamodb.py (IS003)

Creates 8 tables:

Table Primary Key Purpose
workflow-state job_id Backtest job tracking
idempotency idempotency_key Request deduplication
health-state service_id Service health tracking
trading-state strategy_id Live trading state
deployments deployment_id Deployment tracking
drift-state model_id Model drift tracking
retraining-state model_id Retraining job state
rollback-state model_id Model rollback history

Features: - On-demand billing (pay per request) - Point-in-time recovery enabled - TTL configured where appropriate


ecs.py (IC001, BE007)

Creates: - ECS cluster with Fargate + Fargate Spot capacity - Generic strategy task definition - CloudWatch log group

Strategy Task Definition: - Image: Overridden at runtime via ECSBacktestExecutor - CPU: 512 (configurable) - Memory: 1024 (configurable) - Uses Fargate Spot for cost savings


lambda_funcs.py (IC004)

Creates 8 container-image Lambdas:

Function Schedule Purpose
health-check rate(5 minutes) Service health monitoring
heartbeat-check rate(1 minute) Trading heartbeat detection
orphan-scanner rate(15 minutes) Orphaned ECS task cleanup
drift-monitor rate(1 day) Model drift detection
retraining-scheduler rate(7 days) Retraining triggers
sqs-consumer SQS trigger Backtest queue processing
validate-strategy On-demand Strategy validation
data-proxy On-demand Data collection proxy

Features: - VPC placement in private subnets - Environment variables from config - Container images from ECR


api_gateway.py (IC005)

Creates: - HTTP API Gateway - 11 routes with ALB integration - Cognito JWT authorizer - Optional custom domain

Routes:

Method Path Auth Target
GET /health No Backend
POST /api/v1/backtests Yes Backend
GET /api/v1/backtests Yes Backend
GET /api/v1/backtests/{id} Yes Backend
GET /api/v1/strategies Yes Strategy Service
POST /api/v1/strategies/* Yes Strategy Service
GET /api/v1/data/* Yes Data Collection
POST /api/v1/hyperopt Yes Strategy Service
GET /api/v1/models/* Yes Strategy Service
POST /api/v1/models/* Yes Strategy Service
GET /api/v1/catalog/* Yes Backend

step_functions.py (IO002, BE008)

Creates: - Backtest workflow state machine - IAM execution role

Workflow States: 1. ValidateStrategy - Lambda validation 2. DecideExecutionMode - Choice state 3. RunBacktest - ECS task 4. Notify - Success/failure handling

Type: STANDARD (supports 2+ hour executions)


cloudwatch_alarms.py (MN003, INF007)

Creates: - Composite alarm for service health - Stale heartbeat alarm - Lambda error alarms (per function) - EventBridge rules for Lambda schedules

Configurable Thresholds:

pulumi config set alarm_latency_threshold 5000
pulumi config set alarm_min_strategies 1
pulumi config set alarm_stale_threshold 1


Environment-Specific Behavior

Resource Dev Staging Prod
RDS Instance db.t4g.micro db.t4g.small db.t4g.small
RDS Multi-AZ No No Yes
NAT Gateway Instance Instance NAT Gateway
ECS Replicas 1 1 2+
Log Retention 7 days 30 days 90 days
Fargate Spot Yes Yes No (for live)

Outputs Quick Reference

# Core
pulumi stack output vpc_id
pulumi stack output ecs_cluster_name
pulumi stack output api_gateway_endpoint

# Database
pulumi stack output rds_endpoint
pulumi stack output rds_secret_arn

# Storage
pulumi stack output s3_bucket_ids
pulumi stack output ecr_repository_urls

# Auth
pulumi stack output cognito_user_pool_id
pulumi stack output cognito_user_pool_client_id

# Monitoring
pulumi stack output composite_alarm_arn
pulumi stack output dashboard_url