Lambda Functions¶
Serverless functions for TradAI platform orchestration, monitoring, and automation.
Overview¶
TradAI uses AWS Lambda for event-driven processing, scheduled tasks, and Step Functions workflow integration. All Lambdas use the lambda_handler decorator from tradai-common for consistent error handling and metrics.
from tradai.common.lambda_ import lambda_handler, LambdaResponse
@lambda_handler
def handler(event: dict, context) -> dict:
# Business logic here
return LambdaResponse.success({"result": "data"})
Lambda Categories¶
Workflow Orchestration¶
| Lambda | Trigger | Purpose |
|---|---|---|
| backtest-consumer | SQS | Launches ECS backtesting tasks from queue |
| sqs-consumer | SQS | Launches ECS retraining tasks from queue |
| cleanup-resources | Step Functions | Cleans up orphaned ECS tasks on workflow failure |
Model Lifecycle¶
| Lambda | Trigger | Purpose |
|---|---|---|
| check-retraining-needed | Step Functions | Evaluates if model needs retraining |
| compare-models | Step Functions | Compares champion vs challenger models |
| promote-model | Step Functions | Promotes model to Production in MLflow |
| model-rollback | CloudWatch Alarm | Rolls back model to previous version |
| retraining-scheduler | EventBridge | Schedules and triggers model retraining |
Monitoring & Health¶
| Lambda | Trigger | Purpose |
|---|---|---|
| health-check | EventBridge | Checks ECS service health via Service Discovery |
| trading-heartbeat-check | EventBridge | Monitors live trading container heartbeats |
| drift-monitor | EventBridge | Detects model drift using PSI metrics |
| orphan-scanner | EventBridge | Finds and stops orphaned ECS tasks |
Deployment & Validation¶
| Lambda | Trigger | Purpose |
|---|---|---|
| validate-strategy | Step Functions | Validates strategy config before deployment |
| notify-completion | Step Functions | Sends SNS/Slack notifications |
Data Services¶
| Lambda | Trigger | Purpose |
|---|---|---|
| data-collection-proxy | API/Step Functions | Proxy to data-collection service |
Common Patterns¶
Environment Variables¶
All Lambdas use these common environment variables:
| Variable | Description |
|---|---|
ENVIRONMENT | Environment name (dev/staging/prod) |
DYNAMODB_TABLE_NAME | Default state repository table |
ALERT_SNS_TOPIC_ARN | SNS topic for alerts |
ECS_CLUSTER | ECS cluster name/ARN |
Response Format¶
All Lambdas return LambdaResponse format:
# Success
{
"statusCode": 200,
"body": {"result": "data"}
}
# Error
{
"statusCode": 400,
"body": {"error": "Error message"}
}
Step Functions Integration¶
Lambdas in Step Functions workflows return data directly (not wrapped in statusCode):
# For Step Functions
return LambdaResponse.success({"decision": "PROMOTE"})
# Returns: {"decision": "PROMOTE"}
SQS Batch Processing¶
SQS-triggered Lambdas support partial batch failure:
Deployment¶
Lambdas are deployed via Pulumi in infra/modules/lambda_funcs.py:
# Example: Creating a Lambda function
lambda_func = aws.lambda_.Function(
f"{name}-lambda",
runtime="python3.11",
handler="handler.handler",
timeout=300,
memory_size=256,
environment=aws.lambda_.FunctionEnvironmentArgs(
variables={
"ENVIRONMENT": environment,
"DYNAMODB_TABLE_NAME": state_table.name,
}
),
)
IAM Permissions¶
Each Lambda has specific IAM permissions defined in infra/modules/iam.py. Common patterns:
- ECS Lambdas:
ecs:RunTask,ecs:DescribeTasks,ecs:ListTasks,ecs:StopTask - DynamoDB Lambdas:
dynamodb:GetItem,dynamodb:PutItem,dynamodb:Query,dynamodb:Scan - SNS Lambdas:
sns:Publish - CloudWatch Lambdas:
cloudwatch:PutMetricData
Monitoring¶
CloudWatch Metrics¶
All Lambdas publish custom metrics to CloudWatch namespace TradAI/Lambda:
| Metric | Description |
|---|---|
Invocations | Total invocations |
Errors | Error count |
Duration | Execution time (ms) |
ConcurrentExecutions | Concurrent executions |
CloudWatch Alarms¶
Critical Lambdas have CloudWatch alarms configured:
# Example alarm for health-check failures
aws.cloudwatch.MetricAlarm(
"health-check-alarm",
comparison_operator="GreaterThanThreshold",
evaluation_periods=3,
metric_name="Errors",
namespace="AWS/Lambda",
period=300,
statistic="Sum",
threshold=1,
)
See Also¶
Architecture:
- Architecture Overview - System diagrams including Lambda infrastructure
- Step Functions - Workflow orchestration details
- Services - ECS service definitions
- ML Lifecycle - Model training and drift detection
SDK Reference:
- tradai-common - Lambda decorators and utilities
- tradai-strategy - Strategy validation
Services:
- Backend Service - Backtest submission
- Strategy Service - Model registry integration
CLI:
- CLI Reference - Command-line tools
Infrastructure:
- Pulumi Code - Lambda deployment code