promote-model¶
Promotes a challenger model version to Production stage in MLflow, archiving the previous champion.
Overview¶
| Property | Value |
|---|---|
| Trigger | Step Functions / Direct |
| Runtime | Python 3.11 |
| Timeout | 60 seconds |
| Memory | 256 MB |
| Settings class | ModelManagementSettings |
Input Schema¶
The model version comes from the Step Functions event payload (not from an environment variable):
{
"model_name": "PascalStrategy", # Required
"new_version": "3", # Optional: version to promote
"confidence": 0.85 # Optional: comparison confidence score (default: 0.0)
}
Version Resolution¶
- If
new_versionis provided in the event, that version is promoted directly - If
new_versionis not provided, the handler finds the latestStagingversion from MLflow'sregistered_model.latest_versionsand promotes that - If no Staging version exists and no version is specified, returns
{"promoted": false, "reason": "No Staging version found"}
Output Schema¶
Success¶
{
"promoted": true,
"new_version": "3",
"new_stage": "Production",
"archived_version": "2" # Previous Production version, or null
}
Idempotent (already promoted)¶
{
"promoted": true,
"new_version": "3",
"new_stage": "Production",
"archived_version": null,
"already_promoted": true
}
Failure¶
Environment Variables¶
| Variable | Required | Default | Description |
|---|---|---|---|
MLFLOW_TRACKING_URI | Yes | - | MLflow server URL |
MLFLOW_TRACKING_USERNAME | No | - | MLflow basic auth username |
MLFLOW_TRACKING_PASSWORD | No | - | MLflow basic auth password |
MODEL_REGISTRY_NAME | No | tradai-models | MLflow model registry name |
ENVIRONMENT | No | dev | Environment name |
Promotion Process¶
flowchart TD
A[Promotion Request] --> A1{model_name provided?}
A1 -->|No| A2[Return promoted: false]
A1 -->|Yes| B{new_version specified?}
B -->|Yes| C[Use specified version]
B -->|No| D[Find latest Staging version]
D -->|Found| C
D -->|Not found| E[Return promoted: false]
C --> F{Already Production?}
F -->|Yes| F1[Return already_promoted: true]
F -->|No| G[Atomic transition to Production]
G --> H[archive_existing_versions=True]
H --> I[Send SNS notification]
I --> J[Return promoted: true] Key Features¶
- Atomic promotion: Uses
archive_existing_versions=Truein the MLflowtransition_model_version_stagecall to archive old Production versions in the same API call, avoiding partial failure states on retry - Idempotency: If the requested version is already in Production, returns success with
already_promoted: truewithout making any changes - Fallback to Staging: When no
new_versionis specified, automatically finds and promotes the latest Staging version - Uses
MLflowAdapter: Lazy-imported fromtradai.common.mlflowto reduce cold start overhead - Uses
ModelStageenum: Fromtradai.common.entities.mlflowfor stage values
Step Functions Integration¶
{
"PromoteModel": {
"Type": "Task",
"Resource": "arn:aws:lambda:...:promote-model",
"Parameters": {
"model_name.$": "$.model_name",
"new_version.$": "$.challenger_version",
"confidence.$": "$.comparison.confidence"
},
"Next": "NotifySuccess"
}
}
SNS Notification Format¶
Sent via ctx.alert_publisher when promotion succeeds (and alerts are enabled):
Model Promotion Completed
Model: PascalStrategy
New Production Version: 3
Previous Version: 2
Comparison Confidence: 85.0%
Environment: prod
Subject: [TradAI] Model Promoted: {model_name}
Related¶
- compare-models - Comparison before promotion
- model-rollback - Reverse promotion
- check-retraining-needed - Triggers retraining flow