orphan-scanner Scans for orphaned ECS tasks that have been running too long or have no state record.
Overview Property Value Trigger EventBridge scheduled event Runtime Python 3.11 Timeout 300 seconds Memory 256 MB
{
"dry_run" : true , # Default: true (safety default)
"max_runtime_hours" : 6 # Default: 6
}
Output Schema {
"summary" : {
"total_running" : 15 ,
"orphans_found" : 2 ,
"tasks_stopped" : 2 , # 0 if dry_run
"dry_run" : false
},
"orphans" : [
{
"task_arn" : "arn:aws:ecs:...:task/abc123" ,
"reason" : "Running longer than 6 hours" ,
"runtime_hours" : 8.5
},
{
"task_arn" : "arn:aws:ecs:...:task/def456" ,
"reason" : "No matching DynamoDB state record"
}
]
}
Environment Variables Variable Required Default Description ECS_CLUSTER Yes - ECS cluster name/ARN RETRAINING_STATE_TABLE Yes - Retraining state table TRADING_STATE_TABLE Yes - Trading state table MAX_TASK_RUNTIME_HOURS No 6 Max expected runtime DRY_RUN No "true" Safety default
Orphan Detection Criteria A task is considered orphaned if:
Runtime exceeded : Running longer than max_runtime_hours No state record : No matching entry in DynamoDB Stale state : State shows "completed"/"failed" but task still running Mismatched ARN : State record points to different task ARN CloudWatch Metrics Metric Description RunningTasksScanned Total tasks scanned OrphanTasksFound Orphaned tasks detected OrphanTasksStopped Tasks actually stopped
Key Features Uses pagination for large task lists (100 tasks per batch) Checks both retraining and trading state tables Dry-run mode for safe inspection Detailed alert with orphan reasons EventBridge Schedule {
"ScheduleExpression" : "rate(6 hours)" ,
"Targets" : [{
"Arn" : "arn:aws:lambda:...:orphan-scanner" ,
"Input" : "{\"dry_run\": false, \"max_runtime_hours\": 6}"
}]
}
{
"subject" : "Orphaned ECS Tasks Found" ,
"message" : {
"total_running" : 15 ,
"orphans_found" : 2 ,
"tasks_stopped" : 2 ,
"orphans" : [
{
"task_arn" : "..." ,
"reason" : "Running longer than 6 hours" ,
"runtime_hours" : 8.5
}
]
}
}