Security Incident Response Runbook¶

Procedures for handling security incidents including credential exposure, unauthorized access, and security policy violations.

Severity Classification¶

Level	Description	Response Time	Examples
P1	Active breach, data exposure	Immediate (<15 min)	Compromised AWS creds, unauthorized trading
P2	Suspected compromise	< 1 hour	Suspicious API activity, failed auth spike
P3	Policy violation	< 4 hours	Exposed secrets in logs, missing encryption

API Key/Secret Exposure¶

Symptoms¶

Secrets found in logs, code commits, or public repositories
Unexpected API calls from unknown sources
Exchange reporting unusual activity

Immediate Actions (P1)¶

Identify exposed credentials:

# Check what was exposed
# - Exchange API keys (Binance, etc.)
# - AWS credentials
# - Database passwords
# - JWT signing keys

Rotate exchange API keys immediately:

# 1. Create new API key on exchange (manually via exchange UI)
# 2. Update Secrets Manager
aws secretsmanager put-secret-value \
  --secret-id tradai/${ENVIRONMENT}/exchange-keys \
  --secret-string '{"api_key":"NEW_KEY","api_secret":"NEW_SECRET"}'

# 3. Restart services to pick up new credentials
aws ecs update-service \
  --cluster tradai-${ENVIRONMENT} \
  --service tradai-strategy-service-${ENVIRONMENT} \
  --force-new-deployment

Revoke old exchange keys (via exchange UI)

Check for unauthorized activity:

# Review exchange order history for unauthorized trades
# Review API call logs for suspicious patterns

AWS Credential Rotation¶

If AWS credentials exposed:

Disable compromised credentials:

# For IAM user access keys
aws iam update-access-key \
  --user-name $USER_NAME \
  --access-key-id $COMPROMISED_KEY_ID \
  --status Inactive

# For IAM user
aws iam delete-login-profile --user-name $USER_NAME

Create new credentials:

aws iam create-access-key --user-name $USER_NAME

Review CloudTrail for unauthorized actions:

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=AccessKeyId,AttributeValue=$COMPROMISED_KEY_ID \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ)

Delete compromised credentials (after verification):

aws iam delete-access-key \
  --user-name $USER_NAME \
  --access-key-id $COMPROMISED_KEY_ID

JWT Token Invalidation¶

If JWT signing key compromised:

Rotate Cognito app client secret:

# Generate new client secret
aws cognito-idp update-user-pool-client \
  --user-pool-id $USER_POOL_ID \
  --client-id $CLIENT_ID \
  --generate-secret

Force sign-out all users:

aws cognito-idp admin-user-global-sign-out \
  --user-pool-id $USER_POOL_ID \
  --username $USERNAME

Unauthorized Access Detected¶

Symptoms¶

Unexpected IAM activity in CloudTrail
Unknown IP addresses in access logs
Privilege escalation attempts
Resource creation outside normal patterns

Diagnosis¶

Review CloudTrail events:

# Recent suspicious events
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=CreateAccessKey \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[*].{Time:EventTime,User:Username,Event:EventName,Source:EventSource}'

Check for new IAM users/roles:

# List recently created users
aws iam list-users \
  --query "Users[?CreateDate>='$(date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)']"

# List recently created roles
aws iam list-roles \
  --query "Roles[?CreateDate>='$(date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)']"

Check API Gateway access logs:

aws logs filter-log-events \
  --log-group-name /aws/api-gateway/tradai-${ENVIRONMENT} \
  --start-time $(date -u -v-24H +%s)000 \
  --filter-pattern '{ $.status = 401 || $.status = 403 }'

ECS Task Isolation¶

If a container is suspected compromised:

Stop the suspicious task:

aws ecs stop-task \
  --cluster tradai-${ENVIRONMENT} \
  --task $TASK_ARN \
  --reason "Security incident investigation"

Update security group to deny all egress (preserve for forensics):

# Create isolated security group
aws ec2 create-security-group \
  --group-name tradai-isolated-${ENVIRONMENT} \
  --description "Isolated for security investigation" \
  --vpc-id $VPC_ID

# No ingress or egress rules - fully isolated

Capture task logs for forensics:

aws logs get-log-events \
  --log-group-name /ecs/tradai-${SERVICE_NAME}-${ENVIRONMENT} \
  --log-stream-name $LOG_STREAM \
  --start-time $(date -u -v-24H +%s)000 \
  --output json > forensics-logs-$(date +%Y%m%d).json

Network Security Breach¶

Symptoms¶

Unexpected outbound connections
Data exfiltration attempts in VPC Flow Logs
Unusual traffic patterns

Diagnosis¶

Check VPC Flow Logs:

aws logs filter-log-events \
  --log-group-name /aws/vpc/flowlogs/tradai-${ENVIRONMENT} \
  --start-time $(date -u -v-24H +%s)000 \
  --filter-pattern '[version, accountid, interfaceid, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action=REJECT, logstatus]'

Look for unexpected destinations:

# Check for traffic to unexpected IPs
aws logs filter-log-events \
  --log-group-name /aws/vpc/flowlogs/tradai-${ENVIRONMENT} \
  --start-time $(date -u -v-1H +%s)000 \
  --filter-pattern '[version, accountid, interfaceid, srcaddr, dstaddr != 10.*, srcport, dstport, protocol, packets, bytes, start, end, action=ACCEPT, logstatus]'

Containment¶

Block suspicious IPs via WAF:

# Get WAF Web ACL
WAF_ACL_ID=$(aws wafv2 list-web-acls --scope REGIONAL --query "WebACLs[?Name=='tradai-${ENVIRONMENT}-waf'].Id" --output text)

# Add IP to block list (requires updating WAF rule)

Update security group to block IP:

# Add deny rule to NACL for immediate block
aws ec2 create-network-acl-entry \
  --network-acl-id $NACL_ID \
  --rule-number 50 \
  --protocol -1 \
  --rule-action deny \
  --egress \
  --cidr-block $SUSPICIOUS_IP/32

CloudTrail Audit Analysis¶

Common Queries¶

Failed API calls:

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=ReadOnly,AttributeValue=false \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[?ErrorCode!=`null`].{Time:EventTime,Event:EventName,Error:ErrorCode,User:Username}'

Root account activity (should be rare):

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=Username,AttributeValue=root \
  --start-time $(date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)

IAM policy changes:

aws cloudtrail lookup-events \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[?contains(EventName, `Policy`)].{Time:EventTime,Event:EventName,User:Username}'

Security group changes:

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventSource,AttributeValue=ec2.amazonaws.com \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[?contains(EventName, `SecurityGroup`)].{Time:EventTime,Event:EventName,User:Username}'

Post-Incident Actions¶

Immediate (within 1 hour)¶

[ ] Contain the incident (rotate credentials, isolate resources)
[ ] Preserve evidence (logs, configurations, artifacts)
[ ] Notify stakeholders (team lead, security contact)
[ ] Document initial findings

Short-term (within 24 hours)¶

[ ] Complete root cause analysis
[ ] Verify all compromised credentials rotated
[ ] Review all related resources for tampering
[ ] Check for persistence mechanisms (backdoors, new users/keys)
[ ] Update WAF/security group rules as needed

Long-term (within 7 days)¶

[ ] Write incident report with timeline
[ ] Identify and implement preventive measures
[ ] Update runbooks with lessons learned
[ ] Review and update monitoring/alerting
[ ] Conduct team debrief

Emergency Contacts¶

Role	Contact	When to Escalate
On-call Engineer	[PagerDuty]	P1/P2 incidents
Security Lead	[Contact info]	All P1, suspected breaches
AWS Support	[Support case]	AWS resource compromise
Exchange Support	[Exchange contact]	Trading account compromise

Verification Checklist¶

After security incident resolution:

[ ] All compromised credentials rotated
[ ] Unauthorized access terminated
[ ] No persistent backdoors found
[ ] CloudTrail showing normal activity
[ ] VPC Flow Logs clean
[ ] Services operating normally
[ ] Monitoring alerts configured for recurrence
[ ] Incident documented and reported