Skip to content

Security Incident Response Runbook

Procedures for handling security incidents including credential exposure, unauthorized access, and security policy violations.

Severity Classification

Level Description Response Time Examples
P1 Active breach, data exposure Immediate (<15 min) Compromised AWS creds, unauthorized trading
P2 Suspected compromise < 1 hour Suspicious API activity, failed auth spike
P3 Policy violation < 4 hours Exposed secrets in logs, missing encryption

API Key/Secret Exposure

Symptoms

  • Secrets found in logs, code commits, or public repositories
  • Unexpected API calls from unknown sources
  • Exchange reporting unusual activity

Immediate Actions (P1)

  1. Identify exposed credentials:

    # Check what was exposed
    # - Exchange API keys (Binance, etc.)
    # - AWS credentials
    # - Database passwords
    # - JWT signing keys
    

  2. Rotate exchange API keys immediately:

    # 1. Create new API key on exchange (manually via exchange UI)
    # 2. Update Secrets Manager
    aws secretsmanager put-secret-value \
      --secret-id tradai/${ENVIRONMENT}/exchange-keys \
      --secret-string '{"api_key":"NEW_KEY","api_secret":"NEW_SECRET"}'
    
    # 3. Restart services to pick up new credentials
    aws ecs update-service \
      --cluster tradai-${ENVIRONMENT} \
      --service tradai-strategy-service-${ENVIRONMENT} \
      --force-new-deployment
    

  3. Revoke old exchange keys (via exchange UI)

  4. Check for unauthorized activity:

    # Review exchange order history for unauthorized trades
    # Review API call logs for suspicious patterns
    

AWS Credential Rotation

If AWS credentials exposed:

  1. Disable compromised credentials:

    # For IAM user access keys
    aws iam update-access-key \
      --user-name $USER_NAME \
      --access-key-id $COMPROMISED_KEY_ID \
      --status Inactive
    
    # For IAM user
    aws iam delete-login-profile --user-name $USER_NAME
    

  2. Create new credentials:

    aws iam create-access-key --user-name $USER_NAME
    

  3. Review CloudTrail for unauthorized actions:

    aws cloudtrail lookup-events \
      --lookup-attributes AttributeKey=AccessKeyId,AttributeValue=$COMPROMISED_KEY_ID \
      --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ)
    

  4. Delete compromised credentials (after verification):

    aws iam delete-access-key \
      --user-name $USER_NAME \
      --access-key-id $COMPROMISED_KEY_ID
    

JWT Token Invalidation

If JWT signing key compromised:

  1. Rotate Cognito app client secret:

    # Generate new client secret
    aws cognito-idp update-user-pool-client \
      --user-pool-id $USER_POOL_ID \
      --client-id $CLIENT_ID \
      --generate-secret
    

  2. Force sign-out all users:

    aws cognito-idp admin-user-global-sign-out \
      --user-pool-id $USER_POOL_ID \
      --username $USERNAME
    


Unauthorized Access Detected

Symptoms

  • Unexpected IAM activity in CloudTrail
  • Unknown IP addresses in access logs
  • Privilege escalation attempts
  • Resource creation outside normal patterns

Diagnosis

  1. Review CloudTrail events:

    # Recent suspicious events
    aws cloudtrail lookup-events \
      --lookup-attributes AttributeKey=EventName,AttributeValue=CreateAccessKey \
      --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
      --query 'Events[*].{Time:EventTime,User:Username,Event:EventName,Source:EventSource}'
    

  2. Check for new IAM users/roles:

    # List recently created users
    aws iam list-users \
      --query "Users[?CreateDate>='$(date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)']"
    
    # List recently created roles
    aws iam list-roles \
      --query "Roles[?CreateDate>='$(date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)']"
    

  3. Check API Gateway access logs:

    aws logs filter-log-events \
      --log-group-name /aws/api-gateway/tradai-${ENVIRONMENT} \
      --start-time $(date -u -v-24H +%s)000 \
      --filter-pattern '{ $.status = 401 || $.status = 403 }'
    

ECS Task Isolation

If a container is suspected compromised:

  1. Stop the suspicious task:

    aws ecs stop-task \
      --cluster tradai-${ENVIRONMENT} \
      --task $TASK_ARN \
      --reason "Security incident investigation"
    

  2. Update security group to deny all egress (preserve for forensics):

    # Create isolated security group
    aws ec2 create-security-group \
      --group-name tradai-isolated-${ENVIRONMENT} \
      --description "Isolated for security investigation" \
      --vpc-id $VPC_ID
    
    # No ingress or egress rules - fully isolated
    

  3. Capture task logs for forensics:

    aws logs get-log-events \
      --log-group-name /ecs/tradai-${SERVICE_NAME}-${ENVIRONMENT} \
      --log-stream-name $LOG_STREAM \
      --start-time $(date -u -v-24H +%s)000 \
      --output json > forensics-logs-$(date +%Y%m%d).json
    


Network Security Breach

Symptoms

  • Unexpected outbound connections
  • Data exfiltration attempts in VPC Flow Logs
  • Unusual traffic patterns

Diagnosis

  1. Check VPC Flow Logs:

    aws logs filter-log-events \
      --log-group-name /aws/vpc/flowlogs/tradai-${ENVIRONMENT} \
      --start-time $(date -u -v-24H +%s)000 \
      --filter-pattern '[version, accountid, interfaceid, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action=REJECT, logstatus]'
    

  2. Look for unexpected destinations:

    # Check for traffic to unexpected IPs
    aws logs filter-log-events \
      --log-group-name /aws/vpc/flowlogs/tradai-${ENVIRONMENT} \
      --start-time $(date -u -v-1H +%s)000 \
      --filter-pattern '[version, accountid, interfaceid, srcaddr, dstaddr != 10.*, srcport, dstport, protocol, packets, bytes, start, end, action=ACCEPT, logstatus]'
    

Containment

  1. Block suspicious IPs via WAF:

    # Get WAF Web ACL
    WAF_ACL_ID=$(aws wafv2 list-web-acls --scope REGIONAL --query "WebACLs[?Name=='tradai-${ENVIRONMENT}-waf'].Id" --output text)
    
    # Add IP to block list (requires updating WAF rule)
    

  2. Update security group to block IP:

    # Add deny rule to NACL for immediate block
    aws ec2 create-network-acl-entry \
      --network-acl-id $NACL_ID \
      --rule-number 50 \
      --protocol -1 \
      --rule-action deny \
      --egress \
      --cidr-block $SUSPICIOUS_IP/32
    


CloudTrail Audit Analysis

Common Queries

Failed API calls:

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=ReadOnly,AttributeValue=false \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[?ErrorCode!=`null`].{Time:EventTime,Event:EventName,Error:ErrorCode,User:Username}'

Root account activity (should be rare):

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=Username,AttributeValue=root \
  --start-time $(date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)

IAM policy changes:

aws cloudtrail lookup-events \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[?contains(EventName, `Policy`)].{Time:EventTime,Event:EventName,User:Username}'

Security group changes:

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventSource,AttributeValue=ec2.amazonaws.com \
  --start-time $(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ) \
  --query 'Events[?contains(EventName, `SecurityGroup`)].{Time:EventTime,Event:EventName,User:Username}'


Post-Incident Actions

Immediate (within 1 hour)

  • [ ] Contain the incident (rotate credentials, isolate resources)
  • [ ] Preserve evidence (logs, configurations, artifacts)
  • [ ] Notify stakeholders (team lead, security contact)
  • [ ] Document initial findings

Short-term (within 24 hours)

  • [ ] Complete root cause analysis
  • [ ] Verify all compromised credentials rotated
  • [ ] Review all related resources for tampering
  • [ ] Check for persistence mechanisms (backdoors, new users/keys)
  • [ ] Update WAF/security group rules as needed

Long-term (within 7 days)

  • [ ] Write incident report with timeline
  • [ ] Identify and implement preventive measures
  • [ ] Update runbooks with lessons learned
  • [ ] Review and update monitoring/alerting
  • [ ] Conduct team debrief

Emergency Contacts

Role Contact When to Escalate
On-call Engineer [PagerDuty] P1/P2 incidents
Security Lead [Contact info] All P1, suspected breaches
AWS Support [Support case] AWS resource compromise
Exchange Support [Exchange contact] Trading account compromise

Verification Checklist

After security incident resolution:

  • [ ] All compromised credentials rotated
  • [ ] Unauthorized access terminated
  • [ ] No persistent backdoors found
  • [ ] CloudTrail showing normal activity
  • [ ] VPC Flow Logs clean
  • [ ] Services operating normally
  • [ ] Monitoring alerts configured for recurrence
  • [ ] Incident documented and reported