Skip to content

TradAI Final Architecture - Security

Version: 10.0.0 | Date: 2026-03-28 | Source: infra/persistent/modules/, infra/compute/modules/iam.py, infra/edge/modules/waf.py, infra/shared/tradai_infra_shared/core/policy_builder.py

TL;DR: 5-layer defense: WAF (rate limiting + OWASP rules) at API Gateway, Cognito JWT auth (MFA required, TOTP), security groups per service tier, SSE-S3 encryption at rest with SSL-enforced RDS, and CloudTrail audit logging with S3 data events. IAM follows least-privilege via PolicyBuilder.


Security Architecture

flowchart TD
    Internet[Internet Traffic]
    Internet --> WAF["1. WAF<br/>Rate Limit · OWASP · SQLi"]
    WAF --> Cognito["2. Cognito JWT Auth<br/>MFA Required · TOTP"]
    Cognito --> SGs["3. Security Groups<br/>ALB → ECS → RDS"]
    SGs --> Encryption["4. Encryption<br/>SSE-S3 · RDS SSL · Secrets Manager"]
    Encryption --> Audit["5. CloudTrail<br/>Management + S3 Data Events"]

    style WAF fill:#f9f,stroke:#333
    style Cognito fill:#bbf,stroke:#333
    style SGs fill:#bfb,stroke:#333
    style Encryption fill:#ffb,stroke:#333
    style Audit fill:#fbb,stroke:#333

1. Authentication (Cognito)

User Pool Configuration

Property Value
Name tradai-users-{env}
Username attribute Email (case-insensitive)
MFA Required (ON) -- TOTP only (no SMS)
Password min length 12
Require lowercase Yes
Require uppercase Yes
Require numbers Yes
Require symbols Yes
Temporary password validity 7 days
Account recovery Email only (verified_email, priority 1)
Auto-verified attributes Email
Email sending COGNITO_DEFAULT
Deletion protection ACTIVE in prod, INACTIVE otherwise
Schema attributes email (String, required, mutable)

No custom attributes (custom:org, custom:role) are defined. No advanced security mode is enabled.

Public Client (Web/CLI)

Property Value
Name tradai-api-client-{env}
Client secret None (public client)
OAuth flows code only (no implicit)
OAuth scopes email, openid, profile
Auth flows ALLOW_REFRESH_TOKEN_AUTH, ALLOW_USER_SRP_AUTH
Access token validity 1 hour
ID token validity 1 hour
Refresh token validity 30 days
Callback URLs http://localhost:3000/callback, http://localhost:8400/callback
Logout URLs http://localhost:3000/logout, http://localhost:8400/logout
Identity providers COGNITO only
Prevent user existence errors Enabled

Note: Development-only callback URL. Production user pool should have only HTTPS callback URLs.

Machine-to-Machine Client (CI/CD)

Property Value
Name tradai-m2m-client-{env}
Client secret Generated (stored in Secrets Manager for prod)
OAuth flows client_credentials only
OAuth scopes tradai-api/read, tradai-api/write, tradai-api/admin
Auth flows None (empty list)
Access token validity 1 hour
ID token validity 1 hour
Identity providers COGNITO only

Resource Server

Property Value
Identifier tradai-api
Scopes read (read access), write (write access), admin (administrative access)

User Pool Domain

Cognito hosted UI at tradai-{env}.auth.eu-central-1.amazoncognito.com.

Source: infra/persistent/modules/cognito.py


2. IAM Roles

All roles use PolicyBuilder for DRY policy creation (infra/shared/tradai_infra_shared/core/policy_builder.py). The builder provides fluent methods like .with_dynamodb_crud(), .with_s3_readwrite(), .with_secrets_read() that generate least-privilege statements with resource patterns scoped to tradai-*.

Role Inventory

Role Trust Principal Purpose Key Permissions
tradai-ecs-execution-{env} ecs-tasks.amazonaws.com ECS agent operations ECR pull, CloudWatch Logs, Secrets Manager read
tradai-ecs-task-{env} ecs-tasks.amazonaws.com Container runtime DynamoDB CRUD, S3 R/W, Secrets Manager, CloudWatch metrics, SNS publish, CodeArtifact read, RDS secrets
tradai-lambda-role-{env} lambda.amazonaws.com All Lambda functions (shared) Basic execution, VPC access, ECS RunTask, SQS, DynamoDB (scoped to used tables), S3, Secrets Manager, SNS, CloudWatch
tradai-cli-ci-{env} IAM users (same account, sts:ExternalId=tradai-cli) CLI/CI strategy lifecycle ECS service management, DynamoDB (tradai-deployments-*), ECR describe, iam:PassRole for task/execution roles
tradai-consolidated-{env} ec2.amazonaws.com Consolidated EC2 (dev/staging) ECR read-only, CloudWatch agent, SSM, DynamoDB CRUD, S3 R/W, SQS, Secrets Manager, RDS secrets, CloudWatch metrics, SNS, CodeArtifact, Service Discovery register, ASG lifecycle
tradai-nat-role-{env} ec2.amazonaws.com NAT instance ec2:AssociateAddress, ec2:ModifyInstanceAttribute, ec2:DescribeInstances
tradai-nat-lambda-role-{env} lambda.amazonaws.com NAT route updater Lambda CloudWatch Logs, ec2:CreateRoute/ReplaceRoute/DeleteRoute, autoscaling:CompleteLifecycleAction
tradai-flow-logs-role-{env} vpc-flow-logs.amazonaws.com VPC Flow Logs delivery CloudWatch Logs write
tradai-cloudtrail-role-{env} cloudtrail.amazonaws.com CloudTrail delivery CloudWatch Logs write

Lambda Role Design

Lambda functions share a single execution role (tradai-lambda-role-{env}), not per-function roles. The role includes managed policies (AWSLambdaBasicExecutionRole, AWSLambdaVPCAccessExecutionRole) plus an inline policy granting ECS RunTask, SQS, DynamoDB (scoped to specific table names that are actually provisioned), S3, Secrets Manager, SNS, and CloudWatch access.

Shared Lambda Role = Wide Blast Radius

All Lambda functions share a single IAM execution role (tradai-lambda-role-{env}) with permissions across ECS, DynamoDB, S3, SQS, and SNS. A compromised Lambda (e.g., promote-model) could invoke ecs:RunTask or modify any tradai-* DynamoDB table. Consider per-function roles for production to reduce blast radius.

Consolidated Role (dev/staging only)

Only created when is_consolidated_mode() returns True. Combines ECS task role permissions with EC2-specific access (ECR read-only, SSM for Session Manager, SQS for backtest queue, Service Discovery registration, ASG lifecycle hooks). Uses PolicyBuilder to eliminate ~80 lines of duplicate JSON.

Source: infra/compute/modules/iam.py, infra/foundation/modules/nat_instance.py


3. Encryption

S3 Buckets

All buckets (configs, results, arcticdb, logs, mlflow) use:

Property Value
Encryption SSE-S3 (AES256) with bucket key enabled
Public access Fully blocked (all 4 public access block settings enabled)
Versioning Enabled on all except logs

Lifecycle rules are configured per bucket where applicable: - results: Glacier transition at 30 days - logs: Delete at 90 days - configs, arcticdb, mlflow: No lifecycle rules

RDS PostgreSQL

Property Value
Storage encryption storage_encrypted=True
SSL enforcement rds.force_ssl=1 via parameter group
Master password Managed by Secrets Manager (manage_master_user_password=True)
Publicly accessible False
Performance Insights Enabled (7-day retention)
Log exports PostgreSQL logs to CloudWatch
Parameter group log_statement=ddl, log_min_duration_statement=1000 (slow queries >1s)

DynamoDB

DynamoDB tables use AWS default encryption (AWS owned keys). No customer-managed KMS keys are configured.

Source: infra/persistent/modules/s3.py, infra/foundation/modules/rds.py


4. WAF (Web Application Firewall)

WAF Not Currently Effective

The WebACL is created but not associated with any resource. WAFv2 cannot parse the $default stage ARN of HTTP APIs. Until the API Gateway is migrated to REST API or the WAF is associated with the ALB, the WAF rules provide no protection. See 09-PULUMI-CODE.md Section 7 for details.

The WAF WebACL is created with scope REGIONAL and is intended for API Gateway association.

Rules

Priority Rule Name Type Action Description
1 RateLimitRule Rate-based Block 100 requests per 5 minutes per IP
2 AWSManagedRulesCommonRuleSet Managed (AWS) Override: none OWASP Top 10 protection
3 AWSManagedRulesKnownBadInputsRuleSet Managed (AWS) Override: none Known malicious input patterns
4 AWSManagedRulesSQLiRuleSet Managed (AWS) Override: none SQL injection protection

Default action: Allow (only matched rules block/count).

WAF Logging

Property Value
Log destination CloudWatch Logs (aws-waf-logs-tradai-{env})
Retention 30 days (dev/staging), 90 days (prod)
Logged requests All (blocked, allowed, counted)

CloudWatch Metrics

All rules have CloudWatch metrics enabled with sampled requests. Metric names: RateLimitRule, CommonRuleSet, KnownBadInputs, SQLiRuleSet, plus overall tradai-waf-metrics.

Source: infra/edge/modules/waf.py


5. CloudTrail

Trail Configuration

Property Value
Trail name tradai-audit-trail-{env}
S3 bucket tradai-logs-{env} (prefix: cloudtrail/)
CloudWatch Logs /aws/cloudtrail/tradai
Log retention 30 days (dev/staging), 90 days (prod)
Multi-region No (single-region trail saves ~$2/month vs multi-region. For compliance requirements or multi-region deployments, enable multi-region trail.)
Global service events Yes
Log file validation Enabled

Event Selectors

Type Resources Read/Write
Management events All All
S3 data events tradai-configs-{env}/* All
S3 data events tradai-results-{env}/* All

No DynamoDB data events are configured. The code comment notes: "DynamoDB data events removed -- wildcards not supported by CloudTrail. Management events already track DynamoDB API calls. For item-level auditing, use DynamoDB Streams instead."

Insight Selectors

Insight Type
ApiCallRateInsight
ApiErrorRateInsight

S3 Bucket Policy

CloudTrail has a dedicated S3 bucket policy allowing s3:GetBucketAcl and s3:PutObject (with bucket-owner-full-control ACL condition) from the cloudtrail.amazonaws.com service principal, scoped to the specific trail ARN.

Source: infra/persistent/modules/cloudtrail.py


Verified Security Controls

  • Cognito MFA is required (TOTP only, no SMS) with 12-character minimum passwords.
  • WAF defines rate limiting (100 req/5min) plus 3 AWS managed rule sets (OWASP, bad inputs, SQLi), but is not currently associated with any resource (see Section 4).
  • All S3 buckets have public access fully blocked and SSE-S3 encryption enabled.
  • RDS enforces SSL via parameter group (rds.force_ssl=1) with Secrets Manager password rotation.
  • CloudTrail logs management events + S3 data events with log file validation enabled.
  • IAM roles use PolicyBuilder for least-privilege, scoped to tradai-* resource patterns.

6. Network Security

Network security is covered in detail in 03-VPC-NETWORKING.md. Key points:

  • Security groups enforce stateful rules: ALB -> ECS/Consolidated (ports 8000-8003, 5000), Lambda -> ECS/Consolidated (same ports), ECS/Lambda -> RDS (5432), NAT accepts ALL TCP from private subnets
  • NACLs provide stateless defense-in-depth at the subnet boundary; database tier only allows PostgreSQL from private subnets
  • VPC endpoints keep S3, DynamoDB, ECR, STS, Secrets Manager, CloudWatch, SSM, and SQS traffic within the AWS network
  • VPC Flow Logs capture all traffic to CloudWatch Logs with 7-day retention

7. What's NOT Implemented

Known Security Gaps

The following features are documented as planned but not yet implemented in infrastructure code. Prioritize security headers and ALB access logs for production readiness.

The following security features are documented as planned but are not present in the current infrastructure code:

Gap Description
Security headers middleware No X-Content-Type-Options, X-Frame-Options, Strict-Transport-Security, Content-Security-Policy headers are added by application middleware or ALB
Advanced Cognito security user_pool_add_ons with ENFORCED advanced security mode is not configured (no adaptive authentication, compromised credential checks, or risk-based MFA)
Security alarms No CloudWatch Alarms for security events (e.g., unauthorized API calls, root account usage, console sign-in without MFA)
ALB access logs ALB does not have access_logs configured to an S3 bucket
DisableExecuteApiEndpoint API Gateway disable_execute_api_endpoint is not set, so the default execute-api endpoint remains accessible alongside any custom domain
KMS customer-managed keys S3 uses SSE-S3 (AES256), not KMS CMKs; DynamoDB uses AWS-owned keys; no envelope encryption for sensitive config values

Changelog

Version Date Changes
10.0.0 2026-03-28 Full regeneration. Corrected Cognito (TOTP only), WAF (API Gateway not ALB), honest gaps section

Dependencies

If This Changes Update This Doc
infra/persistent/modules/cognito.py Authentication section
infra/edge/modules/waf.py WAF rules section
infra/persistent/modules/cloudtrail.py Audit section
infra/compute/modules/iam.py IAM roles section