Skip to content

TradAI E2E Verification Report — 2026-04-28

Environment: AWS dev (600802701449, eu-central-1) ALB: http://tradai-dev-1942285475.eu-central-1.elb.amazonaws.com Verified by: Automated scripts executed from Windows 11 (Git Bash)


Executive Summary

Issue Title Script Result
#89 Deploy dev environment issue89-verify.sh 65/66 PASS (1 gap: SNS 0 email subs)
#90 Backtest E2E flow issue90-verify.sh 18/18 PASS
#91 Training flow E2E issue91-verify.sh 38/38 PASS
#92 Promotion & rollback issue92-verify.sh 27/28 PASS (1 gap: SNS notification)
#308-313 Config versioning config-versioning-e2e-aws.sh 16/16 PASS
#376/#385 API endpoints api-endpoint-audit.sh 26/26 PASS + 2 FIXED + 5 known bugs

Total: 190/192 checks PASS (98.9%)


How to Reproduce (macOS)

Prerequisites

# 1. AWS CLI + SSO login
brew install awscli jq
aws sso login --profile tradai

# 2. Clone the repository
git clone git@github.com:tradai-bot/tradai.git
cd tradai

# 3. Make scripts executable
chmod +x docs/verification/*.sh

Run All Verification Scripts

# Set ALB base URL (optional — scripts default to the dev ALB)
export ALB_BASE="http://tradai-dev-1942285475.eu-central-1.elb.amazonaws.com"

# Issue #89 — Infrastructure
./docs/verification/issue89-verify.sh
# Expected: 65/66 PASS (J.2: SNS 0 email subscribers — configuration gap)

# Issue #90 — Backtest E2E (submits a real backtest, takes ~3-5 min)
./docs/verification/issue90-verify.sh
# Expected: 18/18 PASS

# Issue #91 — Training Flow E2E (runs retraining workflow, takes ~5-10 min)
./docs/verification/issue91-verify.sh
# Expected: 38/38 PASS

# Issue #92 — Promotion & Rollback E2E
./docs/verification/issue92-verify.sh
# Expected: 27/28 PASS (H.1: SNS notification not sent)

# Issue #308-313 — Config Versioning E2E
./docs/verification/config-versioning-e2e-aws.sh
# Expected: 16/16 PASS

# Issue #376/#385 — API Endpoint Audit
./docs/verification/api-endpoint-audit.sh
# Expected: 26/26 PASS, 2 FIXED, 5 known bugs

Notes for macOS

  • Scripts use bash with set -euo pipefail — no special shell required
  • jq is required for JSON parsing (all scripts degrade gracefully if missing, but some checks will show warnings)
  • AWS CLI v2 required for SSO auth and DynamoDB/S3/ECS/Step Functions queries
  • issue90-verify.sh and issue91-verify.sh submit real AWS workloads — cost is minimal (~$0.10 per run)
  • All scripts are idempotent — safe to re-run

Detailed Results

Issue #89 — Deploy Dev Environment (65/66 PASS)

Section Checks Pass Fail Notes
A. Pulumi stacks 4 4 0 All 4 stacks deployed
B. Lambda functions 3 3 0 18 functions (exceeds req of 16)
C. ALB & Health 4 4 0 All targets healthy
D. DynamoDB 2 2 0 12 tables present
E. S3 buckets 2 2 0 5 buckets
F. Step Functions 3 3 0 2 state machines
G. ECS 4 4 0 Cluster + task defs
H. CloudWatch alarms 4 4 0 Alarms configured
I. ECR 3 3 0 Repos with images
J. SNS 2 1 1 J.2: 0 email subscribers
K. RDS 2 2 0 PostgreSQL 15, available
L. Consolidated EC2 5 5 0 ASG + instance healthy
M-P. Remaining 27 27 0 Cognito, SQS, security, etc.

Gap: J.2 — SNS topic tradai-alerts-dev has 0 email subscriptions. Alert pipeline configured but no subscribers.

Issue #90 — Backtest E2E Flow (18/18 PASS)

Full happy path verified:

API submit → Backend → Step Functions → ECS task (FreqAI/LightGBM)
  → S3 results → DynamoDB status → API retrieval
Step Check Result
A.1-A.3 Submit backtest via API 202 Accepted, job_id returned
B.1-B.3 Step Functions execution Started, RUNNING state
C.1-C.3 Poll until completion SUCCEEDED
D.1-D.2 DynamoDB job status succeeded, result populated
E.1-E.2 S3 artifacts Backtest results uploaded
F.1-F.3 API retrieval Job retrievable, equity data present

Issue #91 — Training Flow (38/38 PASS)

Section Tests Result
A. Prerequisites 4 PASS
B. Failure path 9 PASS
C. Happy path 13 PASS
D. Skip path 3 PASS
E. Config version 3 PASS
F. MLflow verification 6 PASS

Issue #92 — Promotion & Rollback (27/28 PASS)

Section Tests Pass Notes
A. Setup 2 2
B. Stage 2 2
C. Promote 3 3
D. Model rollback 3 3
E. Verify rollback 5 5
F. Safety 2 2
G. Error handling 3 3
H. Notifications 1 0 SNS notification not sent
I. Cleanup 7 7

Gap: H.1 — alert_publisher Lambda not invoked after rollback. The rollback operation succeeds but does not trigger an alert.

Issues #308-313 — Config Versioning (16/16 PASS)

10-step AWS E2E flow:

Create config → Activate → Submit backtest with config_version_id
  → Verify propagation through Step Functions → ECS → S3 → DynamoDB
  → Verify config_version_id in backtest result

Issue #376/#385 — API Endpoint Audit (26/26 PASS + 2 FIXED + 5 Known Bugs)

Previously broken, now FIXED: - K.1: GET /backtests Pydantic deserialization error → now 200 - K.5: POST /models/{name}/rollback 503 → now returns proper response

Known bugs still present:

ID Endpoint Issue Severity
K.2 GET /strategies/{name}/instances 503 — ECS IAM Access denied Medium
K.3 POST /strategies/{name}/run 503 — ECS IAM Access denied Medium
K.4 GET /strategies/pnl 503 — proxy 404 Low
K.6 GET /strategies/{name}/trades 400 — DynamoDB key schema mismatch Medium
K.7 GET /mlflow/api/2.0/mlflow/experiments/list 404 — expected (use backend proxy) Info

#385 Coverage: - Group 1 (MLflow REST API 404): Covered — fix was to route through backend proxy (/api/v1/experiments, /api/v1/runs). Direct MLflow REST API still 404 (K.7) which is expected. - Group 2 (data export/sync proxy): Partially covered — proxy routes exist but /api/v1/export and /api/v1/sync are not explicitly tested in audit. - Group 3 (strategy stage 503): Covered — F.2 verifies stage endpoint returns 422 (validation), K.5 rollback now FIXED.


Remaining Gaps

# Gap Affected Issues Severity Recommended Action
1 SNS: 0 email subscribers #89, #92 Medium Add email subscription to tradai-alerts-dev
2 ECS operations IAM denied #376 (K.2, K.3) Medium Fix ecs:DescribeTasks / ecs:RunTask IAM policy
3 PnL proxy 404 #376 (K.4) Low Wire /strategies/pnl route
4 Trades DynamoDB key mismatch #376 (K.6) Medium Fix DynamoDB key schema in trades query
5 Audit missing export/sync test #385 Low Add D.4/D.5 checks to api-endpoint-audit.sh
6 Postman collections outdated Medium Update both collections (see below)

Postman Collection Audit

Both dev-tradai.postman_collection.json (local) and aws-dev-tradai.postman_collection.json (AWS) contain 19 endpoints each. The backend service exposes 34+ routes.

Endpoints Missing from Postman Collections

Category Missing Endpoints Count
Backtests POST/GET /backtests, GET /{id}, POST /{id}/cancel, GET /{id}/equity, GET /{id}/report-data 6
Config Versions POST /configs, GET /configs/{strategy}, GET /{id}, POST activate, POST deprecate 5
Trading Ops POST /strategies/{id}/run, GET instances, POST stop/pause/resume, GET logs 6
Trading Status GET /trading/status, GET /strategies/pnl, GET /strategies/{id}/trades 3
MLflow Proxy GET /runs/{exp_id}, GET /runs/detail/{id}, GET metrics, GET metrics/history 4
Other GET /catalog/leaderboard, GET compare, POST /ws/ticket, GET routing-info, GET /data/ohlcv 5

Total: 29 endpoints missing (60% of backend routes)

Issues in Current Collections

Issue Collection Details
Empty base_url Local Variable defined but empty — requests fail without manual config
Hardcoded localhost Local strategy-stage uses http://localhost:8000/... instead of {{base_url}}
Hardcoded localhost Local mlflow(all-needs) uses http://localhost:5001/...
Stale MLflow paths AWS mlflow(all-needs) folder still uses /mlflow/api/2.0/... (returns 404)
Duplicate entries Both strategy-detail and strategy-config are identical requests

Recommendations

  1. Regenerate from OpenAPI spec — backend has auto-generated OpenAPI docs at /api/v1/docs
  2. Use environment variables consistently — replace all hardcoded URLs
  3. Add all 34 backend endpoints — current coverage is only 40%
  4. Remove stale MLflow REST API entries — replace with backend proxy routes
  5. Add test assertions — Postman supports test scripts for response validation