Skip to content

Performance Tuning

Guide for optimizing TradAI performance.


Backtest Optimization

Data Loading

Problem: Slow data loading for large backtests.

Solutions:

  1. Pre-fetch data to ArcticDB:

    # Sync data before backtest
    tradai data sync --symbols BTC/USDT:USDT --timeframe 1h --days 365
    

  2. Use appropriate timeframes:

  3. Development: Use 1h or 4h for faster iteration
  4. Production: Use 5m or 15m for final validation

  5. Limit symbol count for development:

    # Development: Test with fewer symbols
    symbols = ["BTC/USDT:USDT", "ETH/USDT:USDT"]
    
    # Production: Full symbol list
    symbols = strategy.metadata.symbols
    

Memory Management

Problem: Out of memory during backtest.

Solutions:

  1. Reduce data window:

    # freqtrade config
    startup_candle_count: 200  # Reduce from default 400
    

  2. Use chunked processing:

    # Process data in chunks
    for chunk in pd.read_csv(file, chunksize=10000):
        process_chunk(chunk)
    

  3. Enable swap space:

    # Add 4GB swap
    sudo fallocate -l 4G /swapfile
    sudo chmod 600 /swapfile
    sudo mkswap /swapfile
    sudo swapon /swapfile
    

Parallel Backtests

Run multiple backtests concurrently:

# Using GNU parallel
parallel -j 4 tradai backtest --strategy {} ::: Strategy1 Strategy2 Strategy3 Strategy4

Data Sync Optimization

Batch Size

Adjust batch size based on exchange rate limits:

# data-collection settings
DATA_COLLECTION_STREAMING_BUFFER_SIZE=100  # Candles per batch
DATA_COLLECTION_STREAMING_FLUSH_INTERVAL=60  # Seconds between flushes

Concurrent Fetching

Use AsyncCCXTRepository for concurrent symbol fetching:

from tradai.data.infrastructure.repositories import AsyncCCXTRepository

async def fetch_all_symbols():
    repo = AsyncCCXTRepository(config=exchange_config)

    # Fetches symbols concurrently
    tasks = [
        repo.fetch_ohlcv(SymbolList.from_input([symbol]), date_range, timeframe)
        for symbol in symbols
    ]
    results = await asyncio.gather(*tasks)

ArcticDB Optimization

  1. Use versioning wisely:

    # Append data instead of full writes
    adapter.append(new_data, symbol)
    

  2. Configure deduplication:

    # ArcticDB handles duplicates automatically
    adapter.save(data, symbols, datetime.now(UTC), dedup=True)
    


API Performance

Caching

Enable response caching for read-heavy endpoints:

from fastapi_cache.decorator import cache

@router.get("/strategies")
@cache(expire=60)  # Cache for 60 seconds
async def list_strategies():
    ...

Connection Pooling

Configure database connection pooling:

# SQLAlchemy settings
SQLALCHEMY_POOL_SIZE=10
SQLALCHEMY_MAX_OVERFLOW=20
SQLALCHEMY_POOL_TIMEOUT=30

Async Operations

Use async for I/O-bound operations:

# Good: Async database query
async def get_strategy(strategy_id: str):
    async with session.begin():
        return await session.get(Strategy, strategy_id)

# Bad: Sync in async context
def get_strategy_sync(strategy_id: str):
    return session.query(Strategy).get(strategy_id)

ML Training Optimization

FreqAI Settings

# Reduce model complexity for faster training
freqai_config = {
    "feature_parameters": {
        "include_timeframes": ["1h"],  # Fewer timeframes
        "indicator_periods_candles": [10, 20],  # Fewer periods
    },
    "data_split_parameters": {
        "test_size": 0.2,  # Smaller test set for training
    },
    "model_training_parameters": {
        "n_estimators": 100,  # Reduce from 1000
        "max_depth": 5,  # Limit tree depth
    },
}

GPU Acceleration

For supported models:

# Enable CUDA
export CUDA_VISIBLE_DEVICES=0

# Install GPU dependencies
pip install tensorflow-gpu torch --index-url https://download.pytorch.org/whl/cu118

Model Caching

Cache trained models to avoid retraining:

# MLflow model caching
from mlflow.tracking import MlflowClient

client = MlflowClient()
model = mlflow.pyfunc.load_model(f"models:/{model_name}/Production")

Infrastructure Optimization

ECS Task Sizing

Task Type CPU Memory Use Case
Small 256 512 API services
Medium 512 1024 Data sync
Large 1024 2048 Backtests
XLarge 2048 4096 ML training
# Pulumi task definition
task_def = aws.ecs.TaskDefinition(
    f"strategy-task",
    cpu="1024",
    memory="2048",
    ...
)

Auto-Scaling

Configure ECS auto-scaling:

# Scale on CPU utilization
scaling_target = aws.appautoscaling.Target(
    max_capacity=10,
    min_capacity=1,
    scalable_dimension="ecs:service:DesiredCount",
)

scaling_policy = aws.appautoscaling.Policy(
    policy_type="TargetTrackingScaling",
    target_tracking_scaling_policy_configuration={
        "target_value": 70.0,  # Target 70% CPU
        "predefined_metric_specification": {
            "predefined_metric_type": "ECSServiceAverageCPUUtilization",
        },
    },
)

Fargate Spot

Use Fargate Spot for cost savings on non-critical workloads:

# Use Spot for backtests
capacity_provider_strategy=[
    {
        "capacity_provider": "FARGATE_SPOT",
        "weight": 1,
        "base": 0,
    },
]

Profiling Tools

Python Profiling

import cProfile
import pstats

# Profile a function
profiler = cProfile.Profile()
profiler.enable()
result = slow_function()
profiler.disable()

# Print stats
stats = pstats.Stats(profiler)
stats.sort_stats("cumulative")
stats.print_stats(20)

Memory Profiling

# Install memory profiler
pip install memory-profiler

# Run with profiling
python -m memory_profiler script.py
from memory_profiler import profile

@profile
def memory_intensive_function():
    ...

Line Profiling

# Install line profiler
pip install line_profiler

# Profile specific function
kernprof -l -v script.py

Monitoring Performance

CloudWatch Metrics

Key metrics to monitor:

Metric Alert Threshold Description
CPU Utilization > 80% Task overloaded
Memory Utilization > 80% Memory pressure
Request Latency p99 > 1s Slow responses
Error Rate > 1% Too many failures

Custom Metrics

import boto3

cloudwatch = boto3.client("cloudwatch")

def record_metric(name: str, value: float, unit: str = "Count"):
    cloudwatch.put_metric_data(
        Namespace="TradAI",
        MetricData=[{
            "MetricName": name,
            "Value": value,
            "Unit": unit,
        }],
    )

# Record backtest duration
record_metric("BacktestDuration", duration_seconds, "Seconds")

Benchmarks

Expected Performance

Operation Time Notes
Backtest (30 days, 1h) < 30s Single symbol
Backtest (365 days, 1h) < 5min Single symbol
Data sync (1 symbol, 1h) < 10s Per 1000 candles
API response (cached) < 50ms With caching
API response (uncached) < 500ms Without caching

Benchmark Script

# Run performance benchmark
tradai benchmark --strategy MyStrategy --days 30

# Output:
# Data loading: 2.3s
# Feature calculation: 5.1s
# Model inference: 1.2s
# Total backtest time: 8.6s

See Also