Step 3 — Run a Traffic Simulation
Once the canvas is wired and services are configured, run the simulation to propagate synthetic traffic through the architecture and observe per-node behaviour.
Setting simulation parameters
Base RPS
Set this to your expected peak load, not your average. The goal is to surface how the architecture behaves at stress, not at idle. For initial runs, a good practice is to set three scenarios:
- Baseline: typical daily load
- Peak: known high-traffic period (e.g. 3–5× baseline)
- Stress: maximum plausible spike (e.g. 10× baseline, or a campaign launch scenario)
The RPS slider ranges from 10 to 100,000,000 requests per second.
Traffic pattern
Available on Pro, Team, and Enterprise plans.
| Pattern | When to use |
|---|---|
| Constant | Steady-state validation — confirms the architecture handles sustained load. Available on all plans. |
| Ramp | Gradual traffic growth — tests auto-scaling responsiveness |
| Spike | Sudden burst — stress-tests cold start behaviour, concurrency limits, and throttling |
| Wave | Periodic traffic (e.g. hourly batch jobs, daily peaks) — tests recovery between bursts |
Run at least two traffic patterns per architecture iteration — Constant confirms steady-state health; Spike reveals latency and concurrency failure modes that Constant will not expose.
Live simulation metrics
While the simulation runs, the panel updates in real time:
| Metric | What it shows |
|---|---|
| Current RPS | Live requests per second flowing through the architecture |
| Elapsed Time | Duration the simulation has been running |
| Est. Cost | Real-time monthly cost estimate at the current load level (Pro and above) |
| Alerts | Count of threshold breaches or configuration warnings |
| Node Metrics | Per-service RPS, latency, health status, and utilisation % |
A non-zero alert count requires investigation before proceeding. Do not dismiss alerts without understanding their cause.
Simulation controls
Start / Stop / Pause / Resume. Simulations can be paused and resumed.
If you change Lambda concurrency settings while a simulation is paused, stop the simulation fully and restart it. Concurrency changes are applied at simulation initialisation — resuming a paused run will not pick up the new values.
Each run is saved automatically to Execution History (Pro and above).
Simulation best practices
- Let each simulation run long enough to stabilise. Particularly for Ramp and Wave patterns, early metrics may not represent steady-state behaviour. For most architectures, 30–60 seconds of simulated time is sufficient.
- Do not dismiss zero-alert runs as complete. A clean simulation confirms the architecture handles load without breaching thresholds — it does not mean the architecture is optimised. Proceed to AI Recommendations even when alerts are zero.
- Log your simulation intent. Before each run, note what question you are trying to answer. This makes Execution History comparisons meaningful.
- Test at the extremes, not just the expected load. If your expected peak is 10,000 RPS, also simulate at 1,000 RPS (normal load) and 100,000 RPS (stress/campaign spike). A brittle component often appears only at the edges.
- Pay attention to which node throttles first. The first node to show throttling under increasing load is your current bottleneck. In a typical Lambda API stack, the throttle order under load is: Lambda concurrency → API Gateway request limits → CloudFront. Addressing them in this order is the most efficient path to a healthy simulation.