Summary of "AWS re:Invent 2025 - Building for Efficiency & Reliability with Performance Testing on AWS (CMP351)"
Session
CMP351 — Building for Efficiency & Reliability with Performance Testing on AWS
Goal
Teach how to build reliable, resilient, and efficient applications by using performance testing to validate behavior, capacity, and cost trade-offs.
Key definitions
- Resiliency: ability to recover from a crash (how fast/how seamlessly you recover).
- Reliability: everything you do to prevent failures in the first place.
- Efficiency: meeting SLOs while consuming the least resources possible.
Types of performance tests (what they measure)
- Load testing: expected / normal traffic levels.
- Stress testing: push beyond expected load to find breaking or degradation points (test multiples of expected load: 2x, 5x, 10x).
- Endurance / soak testing: long-run behavior to detect leaks or degradation over time.
- Scalability testing: ability to scale from zero to target (for example, 0 → 1M users).
- Spike testing: sudden jumps (up and down) in traffic; important to test both ramp-up and ramp-down behavior (autoscaling speed and application handling).
- Volume testing: behavior with large volumes of data (database size, file uploads, ingestion).
Why do performance testing
- Scale and deploy with confidence — pick the right compute / database options by testing.
- Improve reliability and user experience — reduce preventable outages and keep responses within SLOs.
- Cost optimization — find the sweet spot and avoid overprovisioning by measuring response vs. instance size.
- Continuous validation — do it early, after changes, and regularly (daily or before big events).
What to measure (important metrics)
- Latency percentiles (95th, 99th recommended — shows consumer experience).
- Throughput: transactions per second / minute.
- Bandwidth.
- Global latency distribution (test from multiple regions — don’t test from only one region).
- Errors (type and count).
- Resource utilization for root-cause analysis: CPU, memory, etc.
Popular test frameworks
- Apache JMeter (Java / XML / YAML; long-established)
- k6 (Grafana Labs; JavaScript-based)
- Locust (Python)
AWS offering: Distributed Load Testing on AWS (open-source)
- Fully supported AWS open-source solution, single-tenant (you deploy it into your account).
- Deployable from a CloudFormation template; demo claims ~5 minutes to production-ready.
- Recent release referenced as v4.0.1.
Features:
- Supports multiple engines: JMeter, k6, Locust, or simple HTTP endpoints from a single platform.
- Multi-region / global testing: spin up agents across regions to simulate worldwide traffic.
- Auto-provisions required infrastructure for the test, runs it, then tears down resources when finished.
- Can be run via a web UI or automated via APIs / CI-CD pipelines.
- Configuration options: number of unique IPs, concurrent users per region, ramp-up patterns, sustain time.
- Real-time dashboards: average response time, success / error counts, percentiles, etc.
Benefits:
- No need to manage separate testing infrastructure per framework; centralized execution and regional distribution.
Demo highlights (from session)
- Example “global sport global test” configured to ramp to 1M virtual users in 1 minute and hold for a minute.
- Live view showed per-region agent provisioning, traffic ramping, response-time charts, successes and errors.
- Errors in the demo illustrated autoscaling delays (undersized instances triggered autoscale too late).
- Emphasis on testing autoscaling speed both up and down.
Best practices & recommendations
- Run performance tests regularly, not just once — establish and monitor a baseline.
- Investigate deviations quickly (both slower and unexpectedly faster behavior).
- Test globally if you have global traffic; single-region tests are insufficient for realistic behavior.
- Combine different test types based on objectives (for example, stress + endurance + volume).
- Use performance testing to evaluate architecture / instance choices (for example, Graviton vs x86).
- Include performance tests in CI/CD for automated validation.
Guides, tutorials, demos, and next steps
- Deploy Distributed Load Testing on AWS via the provided CloudFormation template to get started.
- Use the web UI for ad-hoc tests or call APIs from CI/CD pipelines to automate tests.
- The speaker offered a deeper, hands-on demo at the AWS village for a guided walkthrough.
Main speaker / source
Luis (last name unclear in the transcript), Head of Infrastructure Solutions at AWS. The content and demo originate from the AWS CMP351 session.
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.
Preparing reprocess...