Rate Limit Best Practices

Use these practices to run high-volume Switch API workflows without creating request spikes or avoidable errors.

For implementation guidance, see Spike Limiter Implementation.

For conceptual background on tasks, concurrency, and workflow design, see Understanding Tasks and Concurrency in Workflows.

What counts as a task?

A task is any unit of work in your workflow. In Switch integrations, tasks often include API operations such as:

Fetching tariffs
Viewing territories
Viewing tariff properties
Running calculations

When your application runs many tasks at once, the number of simultaneous API calls matters as much as the total number of calls.

What is concurrency?

Concurrency is the number of API tasks running at the same time. A higher concurrency value can help you process work faster, but it can also create spikes that increase response times or error rates.

Use concurrency carefully. A workflow that runs 1,000 API calls with a small concurrency limit and gradual ramp-up is safer than a workflow that starts all 1,000 calls at once.

Best practices

Limit concurrency

Set a hard maximum on the number of simultaneous API calls your application can make. For high-volume workflows, use a request queue and process work in small batches instead of launching every request at the same time.

Ramp up gradually

Increase request volume over 8–10 minutes instead of starting at full speed. A gradual ramp-up gives your application time to observe latency, errors, and retry behavior before reaching peak throughput.

Monitor continuously

Track response times, HTTP status codes, retry counts, and error rates while the workflow runs. If response times or error rates increase, reduce concurrency or add delays between batches.

Acceptable usage example

An acceptable way to run 1,000 API calls is to combine a low concurrency limit with gradual ramp-up and ongoing monitoring.

For example:

Time range	Concurrency
Minutes 0–2	1 concurrent API call
Minutes 2–4	2 concurrent API calls
Minutes 4–6	3 concurrent API calls
Minutes 6–8	4 concurrent API calls

After each batch, pause briefly and monitor performance before continuing.

Recommended settings for this pattern:

Use a concurrency limit of 4.
Process requests in small batches, such as batches of 50.
Add short delays between batches.
Monitor latency, HTTP status codes, retry counts, and error rates.
Back off automatically if latency or error rates increase.

Unacceptable usage example

Avoid request patterns that create sudden spikes.

For example, do not:

Start immediately with 50 concurrent requests.
Attempt to process all 1,000 requests simultaneously.
Skip gradual ramp-up.
Run high-volume workflows without monitoring response times or errors.

Risks of ignoring this guidance

Ignoring these practices can lead to:

Increased response times
Higher error rates
Retries, failed workflow steps, or temporary instability

Implementation checklist

Before running a high-volume workflow, confirm that your integration can:

Queue requests instead of launching them all at once.
Enforce a hard concurrency cap.
Start with concurrency 1 and increase gradually.
Process work in small batches.
Add short sleeps between batches.
Track HTTP status codes, latency, retry counts, and error rates.
Reduce concurrency or pause when response times or errors increase.

These controls help your integration process large workflows safely while keeping Switch API traffic predictable.