What is Little's Law?

Little's Law states that the average number of items in a system (L) equals the average arrival rate (λ) multiplied by the average time an item spends in the system (W). In web services: concurrent requests = throughput × response time.

What counts as a worker?

A worker is any unit that can process a request concurrently: a thread in a thread pool, a process in a process pool, or an async connection handler. For Node.js with async I/O, the effective concurrency can be much higher than the number of CPU cores.

Why does actual throughput differ from calculated?

Real systems have overhead: garbage collection, context switching, lock contention, I/O bottlenecks, and network latency. The calculated maximum is an upper bound. Actual achievable throughput is typically 50–80% of theoretical maximum.

How do I increase throughput?

Two levers: increase concurrency (more workers, servers, or instances) or reduce response time (optimize code, caching, database queries). Reducing response time is often more cost-effective than adding servers.

Should I use mean or percentile response time?

Mean gives optimistic estimates. Using p95 or p99 gives conservative estimates that better represent real-world capacity. For SLO compliance planning, use the percentile that matches your SLO definition.

How does this relate to auto-scaling?

Auto-scaling triggers should fire before throughput reaches maximum capacity. If your max is 2,000 RPS, set scaling thresholds at 1,400–1,600 RPS (70–80%) to allow time for new instances to start and absorb load.

Throughput Capacity Calculator

Calculate max throughput using Little's Law from average response time and concurrent workers. Plan server capacity for peak loads.

Workload Type

Avg Response Time

P99 Response Time

Concurrent Workers

Error Rate

Target RPS

req/s

Max Throughput

2,000.0 RPS

100 workers ÷ 50ms avg latency

Effective Throughput

1,980.0 RPS

After 1% error rate deduction

Safe Capacity (80%)

1,600.0 RPS

Recommended operating limit

Throughput at P99

500.0 RPS

Worst-case at 200ms tail latency

Max RPM / RPD

120,000.00 / 172,800,000.00

Requests per minute / per day

Current Utilization

75.0%

Headroom: 500.0 RPS

Workers for Target

75.00

At 80% util: 94.00 workers

Avg Wait (at target)

200.0 ms

Service 50ms + queue 150.0ms

Utilization at Target Load

75.0%

0%70% safe80% limit100% max

Worker Scaling Analysis

Scale	Workers	Max RPS	Efficiency
0.25×	25.00	500.0	29.4%
0.5×	50.00	1,000.0	16.9%
1×	100.00	2,000.0	9.2%
2×	200.00	4,000.0	4.8%
4×	400.00	8,000.0	2.4%
8×	800.00	16,000.0	1.2%

Planning notes, formulas, and examples

About the Throughput Capacity Calculator

Throughput capacity determines how many requests your system can process per unit of time. Using Little's Law, you can calculate the maximum throughput from the average response time and the number of concurrent workers (threads, processes, or connections).

This calculator applies the fundamental queueing theory relationship: throughput equals the number of concurrent workers divided by the average processing time per request. It helps capacity planners determine how many servers, containers, or worker processes are needed to handle expected load.

Understanding throughput capacity is essential for capacity planning, auto-scaling configuration, and performance testing. It tells you the theoretical maximum your current architecture can handle before you need to scale horizontally or optimize response times.

When This Page Helps

Capacity planning without theoretical modeling leads to either over-provisioning (wasting money) or under-provisioning (causing outages). This calculator uses Little's Law to estimate maximum throughput from basic, measurable parameters, giving you a scientific foundation for capacity decisions.

How to Use the Inputs

Measure or estimate the average response time for your service in milliseconds.
Enter the number of concurrent workers (threads, processes, or connections).
Review the calculated maximum throughput in requests per second.
Compare with your expected peak load to determine if capacity is sufficient.
Adjust workers or optimize response time to meet target throughput.
Factor in a safety margin (typically 70–80% of max) for production capacity.

Formula used

Max Throughput (RPS) = Concurrent Workers / (Avg Response Time in seconds). From Little's Law: L = λ × W, where L = concurrent requests, λ = throughput, W = response time.

Example Calculation

Result: 2,000 requests per second max throughput

With 100 concurrent workers and 50ms average response time (0.05 seconds), the maximum throughput is 100 / 0.05 = 2,000 RPS. At 80% safe capacity, you should plan for handling up to 1,600 RPS before scaling.

Tips & Best Practices

Use p95 or p99 response time instead of mean for conservative capacity estimates.
Maximum throughput assumes all workers are always busy — real utilization is lower.
Keep utilization below 80% to avoid queuing delays that amplify latency.
Connection pool size, thread pool size, and database connections all limit concurrency.
Reducing average response time by 50% doubles your throughput capacity.
Test actual throughput with load tests — theoretical and real often diverge.

Little's Law in Practice

Little's Law is one of the most fundamental and useful results in queueing theory. It applies to any stable system regardless of the arrival distribution, service time distribution, or queueing discipline. This universality makes it invaluable for capacity planning.

Concurrency Bottlenecks

Every system has a concurrency limit. For threaded servers, it is the thread pool size. For database-backed services, it may be the connection pool size. For upstream dependencies, it may be rate limits. The lowest limit in the chain determines overall throughput.

Response Time Optimization

Reducing average response time is the highest-leverage capacity improvement. A 50ms to 25ms optimization doubles throughput without adding any infrastructure. Common optimizations include caching, query optimization, payload reduction, and eliminating unnecessary I/O.

Capacity Planning Process

Start by measuring current throughput and response times under load. Apply Little's Law to calculate theoretical maximum. Compare against projected peak load with a safety margin. Decide whether to optimize response time or add capacity based on cost analysis.

Sources & Methodology

Last updated: February 8, 2026

Frequently Asked Questions

Little's Law states that the average number of items in a system (L) equals the average arrival rate (λ) multiplied by the average time an item spends in the system (W). In web services: concurrent requests = throughput × response time.