Free tool

How many servers do you need?

Size your fleet for peak load. Enter your peak traffic and per-request latency and get the number of instances you need — with headroom so you're not running at the redline.

💡 Every value is editable. Defaults are illustrative. Use your own measured peak RPS, average request time and per-instance concurrency for a number you can plan around.

Your peak load

Peak requests per second RPS

Average request processing time ms

Concurrent workers / threads per instance

Target utilization % (headroom)

All values are illustrative — replace them with your own measured numbers from load tests or production metrics.

Instances needed

0req/s / instance

0req/s total safe

0%headroom

max 0 req/s/instance × 70% = 0 safe

Mind the headroom. At 100% utilization with no headroom you have no room for traffic spikes, rolling deploys, or instance failure — target 60–75% so a bad minute doesn't become an outage.

Little's Law, in one line

An instance that runs W requests concurrently, each taking L seconds, sustains about W / L requests per second. So capacity = workers / latency. With 50 workers at 200 ms (0.2 s), that's 50 / 0.2 = 250 req/s per instance at full tilt — then you apply your utilization target for safe capacity.

Assumes requests are CPU/IO-bound and evenly distributed. Real systems have queueing, GC pauses, and uneven load — load-test to confirm.

Capacity is a design decision, not a guess.

We profile your real latency distribution, find the bottlenecks, and right-size the fleet so you carry headroom without overpaying for idle instances. Book a free build audit.

Book a Build Audit

How it works

Capacity planning, the simple way

The question "how many servers do I need" has a surprisingly clean answer once you know three numbers: peak load, per-request latency, and concurrency per instance. This capacity planning calculator turns them into a fleet size using Little's Law — one instance sustains its worker count divided by average request time, and you divide peak traffic by that safe number to get instances for requests per second. Round up, and you have a defensible answer for scaling servers before peak hits.

The piece teams skip is headroom. Running a fleet at 100% utilization looks efficient on a spreadsheet and fails the first time traffic spikes, a deploy rolls, or an instance dies. Targeting 60–75% costs a little more steady-state and saves you from the outage — which is why a good server capacity calculator bakes utilization into the math rather than leaving it as an afterthought.

Input	What it means	Typical
Peak RPS	Requests/sec at your busiest minute	Measured
Latency	Average request processing time	50–500 ms
Workers / instance	Concurrent threads or connections	10–200
Target utilization	Headroom for spikes & failures	60–75%

FAQ

Questions about capacity planning

How many servers do I need for X requests per second?

Divide your peak requests per second by what one instance can safely handle. One instance handles roughly its worker/thread count divided by the average request time in seconds (Little's Law) — for example 50 workers at 200ms each is about 250 req/s. Then add headroom so you're not at 100% utilization, and round up.

What utilization should I target?

Aim for 60–75% at peak. Running closer to 100% leaves no room for traffic spikes, rolling deploys, garbage-collection pauses, or a failed instance — all of which happen routinely. The headroom is cheaper than an outage.

What is Little's Law for capacity planning?

Little's Law says the number of in-flight requests equals throughput times latency. Rearranged for capacity: an instance that can run W requests concurrently, each taking L seconds, sustains about W/L requests per second. It's the simplest reliable way to turn latency and concurrency into a server count.

How many servers do you need?

Your peak load

Little's Law, in one line

Capacity is a design decision, not a guess.

Capacity planning, the simple way

Questions about capacity planning

Related

Performance engineering — find the bottleneck, right-size the fleet

Why startups rewrite their architecture (and how to avoid it)