Intermediate 25 min

Scaling Options

When your system can’t handle the load, you scale. But how?

Vertical scaling: Stronger machine Horizontal scaling: More machines

Let’s see how each works.

Vertical Scaling

Stronger machine → higher per-instance throughput

If one instance handles 5 req/s, a machine twice as powerful might handle 10 req/s.

Pros:

  • Simple: just upgrade the machine
  • No code changes needed
  • Single instance to manage

Cons:

  • Hits limits: cost, single point of failure
  • Can’t scale beyond one machine’s capacity
  • Expensive at high levels

Horizontal Scaling

More workers behind queue → higher total throughput

If one worker handles 5 req/s, 10 workers handle 50 req/s.

Pros:

  • Can scale beyond single machine limits
  • Better fault tolerance (one worker fails, others continue)
  • Cost-effective (many small machines vs one huge one)

Cons:

  • More complex: need load balancing, coordination
  • Some bottlenecks don’t scale (database, external services)

Interactive Scaling Simulator

Adjust the number of worker instances and see the effect:

Scaling Control

Max Throughput
5 req/s
Queue Length
0
Avg Latency
200ms

Current Bottleneck

Worker CPU

Real-World Constraints

Database often becomes the next bottleneck:

You scale workers to 10 instances. Each handles 5 req/s. Total capacity: 50 req/s.

But your database can only handle 30 req/s. That’s your bottleneck now.

External services can also cap throughput:

Your workers call external services (maps API, payment gateway). Those services have rate limits. Even with 100 workers, you’re limited by external service capacity.

The Bottleneck Shifts

Solving one bottleneck just moves the bottleneck.

  1. Start: Worker CPU is the bottleneck
  2. Scale workers: Now database is the bottleneck
  3. Scale database: Now external services are the bottleneck
  4. Scale external services: Now network is the bottleneck

This is normal. You solve one problem, then tackle the next.

When Scaling Stops Helping

Vertical scaling stops when:

  • You hit the largest available machine
  • Cost becomes prohibitive
  • Single point of failure is unacceptable

Horizontal scaling stops when:

  • Database becomes the bottleneck (harder to scale)
  • External services become the bottleneck (can’t control)
  • Coordination overhead exceeds benefits

Cost Considerations

Vertical scaling:

  • Linear cost increase
  • 2x machine = 2x cost
  • Simple but expensive at scale

Horizontal scaling:

  • Can use smaller, cheaper machines
  • Better cost efficiency
  • But need load balancer, coordination

Key Takeaways

  • Vertical scaling: Stronger machine, higher per-instance throughput
  • Horizontal scaling: More workers, higher total throughput
  • Database often becomes bottleneck after scaling workers
  • External services can cap throughput regardless of worker count
  • Solving one bottleneck reveals the next - this is normal
  • Scaling stops helping when you hit un-scalable bottlenecks

What’s Next?

In the final page, we’ll wrap up with a practical checklist you can use for your own systems. You’ll get a simple framework for thinking about capacity, bottlenecks, and scaling.