Intermediate 25 min

What Stops This From Blowing Up?

So far we’ve seen queues grow. But what stops them from growing forever?

Backpressure. That’s the answer.

Backpressure in Plain Terms

Slow down producers when consumers fall behind.

Stop accepting more work when the queue is too long.

Think of a sink with a clogged drain. Water flows in faster than it drains. The sink fills up. Eventually, you turn off the tap. That’s backpressure.

Techniques

Reject requests when:

  • Queue depth > threshold (e.g., 1000 requests)
  • CPU usage > threshold (e.g., 80%)
  • Memory usage > threshold (e.g., 80%)

Return clear error codes:

  • 429 Too Many Requests (rate limited)
  • 503 Service Unavailable (overloaded)

Include helpful messaging:

  • “Service is temporarily overloaded. Please try again in a few seconds.”
  • “Rate limit exceeded. Try again in 60 seconds.”

Why This Is Better

Better than letting queues grow unbounded:

  • Queues consume memory
  • Long queues mean long wait times
  • System may crash when memory runs out

Better than letting whole service crash:

  • Some requests succeed (those already in queue)
  • System recovers faster
  • Users get clear error messages

Interactive Backpressure Toggle

Toggle backpressure on/off and see the difference:

Backpressure Control

Accepted/sec
0
Rejected/sec
0
Queue Length
0
Avg Latency
200ms

Outcome Summary

System is healthy

When to Use Each Strategy

Reject immediately (429/503):

  • When queue is full
  • When system is clearly overloaded
  • Better to fail fast than wait forever

Throttle (rate limit):

  • When you want to smooth traffic
  • When you can accept some delay
  • Better user experience than rejection

Queue with limits:

  • When you can handle some queuing
  • When you want to smooth spikes
  • But set a maximum queue depth

Real-World Example

Ride-hailing during surge:

  • Too many requests arrive
  • System rejects some with “High demand, try again in a minute”
  • Accepted requests get processed
  • System stays responsive

Without backpressure:

  • All requests accepted
  • Queue grows to thousands
  • Everyone waits minutes
  • System may crash
  • Worse user experience

The Trade-off

With backpressure:

  • ✅ System stays responsive
  • ✅ Some requests succeed quickly
  • ❌ Some users see errors
  • ❌ Need to handle retries

Without backpressure:

  • ✅ Fewer errors (requests accepted)
  • ❌ System may crash
  • ❌ Everyone waits longer
  • ❌ Worse overall experience

Key Takeaways

  • Backpressure slows producers when consumers fall behind
  • Reject requests when queue/cpu/memory exceeds thresholds
  • Return clear errors (429/503) with helpful messages
  • Better than unbounded queues - prevents crashes
  • Better than crashing - some requests succeed
  • Trade-off: Some users see errors, but system stays up

What’s Next?

In the next page, we’ll explore scaling patterns. You’ll learn about vertical vs horizontal scaling, and when they stop helping. You’ll also see how solving one bottleneck often reveals the next one.