What Stops This From Blowing Up?
So far we’ve seen queues grow. But what stops them from growing forever?
Backpressure. That’s the answer.
Backpressure in Plain Terms
Slow down producers when consumers fall behind.
Stop accepting more work when the queue is too long.
Think of a sink with a clogged drain. Water flows in faster than it drains. The sink fills up. Eventually, you turn off the tap. That’s backpressure.
Techniques
Reject requests when:
- Queue depth > threshold (e.g., 1000 requests)
- CPU usage > threshold (e.g., 80%)
- Memory usage > threshold (e.g., 80%)
Return clear error codes:
- 429 Too Many Requests (rate limited)
- 503 Service Unavailable (overloaded)
Include helpful messaging:
- “Service is temporarily overloaded. Please try again in a few seconds.”
- “Rate limit exceeded. Try again in 60 seconds.”
Why This Is Better
Better than letting queues grow unbounded:
- Queues consume memory
- Long queues mean long wait times
- System may crash when memory runs out
Better than letting whole service crash:
- Some requests succeed (those already in queue)
- System recovers faster
- Users get clear error messages
Interactive Backpressure Toggle
Toggle backpressure on/off and see the difference:
Backpressure Control
Outcome Summary
System is healthy
When to Use Each Strategy
Reject immediately (429/503):
- When queue is full
- When system is clearly overloaded
- Better to fail fast than wait forever
Throttle (rate limit):
- When you want to smooth traffic
- When you can accept some delay
- Better user experience than rejection
Queue with limits:
- When you can handle some queuing
- When you want to smooth spikes
- But set a maximum queue depth
Real-World Example
Ride-hailing during surge:
- Too many requests arrive
- System rejects some with “High demand, try again in a minute”
- Accepted requests get processed
- System stays responsive
Without backpressure:
- All requests accepted
- Queue grows to thousands
- Everyone waits minutes
- System may crash
- Worse user experience
The Trade-off
With backpressure:
- ✅ System stays responsive
- ✅ Some requests succeed quickly
- ❌ Some users see errors
- ❌ Need to handle retries
Without backpressure:
- ✅ Fewer errors (requests accepted)
- ❌ System may crash
- ❌ Everyone waits longer
- ❌ Worse overall experience
Key Takeaways
- Backpressure slows producers when consumers fall behind
- Reject requests when queue/cpu/memory exceeds thresholds
- Return clear errors (429/503) with helpful messages
- Better than unbounded queues - prevents crashes
- Better than crashing - some requests succeed
- Trade-off: Some users see errors, but system stays up
What’s Next?
In the next page, we’ll explore scaling patterns. You’ll learn about vertical vs horizontal scaling, and when they stop helping. You’ll also see how solving one bottleneck often reveals the next one.