Intermediate 25 min

Adding a Queue

Now we add an actual message queue between the API and the worker.

New Architecture

Before (Direct Call):

Client → API → Worker → DB → Response

After (Queued):

Client → API → Queue → Worker → DB → External Services

The API becomes “fire-and-forget” relative to the worker. It accepts requests and puts them in a queue. The worker processes them asynchronously.

Why Add Queues?

Smooth spikes:

  • If 200 requests arrive in one second, the queue absorbs them
  • Workers process at steady rate
  • System doesn’t crash

Avoid losing requests:

  • If worker restarts, requests stay in queue
  • No requests lost during restarts
  • Better reliability

Decouple components:

  • API doesn’t wait for worker
  • API responds quickly
  • Worker processes at its own pace

What Changes

API behavior:

  • Before: API waits for worker to finish
  • After: API puts request in queue and responds immediately

User experience:

  • Before: User waits for full processing
  • After: User gets quick acknowledgment, processing happens in background

End-to-end time:

  • Users still care about total time
  • But the shape is different: fast acknowledgment, then background processing

Interactive Architecture Toggle

Switch between architectures and see the difference:

Architecture Mode

Producer and Consumer

Now we have two rates to think about:

Producer rate: Requests per second into the queue (from API) Consumer rate: Jobs per second a worker can process

If producer rate > consumer rate, the queue grows.

Interactive Producer/Consumer Simulator

Adjust the rates and watch the queue:

Producer / Consumer Rates

Queue Length
0
Status
Healthy

Queue Length Over Time

New Trade-offs

Higher peak throughput possible:

  • Queue absorbs spikes
  • Workers process steadily
  • Can handle bursts

But if consumers can’t keep up:

  • Queue grows without bound
  • Requests wait longer
  • Memory usage increases
  • System may crash

You need backpressure:

  • Stop accepting more work when queue is too long
  • Reject requests when overloaded
  • We’ll cover this next

Key Takeaways

  • Queues smooth spikes by absorbing bursts
  • API becomes fire-and-forget - responds quickly
  • Workers process at steady rate - decoupled from API
  • Queue grows if producer rate > consumer rate
  • Need backpressure to prevent unbounded growth

What’s Next?

In the next page, we’ll add backpressure and load shedding. You’ll learn how to prevent queues from growing without bound and how to gracefully handle overload.