From One Request to Many
So far we’ve looked at one request. Now let’s add traffic.
What if 10 users per second request a ride?
What if 200 users per second do it?
Let’s see what happens.
The Intuition
Think of a cashier at a store. One cashier can serve about 5 customers per minute. If 20 customers arrive per minute, a queue forms. The store door still lets people in, but they wait longer.
Same thing happens with our dispatcher. If requests arrive faster than we can process them, they queue up inside the dispatcher process.
Interactive Traffic Simulator
Adjust the traffic slider and watch what happens:
Note: Interactive features are being added. The sliders and visualizations will be fully functional soon.
Traffic Control
Queue Visualizer
What Happens as Load Grows
Low load (1-5 req/s):
- No queue forms
- Latency stays at ~200ms
- System is comfortable
Medium load (6-15 req/s):
- Queue starts forming
- Latency grows: 200ms + queue wait time
- System is stressed but functional
High load (16-30 req/s):
- Queue grows quickly
- Latency can reach several seconds
- Some requests may timeout
Very high load (30+ req/s):
- Queue grows without bound
- Requests start timing out
- System may become unresponsive
The Math
If each request takes 200ms:
- One instance capacity: 5 req/s
- If 20 requests arrive per second:
- 5 are processed immediately
- 15 queue up
- Queue grows by 15 per second
- After 10 seconds, 150 requests are queued
- Each queued request waits: queue position × 200ms
Real-World Analogy
Think of a highway toll booth:
- One booth can process 5 cars per minute
- If 20 cars arrive per minute, 15 wait
- The line grows longer
- Cars at the back wait much longer
Same with our dispatcher. Requests at the back of the queue wait much longer than requests at the front.
When Requests Time Out
Most clients have timeout settings:
- Mobile apps: 5-10 seconds
- Web browsers: 30 seconds
- API clients: varies
If latency exceeds the timeout, the client gives up. The request might still be processing on the server, but the user sees an error.
Key Takeaways
- Queues form when requests arrive faster than capacity
- Latency grows as queue length increases
- Requests timeout when latency exceeds client timeout
- Capacity limits determine when queues form
- If capacity is 5 req/s and 20 arrive, queue grows by 15 per second
What’s Next?
In the next page, we’ll introduce a message queue. This changes the architecture and lets us smooth out traffic spikes. You’ll see how queues can help, but also create new problems.