Setting the Scene
We run a tiny ride-hailing service. Users tap a button to request a ride. Our dispatcher service finds a driver and confirms the ride.
That’s it. Simple.
What We Care About
Users want:
- Fast confirmation. They don’t want to wait 10 seconds for a response.
The business wants:
- Handle traffic spikes without the system falling over. When a concert ends and 200 people request rides at once, the system should still work.
The System We’ll Build
Here’s our starting point:
User → Dispatcher → Database → Driver Match → Response
One dispatcher service. One database. Everything synchronous. We’ll add complexity as we go.
Key Concepts We’ll Explore
Before we dive in, let’s define what we’re measuring:
Latency: Time for one request to complete. If a user clicks “Request Ride” at 10:00:00 and gets a response at 10:00:02, the latency is 2 seconds.
Throughput: Requests handled per second. If our system can process 10 ride requests per second, that’s our throughput.
Concurrency: How many requests are “in flight” at once. If 5 requests are being processed simultaneously, concurrency is 5.
Queue: Requests waiting to be processed. When requests arrive faster than we can process them, they queue up.
These concepts connect. If latency is 200ms per request, one instance can finish about 5 requests per second. If 20 requests arrive per second, the queue grows.
Choose Your Scenario
Different cities have different traffic patterns. Pick a scenario to explore:
Small City:
- 5 requests per second
- 200ms average processing time
- Low traffic, easy to handle
Medium City:
- 20 requests per second
- 200ms average processing time
- Moderate load, some queuing
Big City:
- 50 requests per second
- 200ms average processing time
- High load, queues will form
Scenario Selector
Choose a scenario to see how it affects the system:
Selected: None
- Requests per second: -
- Average processing time: -
- Expected capacity: -
What’s Next?
In the next page, we’ll model the basic synchronous system. You’ll see how a single request flows through the dispatcher, where time is spent, and how we measure latency and throughput.