The Simplest Version
Let’s start with the simplest possible system:
- Single dispatcher service
- Synchronous request handling
- Everything on one instance
The Request Flow
Here’s what happens when a user requests a ride:
Client → Dispatcher → DB Lookups → Driver Match → Response
Step by step:
- Client sends request: User taps “Request Ride” button
- Dispatcher receives: API endpoint receives the request
- DB lookups: Dispatcher queries database for:
- Available drivers near the user
- User’s ride history
- Pricing rules
- Driver match: Dispatcher selects best driver
- Response: Dispatcher sends confirmation back to client
Everything happens in sequence. The client waits for the entire process to finish.
Where Time Is Spent
Let’s break down where time goes in a typical request:
Network time: 50ms
- Client to dispatcher: 25ms
- Dispatcher to client: 25ms
Dispatcher logic: 50ms
- Request parsing and validation
- Business logic execution
Database time: 100ms
- Query execution
- Result processing
Total latency: ~200ms
This is a simplified model, but it’s close enough for learning.
Interactive Timeline
Watch a single request flow through the system:
Key Definitions
Latency: Time for one request to complete. If a request takes 200ms from start to finish, latency is 200ms.
Throughput: Requests handled per second. If each request takes 200ms, one instance can finish about 5 requests per second (1000ms ÷ 200ms = 5).
Concurrency: How many requests are “in flight” at once. If requests take 200ms and 10 arrive per second, you’ll have about 2 requests being processed simultaneously on average.
Latency Breakdown
Hover over each stage to see where time is spent:
Latency Breakdown
Network: 50ms
Dispatcher Logic: 50ms
Database: 100ms
The Math
If each request takes 200ms:
- One instance can process: 1000ms ÷ 200ms = 5 requests per second
- This is the maximum throughput for one instance
If requests arrive faster than 5 per second, they’ll start queuing. We’ll see that next.
Key Takeaways
- Latency is the time for one request to complete
- Throughput is requests handled per second
- Concurrency is how many requests are in flight
- If latency is 200ms, max throughput is about 5 req/s per instance
- Network, dispatcher logic, and database all contribute to latency
What’s Next?
In the next page, we’ll add load and watch bottlenecks appear. You’ll see how queues form, latency grows, and requests start timing out as traffic increases.