The Simplest Version

Let’s start with the simplest possible system:

Single dispatcher service
Synchronous request handling
Everything on one instance

The Request Flow

Here’s what happens when a user requests a ride:

Client → Dispatcher → DB Lookups → Driver Match → Response

Step by step:

Client sends request: User taps “Request Ride” button
Dispatcher receives: API endpoint receives the request
DB lookups: Dispatcher queries database for:
- Available drivers near the user
- User’s ride history
- Pricing rules
Driver match: Dispatcher selects best driver
Response: Dispatcher sends confirmation back to client

Everything happens in sequence. The client waits for the entire process to finish.

Where Time Is Spent

Let’s break down where time goes in a typical request:

Network time: 50ms

Client to dispatcher: 25ms
Dispatcher to client: 25ms

Dispatcher logic: 50ms

Request parsing and validation
Business logic execution

Database time: 100ms

Query execution
Result processing

Total latency: ~200ms

This is a simplified model, but it’s close enough for learning.

Interactive Timeline

Watch a single request flow through the system:

Key Definitions

Latency: Time for one request to complete. If a request takes 200ms from start to finish, latency is 200ms.

Throughput: Requests handled per second. If each request takes 200ms, one instance can finish about 5 requests per second (1000ms ÷ 200ms = 5).

Concurrency: How many requests are “in flight” at once. If requests take 200ms and 10 arrive per second, you’ll have about 2 requests being processed simultaneously on average.

Latency Breakdown

Hover over each stage to see where time is spent:

Latency Breakdown

Network: 50ms

Dispatcher Logic: 50ms

Database: 100ms

The Math

If each request takes 200ms:

One instance can process: 1000ms ÷ 200ms = 5 requests per second
This is the maximum throughput for one instance

If requests arrive faster than 5 per second, they’ll start queuing. We’ll see that next.

Key Takeaways

Latency is the time for one request to complete
Throughput is requests handled per second
Concurrency is how many requests are in flight
If latency is 200ms, max throughput is about 5 req/s per instance
Network, dispatcher logic, and database all contribute to latency

What’s Next?

In the next page, we’ll add load and watch bottlenecks appear. You’ll see how queues form, latency grows, and requests start timing out as traffic increases.

Progress 29%

Page 2 of 7

← Previous → Next

Sign In

Latency Breakdown