Designing a Ride Sharing Service (Uber/Lyft)
Designing Uber is one of the most complex system design questions because it combines real-time constraints, heavy write throughput (location updates), and complex consistency requirements (payment/matching).
1. Requirements
Functional Requirements
- Rider: Can see nearby cabs. Can request a ride. Can track the driver in real-time.
- Driver: Can accept a ride. Can mark trip as started/ended.
- System: Must match riders with the nearest available driver optimally.
Non-Functional Requirements
- Low Latency: Matching must happen in < 2 seconds.
- Consistency: A driver cannot be assigned to two riders simultaneously.
- High Availability: The system must accept ride requests even if payments are slow.
2. High-Level Architecture
The system is split into two distinct subsystems: the Supply Service (Drivers) and the Demand Service (Riders).
Components
- WebSocket Gateway: Maintains persistent connections with Driver and Rider apps (avoiding HTTP polling overhead).
- Location Service: Ingests streams of latitude/longitude updates from drivers.
- Map Service: Calculates ETA and routes (using Google Maps or OpenStreetMap data).
- Matching Service: The "brain" that pairs Supply with Demand.
- Trip Service: Manages the state machine of a trip.
3. The Core Challenge: Geospatial Indexing
How do we efficiently find "all drivers within 1km of User X"?
Naive Approach: SQL
SELECT * FROM drivers WHERE lat BETWEEN x1 AND x2 AND long BETWEEN y1 AND y2
- Problem: This requires a full table scan or massive B-Tree index lookups for every request. It doesn't scale to millions of drivers.
Solution 1: QuadTrees
Solution 2: Geohashing (String-based)
We divide the world into a grid and assign a unique string base-32 string to each cell.
u4pruydqqvjrepresents a precise 1 meter box.u4pruyrepresents a larger neighborhood.- Pro: Neighbors share common prefixes. querying neighbors is just string matching.
Interactive: Spatial Indexing (H3/S2)
Geospatial Grid Search (H3/S2)
> User Location: Cell #21
Solution 3: Google S2 / Uber H3 (The Industry Standard)
Uber actually uses H3 (Hexagonal Hierarchical Spatial Index).
- Why Hexagons? In a square grid, diagonal neighbors are further away than horizontal neighbors. In a hexagon grid, all 6 neighbors are equidistant. This simplifies "Expand Ring" search algorithms significantly.
- Implementation: Each driver's location is mapped to an H3 Cell ID (a 64-bit integer). This ID is the Sharding Key.
4. Driver Location Updates (Heavy Write Load)
Drivers send updates every 3 seconds. With 1 million active drivers, that's ~330k writes per second.
The Problem with Databases
Writing 330k updates/sec to Postgres or MySQL will kill the disk I/O.
The Solution: Ephemeral Storage (Redis)
We don't need to persist every historical location forever immediately. We only need the current location for matching.
- Ingestion: Driver app sends coordinates via WebSocket.
- Buffer: Updates go into a Redis Cluster (Geospatial Set).
- Command:
GEOADD drivers <long> <lat> <driver_id>
- Command:
- Expiry: Set a TTL of 10 seconds. If a driver loses connection, they "disappear" from the index automatically.
- Archival: A separate Kafka stream pumps data to Cassandra/S3 for "Trip History" and analytics/auditing.
5. The Matching Service
When a rider requests a ride:
- Find Candidates: Query Redis
GEORADIUSfor drivers within 2km. - Filter: Remove drivers who are "Busy" or have "Low Rating".
- Rank: Call Map Service to get actual driving ETA (not just straight-line distance).
- Lock: Attempt to "Reserve" the best driver using a Distributed Lock (Redis Redlock).
- Why? Multiple riders might be requesting in the same area. We must ensure only one rider grabs that driver.
- Offer: Send "New Ride Request" notification to Driver.
- Accept: If Driver accepts, create Trip. If Driver ignores/declines, try the next driver in the ranked list.
6. Fault Tolerance
- Ringpop: Uber uses a library called Ringpop for application-layer sharding. It allows services to gossip and discover which node handles which Hexagon ID.
- Availability vs Consistency: For the "Location Service", we favor Availability (AP). It's okay if a driver position is 3 seconds stale. For "Payment/Trip", we favor Consistency (CP).
Summary
- Index: H3 (Hexagonal) for uniform neighborhood traversal.
- Storage: Redis for high-write throughput of ephemeral locations.
- Communication: WebSocket for bi-directional low-latency updates.
- Scale: Shard by H3 Cell ID.
About ScaleWiki
ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.
Read more about our Editorial Guidelines & Authorship.
Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.
Related Articles
Geohashing (Location Encoding)
A geocoding system that encodes latitude/longitude coordinates into short alphanumeric strings for efficient proximity searches and spatial indexing.
System Design: Notification System
How to send millions of SMS, Email, and Push notifications reliably. Message Queues, Rate Limiting, and Retry policies.
System Design: WhatsApp (Chat App)
How to design a massive scale chat application with focus on WebSocket architecture, End-to-End Encryption, and offline message delivery.