SuryanandHome

Ride Hailing (e.g. Uber / Lyft)

Problem statement

Match riders to drivers in real time, estimate ETA and fare, handle surge pricing, payments, ratings, and safety events at city scale.

How it works

  1. Drivers broadcast GPS periodically → location ingestion updates a geospatial index.
  2. Rider requests ride → matching service finds nearby available drivers (radius + ETA).
  3. Dispatch assigns best driver; both apps get WebSocket updates for trip state machine.
  4. Trip ends → billing + receipt + analytics.

Analogy: A restaurant pager system: the kitchen (matching) sees who is waiting (riders) and which tables are free (drivers), then buzzes the right pair.

High-level design

Rendering diagram…

Components explained — this design

ComponentWhat it isWhy we use it here
API Gateway + WAFEdge HTTP + firewall (see glossary).Protects matching endpoints from scrapers; authenticates riders/drivers via JWT.
Location ingestion → KafkaStream of GPS updates from drivers.High volume append stream; Kafka gives buffer + replay if downstream geo index lags.
Redis GEO / PostGISGeospatial indexes for “nearby drivers”.Redis GEO for sub-second radius queries at huge QPS; PostGIS if you already run SQL and moderate scale.
Matching / DispatchScores candidates by ETA, surge, rating.Core marketplace brain; isolated service to scale CPU-heavy routing independently.
Trip state serviceState machine for ride lifecycle.Clear ownership of transitions; emits events to billing and analytics.
PostgreSQLRelational store for trips, users, payments references.ACID for money-adjacent records and referential integrity between trip and payment.
Time-series DBOptimized store for metrics-like location traces.Optional for heatmaps and fraud investigation without bloating OLTP schema.

Shared definitions: 00-glossary-common-services.md

Low-level design

Location pipeline

  • Mobile → HTTPS batch or MQTT for low battery; AWS IoT Core optional.
  • Stream: Amazon Kinesis / Kafka for ordered driver updates per driver_id.
  • Hot geospatial: Redis GEOADD + GEORADIUS for sub-second nearby queries at high QPS.
  • Cold path: S3 + Athena for historical heatmaps / city planning.

Matching

  • Criteria: distance, ETA from Mapbox / Google Routes API or internal Valhalla graph.
  • Surge: Redis counter by geohash cell; pricing service reads multiplier.
  • Fairness: two-sided marketplace — avoid starvation with aging in match queue.

Trip state machine

States: REQUESTED → MATCHED → EN_ROUTE → IN_PROGRESS → COMPLETED | CANCELLED.
Use event sourcing in Kafka + CQRS read model in PostgreSQL for audits.

Payments

  • Stripe Connect for marketplace splits (platform fee + driver payout).
  • PCI: never store PAN; use tokenization.

Notifications

  • SNS + FCM/APNs for driver offer timeout (10s accept window).

E2E: request to match

Rendering diagram…

Tricky parts

ProblemSolution
Thundering herd at bar closeShard surge by geohash; capacity limits on dispatch
Split-brain double assignOptimistic locking on driver row version; idempotent accept
Ghost drivers offline but “available”Heartbeat TTL in Redis; auto-offline
RegulatoryPer-city feature flags; data residency (EU trips in EU region)

Caveats

  • Map APIs cost money — cache static road segments; refresh dynamic traffic more often.
  • Safety: 911 integration, trip sharing, in-app emergency — separate highly available microservice.

Azure mapping

  • Event Hubs for location stream; Azure Cache for Redis GEO; Azure Maps for routing.