Real-time Chat (e.g. Slack / WhatsApp-lite)
Problem statement
Design 1:1 and group messaging with online presence, delivery/read receipts, history sync, and mobile push when offline.
How it works
- Active users: maintain WebSocket (or MQTT) connections to a gateway tier.
- Messages: append to durable log (Kafka) + per-conversation store (sharded DB).
- Offline: push notifications (APNs/FCM) + client pulls history on reconnect.
Analogy: Walkie-talkie channel (WebSocket) for live voice, plus voicemail inbox (DB) when the other person is offline.
High-level design
Rendering diagram…
Components explained — this design
| Component | What it is | Why we use it here |
|---|---|---|
| NLB / ALB | Layer 4/7 load balancers distributing TCP/WebSocket connections. | Sticky sessions or least-conn routing to WebSocket gateways; health checks restart bad nodes. |
| WebSocket gateway | Maintains long-lived connections; frames to/from clients. | Chat needs server push; HTTP polling is wasteful at scale. |
| Presence Redis | Fast key-value with TTL for “user online/offline”. | Presence is ephemeral and high-churn; Redis TTL matches heartbeat semantics cheaply. |
| Kafka | Append-only event log for messages and receipts. | Ordering per conversation partition, replay for new consumers (search indexer), decoupling persistence from gateway CPU. |
| Cassandra / DynamoDB | Wide-column / key-value for message history at scale. | Message tables are append-heavy and partitioned by conversation_id; SQL can become a write bottleneck. |
| S3 | Object storage for attachments/voice notes. | Keeps large blobs out of the DB; lifecycle to cold storage. |
| SNS + APNs/FCM | Mobile push fan-out (see glossary). | Offline users don’t hold WebSockets; push wakes device for new messages. |
Shared definitions: 00-glossary-common-services.md
Low-level design
WebSocket gateway
- Sticky sessions at load balancer to same gateway pod (or use Redis pub/sub for cross-pod fan-out).
- Heartbeat to detect dead TCP; reconnect with exponential backoff on client.
Message ordering
- Per conversation sequence number assigned by single partition Kafka topic key = conversation_id guarantees order in log.
- Clients dedupe by
message_id.
Storage
- Hot messages: Redis recent window for speed.
- Cold history: S3 + Parquet for cheap archive, or DB partitions by month.
Search (Slack-like)
- OpenSearch (Elasticsearch) index messages async from Kafka.
Security
- E2E encryption (Signal model) changes design: server stores ciphertext only; key exchange out of band.
- If not E2E: TLS in transit, KMS at rest, field-level encryption for sensitive attachments metadata.
Identity
- Amazon Cognito / Auth0 for JWT; gateway validates JWT per connection upgrade.
E2E: send message in group
Rendering diagram…
Tricky parts
| Problem | Solution |
|---|---|
| Ordering across servers | Single Kafka partition per conversation |
| Split brain presence | TTL keys in Redis; CRDT optional for advanced |
| Large groups | Fan-in reads; ephemeral messages for mega channels |
| Delivery receipts | Separate small events topic; idempotent upserts |
Caveats
- Backpressure: slow clients → bounded per-connection queues; drop or disconnect abusive clients.
- Compliance: FINRA / healthcare may require WORM storage and audit trails.
Azure equivalents
- SignalR Service for WebSocket scale-out.
- Event Hubs instead of Kafka.
- Notification Hubs for push.