News Feed / Timeline (e.g. Twitter-style)
Problem statement
Design a system where users follow others and see a chronological or ranked feed of posts at scale (fan-out on read vs write, mixed media).
How it works
Two classic strategies:
- Fan-out on write (push): when user posts, push post id into every follower’s feed cache (Redis list / sharded DB).
- Fan-out on read (pull): store posts by author; at read time, merge recent posts from all followees.
Analogy: Push = newspaper delivered to every doorstep nightly. Pull = you walk to each friend’s house and collect today’s notes when you open the app.
When to use push vs pull
| Scenario | Prefer |
|---|---|
| Celebrity with 50M followers | Hybrid: push to normal users, pull merge for mega-followed accounts |
| Small follower counts | Push for low read latency |
| Ephemeral stories | Often push + TTL |
High-level design
Rendering diagram…
Components explained — this design
| Component | What it is | Why we use it here |
|---|---|---|
| Post service | Writes new posts to durable store and emits events. | Source of truth for content; publishing to Kafka enables fan-out workers without blocking the user’s POST response. |
| Kafka / Kinesis | Durable ordered log (see glossary). | Buffers fan-out work; replays if workers lag; partition by author or post for ordering guarantees where needed. |
| Fan-out worker pool | Consumers that push post ids into per-user feed lists. | Implements push model feeds; horizontal scale by adding consumers (watch consumer lag). |
| Redis feed cache | LIST/ZSET of recent post ids per user. | O(1) reads for home timeline first page; trim to cap memory. |
| Feed API + Ranking | Assembles hydrated posts; optional ML ranker. | Read path composition; ranking should not block basic chronological fallback on outage. |
Shared definitions: 00-glossary-common-services.md
Low-level design
Post storage
- Cassandra / DynamoDB partition by
user_id+ time (wide rows) for write scalability. - Or PostgreSQL with partitioning if team strength is SQL.
Feed cache
- Redis lists
feed:{user_id}capped at N (e.g. 800) post ids — LPUSH + LTRIM. - TTL on inactive users to save memory.
Fan-out worker
- Kafka consumers horizontally scaled; idempotent writes (dedupe by
post_id). - Dead letter queue (SQS DLQ) for poison posts.
Ranking (optional)
- Online: lightweight model in AWS SageMaker endpoint or TensorFlow Serving.
- Offline: precompute scores batch to Feature Store.
Media
- S3 + CloudFront for images/video; transcoding with Elastic Transcoder / MediaConvert.
E2E: post then read feed
Rendering diagram…
Tricky parts
| Problem | Solution |
|---|---|
| Hot celebrity | Don’t fan-out to millions; merge at read for those authors |
| Stale ordering | Hybrid timeline: merge push cache + pull recent from celebs |
| Global ordering | True global order needs single writer or vector clocks — usually per-user relative order is enough |
| Muted / blocked users | Filter layer in Feed API reading social graph service |
Caveats
- Strong consistency between “posted” and “visible” is often relaxed; eventual fan-out acceptable with “sent” ack after durable log.
- Compliance: geo-fencing content, GDPR delete must cascade to feed shards (async jobs).
Tech picks (example)
| Layer | AWS | Azure |
|---|---|---|
| Stream | Kinesis / MSK | Event Hubs |
| Cache | ElastiCache | Azure Redis |
| CDN | CloudFront | Azure CDN |