SuryanandHome

Video Streaming (e.g. Netflix-style)

Problem statement

Design upload, transcode to multiple bitrates, DRM optional, global low-latency playback with adaptive bitrate (ABR), and recommendations surface.

How it works

  1. Creator uploads source video → stored in object storage.
  2. Transcoding pipeline produces HLS/DASH manifests + segments (360p–4K).
  3. Player uses CDN to fetch segments; ABR switches quality by bandwidth.
  4. Playback auth via signed URLs or tokenized manifest.

Analogy: A kitchen (transcoder) slices a pizza (video) into many slice sizes; diners (players) pick slice size based on how hungry their network is.

High-level design

Rendering diagram…

Components explained — this design

ComponentWhat it isWhy we use it here
S3 raw bucketStores mezzanine uploads from creators.Durable ingest landing zone; triggers downstream transcode via events.
SQS / EventBridgeQueue or event bus notifying transcode workers.Decouples upload completion from long CPU/GPU jobs so API returns quickly.
MediaConvert / TranscoderManaged video encoding to multi-bitrate HLS/DASH.FFmpeg ops are painful at scale; managed encoders handle formats, DRM packaging, scaling.
S3 encoded bucketStores packaged segments + manifests.Origin for CDN; separate bucket allows different IAM and lifecycle from raw uploads.
CloudFront CDNEdge cache for segments close to viewers.Bandwidth is the main cost driver; CDN reduces origin load and improves startup latency.
Metadata DBTitles, genres, entitlements, playback position.Relational or document DB for complex queries not suited to object keys alone.
Recommendation APIRetrieves candidates + scores.Keeps personalized home separate from byte delivery so teams can iterate models independently.

Shared definitions: 00-glossary-common-services.md

Low-level design

Transcoding

  • AWS Elemental MediaConvert (managed) or Kubernetes + FFmpeg workers for control.
  • Spot instances for cost; idempotent jobs keyed by asset_id + profile.

Streaming protocols

  • HLS (Apple) / DASH (wider DRM). LL-HLS for lower latency live.

DRM (premium content)

  • Widevine / FairPlay / PlayReady via AWS DRM or Axinom / BuyDRM license servers.

Metadata & catalog

  • PostgreSQL for titles, seasons, rights windows.
  • OpenSearch for search / faceted browse.

View telemetry

  • Kinesis FirehoseS3Athena / Snowflake for “watch events” feeding ML.

E2E: user presses play

Rendering diagram…

Tricky parts

ProblemSolution
Cold CDNOrigin shield (secondary CDN layer)
Seek latencyIndependent segments + keyframe alignment (GOP)
Live syncWebRTC for ultra-low latency; HLS for scale
PiracyShort TTL signed URLs; watermarking per session

Caveats

  • 4K transcoding is CPU-heavy; queue depth monitoring + autoscaling on consumer lag.
  • Regional rights require geo-blocking at CDN + manifest filtering.

Azure mapping

  • Azure Media Services (legacy transitions) / ffmpeg on AKS.
  • Azure CDN + Front Door.