Online Judge / Code Execution Platform

Problem statement

Users submit code in multiple languages; system runs tests in sandbox, enforces CPU/memory/time limits, prevents cheating/network exfiltration, and scales during contests.

How it works

Submit → queue → worker pulls Docker / Firecracker microVM job, mounts read-only testcases, runs compiler + binary, captures stdout/stderr/exit code/metrics.

Analogy: Driving test track — car (submission) must stay inside cones (sandbox); examiner (judge) records time and penalties without letting you drive on public roads (internet).

High-level design

Rendering diagram…

Components explained — this design

Component	What it is	Why we use it here
Submission API	Validates auth, language, size; enqueues run.	Must never execute user code in API process (security).
SQS / Redis queue	Job queue for judge runs.	Backpressure during contests; autoscale runners on depth.
Runner on K8s	Pulls jobs, launches sandboxed container per submission.	Isolation boundary; burst scale for contest spikes.
gVisor / Firecracker	Kernel isolation layers.	Reduce container escape risk vs default runc-only profiles.
PostgreSQL results	Verdict, runtime, memory usage.	Leaderboards and re-judges need queryable history.
S3 artifacts	Stdout dumps for debugging (optional).	Keeps large logs out of DB.

Shared definitions: 00-glossary-common-services.md

Low-level design

Isolation

gVisor seccomp profiles; no network namespace default; seccomp BPF deny dangerous syscalls.
Resource limits: cgroups v2 CPU quota, memory max, pids limit.

Anti-cheat

Plagiarism MOSS / AST hashing batch jobs; time anomaly detection (too fast perfect).

Languages

Prebuilt container images per language version; pull policy IfNotPresent in contest clusters.

Scoring

Weighted tests partial scoring; TLE vs WA distinct statuses.

E2E: judge run

Rendering diagram…

Tricky parts

Problem	Solution
Side-channel attacks	Randomize test order; wall clock vs CPU time
Fork bombs	pids cgroup + RLIMIT_NPROC
Persistent FS attacks	Ephemeral overlay per run

Caveats

Hardware nondeterminism floating — epsilon compare for floats in statement.
Cost explosion during contest — pre-warmed node pool + max concurrency per user.

Azure

Azure Container Instances per job; Hyper-V isolation Windows containers rare for OJ.