Online Judge / Code Execution Platform
Problem statement
Users submit code in multiple languages; system runs tests in sandbox, enforces CPU/memory/time limits, prevents cheating/network exfiltration, and scales during contests.
How it works
- Submit → queue → worker pulls Docker / Firecracker microVM job, mounts read-only testcases, runs compiler + binary, captures stdout/stderr/exit code/metrics.
Analogy: Driving test track — car (submission) must stay inside cones (sandbox); examiner (judge) records time and penalties without letting you drive on public roads (internet).
High-level design
Rendering diagram…
Components explained — this design
| Component | What it is | Why we use it here |
|---|---|---|
| Submission API | Validates auth, language, size; enqueues run. | Must never execute user code in API process (security). |
| SQS / Redis queue | Job queue for judge runs. | Backpressure during contests; autoscale runners on depth. |
| Runner on K8s | Pulls jobs, launches sandboxed container per submission. | Isolation boundary; burst scale for contest spikes. |
| gVisor / Firecracker | Kernel isolation layers. | Reduce container escape risk vs default runc-only profiles. |
| PostgreSQL results | Verdict, runtime, memory usage. | Leaderboards and re-judges need queryable history. |
| S3 artifacts | Stdout dumps for debugging (optional). | Keeps large logs out of DB. |
Shared definitions: 00-glossary-common-services.md
Low-level design
Isolation
- gVisor seccomp profiles; no network namespace default; seccomp BPF deny dangerous syscalls.
- Resource limits: cgroups v2 CPU quota, memory max, pids limit.
Anti-cheat
- Plagiarism MOSS / AST hashing batch jobs; time anomaly detection (too fast perfect).
Languages
- Prebuilt container images per language version; pull policy IfNotPresent in contest clusters.
Scoring
- Weighted tests partial scoring; TLE vs WA distinct statuses.
E2E: judge run
Rendering diagram…
Tricky parts
| Problem | Solution |
|---|---|
| Side-channel attacks | Randomize test order; wall clock vs CPU time |
| Fork bombs | pids cgroup + RLIMIT_NPROC |
| Persistent FS attacks | Ephemeral overlay per run |
Caveats
- Hardware nondeterminism floating — epsilon compare for floats in statement.
- Cost explosion during contest — pre-warmed node pool + max concurrency per user.
Azure
- Azure Container Instances per job; Hyper-V isolation Windows containers rare for OJ.