API Gateway & Edge Policy
Problem statement
Single entry for authentication, rate limiting, routing, A/B, request validation, observability, and protocol translation (REST ↔ gRPC).
How it works
- All clients hit gateway (managed or self-hosted Kong / Envoy / NGINX+).
- Gateway validates JWT (JWKS from Cognito / Azure AD), applies policies, forwards to upstream Kubernetes services.
Analogy: A hotel front desk: one door checks ID, luggage rules, and sends you to the correct tower (service) without each tower hiring its own bouncer.
High-level design
Rendering diagram…
Components explained — this design
| Component | What it is | Why we use it here |
|---|---|---|
| CDN (optional) | Caches GETable static assets or signed responses. | Offloads TLS and reduces origin bandwidth for public reference data. |
| API Gateway | Central HTTP policy enforcement point. | JWT validation once at edge; adds request IDs for tracing; routes to internal gRPC/HTTP services. |
| JWT validate + JWKS cache | Fetches public keys from IdP periodically; verifies signatures. | Stateless auth for microservices; avoids calling Cognito on every request. |
| Rate limit plugin | Token bucket / fixed window at gateway. | Protects fragile services from retry storms and abuse. |
| Path router | Maps /orders → orders service, etc. | Keeps clients stable while internal service topology changes. |
| OpenSearch logging | Searchable access logs for support/debug. | Operational insight separate from metrics time series. |
Shared definitions: 00-glossary-common-services.md
Low-level design
Managed gateways
| Cloud | Product |
|---|---|
| AWS | API Gateway HTTP/WebSocket + VPC Link |
| Azure | Azure API Management (APIM) |
| GCP | Apigee X / Cloud Endpoints |
Cross-cutting concerns
- mTLS between gateway and mesh (Istio) for zero-trust.
- WAF integration (AWS WAF, Azure Front Door) for OWASP Top 10.
- Request size limits & timeout budgets per route.
Canary & traffic split
- Weighted routes in Envoy xDS or APIM policies
set-backend-serviceweight.
Developer portal
- OpenAPI import → API keys for partners; OAuth2 client credentials for B2B.
E2E: authenticated call
Rendering diagram…
Tricky parts
| Problem | Solution |
|---|---|
| JWKS stampede | Cache JWKS 15min; background refresh |
| Large JWT cookies | Prefer Authorization header |
| N+1 routing rules | OpenAPI-first codegen for routes |
Caveats
- Business logic creep in gateway → keep policy only; complex rules belong in domain services.
- Cold start on Lambda@Edge / serverless gateways — watch p99 under burst.
Observability
- W3C trace context propagation; X-Ray / Application Insights integration.