Typeahead / Autocomplete Service
Problem statement
Return top-k suggestions in <50ms as user types, respecting personalization, trending, spell tolerance, and prefix updates at scale.
How it works
- Trie (prefix tree) in memory for static dictionary small cases.
- Large scale: Elasticsearch edge n-grams or custom finite state transducer (Lucene) + inverted prefix index.
- Personalized: merge global suggestions with user history trie (Redis).
Analogy: Predictive text on phone keyboard blending dictionary + words you actually type.
High-level design
Rendering diagram…
Components explained — this design
| Component | What it is | Why we use it here |
|---|---|---|
| Aggregator | Merges global + personal + trending lists. | Product wants blended ranking; keeps each source independently cacheable. |
| OpenSearch prefix query | Index with edge n-grams / completion suggester. | Sub-50ms suggestions at scale vs naive LIKE 'abc%' on SQL. |
| Personal Redis ZSET | User-specific recent searches / clicks. | Personalization without heavy joins on every keystroke. |
| Trending Kafka windows | Counts popular prefixes in last N minutes. | Surfaces viral queries during live events. |
Shared definitions: 00-glossary-common-services.md
Low-level design
Indexing
- Edge n-gram tokenizer on
titlefield → matchesst,sta,star…. - Index size explosion — limit max gram length; use completion suggester field type for compact FST.
Ranking signals
- Popularity BM25 + boost by global CTR from ClickHouse rollups.
- Freshness decay exponential for news queries.
Caching
- CDN not ideal (personalization) — Redis
GET suggest:sta5–10s TTL for anonymous users OK.
Debouncing
- Client-side wait 150ms after last keystroke before network call — saves 90% traffic.
E2E: suggestion path
Rendering diagram…
Tricky parts
| Problem | Solution |
|---|---|
| Shingled adult queries | SafeSearch filter list + human moderation queue |
| Locale variants | Separate index per language; Accept-Language routing |
| Empty prefix | Do not query; show defaults only |
Caveats
- Legal: autocomplete defamation cases exist — blocklist high-risk phrases.
- Performance: max concurrency per user to prevent enumeration attacks.
Azure
- Azure Cognitive Search suggesters; Redis Enterprise modules.