Typeahead / Autocomplete Service

Problem statement

Return top-k suggestions in <50ms as user types, respecting personalization, trending, spell tolerance, and prefix updates at scale.

How it works

Trie (prefix tree) in memory for static dictionary small cases.
Large scale: Elasticsearch edge n-grams or custom finite state transducer (Lucene) + inverted prefix index.
Personalized: merge global suggestions with user history trie (Redis).

Analogy: Predictive text on phone keyboard blending dictionary + words you actually type.

High-level design

Rendering diagram…

Components explained — this design

Component	What it is	Why we use it here
Aggregator	Merges global + personal + trending lists.	Product wants blended ranking; keeps each source independently cacheable.
OpenSearch prefix query	Index with edge n-grams / completion suggester.	Sub-50ms suggestions at scale vs naive `LIKE 'abc%'` on SQL.
Personal Redis ZSET	User-specific recent searches / clicks.	Personalization without heavy joins on every keystroke.
Trending Kafka windows	Counts popular prefixes in last N minutes.	Surfaces viral queries during live events.

Shared definitions: 00-glossary-common-services.md

Low-level design

Indexing

Edge n-gram tokenizer on title field → matches st, sta, star….
Index size explosion — limit max gram length; use completion suggester field type for compact FST.

Ranking signals

Popularity BM25 + boost by global CTR from ClickHouse rollups.
Freshness decay exponential for news queries.

Caching

CDN not ideal (personalization) — Redis GET suggest:sta 5–10s TTL for anonymous users OK.

Debouncing

Client-side wait 150ms after last keystroke before network call — saves 90% traffic.

E2E: suggestion path

Rendering diagram…

Tricky parts

Problem	Solution
Shingled adult queries	SafeSearch filter list + human moderation queue
Locale variants	Separate index per language; Accept-Language routing
Empty prefix	Do not query; show defaults only

Caveats

Legal: autocomplete defamation cases exist — blocklist high-risk phrases.
Performance: max concurrency per user to prevent enumeration attacks.

Azure

Azure Cognitive Search suggesters; Redis Enterprise modules.