What is a Trie?
A Trie (pronounced "try", from retrieval) is a tree-like data structure that stores strings in a way that allows for fast prefix-based searches. Each path from root to leaf represents a word, and nodes sharing common prefixes share the same path.
Think of it like organizing a dictionary: all words starting with "ca" are in the same section, so "car", "cat", and "care" share the path C→A.
The Problem: Fast Prefix Search
How does Google autocomplete your search as you type?
User types: "sys" Need to find: ["system", "systematic", "systems", "sysadmin", ...]
Naive Approach: Linear Scan
def autocomplete(prefix, words):
return [w for w in words if w.startswith(prefix)]
# With 1 million words
autocomplete("sys", dictionary) # O(N × L) - scan all words
Problems:
- Time: O(N × L) where N = total words, L = word length
- For 1M words, every keystroke scans 1M entries
- Unusable for real-time autocomplete
Trie Approach
trie.search("sys") # O(L) - only traverse 3 characters!
Benefits:
- Time: O(L) where L = prefix length (constant time w.r.t. dictionary size)
- Space efficient: common prefixes stored once
- Natural for autocomplete and dictionary operations
Trie Structure
Visual Representation
Words: ["car", "cat", "care", "dog", "dodge"]
(root)
/ \
c d
| |
a o
/ \ |
r t g
| |
e e
Key properties:
- Each edge represents a character
- Path from root to node spells a prefix
- Special marker indicates end of word
- Children stored in hash map or array
Node Structure
class TrieNode:
def __init__(self):
self.children = {} # char -> TrieNode
self.is_end_of_word = False
self.word = None # Optional: store full word
self.frequency = 0 # Optional: for ranking
Implementation
Basic Trie (Python)
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word):
"""Insert a word into the trie. O(L) time."""
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end_of_word = True
node.word = word
def search(self, word):
"""Check if word exists. O(L) time."""
node = self._find_node(word)
return node is not None and node.is_end_of_word
def starts_with(self, prefix):
"""Check if any word starts with prefix. O(L) time."""
return self._find_node(prefix) is not None
def _find_node(self, prefix):
"""Helper: traverse to node for prefix."""
node = self.root
for char in prefix:
if char not in node.children:
return None
node = node.children[char]
return node
def autocomplete(self, prefix, max_results=10):
"""Find all words with given prefix."""
node = self._find_node(prefix)
if not node:
return []
results = []
self._collect_words(node, results, max_results)
return results
def _collect_words(self, node, results, max_results):
"""DFS to collect all words under node."""
if len(results) >= max_results:
return
if node.is_end_of_word:
results.append(node.word)
for child in node.children.values():
self._collect_words(child, results, max_results)
# Usage
trie = Trie()
for word in ["car", "cat", "care", "dog", "dodge"]:
trie.insert(word)
print(trie.search("car")) # True
print(trie.search("ca")) # False (not a complete word)
print(trie.starts_with("ca")) # True
print(trie.autocomplete("ca")) # ["car", "cat", "care"]
print(trie.autocomplete("do")) # ["dog", "dodge"]
JavaScript Implementation
class TrieNode {
constructor() {
this.children = new Map();
this.isEndOfWord = false;
this.word = null;
}
}
class Trie {
constructor() {
this.root = new TrieNode();
}
insert(word) {
let node = this.root;
for (const char of word) {
if (!node.children.has(char)) {
node.children.set(char, new TrieNode());
}
node = node.children.get(char);
}
node.isEndOfWord = true;
node.word = word;
}
search(word) {
const node = this.findNode(word);
return node !== null && node.isEndOfWord;
}
autocomplete(prefix, maxResults = 10) {
const node = this.findNode(prefix);
if (!node) return [];
const results = [];
this.collectWords(node, results, maxResults);
return results;
}
findNode(prefix) {
let node = this.root;
for (const char of prefix) {
if (!node.children.has(char)) return null;
node = node.children.get(char);
}
return node;
}
collectWords(node, results, maxResults) {
if (results.length >= maxResults) return;
if (node.isEndOfWord) results.push(node.word);
for (const child of node.children.values()) {
this.collectWords(child, results, maxResults);
}
}
}
Time & Space Complexity
Time Complexity
| Operation | Complexity | Explanation |
|---|---|---|
| Insert | O(L) | L = word length, traverse each character |
| Search | O(L) | Traverse each character in word |
| Prefix search | O(L) | Find prefix node |
| Autocomplete | O(L + M) | L = prefix length, M = matches to collect |
| Delete | O(L) | Traverse and mark node |
Compare to HashMap:
- HashMap: O(1) for exact match
- Trie: O(L) for prefix match (HashMap can't do this efficiently)
Space Complexity
Worst case: O(ALPHABET_SIZE × N × L)
- N = number of words
- L = average word length
- ALPHABET_SIZE = 26 for lowercase English
Best case (shared prefixes): Much better!
Without Trie (store separately):
["car", "cat", "care"] = 3 + 3 + 4 = 10 characters
With Trie (share prefix):
c → a → r → (end)
↓
t → (end)
↓
e → (end)
Total: 5 unique nodes
Space Optimization Techniques
1. Compressed Trie (Radix Tree)
Merge chains of single-child nodes:
Before: After: c car | / \ a e t | r / \ e (end)
class RadixNode:
def __init__(self):
self.children = {}
self.label = "" # Edge label (can be multi-char)
self.is_end = False
Benefits: Reduces nodes for words with unique suffixes
2. Array vs HashMap for Children
# HashMap (flexible, sparse)
node.children = {} # char -> TrieNode
# Array (faster, memory-hungry)
node.children = [None] * 26 # Only lowercase a-z
Trade-off:
- Array: O(1) lookup, wastes space if sparse
- HashMap: O(1) average lookup, memory efficient
3. Ternary Search Tree
Hybrid between Trie and BST:
class TSTNode:
def __init__(self, char):
self.char = char
self.left = None # chars < this char
self.mid = None # next char in word
self.right = None # chars > this char
self.is_end = False
Benefits: Space efficient (3 pointers vs 26), still fast
Real-World Use Cases
1. Autocomplete / Type-Ahead
# Google Search autocomplete
search_trie = Trie()
# Pre-populate with popular searches (ranked by frequency)
queries = [
("system design interview", 1000000),
("system design primer", 500000),
("systematic review", 200000)
]
for query, freq in queries:
search_trie.insert(query)
# Store frequency for ranking
# User types "syst"
suggestions = search_trie.autocomplete("syst", max_results=5)
# Returns: ["system design interview", "system design primer", "systematic review"]
2. Spell Checker
def spell_check(word, dictionary_trie):
if dictionary_trie.search(word):
return True, "Correct"
# Find similar words (edit distance 1)
suggestions = find_similar_words(word, dictionary_trie)
return False, f"Did you mean: {suggestions}?"
def find_similar_words(word, trie):
"""Find words with 1 character difference"""
suggestions = []
# Try deletions, insertions, substitutions
for i in range(len(word)):
# Deletion
candidate = word[:i] + word[i+1:]
if trie.search(candidate):
suggestions.append(candidate)
# Substitution
for c in 'abcdefghijklmnopqrstuvwxyz':
candidate = word[:i] + c + word[i+1:]
if trie.search(candidate):
suggestions.append(candidate)
return suggestions[:5]
3. IP Routing (Longest Prefix Match)
Routers use tries to find the longest matching IP prefix:
# IP routing table
routing_trie = Trie()
routing_trie.insert("192.168.0.0/16", next_hop="Router A")
routing_trie.insert("192.168.1.0/24", next_hop="Router B")
# Packet to 192.168.1.100
# Matches both, but 192.168.1.0/24 is longer (more specific)
# Forward to Router B
4. T9 Predictive Text
Old phone keyboards (2ABC, 3DEF, etc.):
t9_trie = Trie()
# Map: 2 -> ['a','b','c'], 3 -> ['d','e','f'], ...
def t9_search(digits):
"""User presses 228 -> find words like 'cat', 'bat', 'act'"""
candidates = []
def dfs(node, pos, current_word):
if pos == len(digits):
if node.is_end_of_word:
candidates.append(current_word)
return
# Try all letters for this digit
for char in digit_to_chars[digits[pos]]:
if char in node.children:
dfs(node.children[char], pos + 1, current_word + char)
dfs(t9_trie.root, 0, "")
return candidates
5. Word Games (Boggle, Scrabble)
def find_words_in_grid(grid, dictionary_trie):
"""Find all valid words in Boggle grid"""
words = set()
def dfs(i, j, node, path, visited):
if node.is_end_of_word:
words.add(node.word)
for di, dj in [(0,1), (1,0), (0,-1), (-1,0)]:
ni, nj = i + di, j + dj
if 0 <= ni < len(grid) and 0 <= nj < len(grid[0]):
if (ni, nj) not in visited:
char = grid[ni][nj]
if char in node.children:
visited.add((ni, nj))
dfs(ni, nj, node.children[char], path + char, visited)
visited.remove((ni, nj))
for i in range(len(grid)):
for j in range(len(grid[0])):
char = grid[i][j]
if char in dictionary_trie.root.children:
dfs(i, j, dictionary_trie.root.children[char], char, {(i,j)})
return words
Trie vs Alternatives
| Data Structure | Insert | Search (exact) | Prefix search | Space |
|---|---|---|---|---|
| Trie | O(L) | O(L) | O(L + M) ✅ | High (many nodes) |
| HashMap | O(1) | O(1) ✅ | O(N × L) ❌ | Medium |
| Binary Search Tree | O(log N) | O(log N) | O(log N + M) | Low |
| Sorted Array | O(N) | O(log N) | O(log N + M) | Low ✅ |
Choose Trie when:
- Prefix-based queries are common
- Autocomplete is needed
- Dictionary operations dominate
- Memory is available
Avoid Trie when:
- Only exact matches needed (use HashMap)
- Memory constrained
- Words don't share prefixes
Interview Tips 💡
When discussing tries in system design interviews:
- Explain the problem: "We need to support autocomplete, which requires efficient prefix searches..."
- Justify trie choice: "Trie gives us O(L) prefix search, independent of dictionary size..."
- Discuss optimizations: "For production, we'd use a compressed trie to save memory..."
- Mention ranking: "We'd store word frequency in nodes to rank autocomplete results..."
- Consider scale: "For billions of queries, we'd cache popular prefixes in Redis..."
- Address alternatives: "For exact-match-only queries, a hash table would be faster..."
Related Concepts
- Consistent Hashing — Another specialized data structure
- Bloom Filters — Probabilistic membership testing
- Geohashing — Prefix-based spatial indexing
- Caching — Cache autocomplete results
- Redis Internals — Redis supports sorted sets for autocomplete
About ScaleWiki
ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.
Read more about our Editorial Guidelines & Authorship.
Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.
Related Articles
Bloom Filters
Space-efficient probabilistic data structure for membership testing that allows false positives but guarantees no false negatives, using minimal memory compared to hash sets.
Cache Eviction Policies
When the cache is full, something has to go. A comprehensive guide to LRU, LFU, ARC, and other replacement algorithms with implementation details.
Geohashing (Location Encoding)
A geocoding system that encodes latitude/longitude coordinates into short alphanumeric strings for efficient proximity searches and spatial indexing.