# Trie (Prefix Tree) ## Problem Description Implement a Trie (also called Prefix Tree) data structure for efficient string storage and retrieval, particularly useful for prefix-based operations. ## What is a Trie? A trie is a tree-like data structure used to store strings where each node represents a character. Words with common prefixes share the same path from the root, making it efficient for prefix operations. ## Characteristics - **Time Complexity**: - Insert: O(m) where m is the length of the word - Search: O(m) - StartsWith: O(m) - Delete: O(m) - **Space Complexity**: O(ALPHABET_SIZE * N * M) where N is number of words and M is average length - **Best for**: Prefix operations, autocomplete, spell checking ## Structure Each TrieNode contains: - Map of character to child TrieNode - Boolean flag indicating end of word ``` Trie with "cat", "car", "dog": root / \ c d | | a o / \ | t r g * * * (* = end of word) ``` ## Core Operations 1. **insert(word)**: Add a word to the trie 2. **search(word)**: Check if exact word exists 3. **startsWith(prefix)**: Check if any word starts with prefix 4. **delete(word)**: Remove a word from trie 5. **wordsWithPrefix(prefix)**: Get all words with given prefix ## Your Task Implement the following methods in `Trie.java`: 1. `insert(String word)` - Insert a word 2. `search(String word)` - Search for exact match 3. `startsWith(String prefix)` - Check for prefix 4. `delete(String word)` - Delete a word 5. `wordsWithPrefix(String prefix)` - Find all words with prefix ## Test Cases to Consider - Insert and search words - Search for non-existent words - Check prefixes - Words that are prefixes of other words ("app" and "apple") - Empty string - Delete words - Single character words - Words with common prefixes ## Hints - Use a HashMap for children - Mark end of word with a boolean flag - For search: traverse to end and check isEndOfWord - For startsWith: just check if path exists - For delete: use recursion and clean up nodes with no children ## Insert Algorithm ``` 1. Start at root 2. For each character in word: - If child doesn't exist, create it - Move to child 3. Mark last node as end of word ``` ## Search Algorithm ``` 1. Start at root 2. For each character in word: - If child doesn't exist, return false - Move to child 3. Check if current node is marked as end of word ``` ## Applications - Autocomplete systems - Spell checkers - IP routing (longest prefix matching) - T9 predictive text - Solving word games (Boggle, Scrabble) - Dictionary implementation - Search engines ## Advantages over Hash Table 1. Can quickly find all words with given prefix 2. Alphabetical ordering of words 3. No hash collisions 4. Predictable lookup time ## Disadvantages 1. More memory intensive 2. Cache-unfriendly (pointer chasing) 3. More complex to implement ## Follow-up Questions 1. How would you implement word count (number of times a word is inserted)? 2. How would you find the longest word in the trie? 3. How would you implement a case-insensitive trie? 4. Can you optimize space by compressing common paths? ## Variations 1. **Radix Tree (Patricia Trie)**: Compressed trie 2. **Suffix Tree**: For substring operations 3. **Ternary Search Tree**: Space-efficient variant 4. **Compressed Trie**: Merge nodes with single child ## Common Pitfalls - Not marking end of word correctly - Forgetting to handle words that are prefixes of others - Memory leaks in delete operation - Not handling empty string - Case sensitivity issues ## Delete Operation Considerations When deleting: 1. Check if word exists 2. Remove end-of-word marker 3. Clean up nodes with no children (optional optimization) 4. Don't delete nodes that are part of other words ## Comparison with BST | Operation | Trie | BST | |-----------|------|-----| | Search | O(m) | O(log n) | | Insert | O(m) | O(log n) | | Space | Higher | Lower | | Use Case | Prefix ops | General | m = word length, n = number of words ## Practice Problems 1. Implement autocomplete 2. Word search in 2D grid 3. Replace words (longest prefix) 4. Design search autocomplete system 5. Longest word in dictionary 6. Stream of characters ## Related Data Structures - Hash Table - Binary Search Tree - Suffix Tree - Radix Tree