Problem Statement
Contains Duplicate
Given an integer array nums
, return true
if any value appears at least twice in the array, and return false
if every element is distinct.
Example 1:
Input: nums = [1,2,3,1]
Output: true
Example 2:
Input: nums = [1,2,3,4]
Output: false
Optimal Approach
The most efficient way to solve this problem is by using a HashSet. This allows us to check if a number has already appeared in the array in constant time, O(1). As we iterate through the array, we simply check if the number is already in the set. If it is, we return true
. If we finish the loop without finding any duplicates, we return false
.
LeetCode-Compatible Code Implementation
C++ Solution
class Solution {
public:
bool containsDuplicate(vector<int>& nums) {
unordered_set<int> seen;
for (int num : nums) {
if (seen.find(num) != seen.end()) {
return true;
}
seen.insert(num);
}
return false;
}
};
Step-by-Step Explanation of Code
- Initialization: We initialize an unordered set called
seen
. This will store the elements that we have encountered so far. - Iterate through the array: We loop through each element in the array
nums
. - Check for duplicate: For each element
num
, we check if it already exists in the set:- If it does, we immediately return
true
(indicating a duplicate was found). - If it doesn't, we insert
num
into the set.
- If it does, we immediately return
- Return false: If no duplicates are found after iterating through the entire array, return
false
.
Dry Run / Execution Steps
Input: [1, 2, 3, 1]
seen = {}
(empty initially)- Iterating over the array:
- For
num = 1
:seen = {}
→ Insert 1 →seen = {1}
- For
num = 2
:seen = {1}
→ Insert 2 →seen = {1, 2}
- For
num = 3
:seen = {1, 2}
→ Insert 3 →seen = {1, 2, 3}
- For
num = 1
:seen = {1, 2, 3}
→ Found 1 in the set → Returntrue
- For
Output: true
Alternative Approaches & Why Not?
1. Brute Force Approach
- Description: Compare every pair of elements in the array. If any two elements are equal, return
true
. - Time Complexity: O(n^2) because we are comparing all pairs.
- Space Complexity: O(1), no extra space used.
Why Not?: This approach is inefficient with a time complexity of O(n^2). It's impractical for large arrays.
2. Sorting Approach
- Description: Sort the array, and then check adjacent elements for duplicates.
- Time Complexity: O(n log n) due to sorting.
- Space Complexity: O(1) if in-place sorting is used, O(n) if sorting requires additional space.
Why Not?: While more efficient than brute force, sorting introduces additional complexity and is not the most optimal approach for this problem.
3. Two Pointers
- Description: This approach involves using two pointers to compare elements.
- Time Complexity: O(n log n) (because we need to sort first).
- Space Complexity: O(1).
Why Not?: It’s not the most direct approach and relies on sorting, making it less optimal than using a hash set.
Best Solution & Why It’s Best
Solution Using HashSet
- Time Complexity: O(n), where n is the number of elements in the array. Each lookup and insertion operation in the
unordered_set
takes O(1) time on average. - Space Complexity: O(n), because in the worst case, we may need to store all the elements in the set.
This approach is optimal because it solves the problem in linear time with constant-time lookups and insertions.
Comparison Table:
Approach | Time Complexity | Space Complexity |
---|---|---|
Brute Force | O(n^2) | O(1) |
Sorting | O(n log n) | O(1) / O(n) |
Two Pointers | O(n log n) | O(1) |
HashSet (Best) | O(n) | O(n) |
Complexity Analysis
-
Brute Force:
- Time Complexity: O(n^2) due to nested loops.
- Space Complexity: O(1) as no extra space is used.
-
Sorting:
- Time Complexity: O(n log n) because of the sorting step.
- Space Complexity: O(1) if sorting is in-place.
-
Two Pointers:
- Time Complexity: O(n log n) due to sorting.
- Space Complexity: O(1).
-
HashSet:
- Time Complexity: O(n) because we only traverse the array once.
- Space Complexity: O(n) for the hash set.
Conclusion
The HashSet approach is the most efficient for solving the "Contains Duplicate" problem. It offers linear time complexity and is simple to implement, making it the best choice. For similar problems, you can practice with the following: