Determining Common Characters in Strings: Effective Techniques and Leetcode Example
Discovering common characters between two strings or determining if they have no common characters can be crucial in various programming scenarios. This article presents different approaches, including a bitmask approach, to efficiently identify common characters. Additionally, it showcases a practical Leetcode example highlighting the application of these techniques.
Let's say we have
string a="wxyz";
string b= "uxaoooal";
To determine common characters or to determine if two strings have no common characters:
Naive Approach:
- Compare each character from the first string with every character in the second string.
- Store the matching characters in another string or an array.
- This approach is straightforward but may not be efficient for larger strings.
Little fast Approach:
- Sort both strings before comparing.
- Sorting enables faster comparison as characters with similar values will be adjacent, allowing for optimized matching.
Using Set/Hash-Table data structure:
- Utilize a set or hashtable to store all characters from one of the strings.
- Iterate through the characters of the second string, allowing for faster identification of common characters.
Bitmask Approach(Optimized/Fastest):
- Initialize an integer to zero, representing a 32-bit bitmask.
- Utilize the bits to indicate the presence of a character in a word.
- Subtract the character from the base character (e.g., 'a') to determine the bit position to mask.
- Set the corresponding bit to 1 using bitwise OR operations.
- By performing this operation on both strings, obtain two integers.
- Apply a logical AND operation between the two integers.
- If the result is non-zero, common characters exist.
- If the result is zero, the strings have no common characters.
Leetcode Example: Maximum Product of Word Lengths
- This Leetcode problem (Maximum-product-of-word-lengths/) demonstrates the practical application of the bitmask approach.
- The provided solution utilizes a bitmask representation of characters for each word in a vector.
- It calculates the maximum product of word lengths that have no common characters using bitwise operations.
- The solution efficiently processes the words and finds the maximum product by comparing their bitmasks.
For letter 'a' : .....00001 as letter a is 0 chars away from 'a';
We can get bit position to mask by subtracting the char from 'a' character;
ex. int pos='c'-'a', which is 2, then index 2 from RHS will be made 1.
so,
pos= ....000100,
We can do this using the bit right shift operation.
1<<n
here, 1 in bit form is shifted left by n,
now,
for ab=......00011 // for a and b
we can achieve this using (a or b) i.e. a|b where a=...0001,b =...0010
Therefore,
for string a="wxyz"
Let n=0
foreach( c in a):
n = n| 1<<(c-'a')
we will get an integer n, similarly on doing the above operation for string b, we will get another integer say n1.
Now, if we do logical AND operation on n & n1;
suppose n=5- 0101;
and n1=7-0111;
n&n1=0101
Since the above value is not 0, we can say it has common characters.
to find common characters,
we know as first bit from RHS is 1, so char a, and 3rd bit is also 1, so char c is also common.
However, if the value after the AND operation is 0, then these strings have no characters common.
Here is a question from Leetcode,
Whose answer is posted below.
class Solution { public: int maxProduct(vector<string>& words) { int N=0; int n=words.size(); vector<int> st(n,0); int k=0; for(int i=0;i<n;i++) { string s=words[i]; for(int j=0;j<s.size();j++) { st[i] |=1<<(s[j]-'a'); } } int ans=0; for(int i=0;i<n;i++) { int a=st[i]; for(int j=i+1;j<n;j++) { int b=st[j]; if((a&b)==0) { int s1=words[i].size(); int s2=words[j].size(); ans=max(ans,s1*s2); } } } return ans; } };
Comment your doubts. Have a good day!