Algorithm to find two repeated numbers in an array, without sorting


I saw this question in Stack Overflow which I found interesting.

This is the link to the actual question on StackOverflow.com.

There is an array of size n (numbers are between 0 and n – 3) and only 2 numbers are repeated. Elements are placed randomly in the array.

E.g. in {2, 3, 6, 1, 5, 4, 0, 3, 5} n=9, and repeated numbers are 3 and 5.

What is the best way to find the repeated numbers? Space complexity should not be more than O(1) and time complexity should not exceed O(n). Note thatn O(1) means constant amount of memory where memory consumption is not relative to the size of n and O(n) means you can look into the array only a constant number of times.

HINT: Sorting, or using a hash table is not the answer.

One solution is to sum the numbers and also sum the squares then solve the problem by solving the following math equation: A+B=cte and A^2+B^2=anotherCte.

I would suggest a different solution:

As the numbers are integers, they have constant bit size (i.e. 32). Let us assume we are working with 4 bit integers right now. We look for A and B which are the duplicate numbers.

We need 4 buckets, each for one bit. Each bucket contains numbers which its specific bit is 1. For example bucket 1 gets 2, 3, 4, 7, …:

Bucket 0 : Sum ( x where: x & 2 power 0 == 0 )
...
Bucket i : Sum ( x where: x & 2 power i == 0 )

We know what would be the sum of each bucket if there was no duplicate. I consider this as prior knowledge.
Once above buckets are generated, a bunch of them would have values more than expected. By constructing the number from buckets we will have (A OR B for your information).

We can calculate (A XOR B) as follows:

A XOR B = Array[i] XOR Array[i-1] XOR … 0, XOR n-3 XOR n-2 … XOR 0
Now going back to buckets, we know exactly which buckets have both our numbers and which ones have only one (from the XOR bit).

For the buckets that have only one number we can extract the number num = (sum – expected sum of bucket). However, we should be good only if we can find one of the duplicate numbers so if we have at least one bit in A XOR B, we’ve got the answer.

But what if A XOR B is zero? Well this case is only possible if both duplicate numbers are the same number, which then our number is the answer of A OR B.

Example:

Given array of numbers:

0, 1, 2, 3, 4, 5, 2, 3

Let’s write them in Binary format:

 0 => 0 0 0
 1 => 0 0 1
 2 => 0 1 0
 3 => 0 1 1
 4 => 1 0 0
 5 => 1 0 1
 2 => 0 1 0
 3 => 0 1 1

Let us calculate the XOR (^) first: (number xor index)

0^0 ^ 1^1 ^ 2^2 ^ 3^3 ^ 4^4 ^ 2 ^ 3 => 0 0 1

Then we create 3 bit buckets by summing all number where bucket bit is 1:

Expected bucket values:
 B3(2) B2(5) B1(9)
Actual bucket values:
 B3(2) B2(10) B1(12)

From the buckets we can say A|B = 0 1 1 which is correct 0 1 0 | 0 1 1 = 0 1 1.
Using the result of XOR, we know that bucket B1 contains only one of the extra numbers. This is because when A|B is 1, and A^B=1, then one of the numbers should be 0 and the other should be 1.
So we can calculate one number : 12 – 9 = 3. The other number is 10-3-5 = 2.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: