Algorithm to find two repeated numbers in an array, without sorting

I saw this question in Stack Overflow which I found interesting.

This is the link to the actual question on StackOverflow.com.

There is an array of size n (numbers are between 0 and n – 3) and only 2 numbers are repeated. Elements are placed randomly in the array.

E.g. in {2, 3, 6, 1, 5, 4, 0, 3, 5} n=9, and repeated numbers are 3 and 5.

What is the best way to find the repeated numbers? Space complexity should not be more than O(1) and time complexity should not exceed O(n). Note thatn O(1) means constant amount of memory where memory consumption is not relative to the size of n and O(n) means you can look into the array only a constant number of times.

HINT: Sorting, or using a hash table is not the answer.

One solution is to sum the numbers and also sum the squares then solve the problem by solving the following math equation: A+B=cte and A^2+B^2=anotherCte.

I would suggest a different solution:

As the numbers are integers, they have constant bit size (i.e. 32). Let us assume we are working with 4 bit integers right now. We look for A and B which are the duplicate numbers.

We need 4 buckets, each for one bit. Each bucket contains numbers which its specific bit is 1. For example bucket 1 gets 2, 3, 4, 7, …:

Bucket 0 : Sum ( x where: x & 2 power 0 == 0 )
...
Bucket i : Sum ( x where: x & 2 power i == 0 )

We know what would be the sum of each bucket if there was no duplicate. I consider this as prior knowledge.
Once above buckets are generated, a bunch of them would have values more than expected. By constructing the number from buckets we will have (A OR B for your information).

We can calculate (A XOR B) as follows:

A XOR B = Array[i] XOR Array[i-1] XOR … 0, XOR n-3 XOR n-2 … XOR 0
Now going back to buckets, we know exactly which buckets have both our numbers and which ones have only one (from the XOR bit).

For the buckets that have only one number we can extract the number num = (sum – expected sum of bucket). However, we should be good only if we can find one of the duplicate numbers so if we have at least one bit in A XOR B, we’ve got the answer.

But what if A XOR B is zero? Well this case is only possible if both duplicate numbers are the same number, which then our number is the answer of A OR B.

Example:

Given array of numbers:

0, 1, 2, 3, 4, 5, 2, 3

Let’s write them in Binary format:

 0 => 0 0 0
 1 => 0 0 1
 2 => 0 1 0
 3 => 0 1 1
 4 => 1 0 0
 5 => 1 0 1
 2 => 0 1 0
 3 => 0 1 1

Let us calculate the XOR (^) first: (number xor index)

0^0 ^ 1^1 ^ 2^2 ^ 3^3 ^ 4^4 ^ 2 ^ 3 => 0 0 1

Then we create 3 bit buckets by summing all number where bucket bit is 1:

Expected bucket values:
 B3(2) B2(5) B1(9)
Actual bucket values:
 B3(2) B2(10) B1(12)

From the buckets we can say A|B = 0 1 1 which is correct 0 1 0 | 0 1 1 = 0 1 1.
Using the result of XOR, we know that bucket B1 contains only one of the extra numbers. This is because when A|B is 1, and A^B=1, then one of the numbers should be 0 and the other should be 1.
So we can calculate one number : 12 – 9 = 3. The other number is 10-3-5 = 2.

Advertisements

WordPress reminded me of Iran today

I opened the wordpress home page this morning and suddenly I felt like I am using internet in Iran. No image was visible and all the images were censored.

This might be a weird concept for you but it is the most common internet page for Iranians. Iranian regime blocks inappropriate material, and you know how governments work, level of inappropriateness grows throughout the time and once a black censored box appeared in your browser, it would be just a matter of time for them to fill up your screen.

Please protest against SOPA http://sopastrike.com/strike. Even if you are not American you will be affected with american government’s decision on internet, so let our voice be heard.

To realize how dangerous this bill is, please check this good article.