Binary Search : Extra Care


Author - FrugalisMinds

So you might be aware of the algorithm of Binary Search, simple one right. But there is simple bug lies in Binary Search if we don't care at all while writing the codes and most of the beginner does same. Here is Java Edition of Algorithm -

public static int binarySearch(int[] a, int key) {
int low = 0;
int high = a.length - 1;
while (low <= high) {
int mid = (low + high) / 2;
int midVal = a[mid];
if (midVal < key)
low = mid + 1
else if (midVal > key)
high = mid - 1;
return mid; // Key has found
return -1; // key not found.
view raw hosted with ❤ by GitHub

So what's Bug Here or do you think everything is okay here? The Bug is in the line

int mid = (low + high) / 2;
view raw midOperationOldWay hosted with ❤ by GitHub

In Programming Pearls Bentley says that the analogous line "sets m to the average of l and u, truncated down to the nearest integer." On the face of it, this assertion might appear correct, but it fails for large values of the int variables low and high. Specifically, it fails if the sum of low and high is greater than the maximum positive int value (231 - 1). The sum overflows to a negative value, and the value stays negative when divided by two. In C this causes an array index out of bounds with unpredictable results. In Java, it throws ArrayIndexOutOfBoundsException.

This bug can manifest itself for search space is 230 or greater (roughly a billion elements). 

So what's best way to fix the Bug? Here's the one way manually add half the range to the low number:

int mid = low + ((high - low) / 2);
view raw midOperationNewWay hosted with ❤ by GitHub

Do you know how this works,
Calculation without adjustment would've been like 5 + 10 = 15 -> 15/2 = 7.5
here, sum can cause overflow but not subtraction.
5 + (10-5)/2 == 5 + 5/2 == 5 + 2.5 == 7.5 . We calculate the average difference instead of actual average and add it to smaller number to get same results.

Second way to fix this is using logical bit shifting.

int mid = (high + low)>>>1;

Where >>> is logical right shift operator.

Reason this works is, logical bit shift unlike arithmetic shift considers sign bit also in shift operation. When overflow happens addition operation sets sign-bit to store result. This is still valid summation but interpreting it as signed int will have wrong results.
1111 unsigned represents 31
signed represents - 15
Also as we know right shift bit operation is equivalent to dividing by 2. So, shift considering the extra bit results,
0111. ie, 15.

Happy Exploring and Happy Coding...

Connect with us at LinkedIn -

Yogesh Joshi