Binary Search

February 15, 2019 4 minutes

While Linear Search can be used in every possible scenario for sequential data structures, the overhead of linear time complexity makes it unsuitable for large data sets.

Another problem involved with linear-search algorithm is that, it does not consider any additional meta-data, that might be available for the input dataset. For example, if the data is already sorted and compared to the first value the value that we need to search is very large, we can start scanning from the end, to reduce the number of iterations performed.

In order to avoid these problems, the next best possible solution available is to use Binary Search. Binary Search, as the name suggests includes searching for the given number in two partitions. But then the questions come, which two partitions are we referring to? How do we partition the input data? etc. etc.

Before we answer these questions and look at the algorithm itself.

Binary Search algorithm only works for the sorted dataset. If the data is not sorted, we will need to sort it beforehand.

Now, let’s look at the algorithm:

Pseudo Code

  1. Initialize
    • key = value to be found
    • low = first index of the dataset
    • high = last index of the input data set
  2. While low <= high
    • Calculate the mid index
    • Check if the element at index mid is equal to the element we need to find (key)?
      • If yes, then return mid
      • Or else
        • If value[mid] < key, set low = mid + 1
        • Else, set high = mid -1
  3. Return -1, indicating that the value is not present in the input dataset.

Ok, so having seen the algorithm, lets try to understand what’s going on here. As we know the dataset is already sorted and is in increasing order of the values, we can use this property to reduce the number of iterations required to see if the value (denoted by key) is present in it or not.

BinarySearch

In every iteration, we compare the value to be found key to the middle value of the dataset. If the two values are equal, we return the index mid or else we partition the dataset in two halves (let’s call those left and right).

If the key is greater than the value at mid index, we start the next iteration considering only the right half. On the other hand, if the key is less than the value at mid index, we start the next iteration considering only the left half.

This way in every iteration, we are considering only half of the elements and thus reducing the complexity as compared to the linear search algorithm.

Complexity

As every iteration considers just the half of the elements of the current dataset, the overall complexity of this algorithm is $O(log_2 n)$

Java Implementation

// binary search implementation in java
// the method will search input[] array for the presence of value denoted by key
public int binarySearch(int[] input, int key) {
        int low = 0, high = input.length - 1;
        while (high >= low) {
            int mid = (high + low) >> 1;
            if (input[mid] == key) {
                return mid;
            } else {
                if (input[mid] < key) {
                    low = mid + 1;
                } else {
                    high = mid - 1;
                }
            }
        }
        return -1;
}

Java Collections class also provide two utility methods to search List data structure using BinarySearch :

  1. int binarySearch(List<? extends Comparable<? super T>> list, T key)
  2. int binarySearch(List<? extends T> list, T key, Comparator<? super T> c)

The second method takes an additional comparator for situations where the comparison between elements in not straight forward (like comparing two custom classes). Using these methods is much easier than writing your own implementation for the same :

//Notice Integer[] instead of int[]; can you figure out the reason for using the same ?
public int binarySearchV2(Integer[] input, int key) {
     return Collections.binarySearch(Arrays.asList(input), key);
}

A working implementation of this can be found here


comments powered by Disqus