# Binary Search

While Linear Search can be used in every possible scenario for sequential data structures, the overhead of linear time complexity makes it unsuitable for large data sets.

Another problem involved with linear-search algorithm is that, it does not consider any additional meta-data, that might be available for the input dataset. **For example**, if the data is already sorted and compared to the first value the value that we need to search is very large, we can start scanning from the end, to reduce the number of iterations performed.

In order to avoid these problems, the next best possible solution available is to use **Binary Search**. Binary Search, as the name suggests includes searching for the given number in two partitions. But then the questions come, which two partitions are we referring to? How do we partition the input data? etc. etc.

Before we answer these questions and look at the algorithm itself.

Binary Search algorithm only works for the sorted dataset. If the data is not sorted, we will need to sort it beforehand.

Now, let’s look at the algorithm:

#### Pseudo Code

- Initialize
- key = value to be found
- low = first index of the dataset
- high = last index of the input data set

- While low <= high
- Calculate the
**mid**index - Check if the
**element at index mid**is equal to the element we need to find (key)?- If yes, then return
**mid** - Or else
- If
**value[mid] < key**, set**low = mid + 1** - Else, set
**high = mid -1**

- If

- If yes, then return

- Calculate the
- Return
**-1**, indicating that the value is not present in the input dataset.

Ok, so having seen the algorithm, lets try to understand what’s going on here. As we know the dataset is already sorted and is in increasing order of the values, we can use this property to reduce the number of iterations required to see if the value (denoted by key) is present in it or not.

In every iteration, we compare the value to be found `key`

to the middle value of the dataset. If the two values are equal, we return the index `mid`

or else we partition the dataset in two halves (let’s call those **left** and **right**).

If the `key`

is greater than the value at `mid`

index, we start the next iteration considering only the `right`

half. On the other hand, if the `key`

is less than the value at `mid`

index, we start the next iteration considering only the `left`

half.

This way in every iteration, we are considering only half of the elements and thus reducing the complexity as compared to the linear search algorithm.

#### Complexity

As every iteration considers just the half of the elements of the current dataset, the overall complexity of this algorithm is **$O(log_2 n)$**

#### Java Implementation

```
// binary search implementation in java
// the method will search input[] array for the presence of value denoted by key
public int binarySearch(int[] input, int key) {
int low = 0, high = input.length - 1;
while (high >= low) {
int mid = (high + low) >> 1;
if (input[mid] == key) {
return mid;
} else {
if (input[mid] < key) {
low = mid + 1;
} else {
high = mid - 1;
}
}
}
return -1;
}
```

Java `Collections`

class also provide two utility methods to search List data structure using BinarySearch :

- int binarySearch(List<? extends Comparable<? super T>> list, T key)
- int binarySearch(List<? extends T> list, T key, Comparator<? super T> c)

The second method takes an additional comparator for situations where the comparison between elements in not straight forward (like comparing two custom classes). Using these methods is much easier than writing your own implementation for the same :

```
//Notice Integer[] instead of int[]; can you figure out the reason for using the same ?
public int binarySearchV2(Integer[] input, int key) {
return Collections.binarySearch(Arrays.asList(input), key);
}
```

A working implementation of this can be found here