# Interpolation Search

**Interpolation Search** is another searching algorithm, used to search sorted datasets with an additional condition that the data should be **uniformaly distribued**. Keep in mind that the uniformity of the data is NOT a mandatory ask. Its just the fact that the algorithm performs better in case the data is uniformaly distributed.

Umm.. Ok.. but what is **uniformaly distributed** data?

In layman terms, we can say that when all the elements in a dataset have equal probability to get selected, the dataset is said to be uniformaly distributed. **For example**, when rolling a fair die, every number (from 1 to 6) has equal probability to be on the top face.

Interpolation Search is an improvement over Binary Search, as it performs better for uniformaly distributed datasets. As mentioned in the linked post, binary search divides the given dataset into two equal parts and then based on the comparison with the **middle** index value, it selects the next dataset for search.

Interpolation search differs from Binary Search in this context. Instead of always selecting the middle element as **pivot**, interpolation-search calculates the pivot element by interpolating the **key** element(the value we need to find) with the **lower** (*input[start]*) and **upper** (*input[end]*) values of one of the intervals from the given dataset.

$$ pivot= start + (\frac{(end - start)}{input[end] - input[start]} * (key - input[start])) $$

The above mentioned mathematical equation is used to find the element that we will be using to perform the comparison operation. If the **key** element is not found in the current interval (*input[start] : input[end]*), the **start** and **end** indices are updated to choose another interval (based on the logic mentioned below) and the process is repeated until we have either found the element or have compared all the intervals.

#### Pseudo Code

- Initialize
**start**= 0**end**= size of the input dataset -1**key**= element to be searched

- while
**start < end**and key lies between**start**and**end**indices:- calculate
**pivot**index using above mentioned formula - if the value at
**pivot**index is equal to the key element, return**pivot** - else if,
**key < input[pivot]**, set**start = pivot + 1** - else set
**end = pivot - 1**

- calculate

#### Complexity

The average case complexity of interpolation search algorithm (assuming the data is uniformaly distributed) is $O(log log (n))$. In worst case scenarios, the algorithm performs in a linear time $O(n)$

#### Java Implementation

```
/**
* a method to search input array for the existence of element denoted by key using interpolation search
*
* @param input input array to be searched
* @param key element to be found
* @return index of key element if found, -1 otherwise
*/
public int search(int[] input, int key) {
int start = 0, end = input.length - 1;
//while the key is in the range
while (start <= end && (key >= input[start] && key <= input[end])) {
int pivot = start + ((key - input[start]) * (end - start) / (input[end] - input[start]));
if (input[pivot] == key) {
return pivot;
} else {
if (key > input[pivot]) {
end = pivot - 1;
} else {
start = pivot + 1;
}
}
}
return -1;
}
```

A working implementation with test cases can be found here