Binary search

Binary search is an algorithm that is used to efficiently search for an element in a sorted list. As covered in the section on arrays, searching through an array is a linear-time operation. However, if the array is sorted and we need to find out if it contains a particular element, we can improve this running time. Here's how.

The main idea

Suppose we have the following array of integers, sorted in ascending order: [0, 1, 2, 3, 4, 5]. We want to find out if the array contains the element 1. We can calculate the midpoint of the array by calculating the midpoint of the lowest and highest indices. This gives us $(0 + 5) / 2 = 2$ (we floor the result).

Next, we compare the element at index 2 to the desired value. Since it's greater than the desired value, we can search for the value in the lower half of the array. That is, we can search all the elements to the left of the midpoint.

We calculate a new midpoint in this new sub-array: $(0 + 1) / 2 = 0$ . Since the value at this new midpoint is lower than the desired value, we increase the value of the current low index, so that we search only the elements on the right of the midpoint. The new midpoint is $(1 + 1) / 2 = 1$ and the value at that midpoint is the desired value, so we return it. Let's have a look at the code.

Implementation

Below is the implementation of the binary search method. In this case, the input to the method is a sorted array of ints along with the value to find in the array.

Binary search method

public int binarySearch(int[] array, int element) {
  if (array.length == 0) return -1;
  int low = 0;
  int high = array.length - 1;
  int mid;
  while (low <= high) {
    mid = (low + high) / 2;
    if (array[mid] < element) {
      low = mid + 1;      
    } else if (array[mid] > element) {
      high = mid - 1;      
    } else {
      return mid;
    }
  }
  return -1; // Didn't find the element
}

Time complexity

What's the time complexity of this algorithm? The first few lines all take constant time to execute. The real work happens in the while loop. So the question is: How many times does the while loop execute?

Suppose the input array has size $n$ . In the worst case, where element doesn't exist in the array, the while loop will terminate when the variable low is greater than the variable high. This occurs when the sorted subarray that we're inspecting has no more elements in it.

On each iteration of the while loop, we halve the size of the sorted subarray. In the example above, the original array has length six. Then this is halved to become three. Then this is halved once more to become one.

The first iteration of the while loop looks through an array of size $n / 2$ . The next iteration looks through an array of size $n / 4$ . The next looks through an array of size $n / 8$ . This continues until the array has size 1. So the number of iterations is the number of times $n$ can be halved until we reach 1. Another way of expressing this is $\log_{2} n$ . Therefore, the worst-case running time of the algorithm is logarithmic with respect to the input array size. That is, the algorithm has a $O (\log (n))$ running time.

Key takeaways

Any time you need to scan through a sorted list, consider using binary search. If the list is small, binary search might not seem like a massive improvement over doing a linear scan, but as the list size grows, a logarithmic running time begins to seriously outperform a linear one.